HBB-MODULATING COMPOSITIONS AND METHODS

Abstract
The disclosure provides, e.g., compositions, systems, and methods for targeting, editing, modifying, or manipulating a host cell's genome at one or more locations in a DNA sequence in a cell, tissue, or subject. Gene modifying systems for treating sickle cell disease (SCD) are described.
Description
SEQUENCE LISTING

The instant application contains a Sequence Listing which has been submitted electronically in XML format compliant with WIPO Standard ST.26 and is hereby incorporated by reference in its entirety. Said XML copy, created on Feb. 27, 2024, is named V2065-702720FT_SL.XML and is 30,054,845 bytes in size.


CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation of International Application No. PCT/US2022/076063, filed Sep. 7, 2022, which claims the benefit of U.S. Provisional Application No. 63/241,994, filed Sep. 8, 2021, U.S. Provisional Application No. 63/250,143, filed Sep. 29, 2021, and U.S. Provisional Application No. 63/303,900, filed Jan. 27, 2022. The contents of the aforementioned applications are hereby incorporated by reference in their entirety.


BACKGROUND

Integration of a nucleic acid of interest into a genome occurs at low frequency and with little site specificity, in the absence of a specialized protein to promote the insertion event. Some existing approaches, like CRISPR/Cas9, are more suited for small edits that rely on host repair pathways and are less effective at integrating longer sequences. Other existing approaches, like Cre/loxP, require a first step of inserting a loxP site into the genome and then a second step of inserting a sequence of interest into the loxP site. There is a need in the art for improved compositions (e.g., proteins and nucleic acids) and methods for inserting, altering, or deleting sequences of interest in a genome.


Sickle cell disease is an inherited blood disorder that affects red blood cells. There are several types of sickle cell disease (e.g., hemoglobin SS disease, hemoglobin SC disease; sickle beta-plus thalassemia; sickle beta-zero thalassemia). People with sickle cell disease have red blood cells that contain mostly hemoglobin S, an abnormal type of hemoglobin. Sickle-shaped cells die prematurely, which can lead to a shortage of red blood cells (anemia). Sickle-shaped cells are rigid and can block small blood vessels, causing severe pain and organ damage. Tissue that does not receive a normal blood flow eventually becomes damaged. This is what causes the complications of sickle cell disease.


The HBB gene provides instructions for making a protein, beta-globin. Beta-globin is a component (subunit) of a larger protein called hemoglobin, which is located inside red blood cells. In adults, hemoglobin normally consists of four protein subunits: two subunits of beta-globin and two subunits of another protein called alpha-globin, which is produced from another gene called HBA. Each of these protein subunits is bound to an iron-containing molecule called heme; each heme contains an iron molecule in its center that can bind to one oxygen molecule. Hemoglobin within red blood cells binds to oxygen molecules in the lungs. These cells then travel through the bloodstream and deliver oxygen to tissues throughout the body.


Sickle cell anemia, a common form of sickle cell disease, is caused by a particular mutation in the HBB gene. This mutation results in the production of an abnormal version of beta-globin called hemoglobin S or HbS. In this condition, hemoglobin S replaces both betaglobin subunits in hemoglobin. The mutation changes a single amino acid in beta-globin. Specifically, the amino acid glutamic acid is replaced with the amino acid valine at position 6 in beta-globin, written as Glu6Val or E6V. Replacing glutamic acid with valine causes the abnormal hemoglobin S subunits to stick together and form long, rigid molecules that bend red blood cells into a sickle or crescent shape. Mutations in the HBB gene can also cause other abnormalities in beta-globin, leading to other types of sickle cell disease. In these other types of sickle cell disease, just one beta-globin subunit is replaced with hemoglobin S. The other beta-globin subunit is replaced with a different abnormal variant, such as hemoglobin C or hemoglobin E.


There is currently no universal cure for sickle cell disease. The available options for treating sickle cell disease are limited to a bone marrow or stem cell transplant. Accordingly, there is a need for new and more effective treatments for sickle cell disease utilizing the HBB E6V mutation.


SUMMARY OF THE INVENTION

This disclosure relates to novel compositions, systems, and methods for altering a genome at one or more locations in a host cell, tissue, or subject, in vivo or in vitro. In particular, the invention features compositions, systems, and methods for inserting, altering, or deleting sequences of interest in a host genome. For example, the disclosure provides systems that are capable of modulating (e.g., inserting, altering, or deleting sequences of interest) the HBB gene activity and methods of treating sickle cell disease (SCD) disease by administering one or more such systems to alter a genomic sequence at a HBB nucleotide to correct a pathogenic mutation causing SCD.


In one aspect, the disclosure relates to a system for modifying DNA to correct a human HBB gene mutation causing SCD comprising (a) a nucleic acid encoding a gene modifying polypeptide capable of target primed reverse transcription, the polypeptide comprising (i) a reverse transcriptase domain and (ii) a Cas9 nickase that binds DNA and has endonuclease activity, and (b) a template RNA comprising (i) a gRNA spacer that is complementary to a first portion of the human HBB gene, (ii) a gRNA scaffold that binds the polypeptide, (iii) a heterologous object sequence comprising a mutation region to correct the mutation, and (iv) a primer binding site (PBS) sequence comprising at least 3, 4, 5, 6, 7, or 8 bases of 100% homology to a target DNA strand at the 3′ end of the template RNA. The HBB gene may comprise an E6V mutation. The template RNA sequence may comprise a sequence described herein, e.g., in Table 1, 3, 4, A. AA. B. B1, 5A-5D. X4. or X4A.


The gRNA spacer may comprise at least 15 bases of 100% homology to the target DNA at the 5′ end of the template RNA. The template RNA may further comprise a PBS sequence comprising at least 5 bases of at least 80% homology to the target DNA strand. The template RNA may comprise one or more chemical modifications.


The domains of the gene modifying polypeptide may be joined by a peptide linker. The polypeptide may comprise one or more peptide linkers. The gene modifying polypeptide may further comprise a nuclear localization signal. The polypeptide may comprise more than one nuclear localization signal, e.g., multiple adjacent nuclear localization signals or one or more nuclear localization signals in different regions of the polypeptide, e.g., one or more nuclear localization signals in the N-terminus of the polypeptide and one or more nuclear localization signals in the C-terminus of the polypeptide. The nucleic acid encoding the gene modifying polypeptide may encode one or more intein domains.


Introduction of the system into a target cell may result in insertion of at least 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 500, or 1000 base pairs of exogenous DNA. Introduction of the system into a target cell may result in deletion, wherein the deletion is less than 2, 3, 4, 5, 10, 50, or 100 base pairs of genomic DNA upstream or downstream of the insertion. Introduction of the system into a target cell may result in substitution, e.g., substitution of 1, 2, or 3 nucleotides, e.g., consecutive nucleotides.


The heterologous object sequence may be at least 5, 10, 25, 50, 100, 150, 200, 250, 300, 400, 500, 600, or 700 base pairs.


In one aspect, the disclosure relates to a pharmaceutical composition comprising the system described above and a pharmaceutically acceptable excipient or carrier, wherein the pharmaceutically acceptable excipient or carrier is selected from the group consisting of a plasmid vector, a viral vector, a vesicle, and a lipid nanoparticle. In one aspect, the disclosure relates to a pharmaceutical composition comprising the system described above and multiple pharmaceutically acceptable excipients or carriers, wherein the pharmaceutically acceptable excipients or carriers are selected from the group consisting of a plasmid vector, a viral vector, a vesicle, and a lipid nanoparticle, e.g., where the system described above is delivered by two distinct excipients or carriers, e.g., two lipid nanoparticles, two viral vectors, or one lipid nanoparticle and one viral vector. The viral vector may be an adeno-associated virus (AAV).


In one aspect, the disclosure relates to a host cell (e.g., a mammalian cell, e.g., a human cell) comprising the system described above.


In one aspect, the disclosure relates to a method of correcting a mutation in the human HBB gene in a cell, tissue or subject, the method comprising administering the system described above to the cell, tissue or subject, wherein optionally the correction of the mutant HBB gene comprises an amino acid substitution of V6E (reversing the pathogenic substitution which is E6V. The system may be introduced in vivo, in vitro, ex vivo, or in situ. The nucleic acid of (a) may be integrated into the genome of the host cell. In some embodiments, the nucleic acid of (a) is not integrated into the genome of the host cell. In some embodiments, the heterologous object sequence is inserted at only one target site in the host cell genome. The heterologous object sequence may be inserted at two or more target sites in the host cell genome, e.g., at the same corresponding site in two homologous chromosomes or at two different sites on the same or different chromosomes. The heterologous object sequence may encode a mammalian polypeptide, or a fragment or a variant thereof. The components of the system may be delivered on 1, 2, 3, 4, or more distinct nucleic acid molecules. The system may be introduced into a host cell by electroporation or by using at least one vehicle selected from a plasmid vector, a viral vector, a vesicle, and a lipid nanoparticle.


Features of the compositions or methods can include one or more of the following enumerated embodiments.


Enumerated Embodiments



  • 1. A template RNA comprising, e.g., from 5′ to 3′:
    • (i) a gRNA spacer that is complementary to a first portion of the human HBB gene, wherein the gRNA spacer has a sequence comprising the core nucleotides of a gRNA spacer sequence of Table 1, or a sequence having 1, 2, or 3 substitutions thereto, and optionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the gRNA spacer (e.g., comprises one or more flanking nucleotides that are adjacent to the core nucleotides), or wherein the gRNA spacer has a sequence of a spacer chosen from Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A;
    • (ii) a gRNA scaffold that binds a gene modifying polypeptide (e.g., binds the Cas domain of the gene modifying polypeptide),
    • (iii) a heterologous object sequence comprising a mutation region to introduce a mutation into (e.g., to correct a mutation in) a second portion of the human HBB gene (wherein optionally the heterologous object sequence comprises, from 5′ to 3′, a post-edit homology region, a mutation region, and a pre-edit homology region), and
    • (iv) a primer binding site (PBS) sequence comprising at least 3, 4, 5, 6, 7, or 8 bases with 100% identity to a third portion of the human HBB gene.

  • 2. The template RNA of embodiment 1, wherein the heterologous object sequence comprises the core nucleotides of an RT template sequence from Table 3, or a sequence having 1, 2, or 3 substitutions thereto, and optionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the RT template sequence, or wherein the heterologous object sequence comprises a sequence of an RT template sequence from Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A.

  • 3. The template RNA of embodiment 1, wherein the heterologous object sequence comprises the core nucleotides of the RT template sequence of Table 3 that corresponds to the gRNA spacer sequence, or a sequence having 1, 2, or 3 substitutions thereto, and optionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the RT template sequence (e.g., comprises one or more flanking nucleotides that are adjacent to the core nucleotides), or wherein the heterologous object sequence comprises a sequence of an RT template sequence from Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A that corresponds to the gRNA spacer sequence.

  • 4. The template RNA according to any one of embodiments 1-3 wherein the PBS sequence has a sequence comprising the core nucleotides of the PBS sequence from the same row of Table 3 as the RT template sequence, or a sequence having 1, 2, or 3 substitutions thereto, and optionally comprises one or more consecutive nucleotides starting with the 5′ end of the flanking nucleotides of the PBS sequence (e.g., comprises one or more flanking nucleotides that are adjacent to the core nucleotides).

  • 5. The template RNA according to any one of embodiments 1-3, wherein the PBS sequence has a sequence comprising the core nucleotides of a PBS sequence of Table 3 that corresponds to the RT template sequence, or a sequence having 1, 2, or 3 substitutions thereto, the gRNA spacer sequence, or both, and optionally comprises one or more consecutive nucleotides starting with the 5′ end of the flanking nucleotides of the PBS sequence, or wherein the PBS sequence has a sequence comprising a sequence of a PBS from Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A that corresponds to the RT template sequence, or a sequence having 1, 2, or 3 substitutions thereto, the gRNA spacer sequence, or both.

  • 6. The template RNA according to any of embodiments 1-5, wherein the gRNA scaffold comprises a sequence of a gRNA scaffold of Table 12, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto.

  • 7. The template RNA according to any of embodiments 1-5, wherein the gRNA scaffold comprises a sequence of a gRNA scaffold of Table 12 that corresponds to the RT template sequence, the gRNA spacer sequence, or both, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto.

  • 8. A template RNA comprising, e.g., from 5′ to 3′:
    • (i) a gRNA spacer that is complementary to a first portion of the human HBB gene,
    • (ii) a gRNA scaffold that binds a gene modifying polypeptide (e.g., binds the Cas domain of the gene modifying polypeptide),
    • (iii) a heterologous object sequence comprising a mutation region to introduce a mutation into (e.g., to correct a mutation in) a second portion of the human HBB gene, wherein the heterologous object sequence comprises the core nucleotides of an RT template sequence of Table 3, or a sequence having 1, 2, or 3 substitutions thereto, and optionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the RT template sequence, or wherein the heterologous object sequence comprises an RT template sequence of Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A; and
    • (iv) a PBS sequence comprising at least 3, 4, 5, 6, 7, or 8 bases of 100% identity to a third portion of the human HBB gene.

  • 9. The template RNA of embodiment 8, wherein the gRNA spacer comprises the core nucleotides of a gRNA spacer sequence of Table 1, or a sequence having 1, 2, or 3 substitutions thereto, and optionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the gRNA spacer sequence, or wherein the gRNA spacer comprises a gRNA spacer sequence of Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A.

  • 10. The template RNA of any one of embodiments 1-9, wherein the gRNA spacer comprises CATGGTGCATCTGACTCCTG (SEQ ID NO: 21668) or CATGGTGCACCTGACTCCTG (SEQ ID NO: 19249), or a sequence having 1, 2, or 3 substitutions thereto.

  • 11. The template RNA of any one of embodiments 1-9, wherein the gRNA spacer comprises GTAACGGCAGACTTCTCCAC (SEQ ID NO: 19971), or a sequence having 1, 2, or 3 substitutions thereto.

  • 12. The template RNA of embodiment 8, wherein the heterologous object sequence comprises the core nucleotides of the gRNA spacer sequence of Table 1 that corresponds to the RT template sequence, or a sequence having 1, 2, or 3 substitutions thereto, and optionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the gRNA spacer sequence, or wherein the heterologous object sequence comprises the nucleotides of the gRNA spacer sequence of Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A that corresponds to the RT template sequence, or a sequence having 1, 2, or 3 substitutions thereto.

  • 13. The template RNA according to any one of embodiments 8-12, wherein the PBS sequence has a sequence comprising the core nucleotides of the PBS sequence from the same row of Table 3 as the RT template sequence, or a sequence having 1, 2, or 3 substitutions thereto, and optionally comprises one or more consecutive nucleotides starting with the 5′ end of the flanking nucleotides of the PBS sequence.

  • 14. The template RNA according to any one of embodiments 8-12, wherein the PBS sequence has a sequence comprising the core nucleotides of a PBS sequence of Table 3 that corresponds to the RT template sequence, or a sequence having 1, 2, or 3 substitutions thereto, the gRNA spacer sequence, or both, and optionally comprises one or more consecutive nucleotides starting with the 5′ end of the flanking nucleotides of the PBS sequence, or wherein the PBS sequence has a sequence comprising the core nucleotides of a PBS sequence of Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A that corresponds to the RT template sequence, the gRNA spacer sequence, or both.

  • 15. The template RNA according to any of embodiments 8-14, wherein the gRNA scaffold comprises a sequence of a gRNA scaffold of Table 12, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto.

  • 16. The template RNA according to any of embodiments 8-14, wherein the gRNA scaffold comprises a sequence of a gRNA scaffold of Table 12 that corresponds to the RT template sequence, the gRNA spacer sequence, or both, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto.

  • 17. The template RNA according to any of the preceding embodiments, wherein the gRNA spacer has a sequence of a gRNA spacer sequence of Table A, or Table B, or a sequence having 1, 2, or 3 substitutions thereto.

  • 18. The template RNA according to embodiment 17, wherein the gRNA spacer has a sequence of SEQ ID NO: 21668.

  • 19. The template RNA of embodiment 17 or 18, wherein the PBS sequence has a sequence of a PBS sequence from the same row as Table A or B as the gRNA spacer sequence, or a sequence having 1, 2, or 3 substitutions thereto.

  • 20. The template RNA of any of embodiments 17-19, wherein the PBS sequence has a sequence comprising the core nucleotides of the PBS sequence of SEQ ID NO:21669, and optionally comprises one or more consecutive nucleotides starting with the 5′ end of the flanking nucleotides of the PBS sequence.

  • 21. The template RNA of any of embodiments 17-19, wherein the gRNA scaffold has a sequence of a gRNA scaffold from the same row as Table A or B as the gRNA spacer sequence, or a sequence having 1, 2, or 3 substitutions thereto.

  • 22. The template RNA of any of embodiments 17-20, wherein the heterologous object sequence has a sequence of the RT template sequence from the same row as Table A or B as the gRNA spacer sequence, or a sequence having 1, 2, or 3 substitutions thereto, wherein optionally the bolded T shown in the RT template sequence of Table A is replaced with a G (e.g., a sequence without a PAM-kill mutation), or wherein further optionally the bolded C shown in the RT template of Table B is replaced with a T or U (e.g., a sequence without a SNP that is present in HEK293T cells but absent in the hg38 human reference genome).

  • 23. The template RNA of any of embodiments 17-22, wherein the heterologous object sequence has a sequence comprising the core nucleotides of the RT template sequence of SEQ ID NO:21670, and optionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the RT template sequence.

  • 24. The template RNA of any of embodiments 17-23, wherein the heterologous object sequence has a sequence comprising the core nucleotides of the RT template sequence of SEQ ID NO:21671, and optionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the RT template sequence.

  • 25. The template RNA of any of embodiments 17-24, wherein the template RNA has a sequence of a template RNA of Table A or Table B, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto, wherein optionally the template RNA comprises one or more (e.g., all) chemical modifications shown in the sequence of Table A or Table B.

  • 26. A gene modifying system for modifying DNA, comprising:
    • (a) a first RNA comprising, from 5′ to 3, (i) a guide RNA sequence that is complementary to a first portion of the human HBB gene, wherein the guide RNA sequence has a sequence comprising the core nucleotides of a spacer sequence of Table 1, or a sequence having 1, 2, or 3 substitutions thereto, and optionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the guide RNA sequence, or wherein the guide RNA sequence has a sequence comprising a spacer from Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A; and (ii) a sequence (e.g., a scaffold region) that binds a gene modifying polypeptide (e.g., binds the Cas domain of the gene modifying polypeptide), and
    • (b) a second RNA comprising (iii) a heterologous object sequence comprising a nucleotide substitution to introduce a mutation into a second portion of the human HBB gene (wherein optionally the heterologous object sequence comprises, from 5′ to 3′, a post-edit homology region, a mutation region, and a pre-edit homology region), (iv) a primer region comprising at least 5, 6, 7, or 8 bases of 100% identity to a third portion of the human HBB gene, and (v) an RRS (RNA binding protein recognition sequence) that binds a gene modifying protein.

  • 27. The gene modifying system of embodiment 26, wherein the heterologous object sequence comprises the core nucleotides of an RT template sequence from Table 3, or a sequence having 1, 2, or 3 substitutions thereto, and optionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the RT template sequence, or wherein the heterologous object sequence comprises a sequence of an RT template sequence from Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A, or a sequence having 1, 2, or 3 substitutions thereto.

  • 28. The gene modifying system of embodiment 26, wherein the heterologous object sequence comprises the core nucleotides of the RT template sequence of Table 3 that corresponds to the gRNA spacer sequence, or a sequence having 1, 2, or 3 substitutions thereto, and optionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the RT template sequence, or wherein the heterologous object sequence comprises a sequence of an RT template sequence from Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A that corresponds to the gRNA spacer sequence, or a sequence having 1, 2, or 3 substitutions thereto.

  • 29. The gene modifying system of any one of embodiments 26-28, wherein the PBS sequence has a sequence comprising the core nucleotides of the PBS sequence from the same row of Table 3 as the RT template sequence, or a sequence having 1, 2, or 3 substitutions thereto, and optionally comprises one or more consecutive nucleotides starting with the 5′ end of the flanking nucleotides of the PBS sequence.

  • 30. The gene modifying system of one of embodiments 26-28, wherein the PBS sequence has a sequence comprising the core nucleotides of a PBS sequence of Table 3 that corresponds to the RT template sequence, or a sequence having 1, 2, or 3 substitutions thereto, the gRNA spacer sequence, or both, and optionally comprises one or more consecutive nucleotides starting with the 5′ end of the flanking nucleotides of the PBS sequence, or wherein the PBS sequence comprises a PBS sequence from Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A that corresponds to the RT template sequence, or a sequence having 1, 2, or 3 substitutions thereto, the gRNA spacer sequence, or both.

  • 31. The gene modifying system of any one of embodiments 26-30, wherein the gRNA scaffold comprises a sequence of a gRNA scaffold of Table 12, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto.

  • 32. The gene modifying system of any one of embodiments 26-30, wherein the gRNA scaffold comprises a sequence of a gRNA scaffold of Table 12 that corresponds to the RT template sequence, the gRNA spacer sequence, or both, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto.

  • 33. A gene modifying system for modifying DNA, comprising:
    • (a) a first RNA comprising, from 5′ to 3, (i) a guide RNA sequence that is complementary to a first portion of the human HBB gene, and (ii) a sequence (e.g., a scaffold region) that binds a gene modifying polypeptide (e.g., binds the Cas domain of the gene modifying polypeptide), and
    • (b) a second RNA comprising (iii) a heterologous object sequence comprising a nucleotide substitution to introduce a mutation into a second portion of the human HBB gene, wherein the heterologous object sequence comprises the core nucleotides of an RT template sequence of Table 3, or a sequence having 1, 2, or 3 substitutions thereto, and optionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the RT template sequence, or wherein the heterologous object sequence comprises an RT sequence from Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A, or a sequence having 1, 2, or 3 substitutions thereto, and (iv) a primer region comprising at least 5, 6, 7, or 8 bases of 100% homology to a third portion of the human HBB gene, and (v) an RRS (RNA binding protein recognition sequence) that binds a gene modifying protein.

  • 34. The gene modifying system of embodiment 33, wherein the gRNA spacer comprises the core nucleotides of a gRNA spacer sequence of Table 1, or a sequence having 1, 2, or 3 substitutions thereto, and optionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the gRNA spacer sequence, or wherein the gRNA spacer comprises a gRNA spacer sequence of Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A.

  • 35. The gene modifying system of embodiment 33, wherein the heterologous object sequence comprises the core nucleotides of the gRNA spacer sequence of Table 1 that corresponds to the RT template sequence, or a sequence having 1, 2, or 3 substitutions thereto, and optionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the gRNA spacer sequence, or wherein the gRNA spacer comprises a gRNA spacer sequence from Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A that corresponds to the RT template sequence, or a sequence having 1, 2, or 3 substitutions thereto.

  • 36. The gene modifying system of any one of embodiments 33-35, wherein the PBS sequence has a sequence comprising the core nucleotides of the PBS sequence from the same row of Table 3 as the RT template sequence, or a sequence having 1, 2, or 3 substitutions thereto, and optionally comprises one or more consecutive nucleotides starting with the 5′ end of the flanking nucleotides of the PBS sequence.

  • 37. The gene modifying system of any one of embodiments 33-35, wherein the PBS sequence has a sequence comprising the core nucleotides of a PBS sequence of Table 3 that corresponds to the RT template sequence, the gRNA spacer sequence, or both, or a sequence having 1, 2, or 3 substitutions thereto, and optionally comprises one or more consecutive nucleotides starting with the 5′ end of the flanking nucleotides of the PBS sequence, or wherein the PBS sequence comprises a PBS sequence from Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A that corresponds to the the RT template sequence, the gRNA spacer sequence, or both, or a sequence having 1, 2, or 3 substitutions thereto.

  • 38. The gene modifying system of any one of embodiments 33-37, wherein the gRNA scaffold comprises a sequence of a gRNA scaffold of Table 12, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto.

  • 39. The gene modifying system of any one of embodiments 33-37, wherein the gRNA scaffold comprises a sequence of a gRNA scaffold of Table 12 that corresponds to the RT template sequence, the gRNA spacer sequence, or both, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto.

  • 40. A gRNA comprising (i) a gRNA spacer sequence that is complementary to a first portion of the human HBB gene, wherein the gRNA spacer has a sequence comprising the core nucleotides of a gRNA spacer sequence of Table 1, Table 2, or Table 4, or a sequence having 1, 2, or 3 substitutions thereto and optionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the gRNA spacer sequence; and (ii) a gRNA scaffold, or wherein the gRNA spacer has a sequence of a gRNA spacer sequence from Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A, or a sequence having 1, 2, or 3 substitutions thereto.

  • 41. The gRNA of embodiment 40, wherein the gRNA scaffold comprises a sequence of a gRNA scaffold of Table 12, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto.

  • 42. The gRNA of embodiment 40, wherein the gRNA scaffold comprises a sequence of a gRNA scaffold of Table 12 that corresponds to the gRNA spacer sequence, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto.

  • 43. A template RNA comprising: (iii) a heterologous object sequence comprising a mutation region to introduce a mutation into a second portion of the human HBB gene, wherein the heterologous object sequence comprises the core nucleotides of an RT template sequence of Table 3, or a sequence having 1, 2, or 3 substitutions thereto, and optionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the RT template sequence, or wherein the heterologous object sequence comprises an RT sequence from Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A, or a sequence having 1, 2, or 3 substitutions thereto, and (iv) a PBS sequence comprising at least 5, 6, 7, or 8 bases of 100% homology to a third portion of the human HBB gene.

  • 44. The template RNA according to embodiment 43, wherein the PBS sequence has a sequence comprising the core nucleotides of the PBS sequence from the same row of Table 3 as the RT template sequence, or a sequence having 1, 2, or 3 substitutions thereto, and optionally comprises one or more consecutive nucleotides starting with the 5′ end of the flanking nucleotides of the PBS sequence.

  • 45. The template RNA according to embodiment 43, wherein the PBS sequence has a sequence comprising the core nucleotides of a PBS sequence of Table 3 that corresponds to the RT template sequence, or a sequence having 1, 2, or 3 substitutions thereto, and optionally comprises one or more consecutive nucleotides starting with the 5′ end of the flanking nucleotides of the PBS sequence, or wherein the PBS sequence has a sequence comprising a PBS sequence from Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A, or a sequence having 1, 2, or 3 substitutions thereto.

  • 46. The template RNA according to any one of embodiments 1-16 or 43-45, the gene modifying system of any one of embodiments 26-39, or the gRNA of any one of embodiments 31-33, wherein the mutation introduced by the system is a V6E mutation (e.g., to correct a pathogenic E6V mutation) of the HBB gene.

  • 47. The template RNA according to any one of embodiments 1-16 or 43-46 or the gene modifying system of any one of embodiments 36-39 or 46, wherein the pre-edit sequence comprises between about 1 nucleotide to about 35 nucleotides (e.g., comprises about 1-5, 5-10, 10-15, 15-20, 20-25, 25-30, or 30-35 nucleotides) in length.

  • 48. The template RNA according to any one of embodiments 1-16 or 43-47 or the gene modifying system of any one of embodiments 36-39, 46, or 47, wherein the mutation region comprises a single nucleotide.

  • 49. The template RNA according to any one of embodiments 1-16 or 43-47 or the gene modifying system of any one of embodiments 26-39, 46, or 47, wherein the mutation region is at least two nucleotides in length.

  • 50. The template RNA according to any one of embodiments 1-14, 41-45, or 47 or the gene modifying system of any one of embodiments 24-37, 44-45 or 47, wherein the mutation region is up to 32 (e.g., up to 5, 10, 15, 20, 25, 30, or 32) nucleotides in length and comprises one, two, or three sequence differences relative to a second portion of the human HBB gene.

  • 51. The template RNA according to any one of embodiments 1-16, 43-47, 49, or 50 or the gene modifying system of any one of embodiments 26-39, 46, 47, 49, or 50, wherein the mutation region comprises two sequences differences relative to a second portion of the human HBB gene.

  • 52. The template RNA according to any one of embodiments 1-16, 43-47, or 49-51 or the gene modifying system of any one of embodiments 26-39, 46, 47, or 49-51, wherein the mutation region comprises a first region (e.g., a first nucleotide) designed to correct a pathogenic mutation in the HBB gene and a second region (e.g., a second nucleotide) designed to inactivate a PAM sequence (e.g., a “PAM-kill” mutation exemplified in Table A, AA, B, or B1).

  • 53. The template RNA according to any one of embodiments 1-16, 43-51 or the gene modifying system of any one of embodiments 26-39 or 46-51, wherein the mutation region comprises less than 80%, 70%, 60%, 50%, 40%, or 30% identity to corresponding portion of the human HBB gene.

  • 54. The template RNA of any one of the preceding embodiments, wherein the template RNA comprises one or more silent mutations (e.g., silent substitutions), e.g., as exemplified in Table 7A, X4, or X4A.

  • 55. The template RNA of embodiment 54, wherein the one or more silent mutaitons comprises a silent substitution at the codon encoding the 6th amino acid, counting the initial methionine, of the HBB gene (proline), e.g., to CCC or CCG.

  • 56. The template RNA of any of the preceding embodiments, wherein the mutation region comprises a first region designed to correct a pathogenic mutation in the HBB gene and a second region designed to introduce a silent substitution.

  • 57. The template RNA of any one of the preceding embodiments, which comprises one or more chemically modified nucleotides.

  • 58. A gene modifying system comprising:
    • a template RNA of any of embodiments 1-16, 43-57, or a system of any of embodiments 26-39 or 46-57, and
    • a gene modifying polypeptide, or a nucleic acid (e.g., RNA) encoding the gene modifying polypeptide.

  • 59. The gene modifying system of embodiment 58, wherein the gene modifying polypeptide comprises:
    • a reverse transcriptase (RT) domain (e.g., an RT domain from a retrovirus, or a polypeptide domain having at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% amino acids sequence identity thereto); and
    • a Cas domain that binds to the target DNA molecule and is heterologous to the RT domain (e.g., a Cas9 domain); and
    • optionally, a linker disposed between the RT domain and the Cas domain.

  • 60. The gene modifying system of embodiment 59, wherein:
    • (a) the RT domain comprises:
      • (i) an RT domain of Table 6, or
      • (ii) an RT domain from a murine leukemia virus (MMLV), a porcine endogenous retrovirus (PERV); Avian reticuloendotheliosis virus (AVIRE), a feline leukemia virus (FLV), simian foamy virus (SFV) (e.g., SFV3L), bovine leukemia virus (BLV), Mason-Pfizer monkey virus (MPMV), human foamy virus (HFV), or bovine foamy/syncytial virus (BFV/BSV); or
    • (b) the gene modifying polypeptide comprises an amino acid sequence according to Table C, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto.

  • 61. The gene modifying system of embodiment 59 or 60, wherein the Cas domain comprises a Cas domain of Table 7 or Table 8.

  • 62. The gene modifying system of any one of embodiments 59-61, wherein the Cas domain:
    • (a) is a Cas9 domain;
    • (b) is a SpCas9 domain, a BlatCas9 domain, a Nme2Cas9 domain, a PnpCas9 domain, a SauCas9 domain, a SauCas9-KKH domain, a SauriCas9 domain, a SauriCas9-KKH domain, a ScaCas9-Sc++domain, a SpyCas9 domain, a SpyCas9-NG domain, a SpyCas9-SpRY domain, or a St1Cas9 domain; and/or
    • (c) is a Cas9 domain comprising an N670A mutation, an N611A mutation, an N605A mutation, an N580A mutation, an N588A mutation, an N872A mutation, an N863 mutation, an N622A mutation, or an H840A mutation.

  • 63. The gene modifying system of embodiment 62, wherein the Cas9 domain binds a PAM sequence listed in Table 7 or Table 12.

  • 64. The gene modifying system of embodiment 63, wherein a second portion of the human HBB gene overlaps with a PAM recognized by the Cas domain, e.g., wherein the second portion of the human HBB gene is within the PAM or wherein the PAM is within the second portion of the human HBB gene).

  • 65. The gene modifying system any one of embodiments 58-64, wherein the gRNA spacer is a gRNA spacer according to Table 1, and the Cas domain comprises a Cas domain listed in the same row of Table 1, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto.

  • 66. The gene modifying system of any one of embodiments 58-64, wherein the template RNA comprises a sequence of a template RNA sequence of Table 3, Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto.

  • 67. The gene modifying system of any one of embodiments 58-66, wherein:
    • (a) the template RNA comprises a sequence of a template RNA sequence of Table 3, Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A;
    • (b) the Cas domain comprises a Cas domain of Table 7 or Table 8;
    • (c) the linker comprises a linker sequence of Table 10 (e.g., of any of SEQ ID NOs: 5217, 5106, 5190, and 5218); and
    • (d) the gene modifying polypeptide comprises one or two NLS sequences from Table 11 (e.g., of any of SEQ ID NOs: 5245, 5290, 5323, 5330, 5349, 5350, 5351, and 4001).

  • 68. The gene modifying system of any of embodiments 58-67, which produces a first nick in a first strand of the human HBB gene.

  • 69. The gene modifying system of embodiment 68, which further comprises a second strand-targeting gRNA that directs a second nick to the second strand of the human HBB gene.

  • 70. The gene modifying system of embodiment 69, wherein the second strand-targeting gRNA comprises:
    • (i) a sequence comprising the core nucleotides of a left gRNA spacer sequence or a right gRNA spacer sequence from Table 2, and optionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the left gRNA spacer sequence or right gRNA spacer sequence; or
    • (ii) a second-strand-targeting gRNA comprising a spacer sequence of Table 6A, or a spacer sequence having 1, 2, or 3 substitutions thereto.

  • 71. The gene modifying system of embodiment 69, wherein the second strand-targeting gRNA comprises a sequence comprising the core nucleotides of a left gRNA spacer sequence or a right gRNA spacer sequence from Table 2 that corresponds to the gRNA spacer sequence of (i), and optionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the left gRNA spacer sequence or right gRNA spacer sequence.

  • 72. The gene modifying system of embodiment 69, wherein the second strand-targeting gRNA comprises:
    • (i) a sequence comprising the core nucleotides of a second nick gRNA sequence from Table 4, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto, and optionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the second nick gRNA sequence; or
    • (ii) a second-strand-targeting gRNA comprising a spacer sequence from Table 6A or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto.

  • 73. The gene modifying system of embodiment 69, wherein the second strand-targeting gRNA comprises a sequence comprising the core nucleotides of the second nick gRNA sequence from Table 4 that corresponds to the gRNA spacer sequence of (i), or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto, and optionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the second nick gRNA sequence.

  • 74. The gene modifying system of any one of embodiments 58-73, wherein the second strand-targeting gRNA has a “PAM-in orientation” with the template RNA of the gene modifying system, e.g., as exemplified in Table 4, 6A, X4, or X4A.

  • 75. The gene modifying system of any one of embodiments 58-63, the second strand-targeting gRNA targets a sequence overlapping the target mutation of the template RNA.

  • 76. The gene modifying system of embodiment 75, wherein second strand-targeting gRNA comprises:
    • (i) a sequence (e.g., a spacer sequence) complementary to the sickle cell mutation;
    • (ii) a sequence (e.g., a spacer sequence) complementary to the wild-type sequence at the sickle cell locus;
    • (iii) a sequence (e.g., a spacer sequence) complementary to the Makassar sequence at the sickle cell locus;
    • (iv) a sequence (e.g., a spacer sequence) complementary to a SNP proximal to the sickle cell locus, e.g., a SNP contained in the genomic DNA of a subject (e.g., a patient);
    • (v) a sequence (e.g., spacer sequence) complementary to or comprising one or more silent substitutions proximal to the sickle cell locus.

  • 77. The template RNA, gene modifying system, or gRNA, of any one of the preceding embodiments, wherein the gRNA spacer comprises about 1, 2, 3, or more flanking nucleotides of the gRNA spacer.

  • 78. The template RNA or gene modifying system of any one of the preceding embodiments, wherein the heterologous object sequence comprises about 2, 3, 4, 5, 10, 20, 30, 40, or more flanking nucleotides of the RT template sequence.

  • 79. The template RNA or gene modifying system, of any one of the preceding embodiments, wherein the heterologous object sequence comprises between about 8-30, 9-25, 10-20, 11-16, or 12-15 (e.g., about 11-16) nucleotides.

  • 80. The template RNA or gene modifying system, of any one of the preceding embodiments, wherein the mutation region comprises 1, 2, or 3 nucleotide positions of sequence differences relative to the corresponding portion of the human HBB gene.

  • 81. The template RNA or gene modifying system of any one of the preceding embodiments, wherein the mutation region comprises at least 2 nucleotide positions of sequence difference relative to the corresponding portion of the human HBB gene.

  • 82. The template RNA or gene modifying system, of any one of the preceding embodiments, wherein the post-edit homology region and/or pre-edit homology region comprises 100% identity to the HBB gene.

  • 83. The template RNA or gene modifying system of any one of the preceding embodiments, wherein the PBS sequence additionally comprises about 1, 2, 3, 4, 5, 6, 7, or more flanking nucleotides.

  • 84. The template RNA or gene modifying system of any one of the preceding embodiments, wherein the PBS sequence comprises about 5-20, 8-16, 8-14, 8-13, 9-13, 9-12, or 10-12 (e.g., about 9-12) nucleotides.

  • 85. The template RNA or gene modifying system of any one of the preceding embodiments, wherein the PBS sequence binds within 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleotides of a nick site in the HBB gene.

  • 86. The gene modifying system of any one of the preceding embodiments, wherein the domains of the gene modifying polypeptide are joined by a peptide linker.

  • 87. The gene modifying system of embodiment 86, wherein the linker comprises a sequence of a linker of Table 10 (e.g., of any of SEQ ID NOs: 5217, 5106, 5190, and 5218).

  • 88. The gene modifying system of any one of the preceding embodiments, wherein the gene modifying polypeptide further comprise one or more nuclear localization sequences (NLS).

  • 89. The gene modifying system of embodiment 88, wherein the gene modifying polypeptide comprises a first NLS and a second NLS.

  • 90. The gene modifying system of embodiment 88 or 89, wherein the NLS comprises a sequence of a NLS of Table 11 (e.g., of any of SEQ ID NOs: 5245, 5290, 5323, 5330, 5349, 5350, 5351, and 4001).

  • 91. A template RNA comprising a sequence of a template RNA of Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto.

  • 92. A template RNA comprising a sequence of a template RNA of Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A.

  • 93. A gene modifying system comprising:
    • (i) a template RNA comprising a sequence of a template RNA of Table 4, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto; and
    • (ii) a second-nick gRNA sequence from the same row of Table 4 as (i), a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto.

  • 94 A gene modifying system comprising:
    • (i) a template RNA comprising a sequence of a template RNA of Table 4; and
    • (ii) a second-nick gRNA sequence from the same row of Table 4 as (i).

  • 95. A DNA encoding the template RNA of any one of embodiments 1-16, 43-53, 77-85, 91, or 92, or the gRNA of any one of embodiments 40-42.

  • 96. A pharmaceutical composition, comprising the system of any one of embodiments 58-90, 93, or 94, or one or more nucleic acids encoding the same, and a pharmaceutically acceptable excipient or carrier.

  • 97. The pharmaceutical composition of embodiment 96, wherein the pharmaceutically acceptable excipient or carrier is selected from the group consisting of a plasmid vector, a viral vector, a vesicle, and a lipid nanoparticle.

  • 98. The pharmaceutical composition of embodiment 97, wherein the viral vector is an adeno-associated virus.

  • 99. A host cell (e.g., a mammalian cell, e.g., a human cell) comprising the template RNA or gene modifying system of any one of the preceding embodiments.

  • 100. A method of making the template RNA of any one of embodiments 1-16, 43-53, 77-85, 91, or 92, the method comprising synthesizing the template RNA by in vitro transcription (e.g., solid state synthesis) or by introducing a DNA encoding the template RNA into a host cell under conditions that allow for production of the template RNA.

  • 101. A method for modifying a target site in the human HBB gene in a cell, the method comprising contacting the cell with the gene modifying system of any one of embodiments 58-90, 93, or 94, or DNA encoding the same, thereby modifying the target site in the human HBB gene in a cell.

  • 102. A method for modifying a target site in the human HBB gene in a cell, the method comprising contacting the cell with: (i) the template RNA of any one of embodiments 58-90, 93, or 94, or DNA encoding the same; and (ii) a gene modifying polypeptide or a nucleic acid encoding a gene modifying polypeptide, thereby modifying the target site in the human HBB gene in a cell.

  • 103. A method for treating a subject having a disease or condition associated with a mutation in the human HBB gene, the method comprising administering to the subject the gene modifying system of any one of embodiments 58-90, 93, or 94, or DNA encoding the same, thereby treating the subject having a disease or condition associated with a mutation in the human HBB gene.

  • 104. A method for treating a subject having a disease or condition associated with a mutation in the human HBB gene, the method comprising administering to the subject the template RNA of any one of embodiments 58-90, 93, or 94, or DNA encoding the same; and (ii) a gene modifying polypeptide or a nucleic acid encoding a gene modifying polypeptide, thereby treating the subject having a disease or condition associated with a mutation in the human HBB gene.

  • 105. The method of embodiment 103 or 104, wherein the disease or condition is sickle cell disease (SCD) (e.g., sickle cell anemia).

  • 106. The method of any one of embodiments 103-105, wherein the subject has a pathogenic EV6 mutation.

  • 107. A method for treating a subject having SCD the method comprising administering to the subject the gene modifying system of any one of embodiments 58-90, 93, or 94, or DNA encoding the same, thereby treating the subject having SCD.

  • 108. A method for treating a subject having SCD the method comprising administering to the subject (i) the template RNA of any one of embodiments 58-90, 93, or 94, or DNA encoding the same, and (ii) a gene modifying polypeptide or a nucleic acid encoding a gene modifying polypeptide, thereby treating the subject having SCD.

  • 109. The gene modifying system or method of any one of the preceding embodiments, wherein introduction of the system into a target cell results in a correction of a pathogenic mutation in the HBB gene.

  • 110. The gene modifying system or method of any one of the preceding embodiments, wherein the pathogenic mutation is a E6V mutation, and wherein the correction comprises an amino acid substitution of V6E.

  • 111. The gene modifying system or method of any of the preceding embodiments, wherein correction of the mutation occurs in at least 30% (e.g., 30%, 40%, 50%, 60%, 70%, or more) of target nucleic acids.

  • 112. The gene modifying system or method of any of the preceding embodiments, wherein correction of the mutation occurs in at least 30% (e.g., 30%, 40%, 50%, 60%, 70%, or more) of target cells.

  • 113. The gene modifying system or method of any of the preceding embodiments, wherein the gene modifying system comprises a second strand-targeting gRNA, and wherein correction of the mutation in a population of target cells is increased relative to a population of target cells treated with a gene modifying system comprising a template RNA without a second strand-targeting gRNA.

  • 114. The gene modifying system or method of any of the preceding embodiments, wherein the template RNA comprises one or more silent substitutions (e.g., as exemplified in Tables 7A, X4, and X4A), and wherein correction of the mutation in a population of target cells is increased relative to a population of target cells treated with a gene modifying system comprising a template RNA that does not comprise one or more silent substitutions.

  • 115. The method of any of the preceding embodiments, wherein the cell is a mammalian cell, such as a human cell.

  • 116. The method of any one of the preceding embodiments, wherein the subject is a human.

  • 117. The method of any of the preceding embodiments, wherein the contacting occurs ex vivo, e.g., wherein the cell's or subject's DNA is modified ex vivo.

  • 118. The method of any of the preceding embodiments, wherein the contacting occurs in vivo, e.g., wherein the cell's or subject's DNA is modified in vivo.

  • 119. The method of any of the preceding embodiments, wherein contacting the cell or the subject with the system comprises contacting the cell or a cell within the subject with a nucleic acid (e.g., DNA or RNA) encoding the gene modifying polypeptide under conditions that allow for production of the gene modifying polypeptide.

  • 120. The method of any of the preceding embodiments, wherein the gRNA spacer is perfectly complementary at all nucleotide positions to the first portion of the human HBB gene in the cell, wherein the first portion is situated on the second strand of the HBB gene.

  • 121. The method of any of the preceding embodiments, wherein the heterologous object sequence is perfectly complementary to the second portion of the human HBB gene in the cell, at all nucleotide positions except the mutation region, wherein the second portion is situated on the first strand of the HBB gene.

  • 122. The method any of the preceding embodiments, wherein the PBS sequence is perfectly complementary to the third portion of the human HBB gene, wherein the third portion is situated on the first strand of the HBB gene.



Further Enumerated Embodiments



  • A1. A template RNA comprising, from 5′ to 3′:
    • (i) a gRNA spacer that is complementary to a first portion of the human HBB gene, wherein the gRNA spacer has a nucleotide sequence comprising CATGGTGCATCTGACTCCTG (SEQ ID NO: 21668), or a nucleotide sequence having 1 substitution thereto;
    • (ii) a gRNA scaffold that binds a Cas domain of a gene modifying polypeptide,
    • (iii) a heterologous object sequence comprising a mutation region to correct a mutation in a second portion of the human HBB gene, and
    • (iv) a primer binding site (PBS) sequence comprising at least 5 bases with 100% identity to a third portion of the human HBB gene.

  • A2. The template RNA of embodiment A1, wherein the gRNA spacer has a nucleotide sequence comprising CATGGTGCATCTGACTCCTG (SEQ ID NO: 21668) or CATGGTGCACCTGACTCCTG (SEQ ID NO: 19249).

  • A3. The template RNA of embodiment A1 or A2, wherein the gRNA spacer has a nucleotide sequence consisting of CATGGTGCATCTGACTCCTG (SEQ ID NO: 21668) or CATGGTGCACCTGACTCCTG (SEQ ID NO: 19249).

  • A4. The template RNA of any of the preceding embodiments, wherein the gRNA spacer has a length of 20 nucleotides.

  • A5. The template RNA of embodiment A1, wherein the gRNA scaffold has a sequence according to GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTT GAAAAAGTGGCACCGAGTCGGTGC (SEQ ID NO: 11,012), or a sequence having at least 90% identity thereto.

  • A6. The template RNA of embodiment A1, wherein the gRNA scaffold has a sequence according to










(SEQ ID NO: 11,012)


GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC





TTGAAAAAGTGGCACCGAGTCGGTGC.






  • A7. The template RNA of embodiment A1, wherein the heterologous object sequence comprises a sequence of at least 8 nucleotides from the 3′ end of a sequence according to AGTAACGGCAGACTTCTCTTCAG (SEQ ID NO: 20954), or a sequence having 1, 2, or 3 substitutions thereto.

  • A8. The template RNA of embodiment A1, wherein the heterologous object sequence comprises a sequence of 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, or 23 nucleotides from the 3′ end of a sequence according to AGTAACGGCAGACTTCTCTTCAG (SEQ ID NO: 20954), or a sequence having 1, 2, or 3 substitutions thereto.

  • A9. The template RNA of embodiment A1, wherein the heterologous object sequence comprises a sequence of at least 8 nucleotides from the 3′ end of a sequence according to AGTAACGGCAGACTTCTCTTCAG (SEQ ID NO: 20954).

  • A10. The template RNA of embodiment A1, wherein the heterologous object sequence comprises a sequence of at least 8 nucleotides from the 3′ end of a sequence according to AGTAACGGCAGACTTCTCTGCAG (SEQ ID NO: 20955).

  • A11. The template RNA of embodiment A1, wherein the PBS sequence comprises a sequence of at least 8 nucleotides from the 5′ end of a sequence according to GAGTCAGGTGCACCATG (SEQ ID NO: 19431), or a sequence having 1 substitution thereto.

  • A12. The template RNA of embodiment A1, wherein the PBS sequence comprises a sequence of at least 8 nucleotides from the 5′ end of a sequence according to GAGTCAGGTGCACCATG (SEQ ID NO: 19431).

  • A13. The template RNA of embodiment A1, wherein the PBS sequence comprises a sequence of 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides from the 5′ end of a sequence according to GAGTCAGGTGCACCATG (SEQ ID NO: 19431), or a sequence having 1 substitution thereto.

  • A14. The template RNA of embodiment A1, wherein:
    • the gRNA scaffold has a sequence according to GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCA ACTTGAAAAAGTGGCACCGAGTCGGTGC (SEQ ID NO: 11,012), or a
    • sequence having at least 90% identity thereto;
    • the heterologous object sequence comprises a sequence of at least 8 nucleotides from the 3′ end of a sequence according to AGTAACGGCAGACTTCTCTTCAG (SEQ ID NO: 20954), or a sequence having 1, 2, or 3 substitutions thereto; and the PBS sequence comprises a sequence of at least 8 nucleotides from the 5′ end of a sequence according to GAGTCAGGTGCACCATG (SEQ ID NO: 19431), or a sequence having 1 substitution thereto.

  • A15. The template RNA of embodiment A1, wherein:
    • the gRNA scaffold has a sequence according to










(SEQ ID NO: 11,012)


GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCA





ACTTGAAAAAGTGGCACCGAGTCGGTGC.








    • wherein the heterologous object sequence comprises a sequence of at least 8 nucleotides from the 3′ end of a sequence according to AGTAACGGCAGACTTCTCTTCAG (SEQ ID NO: 20954); and the PBS sequence comprises a sequence of at least 8 nucleotides from the 5′ end of a sequence according to GAGTCAGGTGCACCATG (SEQ ID NO: 19431).



  • A16. The template RNA of any of the preceding embodiments, which does not comprise a sequence according to










(SEQ ID NO: 21997)


GCATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAA





TAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGA





CTTCTCCACAGGAGTCAGGTGCAC.






  • A17. A template RNA comprising, from 5′ to 3′:
    • (i) a gRNA spacer that is complementary to a first portion of the human HBB gene, wherein the gRNA spacer has a nucleotide sequence comprising GTAACGGCAGACTTCTCCAC (SEQ ID NO: 19971), or a nucleotide sequence having 1 substitution thereto;
    • (ii) a gRNA scaffold that binds a Cas domain of a gene modifying polypeptide,
    • (iii) a heterologous object sequence comprising a mutation region to introduce a mutation into a second portion of the human HBB gene, and
    • (iv) a primer binding site (PBS) sequence comprising at least 5 bases with 100% identity to a third portion of the human HBB gene.

  • A18. The template RNA of embodiment A17, wherein the gRNA spacer has a nucleotide sequence comprising GTAACGGCAGACTTCTCCAC (SEQ ID NO: 19971).

  • A19. The template RNA of embodiment A17 or A18, wherein the gRNA scaffold has a sequence according to GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTT GAAAAAGTGGCACCGAGTCGGTGC (SEQ ID NO: 11,012), or a sequence having at least 90% identity thereto.

  • A20. The template RNA of any of embodiments A17-19, wherein the gRNA scaffold has a sequence according to










(SEQ ID NO: 11,012)


GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAA





CTTGAAAAAGTGGCACCGAGTCGGTGC.






  • A21. The template RNA of any of embodiments A17-20, wherein the heterologous object sequence comprises a sequence of at least 8 nucleotides from the 3′ end of a sequence according to CCATGGTGCACCTGACTCCTGAG (SEQ ID NO: 20956) or CCATGGTGCACCTGACTCCTGCG (SEQ ID NO: 21906), or a sequence having 1, 2, or 3 substitutions thereto.

  • A22. The template RNA of any of embodiments A17-21, wherein the heterologous object sequence comprises a sequence of 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, or 23 nucleotides from the 3′ end of a sequence according to CCATGGTGCACCTGACTCCTGAG (SEQ ID NO: 20956) or CCATGGTGCACCTGACTCCTGCG (SEQ ID NO: 21906), or a sequence having 1, 2, or 3 substitutions thereto.

  • A23. The template RNA of any of embodiments A17-22, wherein the heterologous object sequence comprises a sequence of at least 8 nucleotides from the 3′ end of a sequence according to CCATGGTGCACCTGACTCCTGAG (SEQ ID NO: 20956) or CCATGGTGCACCTGACTCCTGCG (SEQ ID NO: 21906).

  • A24. The template RNA of any of embodiments A17-23, wherein the heterologous object sequence comprises a sequence of 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, or 23 nucleotides from the 3′ end of a sequence according to












(SEQ ID NO: 20956)



CCATGGTGCACCTGACTCCTGAG



or







(SEQ ID NO: 21906)



CCATGGTGCACCTGACTCCTGCG.






  • A25. The template RNA of any of embodiments A17-24, wherein the PBS sequence comprises a sequence of at least 8 nucleotides from the 5′ end of a sequence according to GAGAAGTCTGCCGTTAC (SEQ ID NO: 20957), or a sequence having 1 substitution thereto.

  • A26. The template RNA of any of embodiments A17-25, wherein the PBS sequence comprises a sequence of at least 8 nucleotides from the 5′ end of a sequence according to GAGAAGTCTGCCGTTAC (SEQ ID NO: 20957).

  • A27. The template RNA of any of embodiments A17-26, wherein the PBS sequence comprises a sequence of 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides from the 5′ end of a sequence according to GAGAAGTCTGCCGTTAC (SEQ ID NO: 20957), or a sequence having 1 substitution thereto.

  • A28. The template RNA of any of embodiments A17-27, wherein the PBS sequence comprises a sequence of 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides from the 5′ end of a sequence according to GAGAAGTCTGCCGTTAC (SEQ ID NO: 20957).

  • A29. The template RNA of any of embodiments A17-28, which does not comprise a sequence according to










(SEQ ID NO: 21998)


GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAAT





AAGGCTAGTCCGTTATCAACTTGAAAAAGTGGGACCGAGTCGGTCCGACT





CCTGaGGAGAAGTCTGCC.






  • A30. The template RNA of any of the preceding embodiments, wherein the mutation region comprises a single nucleotide.

  • A31. The template RNA of any of the preceding embodiments, wherein the mutation region is at least two nucleotides in length.

  • A32. The template RNA of any of the preceding embodiments, wherein the mutation region is up to 20 nucleotides in length and comprises one, two, or three sequence differences relative to the second portion of the human HBB gene.

  • A33. The template RNA of any of the preceding embodiments, wherein the mutation region comprises a first region designed to correct a pathogenic mutation in the HBB gene and a second region designed to inactivate a PAM sequence.

  • A34. The template RNA of any of the preceding embodiments, wherein the mutation region comprises a first region designed to correct a pathogenic mutation in the HBB gene and a second region designed to introduce a silent substitution.

  • A35. The template RNA of any of the preceding embodiments, which is configured to edit an E6V mutation in the human HBB gene.

  • A36. The template RNA of embodiment A35, which is configured to convert an E6V mutation to glutamine or alanine.

  • A37. The template RNA of any of the preceding embodiments, which comprises one or more chemically modified nucleotides.

  • A38. A gene modifying system comprising:
    • a template RNA of any of the preceding embodiments, and
    • a gene modifying polypeptide, or a nucleic acid encoding the gene modifying polypeptide.

  • A39. The gene modifying system of embodiment A38, wherein the gene modifying polypeptide comprises an RT domain having a sequence according to SEQ ID NO: 8,003, or a sequence having at least 70%, 80%, 85%, 90%, 95%, 98%, or 99% identity thereto.

  • A40. The gene modifying system of embodiment A38, wherein the gene modifying polypeptide comprises an RT domain having a sequence according to SEQ ID NO: 8,020, or a sequence having at least 70%, 80%, 85%, 90%, 95%, 98%, or 99% identity thereto.

  • A41. The gene modifying system of embodiment A38, wherein the gene modifying polypeptide comprises an RT domain having a sequence according to SEQ ID NO: 8,074, or a sequence having at least 70%, 80%, 85%, 90%, 95%, 98%, or 99% identity thereto.

  • A42. The gene modifying system of embodiment A38, wherein the gene modifying polypeptide comprises an RT domain having a sequence according to SEQ ID NO: 8,113, or a sequence having at least 70%, 80%, 85%, 90%, 95%, 98%, or 99% identity thereto.

  • A43. The gene modifying system of embodiment A38, wherein the gene modifying polypeptide comprises DNA binding domain having a sequence of a Cas9 nickase comprising an N863A mutation, e.g., a sequence according to SEQ ID NO: 11,096, or a sequence having at least 70%, 80%, 85%, 90%, 95%, 98%, or 99% identity thereto.

  • A44. The gene modifying system of embodiment A38, which produces a first nick in a first strand of the human HBB gene.

  • A45. The gene modifying system of embodiment A44, which further comprises a second strand-targeting gRNA that directs a second nick to the second strand of the human HBB gene.

  • A46. The gene modifying system of embodiment A45, wherein the first nick and the second nick are 80-120 nucleotides apart.

  • A47. The gene modifying system of embodiment A45, wherein the template RNA and the second strand-targeting gRNA are configured to produce an outward nick orientation.

  • A48. The gene modifying system of embodiment A45, wherein the second strand-targeting gRNA comprises a spacer sequence that is complementary to a human HBB gene having a sickle cell disease mutation, a wild-type sequence, or a Makassar variant.

  • A49. A method for modifying a target site in the human HBB gene in a cell, the method comprising contacting the cell with the gene modifying system of embodiment 38, thereby modifying the target site in the human HBB gene in a cell.

  • A50. The method of embodiment A49, wherein correction of the mutation occurs in at least 30% of target nucleic acids.

  • A51. A method for treating a subject having a disease or condition associated with a mutation in the human HBB gene, wherein the disease or condition is sickle cell disease (SCD), the method comprising administering to the subject the gene modifying system of embodiment 38, thereby treating the subject having a disease or condition associated with a mutation in the human HBB gene.

  • A52. A template RNA comprising, from 5′ to 3′:
    • (i) a gRNA spacer that is complementary to a first portion of the human HBB gene, wherein the gRNA spacer has a nucleotide sequence comprising the core nucleotides of a gRNA spacer sequence of Table 1, and optionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the gRNA spacer, or a nucleotide sequence having 1, 2, or 3 substitutions thereto;
    • (ii) a gRNA scaffold that binds a Cas domain of a gene modifying polypeptide,
    • (iii) a heterologous object sequence comprising a mutation region to correct a mutation in a second portion of the human HBB gene, and
    • (iv) a primer binding site (PBS) sequence comprising at least 5 bases with 100% identity to a third portion of the human HBB gene.






BRIEF DESCRIPTION OF THE DRAWINGS

The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.



FIG. 1 depicts a gene modifying system as described herein. The left hand diagram shows the gene modifying polypeptide, which comprises a Cas nickase domain (e.g., spCas9 N863A) and a reverse transcriptase domain (RT domain) which are linked by a linker. The right hand diagram shows the template RNA which comprises, from 5′ to 3′, a gRNA spacer, a gRNA scaffold, a heterologous object sequence, and a primer binding site sequence (PBS sequence). The heterologous object sequence can comprise a mutation region that comprises one or more sequence differences relative to the target site. The heterologous object sequence can also comprise a pre-edit homology region and a post-edit homology region, which flank the mutation region. Without wishing to be bound by theory, it is thought that the gRNA spacer of the template RNA binds to the second strand of a target site in the genome, and the gRNA scaffold of the template RNA binds to the gene modifying polypeptide, e.g., localizing the gene modifying polypeptide to the target site in the genome. It is thought that the Cas domain of the gene modifying polypeptide nicks the target site (e.g., the first strand of the target site), e.g., allowing the PBS sequence to bind to a sequence adjacent to the site to be altered on the first strand of the target site. It is thought that the RT domain of the gene modifying polypeptide uses the first strand of the target site that is bound to the complementary sequence comprising the PBS sequence of the template RNA as a primer and the heterologous object sequence of the template RNA as a template to, e.g., polymerize a sequence complementary to the heterologous object sequence. Without wishing to be bound by theory, it is thought that reverse transcription can then proceed through the pre-edit homology region, then through the mutation region, and then through the post-edit homology region, thereby producing a DNA strand comprising a mutation specified by the heterologous object sequence.



FIG. 2 is a pair of graphs showing rewrite levels in 293T cells (left panel) and CD34+primary human HSCs following transfection of gene modifying systems comprising a gene modifying polypeptides various template RNAs.



FIG. 3 is a pair of graphs showing rewrite levels in 293T cells (left panel) and CD34+primary human HSCs following transfection of gene modifying systems comprising a gene modifying polypeptides various template RNAs.



FIG. 4 is a graph showing the percent editing in primary human fibroblasts following electroporation with a gene modifying system comprising tgRNA14 with or without a second nick.



FIG. 5 is a graph showing percent editing in wild type human primary fibroblasts (to install the Makassar mutation) and sickle human primary fibroblasts (to install the wild-type sequence) following electroporation with a gene modifying system comprising tgRNA14 with or without a second nick.



FIG. 6 is a graph showing the percent rewriting achieved using the RNAV209-013 or RNAV214-040 gene modifying polypeptides with the indicated template RNAs.



FIG. 7 is a graph showing the amount of Fah mRNA relative to wild type when template RNAs are used with the RNAV209-013 or RNAV214-040 gene modifying polypeptides.



FIG. 8 is a graph showing the percentage of Cas9-positive hepatocytes 6 hours following dosing with LNPs containing various gene modifying polypeptides and template RNAs.



FIG. 9 is a graph showing the rewrite levels in liver samples 6 days following dosing with LNPs containing various gene modifying polypeptides and template RNAs.



FIG. 10 is a graph showing wild type Fah mRNA restoration compared to littermate heterozygous mice in liver samples following dosing with LNPs containing various gene modifying polypeptides and template RNAs.



FIG. 11 is a graph showing Fah protein distribution in liver samples following dosing with LNPs containing various gene modifying polypeptides and template RNAs.



FIG. 12 is a series of western blots showing Cas9-RT Expression 6 hours after infusion of Cas9-RT mRNA+TTR guide LNP. Each lane represents an individual animal where 20 ug of tissue homogenate was added per lane. Positive control was from an in vitro cell experiment where Cas9-RT was expressed (described previously). GAPDH was used as a loading control for each sample. n-4 per group, vehicle or treated.



FIG. 13 is a graph showing gene editing of TTR locus after treatment with Cas9-RT mRNA+TTR guide LNP. Level of indels detected at the TTR locus measured by TIDE analysis of Sanger sequencing of the TTR locus where the protospacer targets.



FIG. 14 is a graph showing that TTR Serum levels decrease after treatment with Cas9-RT mRNA+TTR guide LNP. Measurement of circulating TTR levels 5 days after mice were treated with LNPs encapsulating Cas9-RT+TTR guide RNA.



FIG. 15 is a graph showing Cas9-RT Expression after infusion of Cas9-RT mRNA+TTR guide LNP. Relative expression quantified by ProteinSimple Jess capillary electrophoresis Western blot. Numbers in the symbols are animal number in group. Vehicle n=2, Cas9-RT+TTR guide n=3.



FIG. 16 is a graph showing gene editing of TTR locus after infusion of Cas9-RT mRNA+TTR guide LNP. Level of indels detected at the TTR locus were measured by amplicon sequencing of the TTR locus where the protospacer targets. Each animal had 8 different biopsies taken across the liver where amplicon sequencing measured the percentage of reads showing an indel.



FIG. 17 is a graph showing average perfect rewrite levels in primary human HSCs following transfection with various gene modifying polypeptides and template RNAs.



FIGS. 18A and 18B are graphs showing average perfect rewrite levels in primary human HSCs following transfection with various gene modifying polypeptides and template RNAs comprising an HBB5 spacer (FIG. 18A) or an HBB8 spacer (FIG. 18B).



FIGS. 19A and 19B are a heatmap (FIG. 19A) and graph (FIG. 19B) showing average perfect rewrite levels in primary human HSCs following transfection with various gene modifying polypeptides and template RNAs comprising an HBB5 spacer (FIG. 19A) or an HBB8 spacer (FIG. 19B).



FIGS. 20A-20C are graphs showing average perfect rewrite levels in primary human HSCs following transfection with various gene modifying polypeptides and template RNAs comprising an HBB5 spacer (FIGS. 20A and 20C) or an HBB8 spacer (FIG. 20B).



FIGS. 21A and 21B are a pair of graphs showing perfect rewrite levels in primary human HSCs (FIG. 21A) and HSC subpopulation percentages (FIG. 21B) following transfection with various gene modifying polypeptides and template RNAs.



FIGS. 22A and 22B are graphs showing perfect rewrite levels in primary human HSCs subpopulations following transfection with various gene modifying polypeptides and template RNAs.



FIGS. 23A-23C are graphs showing total colony number (FIG. 23A), colony number (FIG. 23B), and percent enucleated CD235+ cells (FIG. 23C) following transfection with various gene modifying polypeptides and template RNAs.





DETAILED DESCRIPTION
Definitions

The term “expression cassette,” as used herein, refers to a nucleic acid construct comprising nucleic acid elements sufficient for the expression of the nucleic acid molecule of the instant invention.


A “gRNA spacer”, as used herein, refers to a portion of a nucleic acid that has complementarity to a target nucleic acid and can, together with a gRNA scaffold, target a Cas protein to the target nucleic acid.


A “gRNA scaffold”, as used herein, refers to a portion of a nucleic acid that can bind a Cas protein and can, together with a gRNA spacer, target the Cas protein to the target nucleic acid. In some embodiments, the gRNA scaffold comprises a crRNA sequence, tetraloop, and tracrRNA sequence.


A “gene modifying polypeptide”, as used herein, refers to a polypeptide comprising a retroviral reverse transcriptase, or a polypeptide comprising an amino acid sequence having at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% amino acid sequence identity to a retroviral reverse transcriptase, which is capable of integrating a nucleic acid sequence (e.g., a sequence provided on a template nucleic acid) into a target DNA molecule (e.g., in a mammalian host cell, such as a genomic DNA molecule in the host cell). In some embodiments, the gene modifying polypeptide is capable of integrating the sequence substantially without relying on host machinery. In some embodiments, the gene modifying polypeptide integrates a sequence into a random position in a genome, and in some embodiments, the gene modifying polypeptide integrates a sequence into a specific target site. In some embodiments, a gene modifying polypeptide includes one or more domains that, collectively, facilitate 1) binding the template nucleic acid, 2) binding the target DNA molecule, and 3) facilitate integration of the at least a portion of the template nucleic acid into the target DNA. Gene modifying polypeptides include both naturally occurring polypeptides as well as engineered variants of the foregoing, e.g., having one or more amino acid substitutions to the naturally occurring sequence. Gene modifying polypeptides also include heterologous constructs, e.g., where one or more of the domains recited above are heterologous to each other, whether through a heterologous fusion (or other conjugate) of otherwise wild-type domains, as well as fusions of modified domains, e.g., by way of replacement or fusion of a heterologous sub-domain or other substituted domain. Exemplary gene modifying polypeptides, and systems comprising them and methods of using them, that can be used in the methods provided herein are described, e.g., in


PCT/US2021/020948, which is incorporated herein by reference with respect to gene modifying polypeptides that comprise a retroviral reverse transcriptase domain. In some embodiments, a gene modifying polypeptide integrates a sequence into a gene. In some embodiments, a gene modifying polypeptide integrates a sequence into a sequence outside of a gene. A “gene modifying system,” as used herein, refers to a system comprising a gene modifying polypeptide and a template nucleic acid.


The term “domain” as used herein refers to a structure of a biomolecule that contributes to a specified function of the biomolecule. A domain may comprise a contiguous region (e.g., a contiguous sequence) or distinct, non-contiguous regions (e.g., non-contiguous sequences) of a biomolecule. Examples of protein domains include, but are not limited to, an endonuclease domain, a DNA binding domain, a reverse transcription domain; an example of a domain of a nucleic acid is a regulatory domain, such as a transcription factor binding domain. In some embodiments, a domain (e.g., a Cas domain) can comprise two or more smaller domains (e.g., a DNA binding domain and an endonuclease domain).


As used herein, the term “exogenous”, when used with reference to a biomolecule (such as a nucleic acid sequence or polypeptide) means that the biomolecule was introduced into a host genome, cell or organism by the hand of man. For example, a nucleic acid that is as added into an existing genome, cell, tissue or subject using recombinant DNA techniques or other methods is exogenous to the existing nucleic acid sequence, cell, tissue or subject.


As used herein, “first strand” and “second strand”, as used to describe the individual DNA strands of target DNA, distinguish the two DNA strands based upon which strand the reverse transcriptase domain initiates polymerization, e.g., based upon where target primed synthesis initiates. The first strand refers to the strand of the target DNA upon which the reverse transcriptase domain initiates polymerization, e.g., where target primed synthesis initiates. The second strand refers to the other strand of the target DNA. First and second strand designations do not describe the target site DNA strands in other respects; for example, in some embodiments the first and second strands are nicked by a polypeptide described herein, but the designations ‘first’ and ‘second’ strand have no bearing on the order in which such nicks occur.


The term “heterologous,” as used herein to describe a first element in reference to a second element means that the first element and second element do not exist in nature disposed as described. For example, a heterologous polypeptide, nucleic acid molecule, construct or sequence refers to (a) a polypeptide, nucleic acid molecule or portion of a polypeptide or nucleic acid molecule sequence that is not native to a cell in which it is expressed, (b) a polypeptide or nucleic acid molecule or portion of a polypeptide or nucleic acid molecule that has been altered or mutated relative to its native state, or (c) a polypeptide or nucleic acid molecule with an altered expression as compared to the native expression levels under similar conditions. For example, a heterologous regulatory sequence (e.g., promoter, enhancer) may be used to regulate expression of a gene or a nucleic acid molecule in a way that is different than the gene or a nucleic acid molecule is normally expressed in nature. In another example, a heterologous domain of a polypeptide or nucleic acid sequence (e.g., a DNA binding domain of a polypeptide or nucleic acid encoding a DNA binding domain of a polypeptide) may be disposed relative to other domains or may be a different sequence or from a different source, relative to other domains or portions of a polypeptide or its encoding nucleic acid. In certain embodiments, a heterologous nucleic acid molecule may exist in a native host cell genome, but may have an altered expression level or have a different sequence or both. In other embodiments, heterologous nucleic acid molecules may not be endogenous to a host cell or host genome but instead may have been introduced into a host cell by transformation (e.g., transfection, electroporation), wherein the added molecule may integrate into the host genome or can exist as extra-chromosomal genetic material either transiently (e.g., mRNA) or semi-stably for more than one generation (e.g., episomal viral vector, plasmid or other self-replicating vector).


As used herein, “insertion” of a sequence into a target site refers to the net addition of DNA sequence at the target site, e.g., where there are new nucleotides in the heterologous object sequence with no cognate positions in the unedited target site. In some embodiments, a nucleotide alignment of the PBS sequence and heterologous object sequence to the target nucleic acid sequence would result in an alignment gap in the target nucleic acid sequence.


As used herein, a “deletion” generated by a heterologous object sequence in a target site refers to the net deletion of DNA sequence at the target site, e.g., where there are nucleotides in the unedited target site with no cognate positions in the heterologous object sequence. In some embodiments, a nucleotide alignment of the PBS sequence and heterologous object sequence to the target nucleic acid sequence would result in an alignment gap in the molecule comprising the PBS sequence and heterologous object sequence.


The term “inverted terminal repeats” or “ITRs” as used herein refers to AAV viral cis-elements named so because of their symmetry. These elements promote efficient multiplication of an AAV genome. It is hypothesized that the minimal elements for ITR function are a Rep-binding site (RBS; 5′-GCGCGCTCGCTCGCTC-3′ for AAV2; SEQ ID NO: 4601) and a terminal resolution site (TRS; 5′-AGTTGG-3′ for AAV2) plus a variable palindromic sequence allowing for hairpin formation. According to the present invention, an ITR comprises at least these three elements (RBS, TRS, and sequences allowing the formation of an hairpin). In addition, in the present invention, the term “ITR” refers to ITRs of known natural AAV serotypes (e.g. ITR of a serotype 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or 11 AAV), to chimeric ITRs formed by the fusion of ITR elements derived from different serotypes, and to functional variants thereof. “Functional variant” refers to a sequence presenting a sequence identity of at least 80%, 85%, 90%, preferably of at least 95% with a known ITR and allowing multiplication of the sequence that includes said ITR in the presence of Rep proteins.


The term “mutation region,” as used herein, refers to a region in a template RNA having one or more sequence difference relative to the corresponding sequence in a target nucleic acid. The sequence difference may comprise, for example, a substitution, insertion, frameshift, or deletion.


The term “mutated” when applied to nucleic acid sequences means that nucleotides in a nucleic acid sequence are inserted, deleted, or changed compared to a reference (e.g., native) nucleic acid sequence. A single alteration may be made at a locus (a point mutation), or multiple nucleotides may be inserted, deleted, or changed at a single locus. In addition, one or more alterations may be made at any number of loci within a nucleic acid sequence. A nucleic acid sequence may be mutated by any method known in the art.


“Nucleic acid molecule” refers to both RNA and DNA molecules including, without limitation, complementary DNA (“cDNA”), genomic DNA (“gDNA”), and messenger RNA (“mRNA”), and also includes synthetic nucleic acid molecules, such as those that are chemically synthesized or recombinantly produced, such as RNA templates, as described herein. The nucleic acid molecule can be double-stranded or single-stranded, circular, or linear. If single-stranded, the nucleic acid molecule can be the sense strand or the antisense strand. Unless otherwise indicated, and as an example for all sequences described herein under the general format “SEQ ID NO:,” or “nucleic acid comprising SEQ ID NO: 1” refers to a nucleic acid, at least a portion which has either (i) the sequence of SEQ ID NO:1, or (ii) a sequence complimentary to SEQ ID NO:1. The choice between the two is dictated by the context in which SEQ ID NO: 1 is used. For instance, if the nucleic acid is used as a probe, the choice between the two is dictated by the requirement that the probe be complementary to the desired target. Nucleic acid sequences of the present disclosure may be modified chemically or biochemically or may contain non-natural or derivatized nucleotide bases, as will be readily appreciated by those of skill in the art. Such modifications include, for example, labels, methylation, substitution of one or more naturally occurring nucleotides with an analog, inter-nucleotide modifications such as uncharged linkages (for example, methyl phosphonates, phosphotriesters, phosphoramidates, carbamates, etc.), charged linkages (for example, phosphorothioates, phosphorodithioates, etc.), pendant moieties, (for example, polypeptides), intercalators (for example, acridine, psoralen, etc.), chelators, alkylators, and modified linkages (for example, alpha anomeric nucleic acids, etc.). Also included are chemically modified bases (see, for example, Table 13), backbones (see, for example, Table 14), and modified caps (see, for example, Table 15). Also included are synthetic molecules that mimic polynucleotides in their ability to bind to a designated sequence via hydrogen bonding and other chemical interactions.


Such molecules are known in the art and include, for example, those in which peptide linkages substitute for phosphate linkages in the backbone of a molecule, e.g., peptide nucleic acids (PNAs). Other modifications can include, for example, analogs in which the ribose ring contains a bridging moiety or other structure such as modifications found in “locked” nucleic acids (LNAs). In various embodiments, the nucleic acids are in operative association with additional genetic elements, such as tissue-specific expression-control sequence(s) (e.g., tissue-specific promoters and tissue-specific microRNA recognition sequences), as well as additional elements, such as inverted repeats (e.g., inverted terminal repeats, such as elements from or derived from viruses, e.g., AAV ITRs) and tandem repeats, inverted repeats/direct repeats, homology regions (segments with various degrees of homology to a target DNA), untranslated regions (UTRs) (5′, 3′, or both 5′ and 3′ UTRs), and various combinations of the foregoing. The nucleic acid elements of the systems provided by the invention can be provided in a variety of topologies, including single-stranded, double-stranded, circular, linear, linear with open ends, linear with closed ends, and particular versions of these, such as doggybone DNA (dbDNA), closed-ended DNA (ceDNA).


As used herein, a “gene expression unit” is a nucleic acid sequence comprising at least one regulatory nucleic acid sequence operably linked to at least one effector sequence. A first nucleic acid sequence is operably linked with a second nucleic acid sequence when the first nucleic acid sequence is placed in a functional relationship with the second nucleic acid sequence. For instance, a promoter or enhancer is operably linked to a coding sequence if the promoter or enhancer affects the transcription or expression of the coding sequence. Operably linked DNA sequences may be contiguous or non-contiguous. Where necessary to join two protein-coding regions, operably linked sequences may be in the same reading frame.


The terms “host genome” or “host cell”, as used herein, refer to a cell and/or its genome into which protein and/or genetic material has been introduced. It should be understood that such terms are intended to refer not only to the particular subject cell and/or genome, but to the progeny of such a cell and/or the genome of the progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term “host cell” as used herein. A host genome or host cell may be an isolated cell or cell line grown in culture, or genomic material isolated from such a cell or cell line, or may be a host cell or host genome which composing living tissue or an organism. In some instances, a host cell may be an animal cell or a plant cell, e.g., as described herein. In certain instances, a host cell may be a mammalian cell, a human cell, avian cell, reptilian cell, bovine cell, horse cell, pig cell, goat cell, sheep cell, chicken cell, or turkey cell. In certain instances, a host cell may be a corn cell, soy cell, wheat cell, or rice cell.


As used herein, “operative association” describes a functional relationship between two nucleic acid sequences, such as a 1) promoter and 2) a heterologous object sequence, and means, in such example, the promoter and heterologous object sequence (e.g., a gene of interest) are oriented such that, under suitable conditions, the promoter drives expression of the heterologous object sequence. For instance, a template nucleic acid carrying a promoter and a heterologous object sequence may be single-stranded, e.g., either the (+) or (−) orientation. An “operative association” between the promoter and the heterologous object sequence in this template means that, regardless of whether the template nucleic acid will be transcribed in a particular state, when it is in the suitable state (e.g., is in the (+) orientation, in the presence of required catalytic factors, and NTPs, etc.), it is accurately transcribed. Operative association applies analogously to other pairs of nucleic acids, including other tissue-specific expression control sequences (such as enhancers, repressors and microRNA recognition sequences), IR/DR, ITRs, UTRs, or homology regions and heterologous object sequences or sequences encoding a retroviral RT domain.


The term “primer binding site sequence” or “PBS sequence,” as used herein, refers to a portion of a template RNA capable of binding to a region comprised in a target nucleic acid sequence. In some instances, a PBS sequence is a nucleic acid sequence comprising at least 3, 4, 5, 6, 7, or 8 bases with 100% identity to the region comprised in the target nucleic acid sequence. In some embodiments the primer region comprises at least 5, 6, 7, 8 bases with 100% identity to the region comprised in the target nucleic acid sequence. Without wishing to be bound by theory, in some embodiments when a template RNA comprises a PBS sequence and a heterologous object sequence, the PBS sequence binds to a region comprised in a target nucleic acid sequence, allowing a reverse transcriptase domain to use that region as a primer for reverse transcription, and to use the heterologous object sequence as a template for reverse transcription.


As used herein, a “stem-loop sequence” refers to a nucleic acid sequence (e.g., RNA sequence) with sufficient self-complementarity to form a stem-loop, e.g., having a stem comprising at least two (e.g., 3, 4, 5, 6, 7, 8, 9, or 10) base pairs, and a loop with at least three (e.g., four) base pairs. The stem may comprise mismatches or bulges.


As used herein, a “tissue-specific expression-control sequence” means nucleic acid elements that increase or decrease the level of a transcript comprising the heterologous object sequence in a target tissue in a tissue-specific manner, e.g., preferentially in on-target tissue(s), relative to off-target tissue(s). In some embodiments, a tissue-specific expression-control sequence preferentially drives or represses transcription, activity, or the half-life of a transcript comprising the heterologous object sequence in the target tissue in a tissue-specific manner, e.g., preferentially in an on-target tissue(s), relative to an off-target tissue(s). Exemplary tissue-specific expression-control sequences include tissue-specific promoters, repressors, enhancers, or combinations thereof, as well as tissue-specific microRNA recognition sequences. Tissue specificity refers to on-target (tissue(s) where expression or activity of the template nucleic acid is desired or tolerable) and off-target (tissue(s) where expression or activity of the template nucleic acid is not desired or is not tolerable). For example, a tissue-specific promoter drives expression preferentially in on-target tissues, relative to off-target tissues. In contrast, a microRNA that binds the tissue-specific microRNA recognition sequences is preferentially expressed in off-target tissues, relative to on-target tissues, thereby reducing expression of a template nucleic acid in off-target tissues. Accordingly, a promoter and a microRNA recognition sequence that are specific for the same tissue, such as the target tissue, have contrasting functions (promote and repress, respectively, with concordant expression levels, i.e., high levels of the microRNA in off-target tissues and low levels in on-target tissues, while promoters drive high expression in on-target tissues and low expression in off-target tissues) with regard to the transcription, activity, or half-life of an associated sequence in that tissue.


Table of Contents

1) Introduction


2) Gene modifying systems

    • a) Polypeptide components of gene modifying systems
      • i) Writing domain
      • ii) Endonuclease domains and DNA binding domains
        • (1) Gene modifying polypeptides comprising Cas domains
        • (2) TAL Effectors and Zinc Finger Nucleases
      • iii) Linkers
      • iv) Localization sequences for gene modifying systems
      • v) Evolved Variants of Gene Modifying Polypeptides and Systems
      • vi) Inteins
      • vii) Additional domains
    • b) Template nucleic acids
      • i) gRNA spacer and gRNA scaffold
      • ii) Heterologous object sequence
      • iii) PBS sequence
      • iv) Exemplary Template Sequences
    • c) gRNAs with inducible activity
    • d) Circular RNAs and Ribozymes in Gene Modifying Systems
    • e) Target Nucleic Acid Site
    • f) Second strand nicking


3) Production of Compositions and Systems


4) Therapeutic Applications


5) Administration and Delivery

    • a) Tissue Specific Activity/Administration
      • i) Promoters
      • ii) microRNAs
    • b) Viral vectors and components thereof
    • c) AAV Administration
    • d) Lipid Nanoparticles


6) Kits, Articles of Manufacture, and Pharmaceutical Compositions


7) Chemistry, Manufacturing, and Controls (CMC)


Introduction

This disclosure relates to methods for treating sickle cell disease (SCD) and compositions for targeting, editing, modifying or manipulating a DNA sequence (e.g., inserting a heterologous object sequence into a target site of a mammalian genome) at one or more locations in a DNA sequence in a cell, tissue or subject, e.g., in vivo or in vitro. The heterologous object DNA sequence may include, e.g., a substitution.


More specifically, the disclosure provides methods for treating SCD using reverse transcriptase-based systems for altering a genomic DNA sequence of interest, e.g., by inserting, deleting, or substituting one or more nucleotides into/from the sequence of interest.


The disclosure provides, in part, methods for treating SCD using a gene modifying system comprising a gene modifying polypeptide component and a template nucleic acid (e.g., template RNA) component. In some embodiments, a gene modifying system can be used to introduce an alteration into a target site in a genome. In some embodiments, the gene modifying polypeptide component comprises a writing domain (e.g., a reverse transcriptase domain), a DNA-binding domain, and an endonuclease domain (e.g., nickase domain). In some embodiments, the template nucleic acid (e.g., template RNA) comprises a sequence (e.g., a gRNA spacer) that binds a target site in the genome (e.g., that binds to a second strand of the target site), a sequence (e.g., a gRNA scaffold) that binds the gene modifying polypeptide component, a heterologous object sequence, and a PBS sequence. Without wishing to be bound by theory, it is thought that the template nucleic acid (e.g., template RNA) binds to the second strand of a target site in the genome, and binds to the gene modifying polypeptide component (e.g., localizing the polypeptide component to the target site in the genome). It is thought that the endonuclease (e.g., nickase) of the gene modifying polypeptide component cuts the target site (e.g., the first strand of the target site), e.g., allowing the PBS sequence to bind to a sequence adjacent to the site to be altered on the first strand of the target site. It is thought that the writing domain (e.g., reverse transcriptase domain) of the polypeptide component uses the first strand of the target site that is bound to the complementary sequence comprising the PBS sequence of the template nucleic acid as a primer and the heterologous object sequence of the template nucleic acid as a template to, e.g., polymerize a sequence complementary to the heterologous object sequence. Without wishing to be bound by theory, it is thought that selection of an appropriate heterologous object sequence can result in substitution, deletion, and/or insertion of one or more nucleotides at the target site.


Gene Modifying Systems

In some embodiments, a gene modifying system described herein comprises: (A) a gene modifying polypeptide or a nucleic acid encoding the gene modifying polypeptide, wherein the gene modifying polypeptide comprises (i) a reverse transcriptase domain, and either (x) an endonuclease domain that contains DNA binding functionality or (y) an endonuclease domain and separate DNA binding domain; and (B) a template RNA. A gene modifying polypeptide, in some embodiments, acts as a substantially autonomous protein machine capable of integrating a template nucleic acid sequence into a target DNA molecule (e.g., in a mammalian host cell, such as a genomic DNA molecule in the host cell), substantially without relying on host machinery. For example, the gene modifying protein may comprise a DNA-binding domain, a reverse transcriptase domain, and an endonuclease domain. In some embodiments, the DNA-binding function may involve an RNA component that directs the protein to a DNA sequence, e.g., a gRNA spacer. In other embodiments, the gene modifying polypeptide may comprise a reverse transcriptase domain and an endonuclease domain. The RNA template element of a gene modifying system is typically heterologous to the gene modifying polypeptide element and provides an object sequence to be inserted (reverse transcribed) into the host genome. In some embodiments, the gene modifying polypeptide is capable of target primed reverse transcription. In some embodiments, the gene modifying polypeptide is capable of second-strand synthesis.


In some embodiments the gene modifying system is combined with a second polypeptide. In some embodiments, the second polypeptide may comprise an endonuclease domain. In some embodiments, the second polypeptide may comprise a polymerase domain, e.g., a reverse transcriptase domain. In some embodiments, the second polypeptide may comprise a DNA-dependent DNA polymerase domain. In some embodiments, the second polypeptide aids in completion of the genome edit, e.g., by contributing to second-strand synthesis or DNA repair resolution.


A functional gene modifying polypeptide can be made up of unrelated DNA binding, reverse transcription, and endonuclease domains. This modular structure allows combining of functional domains, e.g., dCas9 (DNA binding), MMLV reverse transcriptase (reverse transcription), FokI (endonuclease). In some embodiments, multiple functional domains may arise from a single protein, e.g., Cas9 or Cas9 nickase (DNA binding, endonuclease).


In some embodiments, a gene modifying polypeptide includes one or more domains that, collectively, facilitate 1) binding the template nucleic acid, 2) binding the target DNA molecule, and 3) facilitate integration of the at least a portion of the template nucleic acid into the target DNA. In some embodiments, the gene modifying polypeptide is an engineered polypeptide that comprises one or more amino acid substitutions to a corresponding naturally occurring sequence. In some embodiments, the gene modifying polypeptide comprises two or more domains that are heterologous relative to each other, e.g., through a heterologous fusion (or other conjugate) of otherwise wild-type domains, or well as fusions of modified domains, e.g., by way of replacement or fusion of a heterologous sub-domain or other substituted domain. For instance, in some embodiments, one or more of: the RT domain is heterologous to the DBD; the DBD is heterologous to the endonuclease domain; or the RT domain is heterologous to the endonuclease domain.


In some embodiments, a template RNA molecule for use in the system comprises, from 5′ to 3′ (1) a gRNA spacer; (2) a gRNA scaffold; (3) heterologous object sequence (4) a primer binding site (PBS) sequence. In some embodiments:

    • (1) Is a gRNA spacer of ˜18-22 nt, e.g., is 20 nt
    • (2) Is a gRNA scaffold comprising one or more hairpin loops, e.g., 1, 2, of 3 loops for associating the template with a Cas domain, e.g., a nickase Cas9 domain. In some embodiments, the gRNA scaffold comprises the sequence, from 5′ to 3′,









(SEQ ID NO: 5008)


GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC





TTGAAAAAGTGGGACCGAGTCGGTCC.








    • (3) In some embodiments, the heterologous object sequence is, e.g., 7-74, e.g., 10-20, 20-30, 30-40, 40-50, 50-60, 60-70, or 70-80 nt or, 80-90 nt in length. In some embodiments, the first (most 5′) base of the sequence is not C.

    • (4) In some embodiments, the PBS sequence that binds the target priming sequence after nicking occurs is e.g., 3-20 nt, e.g., 7-15 nt, e.g., 12-14 nt. In some embodiments, the PBS sequence has 40-60% GC content.





In some embodiments, a second gRNA associated with the system may help drive complete integration. In some embodiments, the second gRNA may target a location that is 0-200 nt away from the first-strand nick, e.g., 0-50, 50-100, 100-200 nt away from the first-strand nick. In some embodiments, the second gRNA can only bind its target sequence after the edit is made, e.g., the gRNA binds a sequence present in the heterologous object sequence, but not in the initial target sequence.


In some embodiments, a gene modifying system described herein is used to make an edit in HEK293, K562, U2OS, or HeLa cells. In some embodiment, a gene modifying system is used to make an edit in primary cells, e.g., primary cortical neurons from E18.5 mice.


In some embodiments, a gene modifying polypeptide as described herein comprises a reverse transcriptase or RT domain (e.g., as described herein) that comprises a MoML V RT sequence or variant thereof. In embodiments, the MoMLV RT sequence comprises one or more mutations selected from D200N, L603W, T330P, T306K, W313F, D524G, E562Q, D583N, P51L, S67R, E67K, T197A, H204R, E302K, F309N, L435G, N454K, H594Q, D653N, R110S, and K103L. In embodiments, the MoMLV RT sequence comprises a combination of mutations, such as D200N, L603W, and T330P, optionally further including T306K and/or W313F.


In some embodiments, an endonuclease domain (e.g., as described herein) nCas9, e.g., comprising an N863A mutation (e.g., in spCas9) or a H840A mutation.


In some embodiments, the heterologous object sequence (e.g., of a system as described herein) is about 1-50, 50-100, 100-200, 200-300, 300-400, 400-500, 500-600, 600-700, 700-800, 800-900, 900-1000, or more, nucleotides in length.


In some embodiments, the RT and endonuclease domains are joined by a flexible linker, e.g., comprising the amino acid sequence SGGSSGGSSGSETPGTSESATPESSGGSSGGSS (SEQ ID NO: 5006).


In some embodiments, the endonuclease domain is N-terminal relative to the RT domain. In some embodiments, the endonuclease domain is C-terminal relative to the RT domain.


In some embodiments, the system incorporates a heterologous object sequence into a target site by TPRT, e.g., as described herein.


In some embodiments, a gene modifying polypeptide comprises a DNA binding domain. In some embodiments, a gene modifying polypeptide comprises an RNA binding domain. In some embodiments, the RNA binding domain comprises an RNA binding domain of B-box protein, MS2 coat protein, dCas, or an element of a sequence of a table herein. In some embodiments, the RNA binding domain is capable of binding to a template RNA with greater affinity than a reference RNA binding domain.


In some embodiments, a gene modifying system is capable of producing an insertion into the target site of at least 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100 nucleotides (and optionally no more than 500, 400, 300, 200, or 100 nucleotides). In some embodiments, a gene modifying system is capable of producing an insertion into the target site of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100 nucleotides (and optionally no more than 500, 400, 300, 200, or 100 nucleotides). In some embodiments, a gene modifying system is capable of producing an insertion into the target site of at least 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5, 9, 9.5 or 10 kilobases (and optionally no more than 1, 5, 10, or 20 kilobases). In some embodiments, a gene modifying system is capable of producing a deletion of at least 81, 85, 90, 95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, or 200 nucleotides (and optionally no more than 500, 400, 300, or 200 nucleotides). In some embodiments, a gene modifying system is capable of producing a deletion of at least 81, 85, 90, 95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, or 200 nucleotides (and optionally no more than 500, 400, 300, or 200 nucleotides). In some embodiments, a gene modifying system is capable of producing a deletion of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, or 200 nucleotides (and optionally no more than 500, 400, 300, or 200 nucleotides). In some embodiments, a gene modifying system is capable of producing a deletion of at least 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5, 9, 9.5 or 10 kilobases (and optionally no more than 1, 5, 10, or 20 kilobases). In some embodiments, a gene modifying system is capable of producing a substitution into the target site of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, or 100 or more nucleotides. In some embodiments, a gene modifying system is capable of producing a substitution in the target site of 1-2, 2-3, 3-4, 4-5, 5-10, 10-15, 15-20, 20-30, 30-40, 40-50, 50-60, 60-70, 70-80, 80-90, or 90-100 nucleotides.


In some embodiments, the substitution is a transition mutation. In some embodiments, the substitution is a transversion mutation. In some embodiments, the substitution converts an adenine to a thymine, an adenine to a guanine, an adenine to a cytosine, a guanine to a thymine, a guanine to a cytosine, a guanine to an adenine, a thymine to a cytosine, a thymine to an adenine, a thymine to a guanine, a cytosine to an adenine, a cytosine to a guanine, or a cytosine to a thymine.


In some embodiments, an insertion, deletion, substitution, or combination thereof, increases or decreases expression (e.g. transcription or translation) of a gene. In some embodiments, an insertion, deletion, substitution, or combination thereof, increases or decreases expression (e.g. transcription or translation) of a gene by altering, adding, or deleting sequences in a promoter or enhancer, e.g. sequences that bind transcription factors. In some embodiments, an insertion, deletion, substitution, or combination thereof alters translation of a gene (e.g. alters an amino acid sequence), inserts or deletes a start or stop codon, alters or fixes the translation frame of a gene. In some embodiments, an insertion, deletion, substitution, or combination thereof alters splicing of a gene, e.g. by inserting, deleting, or altering a splice acceptor or donor site. In some embodiments, an insertion, deletion, substitution, or combination thereof alters transcript or protein half-life. In some embodiments, an insertion, deletion, substitution, or combination thereof alters protein localization in the cell (e.g. from the cytoplasm to a mitochondria, from the cytoplasm into the extracellular space (e.g. adds a secretion tag)). In some embodiments, an insertion, deletion, substitution, or combination thereof alters (e.g. improves) protein folding (e.g. to prevent accumulation of misfolded proteins). In some embodiments, an insertion, deletion, substitution, or combination thereof, alters, increases, decreases the activity of a gene, e.g. a protein encoded by the gene.


Exemplary gene modifying polypeptides, and systems comprising them and methods of using them are described, e.g., in PCT/US2021/020948, which is incorporated herein by reference with respect to retroviral RT domains, including the amino acid and nucleic acid sequences therein.


Exemplary gene modifying polypeptides and retroviral RT domain sequences are also described, e.g., in International Application No. PCT/US21/20948 filed Mar. 4, 2021, e.g., at Table 30, Table 31, and Table 44 therein; the entire application is incorporated by reference herein with respect to retroviral RTs, e.g., in said sequences and tables. Accordingly, a gene modifying polypeptide described herein may comprise an amino acid sequence according to any of the Tables mentioned in this paragraph, or a domain thereof (e.g., a retroviral RT domain), or a functional fragment or variant of any of the foregoing, or an amino acid sequence having at least 70%, 80%, 85%, 90%, 95%, or 99% identity thereto.


In some embodiments, a polypeptide for use in any of the systems described herein can be a molecular reconstruction or ancestral reconstruction based upon the aligned polypeptide sequence of multiple homologous proteins. In some embodiments, a reverse transcriptase domain for use in any of the systems described herein can be a molecular reconstruction or an ancestral reconstruction, or can be modified at particular residues, based upon alignments of reverse transcriptase domains from the same or different sources. A skilled artisan can, based on the Accession numbers provided herein, align polypeptides or nucleic acid sequences, e.g., by using routine sequence analysis tools as Basic Local Alignment Search Tool (BLAST) or CD-Search for conserved domain analysis. Molecular reconstructions can be created based upon sequence consensus, e.g. using approaches described in Ivics et al., Cell 1997, 501-510; Wagstaff et al., Molecular Biology and Evolution 2013, 88-99.


Polypeptide components of gene modifying systems


In some embodiments, the gene modifying polypeptide possesses the functions of DNA target site binding, template nucleic acid (e.g., RNA) binding, DNA target site cleavage, and template nucleic acid (e.g., RNA) writing, e.g., reverse transcription. In some embodiments, each functions is contained within a distinct domain. In some embodiments, a function may be attributed to two or more domains (e.g., two or more domains, together, exhibit the functionality). In some embodiments, two or more domains may have the same or similar function (e.g., two or more domains each independently have DNA-binding functionality, e.g., for two different DNA sequences). In other embodiments, one or more domains may be capable of enabling one or more functions, e.g., a Cas9 domain enabling both DNA binding and target site cleavage. In some embodiments, the domains are all located within a single polypeptide. In some embodiments, a first domain is in one polypeptide and a second domain is in a second polypeptide. For example, in some embodiments, the sequences may be split between a first polypeptide and a second polypeptide, e.g., wherein the first polypeptide comprises a reverse transcriptase (RT) domain and wherein the second polypeptide comprises a DNA-binding domain and an endonuclease domain, e.g., a nickase domain. As a further example, in some embodiments, the first polypeptide and the second polypeptide each comprise a DNA binding domain (e.g., a first DNA binding domain and a second DNA binding domain). In some embodiments, the first and second polypeptide may be brought together post-translationally via a split-intein to form a single gene modifying polypeptide.


In some aspects, a gene modifying polypeptide described herein comprises (e.g., a system described herein comprises a gene modifying polypeptide that comprises): 1) a Cas domain (e.g., a Cas nickase domain, e.g., a Cas9 nickase domain); 2) a reverse transcriptase (RT) domain of Table D, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity thereto, wherein the RT domain is C-terminal of the Cas domain; and a linker disposed between the RT domain and the Cas domain, wherein the linker has a sequence from the same row of Table D as the RT domain, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity thereto.


In some embodiments, the RT domain has a sequence with 100% identity to the RT domain of Table D and the linker has a sequence with 100% identity to the linker sequence from the same row of Table D as the RT domain. In some embodiments, the Cas domain comprises a sequence of Table 8, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% identity thereto. In some embodiments, the gene modifying polypeptide comprises an amino acid sequence according to any of SEQ ID NOs: 1-3332 in the sequence listing, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity thereto.


In some embodiments, the gene modifying polypeptide comprises a GG amino acid sequence between the Cas domain and the linker, an AG amino acid sequence between the RT domain and the second NLS, and/or a GG amino acid sequence between the linker and the RT domain. In some embodiments, the gene modifying polypeptide comprises a sequence of SEQ ID NO: 4000 which comprises the first NLS and the Cas domain, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% identity thereto. In some embodiments, the gene modifying polypeptide comprises a sequence of SEQ ID NO: 4001 which comprises the second NLS, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% identity thereto.









Exemplary N-terminal NLS-Cas9 domain


(SEQ ID NO: 4000)


MPAAKRVKLDGGDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTD





RHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNE





MAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLR





KKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLV





QTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFG





NLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADL





FLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALV





RQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEEL





LVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREK





IEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQ





SFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAF





LSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNA





SLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTY





AHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFA





NRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQ





TVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIK





ELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVD





HIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYWRQLLNA





KLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRM





NTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYL





NAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFY





SNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSM





PQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTV





AYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKE





VKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLA





SHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDK





VLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTST





KEVLDATLIHQSITGLYETRIDLSQLGGDGG





Exemplary C-terminal sequence comprising an NLS


(SEQ ID NO: 4001)


AGKRTADGSEFEKRTADGSEFESPKKKAKVE






Writing Domain (RT Domain)

In certain aspects of the present invention, the writing domain of the gene modifying system possesses reverse transcriptase activity and is also referred to as a reverse transcriptase domain (a RT domain). In some embodiments, the RT domain comprises an RT catalytic portion and RNA-binding region (e.g., a region that binds the template RNA).


In some embodiments, a nucleic acid encoding the reverse transcriptase is altered from its natural sequence to have altered codon usage, e.g. improved for human cells. In some embodiments the reverse transcriptase domain is a heterologous reverse transcriptase from a retrovirus. In some embodiments, the RT domain comprising a gene modifying polypeptide has been mutated from its original amino acid sequence, e.g., has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100 substitutions. In some embodiments, the RT domain is derived from the RT of a retrovirus, e.g., HIV-1 RT, Moloney Murine Leukemia Virus (MMLV) RT, avian myeloblastosis virus (AMV) RT, or Rous Sarcoma Virus (RSV) RT.


In some embodiments, the retroviral reverse transcriptase (RT) domain exhibits enhanced stringency of target-primed reverse transcription (TPRT) initiation, e.g., relative to an endogenous RT domain. In some embodiments, the RT domain initiates TPRT when the 3 nt in the target site immediately upstream of the first strand nick, e.g., the genomic DNA priming the RNA template, have at least 66% or 100% complementarity to the 3 nt of homology in the RNA template. In some embodiments, the RT domain initiates TPRT when there are less than 5 nt mismatched (e.g., less than 1, 2, 3, 4, or 5 nt mismatched) between the template RNA homology and the target DNA priming reverse transcription. In some embodiments, the RT domain is modified such that the stringency for mismatches in priming the TPRT reaction is increased, e.g., wherein the RT domain does not tolerate any mismatches or tolerates fewer mismatches in the priming region relative to a wild-type (e.g., unmodified) RT domain. In some embodiments, the RT domain comprises a HIV-1 RT domain. In embodiments, the HIV-1 RT domain initiates lower levels of synthesis even with three nucleotide mismatches relative to an alternative RT domain (e.g., as described by Jamburuthugoda and Eickbush J Mol Biol 407(5):661-672 (2011); incorporated herein by reference in its entirety). In some embodiments, the RT domain forms a dimer (e.g., a heterodimer or homodimer). In some embodiments, the RT domain is monomeric. In some embodiments, an RT domain, naturally functions as a monomer or as a dimer (e.g., heterodimer or homodimer). In some embodiments, an RT domain naturally functions as a monomer, e.g., is derived from a virus wherein it functions as a monomer. In embodiments, the RT domain is selected from an RT domain from murine leukemia virus (MLV; sometimes referred to as MoMLV) (e.g., P03355), porcine endogenous retrovirus (PERV) (e.g., UniProt Q4VFZ2), mouse mammary tumor virus (MMTV) (e.g., UniProt P03365), Avian reticuloendotheliosis virus (AVIRE) (e.g., UniProtKB accession: P03360); Feline leukemia virus (FLV or FeLV) (e.g., e.g., UniProtKB accession: P10273); Mason-Pfizer monkey virus (MPMV) (e.g., UniProt P07572), bovine leukemia virus (BLV) (e.g., UniProt P03361), human T-cell leukemia virus-1 (HTLV-1) (e.g., UniProt P03362), human foamy virus (HFV) (e.g., UniProt P14350), simian foamy virus (SFV) (e.g., SFV3L) (e.g., UniProt P23074 or P27401), or bovine foamy/syncytial virus (BFV/BSV) (e.g., UniProt 041894), or a functional fragment or variant thereof (e.g., an amino acid sequence having at least 70%, 80%, 90%, 95%, or 99% identity thereto). In some embodiments, an RT domain is dimeric in its natural functioning. In some embodiments, the RT domain is derived from a virus wherein it functions as a dimer. In embodiments, the RT domain is selected from an RT domain from avian sarcoma/leukemia virus (ASLV) (e.g., UniProt A0A142BKH1), Rous sarcoma virus (RSV) (e.g., UniProt P03354), avian myeloblastosis virus (AMV) (e.g., UniProt Q83133), human immunodeficiency virus type I (HIV-1) (e.g., UniProt P03369), human immunodeficiency virus type II (HIV-2) (e.g., UniProt P15833), simian immunodeficiency virus (SIV) (e.g., UniProt P05896), bovine immunodeficiency virus (BIV) (e.g., UniProt P19560), equine infectious anemia virus (EIAV) (e.g., UniProt P03371), or feline immunodeficiency virus (FIV) (e.g., UniProt P16088) (Herschhorn and Hizi Cell Mol Life Sci 67(16):2717-2747 (2010)), or a functional fragment or variant thereof (e.g., an amino acid sequence having at least 70%, 80%, 90%, 95%, or 99% identity thereto). Naturally heterodimeric RT domains may, in some embodiments, also be functional as homodimers. In some embodiments, dimeric RT domains are expressed as fusion proteins, e.g., as homodimeric fusion proteins or heterodimeric fusion proteins. In some embodiments, the RT function of the system is fulfilled by multiple RT domains (e.g., as described herein). In further embodiments, the multiple RT domains are fused or separate, e.g., may be on the same polypeptide or on different polypeptides.


In some embodiments, a gene modifying system described herein comprises an integrase domain, e.g., wherein the integrase domain may be part of the RT domain. In some embodiments, an RT domain (e.g., as described herein) comprises an integrase domain. In some embodiments, an RT domain (e.g., as described herein) lacks an integrase domain, or comprises an integrase domain that has been inactivated by mutation or deleted. In some embodiment, a gene modifying system described herein comprises an RNase H domain, e.g., wherein the RNase H domain may be part of the RT domain. In some embodiments, the RNase H domain is not part of the RT domain and is covalently linked via a flexible linker. In some embodiments, an RT domain (e.g., as described herein) comprises an RNase H domain, e.g., an endogenous RNAse H domain or a heterologous RNase H domain. In some embodiments, an RT domain (e.g., as described herein) lacks an RNase H domain. In some embodiments, an RT domain (e.g., as described herein) comprises an RNase H domain that has been added, deleted, mutated, or swapped for a heterologous RNase H domain. In some embodiments, the polypeptide comprises an inactivated endogenous RNase H domain. In some embodiments, an endogenous RNase H domain from one of the other domains of the polypeptide is genetically removed such that it is not included in the polypeptide, e.g., the endogenous RNase H domain is partially or completely truncated from the comprising domain. In some embodiments, mutation of an RNase H domain yields a polypeptide exhibiting lower RNase activity, e.g., as determined by the methods described in Kotewicz et al. Nucleic Acids Res 16(1):265-277 (1988) (incorporated herein by reference in its entirety), e.g., lower by at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% compared to an otherwise similar domain without the mutation. In some embodiments, RNase H activity is abolished.


In some embodiments, an RT domain is mutated to increase fidelity compared to an otherwise similar domain without the mutation. For instance, in some embodiments, a YADD (SEQ ID NO: 21999) or YMDD (SEQ ID NO: 22000) motif in an RT domain (e.g., in a reverse transcriptase) is replaced with YVDD (SEQ ID NO: 22001). In embodiments, replacement of the YADD (SEQ ID NO: 21999) or YMDD (SEQ ID NO: 22000) or YVDD (SEQ ID NO: 22001) results in higher fidelity in retroviral reverse transcriptase activity (e.g., as described in Jamburuthugoda and Eickbush J Mol Biol 2011; incorporated herein by reference in its entirety).


In some embodiments, a gene modifying polypeptide described herein comprises an RT domain having an amino acid sequence according to Table 6, or a sequence having at least 70%, 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity thereto. In some embodiments, a nucleic acid described herein encodes an RT domain having an amino acid sequence according to Table 6, or a sequence having at least 70%, 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity thereto.









TABLE 6







Exemplary reverse transcriptase domains from retroviruses









RT Name
SEQ ID NO:
RT amino acid sequence





AVIRE_P03360
8,001
TAPLEEEYRLFLEAPIQNVTLLEQWKREIPKVWAEINPPGLASTQAPIHVQLLSTALPVRVRQYPITLEAKRSLRETIRKFRAAGILRPVHSPWNTPLLPV




RKSGTSEYRMVQDLREVNKRVETIHPTVPNPYTLLSLLPPDRIWYSVLDLKDAFFCIPLAPESQLIFAFEWADAEEGESGQLTWTRLPQGFKNSPTLFD




EALNRDLQGFRLDHPSVSLLQYVDDLLIAADTQAACLSATRDLLMTLAELGYRVSGKKAQLCQEEVTYLGFKIHKGSRSLSNSRTQAILQIPVPKTKRQV




REFLGTIGYCRLWIPGFAELAQPLYAATRGGNDPLVWGEKEEEAFQSLKLALTQPPALALPSLDKPFQLFVEETSGAAKGVLTQALGPWKRPVAYLSK




RLDPVAAGWPRCLRAIAAAALLTREASKLTFGQDIEITSSHNLESLLRSPPDKWLTNARITQYQVLLLDPPRVRFKQTAALNPATLLPETDDTLPIHHCLD




TLDSLTSTRPDLTDQPLAQAEATLFTDGSSYIRDGKRYAGAAVVTLDSVIWAEPLPIGTSAQKAELIALTKALEWSKDKSVNIYTDSRYAFATLHVHGMIY




RERGLLTAGGKAIKNAPEILALLTAVWLPKRVAVMHCKGHQKDDAPTSTGNRRADEVAREVAIRPLSTQATIS





AVIRE_P03360_3mut
8,002
TAPLEEEYRLFLEAPIQNVTLLEQWKREIPKVWAEINPPGLASTQAPIHVQLLSTALPVRVRQYPITLEAKRSLRETIRKFRAAGILRPVHSPWNTPLLPV




RKSGTSEYRMVQDLREVNKRVETIHPTVPNPYTLLSLLPPDRIWYSVLDLKDAFFCIPLAPESQLIFAFEWADAEEGESGQLTWTRLPQGFKNSPTLFN




EALNRDLQGFRLDHPSVSLLQYVDDLLIAADTQAACLSATRDLLMTLAELGYRVSGKKAQLCQEEVTYLGFKIHKGSRSLSNSRTQAILQIPVPKTKRQV




REFLGTIGYCRLWIPGFAELAQPLYAATRPGNDPLVWGEKEEEAFQSLKLALTQPPALALPSLDKPFQLFVEETSGAAKGVLTQALGPWKRPVAYLSK




RLDPVAAGWPRCLRAIAAAALLTREASKLTFGQDIEITSSHNLESLLRSPPDKWLTNARITQYQVLLLDPPRVRFKQTAALNPATLLPETDDTLPIHHCLD




TLDSLTSTRPDLTDQPLAQAEATLFTDGSSYIRDGKRYAGAAVVTLDSVIWAEPLPIGTSAQKAELIALTKALEWSKDKSVNIYTDSRYAFATLHVHGMIY




RERGWLTAGGKAIKNAPEILALLTAVWLPKRVAVMHCKGHQKDDAPTSTGNRRADEVAREVAIRPLSTQATIS





AVIRE_P03360_3mutA
8,003
TAPLEEEYRLFLEAPIQNVTLLEQWKREIPKVWAEINPPGLASTQAPIHVQLLSTALPVRVRQYPITLEAKRSLRETIRKFRAAGILRPVHSPWNTPLLPV




RKSGTSEYRMVQDLREVNKRVETIHPTVPNPYTLLSLLPPDRIWYSVLDLKDAFFCIPLAPESQLIFAFEWADAEEGESGQLTWTRLPQGFKNSPTLFN




EALNRDLQGFRLDHPSVSLLQYVDDLLIAADTQAACLSATRDLLMTLAELGYRVSGKKAQLCQEEVTYLGFKIHKGSRSLSNSRTQAILQIPVPKTKRQV




REFLGKIGYCRLFIPGFAELAQPLYAATRPGNDPLVWGEKEEEAFQSLKLALTQPPALALPSLDKPFQLFVEETSGAAKGVLTQALGPWKRPVAYLSKR




LDPVAAGWPRCLRAIAAAALLTREASKLTFGQDIEITSSHNLESLLRSPPDKWLTNARITQYQVLLLDPPRVRFKQTAALNPATLLPETDDTLPIHHQLDT




LDSLTSTRPDLTDQPLAQAEATLFTDGSSYIRDGKRYAGAAVVTLDSVIWAEPLPIGTSAQKAELIALTKALEWSKDKSVNIYTDSRYAFATLHVHGMIY




RERGWLTAGGKAIKNAPEILALLTAVWLPKRVAVMHCKGHQKDDAPTSTGNRRADEVAREVAIRPLSTQATIS





BAEVM_P10272
8,004
TVSLQDEHRLFDIPVTTSLPDVWLQDFPQAWAETGGLGRAKCQAPIIIDLKPTAVPVSIKQYPMSLEAHMGIRQHIIKFLELGVLRPCRSPWNTPLLPVK




KPGTQDYRPVQDLREINKRTVDIHPTVPNPYNLLSTLKPDYSWYTVLDLKDAFFCLPLAPQSQELFAFEWKDPERGISGQLTWTRLPQGFKNSPTLFD




EALHRDLTDFRTQHPEVTLLQYVDDLLLAAPTKKACTQGTRHLLQELGEKGYRASAKKAQICQTKVTYLGYILSEGKRWLTPGRIETVARIPPPRNPRE




VREFLGTAGFCRLWIPGFAELAAPLYALTKESTPFTWQTEHQLAFEALKKALLSAPALGLPDTSKPFTLFLDERQGIAKGVLTQKLGPWKRPVAYLSKK




LDPVAAGWPPCLRIMAATAMLVKDSAKLTLGQPLTVITPHTLEAIVRQPPDRWITNARLTHYQALLLDTDRVQFGPPVTLNPATLLPVPENQPSPHDCR




QVLAETHGTREDLKDQELPDADHTWYTDGSSYLDSGTRRAGAAVVDGHNTIWAQSLPPGTSAQKAELIALTKALELSKGKKANIYTDSRYAFATAHTH




GSIYERRGLLTSEGKEIKNKAEIIALLKALFLPQEVAIIHCPGHQKGQDPVAVGNRQADRVARQAAMAEVLTLATEPDNTSHIT





BAEVM_P10272_3mut
8,005
TVSLQDEHRLFDIPVTTSLPDVWLQDFPQAWAETGGLGRAKCQAPIIIDLKPTAVPVSIKQYPMSLEAHMGIRQHIIKFLELGVLRPCRSPWNTPLLPVK




KPGTQDYRPVQDLREINKRTVDIHPTVPNPYNLLSTLKPDYSWYTVLDLKDAFFCLPLAPQSQELFAFEWKDPERGISGQLTWTRLPQGFKNSPTLFN




EALHRDLTDFRTQHPEVTLLQYVDDLLLAAPTKKACTQGTRHLLQELGEKGYRASAKKAQICQTKVTYLGYILSEGKRWLTPGRIETVARIPPPRNPRE




VREFLGTAGFCRLWIPGFAELAAPLYALTKPSTPFTWQTEHQLAFEALKKALLSAPALGLPDTSKPFTLFLDERQGIAKGVLTQKLGPWKRPVAYLSKK




LDPVAAGWPPCLRIMAATAMLVKDSAKLTLGQPLTVITPHTLEAIVRQPPDRWITNARLTHYQALLLDTDRVQFGPPVTLNPATLLPVPENQPSPHDCR




QVLAETHGTREDLKDQELPDADHTWYTDGSSYLDSGTRRAGAAVVDGHNTIWAQSLPPGTSAQKAELIALTKALELSKGKKANIYTDSRYAFATAHTH




GSIYERRGWLTSEGKEIKNKAEIIALLKALFLPQEVAIIHCPGHQKGQDPVAVGNRQADRVARQAAMAEVLTLATEPDNTSHIT





BAEVM_P10272_3mutA
8,006
TVSLQDEHRLFDIPVTTSLPDVWLQDFPQAWAETGGLGRAKCQAPIIIDLKPTAVPVSIKQYPMSLEAHMGIRQHIIKFLELGVLRPCRSPWNTPLLPVK




KPGTQDYRPVQDLREINKRTVDIHPTVPNPYNLLSTLKPDYSWYTVLDLKDAFFCLPLAPQSQELFAFEWKDPERGISGQLTWTRLPQGFKNSPTLFN




EALHRDLTDFRTQHPEVTLLQYVDDLLLAAPTKKACTQGTRHLLQELGEKGYRASAKKAQICQTKVTYLGYILSEGKRWLTPGRIETVARIPPPRNPRE




VREFLGKAGFCRLFIPGFAELAAPLYALTKPSTPFTWQTEHQLAFEALKKALLSAPALGLPDTSKPFTLFLDERQGIAKGVLTQKLGPWKRPVAYLSKKL




DPVAAGWPPCLRIMAATAMLVKDSAKLTLGQPLTVITPHTLEAIVRQPPDRWITNARLTHYQALLLDTDRVQFGPPVTLNPATLLPVPENQPSPHDCRQ




VLAETHGTREDLKDQELPDADHTWYTDGSSYLDSGTRRAGAAVVDGHNTIWAQSLPPGTSAQKAELIALTKALELSKGKKANIYTDSRYAFATAHTHG




SIYERRGWLTSEGKEIKNKAEIIALLKALFLPQEVAIIHCPGHQKGQDPVAVGNRQADRVARQAAMAEVLTLATEPDNTSHIT





BLVAU_P25059
8,007
GVLDAPPSHIGLEHLPPPPEVPQFPLNLERLQALQDLVHRSLEAGYISPWDGPGNNPVFPVRKPNGAWRFVHDLRVTNALTKPIPALSPGPPDLTAIPT




HLPHIICLDLKDAFFQIPVEDRFRSYFAFTLPTPGGLQPHRRFAWRVLPQGFINSPALFERALQEPLRQVSAAFSQSLLVSYMDDILYVSPTEEQRLQCY




QTMAAHLRDLGFQVASEKTRQTPSPVPFLGQMVHERMVTYQSLPTLQISSPISLHQLQTVLGDLQWVSRGTPTTRRPLQLLYSSLKGIDDPRAIIHLSP




EQQQGIAELRQALSHNARSRYNEQEPLLAYVHLTRAGSTLVLFQKGAQFPLAYFQTPLTDNQASPWGLLLLLGCQYLQAQALSSYAKTILKYYHNLPK




TSLDNWIQSSEDPRVQELLQLWPQISSQGIQPPGPWKTLVTRAEVFLTPQFSPEPIPAALCLFSDGAARRGAYCLWKDHLLDFQAVPAPESAQKGELA




GLLAGLAAAPPEPLNIWVDSKYLYSLLRTLVLGAWLQPDPVPSYALLYKSLLRHPAIFVGHVRSHSSASHPIASLNNYVDQL





BLVAU_P25059_2mut
8,008
GVLDAPPSHIGLEHLPPPPEVPQFPLNLERLQALQDLVHRSLEAGYISPWDGPGNNPVFPVRKPNGAWRFVHDLRVTNALTKPIPALSPGPPDLTAIPT




HLPHIICLDLKDAFFQIPVEDRFRSYFAFTLPTPGGLQPHRRFAWRVLPQGFINSPALFQRALQEPLRQVSAAFSQSLLVSYMDDILYVSPTEEQRLQCY




QTMAAHLRDLGFQVASEKTRQTPSPVPFLGQMVHERMVTYQSLPTLQISSPISLHQLQTVLGDLQWVSRGTPTTRRPLQLLYSSLKPIDDPRAIIHLSP




EQQQGIAELRQALSHNARSRYNEQEPLLAYVHLTRAGSTLVLFQKGAQFPLAYFQTPLTDNQASPWGLLLLLGCQYLQAQALSSYAKTILKYYHNLPK




TSLDNWIQSSEDPRVQELLQLWPQISSQGIQPPGPWKTLVTRAEVFLTPQFSPEPIPAALCLFSDGAARRGAYCLWKDHLLDFQAVPAPESAQKGELA




GLLAGLAAAPPEPLNIWVDSKYLYSLLRTLVLGAWLQPDPVPSYALLYKSLLRHPAIFVGHVRSHSSASHPIASLNNYVDQL





BLVJ_P03361
8,009
GVLDTPPSHIGLEHLPPPPEVPQFPLNLERLQALQDLVHRSLEAGYISPWDGPGNNPVFPVRKPNGAWRFVHDLRATNALTKPIPALSPGPPDLTAIPT




HPPHIICLDLKDAFFQIPVEDRFRFYLSFTLPSPGGLQPHRRFAWRVLPQGFINSPALFERALQEPLRQVSAAFSQSLLVSYMDDILYASPTEEQRSQCY




QALAARLRDLGFQVASEKTSQTPSPVPFLGQMVHEQIVTYQSLPTLQISSPISLHQLQAVLGDLQWVSRGTPTTRRPLQLLYSSLKRHHDPRAIIQLSPE




QLQGIAELRQALSHNARSRYNEQEPLLAYVHLTRAGSTLVLFQKGAQFPLAYFQTPLTDNQASPWGLLLLLGCQYLQTQALSSYAKPILKYYHNLPKTS




LDNWIQSSEDPRVQELLQLWPQISSQGIQPPGPWKTLITRAEVFLTPQFSPDPIPAALCLFSDGATGRGAYCLWKDHLLDFQAVPAPESAQKGELAGL




LAGLAAAPPEPVNIWVDSKYLYSLLRTLVLGAWLQPDPVPSYALLYKSLLRHPAIVVGHVRSHSSASHPIASLNNYVDQL





BLVJ_P03361_2mut
8,010
GVLDTPPSHIGLEHLPPPPEVPQFPLNLERLQALQDLVHRSLEAGYISPWDGPGNNPVFPVRKPNGAWRFVHDLRATNALTKPIPALSPGPPDLTAIPT




HPPHIICLDLKDAFFQIPVEDRFRFYLSFTLPSPGGLQPHRRFAWRVLPQGFINSPALFNRALQEPLRQVSAAFSQSLLVSYMDDILYASPTEEQRSQCY




QALAARLRDLGFQVASEKTSQTPSPVPFLGQMVHEQIVTYQSLPTLQISSPISLHQLQAVLGDLQWVSRGTPTTRRPLQLLYSSLKRHHDPRAIIQLSPE




QLQGIAELRQALSHNARSRYNEQEPLLAYVHLTRAGSTLVLFQKGAQFPLAYFQTPLTDNQASPWGLLLLLGCQYLQTQALSSYAKPILKYYHNLPKTS




LDNWIQSSEDPRVQELLQLWPQISSQGIQPPGPWKTLITRAEVFLTPQFSPDPIPAALCLFSDGATGRGAYCLWKDHLLDFQAVPAPESAQKGELAGL




LAGLAAAPPEPVNIWVDSKYLYSLLRTWVLGAWLQPDPVPSYALLYKSLLRHPAIVVGHVRSHSSASHPIASLNNYVDQL





BLVJ_P03361_2mutB
8,011
GVLDTPPSHIGLEHLPPPPEVPQFPLNLERLQALQDLVHRSLEAGYISPWDGPGNNPVFPVRKPNGAWRFVHDLRATNALTKPIPALSPGPPDLTAPP




THPPHIICLDLKDAFFQIPVEDRFRFYLSFTLPSPGGLQPHRRFAWRVLPQGFINSPALFQRALQEPLRQVSAAFSQSLLVSYMDDILYASPTEEQRSQC




YQALAARLRDLGFQVASEKTSQTPSPVPFLGQMVHEQIVTYQSLPTLQISSPISLHQLQAVLGDLQWVSRGTPTTRRPLQLLYSSLKRHHDPRAIIQLSP




EQLQGIAELRQALSHNARSRYNEQEPLLAYVHLTRAGSTLVLFQKGAQFPLAYFQTPLTDNQASPWGLLLLLGCQYLQTQALSSYAKPILKYYHNLPKT




SLDNWIQSSEDPRVQELLQLWPQISSQGIQPPGPWKTLITRAEVFLTPQFSPDPIPAALCLFSDGATGRGAYCLWKDHLLDFQAVPAPESAQKGELAG




LLAGLAAAPPEPVNIWVDSKYLYSLLRTWVLGAWLQPDPVPSYALLYKSLLRHPAIVVGHVRSHSSASHPIASLNNYVDQL





FFV_O93209
8,012
MDLLKPLTVERKGVKIKGYWNSQADITCVPKDLLQGEEPVRQQNVTTIHGTQEGDVYYVNLKIDGRRINTEVIGTTLDYAIITPGDVPWILKKPLELTIKLD




LEEQQGTLLNNSILSKKGKEELKQLFEKYSALWQSWENQVGHRRIRPHKIATGTVKPTPQKQYHINPKAKPDIQIVINDLLKQGVLIQKESTMNTPVYPV




PKPNGRWRMVLDYRAVNKVTPLIAVQNQHSYGILGSLFKGRYKTTIDLSNGFWAHPIVPEDYWITAFTWQGKQYCWTVLPQGFLNSPGLFTGDVVDL




LQGIPNVEVYVDDVYISHDSEKEHLEYLDILFNRLKEAGYIISLKKSNIANSIVDFLGFQITNEGRGLTDTFKEKLENITAPTTLKQLQSILGLLNFARNFIPD




FTELIAPLYALIPKSTKNYVPWQIEHSTTLETLITKLNGAEYLQGRKGDKTLIMKVNASYTTGYIRYYNEGEKKPISYVSIVFSKTELKFTELEKLLTTVHKG




LLKALDLSMGQNIHVYSPIVSMQNIQKTPQTAKKALASRWLSWLSYLEDPRIRFFYDPQMPALKDLPAVDTGKDNKKHPSNFQHIFYTDGSAITSPTKE




GHLNAGMGIVYFINKDGNLQKQQEWSISLGNHTAQFAEIAAFEFALKKCLPLGGNILVVTDSNYVAKAYNEELDVWASNGFVNNRKKPLKHISKWKSV




ADLKRLRPDVVVTHEPGHQKLDSSPHAYGNNLADQLATQASFKVH





FFV_O93209_2mut
8,013
MDLLKPLTVERKGVKIKGYWNSQADITCVPKDLLQGEEPVRQQNVTTIHGTQEGDVYYVNLKIDGRRINTEVIGTTLDYAIITPGDVPWILKKPLELTIKLD




LEEQQGTLLNNSILSKKGKEELKQLFEKYSALWQSWENQVGHRRIRPHKIATGTVKPTPQKQYHINPKAKPDIQIVINDLLKQGVLIQKESTMNTPVYPV




PKPNGRWRMVLDYRAVNKVTPLIAVQNQHSYGILGSLFKGRYKTTIDLSNGFWAHPIVPEDYWITAFTWQGKQYCWTVLPQGFLNSPGLFNGDVVDL




LQGIPNVEVYVDDVYISHDSEKEHLEYLDILFNRLKEAGYIISLKKSNIANSIVDFLGFQITNEGRGLTDTFKEKLENITAPTTLKQLQSILGLLNFARNFIPD




FTELIAPLYALIPKSPKNYVPWQIEHSTTLETLITKLNGAEYLQGRKGDKTLIMKVNASYTTGYIRYYNEGEKKPISYVSIVFSKTELKFTELEKLLTTVHKG




LLKALDLSMGQNIHVYSPIVSMQNIQKTPQTAKKALASRWLSWLSYLEDPRIRFFYDPQMPALKDLPAVDTGKDNKKHPSNFQHIFYTDGSAITSPTKE




GHLNAGMGIVYFINKDGNLQKQQEWSISLGNHTAQFAEIAAFEFALKKCLPLGGNILVVTDSNYVAKAYNEELDVWASNGFVNNRKKPLKHISKWKSV




ADLKRLRPDVVVTHEPGHQKLDSSPHAYGNNLADQLATQASFKVH





FFV_O93209_2mutA
8,014
MDLLKPLTVERKGVKIKGYWNSQADITCVPKDLLQGEEPVRQQNVTTIHGTQEGDVYYVNLKIDGRRINTEVIGTTLDYAIITPGDVPWILKKPLELTIKLD




LEEQQGTLLNNSILSKKGKEELKQLFEKYSALWQSWENQVGHRRIRPHKIATGTVKPTPQKQYHINPKAKPDIQIVINDLLKQGVLIQKESTMNTPVYPV




PKPNGRWRMVLDYRAVNKVTPLIAVQNQHSYGILGSLFKGRYKTTIDLSNGFWAHPIVPEDYWITAFTWQGKQYCWTVLPQGFLNSPGLFNGDVVDL




LQGIPNVEVYVDDVYISHDSEKEHLEYLDILFNRLKEAGYIISLKKSNIANSIVDFLGFQITNEGRGLTDTFKEKLENITAPTTLKQLQSILGKLNFARNFIPD




FTELIAPLYALIPKSPKNYVPWQIEHSTTLETLITKLNGAEYLQGRKGDKTLIMKVNASYTTGYIRYYNEGEKKPISYVSIVFSKTELKFTELEKLLTTVHKG




LLKALDLSMGQNIHVYSPIVSMQNIQKTPQTAKKALASRWLSWLSYLEDPRIRFFYDPQMPALKDLPAVDTGKDNKKHPSNFQHIFYTDGSAITSPTKE




GHLNAGMGIVYFINKDGNLQKQQEWSISLGNHTAQFAEIAAFEFALKKCLPLGGNILVVTDSNYVAKAYNEELDVWASNGFVNNRKKPLKHISKWKSV




ADLKRLRPDVVVTHEPGHQKLDSSPHAYGNNLADQLATQASFKVH





FFV_O93209-Pro
8,015
VPWILKKPLELTIKLDLEEQQGTLLNNSILSKKGKEELKQLFEKYSALWQSWENQVGHRRIRPHKIATGTVKPTPQKQYHINPKAKPDIQIVINDLLKQGV




LIQKESTMNTPVYPVPKPNGRWRMVLDYRAVNKVTPLIAVQNQHSYGILGSLFKGRYKTTIDLSNGFWAHPIVPEDYWITAFTWQGKQYCWTVLPQGF




LNSPGLFTGDVVDLLQGIPNVEVYVDDVYISHDSEKEHLEYLDILFNRLKEAGYIISLKKSNIANSIVDFLGFQITNEGRGLTDTFKEKLENITAPTTLKQLQ




SILGLLNFARNFIPDFTELIAPLYALIPKSTKNYVPWQIEHSTTLETLITKLNGAEYLQGRKGDKTLIMKVNASYTTGYIRYYNEGEKKPISYVSIVFSKTELK




FTELEKLLTTVHKGLLKALDLSMGQNIHVYSPIVSMQNIQKTPQTAKKALASRWLSWLSYLEDPRIRFFYDPQMPALKDLPAVDTGKDNKKHPSNFQHI




FYTDGSAITSPTKEGHLNAGMGIVYFINKDGNLQKQQEWSISLGNHTAQFAEIAAFEFALKKCLPLGGNILVVTDSNYVAKAYNEELDVWASNGFVNNR




KKPLKHISKWKSVADLKRLRPDVVVTHEPGHQKLDSSPHAYGNNLADQLATQASFKVH





FFV_O93209-Pro_2mut
8,016
VPWILKKPLELTIKLDLEEQQGTLLNNSILSKKGKEELKQLFEKYSALWQSWENQVGHRRIRPHKIATGTVKPTPQKQYHINPKAKPDIQIVINDLLKQGV




LIQKESTMNTPVYPVPKPNGRWRMVLDYRAVNKVTPLIAVQNQHSYGILGSLFKGRYKTTIDLSNGFWAHPIVPEDYWITAFTWQGKQYCWTVLPQGF




LNSPGLFNGDVVDLLQGIPNVEVYVDDVYISHDSEKEHLEYLDILFNRLKEAGYIISLKKSNIANSIVDFLGFQITNEGRGLTDTFKEKLENITAPTTLKQLQ




SILGLLNFARNFIPDFTELIAPLYALIPKSPKNYVPWQIEHSTTLETLITKLNGAEYLQGRKGDKTLIMKVNASYTTGYIRYYNEGEKKPISYVSIVFSKTELK




FTELEKLLTTVHKGLLKALDLSMGQNIHVYSPIVSMQNIQKTPQTAKKALASRWLSWLSYLEDPRIRFFYDPQMPALKDLPAVDTGKDNKKHPSNFQHI




FYTDGSAITSPTKEGHLNAGMGIVYFINKDGNLQKQQEWSISLGNHTAQFAEIAAFEFALKKCLPLGGNILVVTDSNYVAKAYNEELDVWASNGFVNNR




KKPLKHISKWKSVADLKRLRPDVVVTHEPGHQKLDSSPHAYGNNLADQLATQASFKVH





FFV_O93209-Pro_2mutA
8,017
VPWILKKPLELTIKLDLEEQQGTLLNNSILSKKGKEELKQLFEKYSALWQSWENQVGHRRIRPHKIATGTVKPTPQKQYHINPKAKPDIQIVINDLLKQGV




LIQKESTMNTPVYPVPKPNGRWRMVLDYRAVNKVTPLIAVQNQHSYGILGSLFKGRYKTTIDLSNGFWAHPIVPEDYWITAFTWQGKQYCWTVLPQGF




LNSPGLFNGDVVDLLQGIPNVEVYVDDVYISHDSEKEHLEYLDILFNRLKEAGYIISLKKSNIANSIVDFLGFQITNEGRGLTDTFKEKLENITAPTTLKQLQ




SILGKLNFARNFIPDFTELIAPLYALIPKSPKNYVPWQIEHSTTLETLITKLNGAEYLQGRKGDKTLIMKVNASYTTGYIRYYNEGEKKPISYVSIVFSKTELK




FTELEKLLTTVHKGLLKALDLSMGQNIHVYSPIVSMQNIQKTPQTAKKALASRWLSWLSYLEDPRIRFFYDPQMPALKDLPAVDTGKDNKKHPSNFQHI




FYTDGSAITSPTKEGHLNAGMGIVYFINKDGNLQKQQEWSISLGNHTAQFAEIAAFEFALKKCLPLGGNILVVTDSNYVAKAYNEELDVWASNGFVNNR




KKPLKHISKWKSVADLKRLRPDVVVTHEPGHQKLDSSPHAYGNNLADQLATQASFKVH





FLV_P10273
8,018
TLQLEEEYRLFEPESTQKQEMDIWLKNFPQAWAETGGMGTAHCQAPVLIQLKATATPISIRQYPMPHEAYQGIKPHIRRMLDQGILKPCQSPWNTPLLP




VKKPGTEDYRPVQDLREVNKRVEDIHPTVPNPYNLLSTLPPSHPWYTVLDLKDAFFCLRLHSESQLLFAFEWRDPEIGLSGQLTWTRLPQGFKNSPTL




FDEALHSDLADFRVRYPALVLLQYVDDLLLAAATRTECLEGTKALLETLGNKGYRASAKKAQICLQEVTYLGYSLKDGQRWLTKARKEAILSIPVPKNSR




QVREFLGTAGYCRLWIPGFAELAAPLYPLTRPGTLFQWGTEQQLAFEDIKKALLSSPALGLPDITKPFELFIDENSGFAKGVLVQKLGPWKRPVAYLSK




KLDTVASGWPPCLRMVAAIAILVKDAGKLTLGQPLTILTSHPVEALVRQPPNKWLSNARMTHYQAMLLDAERVHFGPTVSLNPATLLPLPSGGNHHDC




LQILAETHGTRPDLTDQPLPDADLTWYTDGSSFIRNGEREAGAAVTTESEVIWAAPLPPGTSAQRAELIALTQALKMAEGKKLTVYTDSRYAFATTHVH




GEIYRRRGLLTSEGKEIKNKNEILALLEALFLPKRLSIIHCPGHQKGDSPQAKGNRLADDTAKKAATETHSSLTVLP





FLV_P10273_3mut
8,019
TLQLEEEYRLFEPESTQKQEMDIWLKNFPQAWAETGGMGTAHCQAPVLIQLKATATPISIRQYPMPHEAYQGIKPHIRRMLDQGILKPCQSPWNTPLLP




VKKPGTEDYRPVQDLREVNKRVEDIHPTVPNPYNLLSTLPPSHPWYTVLDLKDAFFCLRLHSESQLLFAFEWRDPEIGLSGQLTWTRLPQGFKNSPTL




FNEALHSDLADFRVRYPALVLLQYVDDLLLAAATRTECLEGTKALLETLGNKGYRASAKKAQICLQEVTYLGYSLKDGQRWLTKARKEAILSIPVPKNSR




QVREFLGTAGYCRLWIPGFAELAAPLYPLTRPGTLFQWGTEQQLAFEDIKKALLSSPALGLPDITKPFELFIDENSGFAKGVLVQKLGPWKRPVAYLSK




KLDTVASGWPPCLRMVAAIAILVKDAGKLTLGQPLTILTSHPVEALVRQPPNKWLSNARMTHYQAMLLDAERVHFGPTVSLNPATLLPLPSGGNHHDC




LQILAETHGTRPDLTDQPLPDADLTWYTDGSSFIRNGEREAGAAVTTESEVIWAAPLPPGTSAQRAELIALTQALKMAEGKKLTVYTDSRYAFATTHVH




GEIYRRRGWLTSEGKEIKNKNEILALLEALFLPKRLSIIHCPGHQKGDSPQAKGNRLADDTAKKAATETHSSLTVLP





FLV_P10273_3mutA
8,020
TLQLEEEYRLFEPESTQKQEMDIWLKNFPQAWAETGGMGTAHCQAPVLIQLKATATPISIRQYPMPHEAYQGIKPHIRRMLDQGILKPCQSPWNTPLLP




VKKPGTEDYRPVQDLREVNKRVEDIHPTVPNPYNLLSTLPPSHPWYTVLDLKDAFFCLRLHSESQLLFAFEWRDPEIGLSGQLTWTRLPQGFKNSPTL




FNEALHSDLADFRVRYPALVLLQYVDDLLLAAATRTECLEGTKALLETLGNKGYRASAKKAQICLQEVTYLGYSLKDGQRWLTKARKEAILSIPVPKNSR




QVREFLGKAGYCRLFIPGFAELAAPLYPLTRPGTLFQWGTEQQLAFEDIKKALLSSPALGLPDITKPFELFIDENSGFAKGVLVQKLGPWKRPVAYLSKK




LDTVASGWPPCLRMVAAIAILVKDAGKLTLGQPLTILTSHPVEALVRQPPNKWLSNARMTHYQAMLLDAERVHFGPTVSLNPATLLPLPSGGNHHDCL




QILAETHGTRPDLTDQPLPDADLTWYTDGSSFIRNGEREAGAAVTTESEVIWAAPLPPGTSAQRAELIALTQALKMAEGKKLTVYTDSRYAFATTHVHG




EIYRRRGWLTSEGKEIKNKNEILALLEALFLPKRLSIIHCPGHQKGDSPQAKGNRLADDTAKKAATETHSSLTVLP





FOAMV_P14350
8,021
MNPLQLLQPLPAEIKGTKLLAHWNSGATITCIPESFLEDEQPIKKTLIKTIHGEKQQNVYYVTFKVKGRKVEAEVIASPYEYILLSPTDVPWLTQQPLQLTIL




VPLQEYQEKILSKTALPEDQKQQLKTLFVKYDNLWQHWENQVGHRKIRPHNIATGDYPPRPQKQYPINPKAKPSIQIVIDDLLKQGVLTPQNSTMNTPV




YPVPKPDGRWRMVLDYREVNKTIPLTAAQNQHSAGILATIVRQKYKTTLDLANGFWAHPITPESYWLTAFTWQGKQYCWTRLPQGFLNSPALFTADV




VDLLKEIPNVQVYVDDIYLSHDDPKEHVQQLEKVFQILLQAGYVVSLKKSEIGQKTVEFLGFNITKEGRGLTDTFKTKLLNITPPKDLKQLQSILGLLNFAR




NFIPNFAELVQPLYNLIASAKGKYIEWSEENTKQLNMVIEALNTASNLEERLPEQRLVIKVNTSPSAGYVRYYNETGKKPIMYLNYVFSKAELKFSMLEKL




LTTMHKALIKAMDLAMGQEILVYSPIVSMTKIQKTPLPERKALPIRWITWMTYLEDPRIQFHYDKTLPELKHIPDVYTSSQSPVKHPSQYEGVFYTDGSAI




KSPDPTKSNNAGMGIVHATYKPEYQVLNQWSIPLGNHTAQMAEIAAVEFACKKALKIPGPVLVITDSFYVAESANKELPYWKSNGFVNNKKKPLKHISK




WKSIAECLSMKPDITIQHEKGISLQIPVFILKGNALADKLATQGSYVVN





FOAMV_P14350_2mut
8,022
MNPLQLLQPLPAEIKGTKLLAHWNSGATITCIPESFLEDEQPIKKTLIKTIHGEKQQNVYYVTFKVKGRKVEAEVIASPYEYILLSPTDVPWLTQQPLQLTIL




VPLQEYQEKILSKTALPEDQKQQLKTLEVKYDNLWQHWENQVGHRKIRPHNIATGDYPPRPQKQYPINPKAKPSIQIVIDDLLKQGVLTPQNSTMNTPV




YPVPKPDGRWRMVLDYREVNKTIPLTAAQNQHSAGILATIVRQKYKTTLDLANGFWAHPITPESYWLTAFTWQGKQYCWTRLPQGFLNSPALFNADV




VDLLKEIPNVQVYVDDIYLSHDDPKEHVQQLEKVFQILLQAGYVVSLKKSEIGQKTVEFLGFNITKEGRGLTDTFKTKLLNITPPKDLKQLQSILGLLNFAR




NFIPNFAELVQPLYNLIAPAKGKYIEWSEENTKQLNMVIEALNTASNLEERLPEQRLVIKVNTSPSAGYVRYYNETGKKPIMYLNYVFSKAELKFSMLEKL




LTTMHKALIKAMDLAMGQEILVYSPIVSMTKIQKTPLPERKALPIRWITWMTYLEDPRIQFHYDKTLPELKHIPDVYTSSQSPVKHPSQYEGVFYTDGSAI




KSPDPTKSNNAGMGIVHATYKPEYQVLNQWSIPLGNHTAQMAEIAAVEFACKKALKIPGPVLVITDSFYVAESANKELPYWKSNGFVNNKKKPLKHISK




WKSIAECLSMKPDITIQHEKGISLQIPVFILKGNALADKLATQGSYVVN





FOAMV_P14350_2mutA
8,023
MNPLQLLQPLPAEIKGTKLLAHWNSGATITCIPESFLEDEQPIKKTLIKTIHGEKQQNVYYVTFKVKGRKVEAEVIASPYEYILLSPTDVPWLTQQPLQLTIL




VPLQEYQEKILSKTALPEDQKQQLKTLFVKYDNLWQHWENQVGHRKIRPHNIATGDYPPRPQKQYPINPKAKPSIQIVIDDLLKQGVLTPQNSTMNTPV




YPVPKPDGRWRMVLDYREVNKTIPLTAAQNQHSAGILATIVRQKYKTTLDLANGFWAHPITPESYWLTAFTWQGKQYCWTRLPQGFLNSPALFNADV




VDLLKEIPNVQVYVDDIYLSHDDPKEHVQQLEKVFQILLQAGYVVSLKKSEIGQKTVEFLGFNITKEGRGLTDTFKTKLLNITPPKDLKQLQSILGKLNFAR




NFIPNFAELVQPLYNLIAPAKGKYIEWSEENTKQLNMVIEALNTASNLEERLPEQRLVIKVNTSPSAGYVRYYNETGKKPIMYLNYVFSKAELKFSMLEKL




LTTMHKALIKAMDLAMGQEILVYSPIVSMTKIQKTPLPERKALPIRWITWMTYLEDPRIQFHYDKTLPELKHIPDVYTSSQSPVKHPSQYEGVFYTDGSAI




KSPDPTKSNNAGMGIVHATYKPEYQVLNQWSIPLGNHTAQMAEIAAVEFACKKALKIPGPVLVITDSFYVAESANKELPYWKSNGFVNNKKKPLKHISK




WKSIAECLSMKPDITIQHEKGISLQIPVFILKGNALADKLATQGSYVVN





FOAMV_P14350-Pro
8,024
VPWLTQQPLQLTILVPLQEYQEKILSKTALPEDQKQQLKTLFVKYDNLWQHWENQVGHRKIRPHNIATGDYPPRPQKQYPINPKAKPSIQIVIDDLLKQG




VLTPQNSTMNTPVYPVPKPDGRWRMVLDYREVNKTIPLTAAQNQHSAGILATIVRQKYKTTLDLANGFWAHPITPESYWLTAFTWQGKQYCWTRLPQ




GFLNSPALFTADVVDLLKEIPNVQVYVDDIYLSHDDPKEHVQQLEKVFQILLQAGYVVSLKKSEIGQKTVEFLGFNITKEGRGLTDTFKTKLLNITPPKDLK




QLQSILGLLNFARNFIPNFAELVQPLYNLIASAKGKYIEWSEENTKQLNMVIEALNTASNLEERLPEQRLVIKVNTSPSAGYVRYYNETGKKPIMYLNYVF




SKAELKFSMLEKLLTTMHKALIKAMDLAMGQEILVYSPIVSMTKIQKTPLPERKALPIRWITWMTYLEDPRIQFHYDKTLPELKHIPDVYTSSQSPVKHPS




QYEGVFYTDGSAIKSPDPTKSNNAGMGIVHATYKPEYQVLNQWSIPLGNHTAQMAEIAAVEFACKKALKIPGPVLVITDSFYVAESANKELPYWKSNGF




VNNKKKPLKHISKWKSIAECLSMKPDITIQHEKGISLQIPVFILKGNALADKLATQGSYVVN





FOAMV_P14350-Pro_2mut
8,025
VPWLTQQPLQLTILVPLQEYQEKILSKTALPEDQKQQLKTLFVKYDNLWQHWENQVGHRKIRPHNIATGDYPPRPQKQYPINPKAKPSIQIVIDDLLKQG




VLTPQNSTMNTPVYPVPKPDGRWRMVLDYREVNKTIPLTAAQNQHSAGILATIVRQKYKTTLDLANGFWAHPITPESYWLTAFTWQGKQYCWTRLPQ




GFLNSPALFNADVVDLLKEIPNVQVYVDDIYLSHDDPKEHVQQLEKVFQILLQAGYVVSLKKSEIGQKTVEFLGFNITKEGRGLTDTFKTKLLNITPPKDL




KQLQSILGLLNFARNFIPNFAELVQPLYNLIAPAKGKYIEWSEENTKQLNMVIEALNTASNLEERLPEQRLVIKVNTSPSAGYVRYYNETGKKPIMYLNYV




FSKAELKFSMLEKLLTTMHKALIKAMDLAMGQEILVYSPIVSMTKIQKTPLPERKALPIRWITWMTYLEDPRIQFHYDKTLPELKHIPDVYTSSQSPVKHP




SQYEGVFYTDGSAIKSPDPTKSNNAGMGIVHATYKPEYQVLNQWSIPLGNHTAQMAEIAAVEFACKKALKIPGPVLVITDSFYVAESANKELPYWKSNG




FVNNKKKPLKHISKWKSIAECLSMKPDITIQHEKGISLQIPVFILKGNALADKLATQGSYVVN





FOAMV_P14350-Pro_2mutA
8,026
VPWLTQQPLQLTILVPLQEYQEKILSKTALPEDQKQQLKTLFVKYDNLWQHWENQVGHRKIRPHNIATGDYPPRPQKQYPINPKAKPSIQIVIDDLLKQG




VLTPQNSTMNTPVYPVPKPDGRWRMVLDYREVNKTIPLTAAQNQHSAGILATIVRQKYKTTLDLANGFWAHPITPESYWLTAFTWQGKQYCWTRLPQ




GFLNSPALFNADVVDLLKEIPNVQVYVDDIYLSHDDPKEHVQQLEKVFQILLQAGYVVSLKKSEIGQKTVEFLGFNITKEGRGLTDTFKTKLLNITPPKDL




KQLQSILGKLNFARNFIPNFAELVQPLYNLIAPAKGKYIEWSEENTKQLNMVIEALNTASNLEERLPEQRLVIKVNTSPSAGYVRYYNETGKKPIMYLNYV




FSKAELKFSMLEKLLTTMHKALIKAMDLAMGQEILVYSPIVSMTKIQKTPLPERKALPIRWITWMTYLEDPRIQFHYDKTLPELKHIPDVYTSSQSPVKHP




SQYEGVFYTDGSAIKSPDPTKSNNAGMGIVHATYKPEYQVLNQWSIPLGNHTAQMAEIAAVEFACKKALKIPGPVLVITDSFYVAESANKELPYWKSNG




FVNNKKKPLKHISKWKSIAECLSMKPDITIQHEKGISLQIPVFILKGNALADKLATQGSYVVN





GALV_P21414
8,027
VLNLEEEYRLHEKPVPSSIDPSWLQLFPTVWAERAGMGLANQVPPVVVELRSGASPVAVRQYPMSKEAREGIRPHIQKFLDLGVLVPCRSPWNTPLL




PVKKPGTNDYRPVQDLREINKRVQDIHPTVPNPYNLLSSLPPSYTWYSVLDLKDAFFCLRLHPNSQPLFAFEWKDPEKGNTGQLTWTRLPQGFKNSP




TLFDEALHRDLAPFRALNPQVVLLQYVDDLLVAAPTYEDCKKGTQKLLQELSKLGYRVSAKKAQLCQREVTYLGYLLKEGKRWLTPARKATVMKIPVP




TTPRQVREFLGTAGFCRLWIPGFASLAAPLYPLTKESIPFIWTEEHQQAFDHIKKALLSAPALALPDLTKPFTLYIDERAGVARGVLTQTLGPWRRPVAY




LSKKLDPVASGWPTCLKAVAAVALLLKDADKLTLGQNVTVIASHSLESIVRQPPDRWMTNARMTHYQSLLLNERVSFAPPAVLNPATLLPVESEATPVH




RCSEILAEETGTRRDLEDQPLPGVPTWYTDGSSFITEGKRRAGAPIVDGKRTVWASSLPEGTSAQKAELVALTQALRLAEGKNINIYTDSRYAFATAHIH




GAIYKQRGLLTSAGKDIKNKEEILALLEAIHLPRRVAIIHCPGHQRGSNPVATGNRRADEAAKQAALSTRVLAGTTKP





GALV_P21414_3mut
8,028
VLNLEEEYRLHEKPVPSSIDPSWLQLFPTVWAERAGMGLANQVPPVVVELRSGASPVAVRQYPMSKEAREGIRPHIQKFLDLGVLVPCRSPWNTPLL




PVKKPGTNDYRPVQDLREINKRVQDIHPTVPNPYNLLSSLPPSYTWYSVLDLKDAFFCLRLHPNSQPLFAFEWKDPEKGNTGQLTWTRLPQGFKNSP




TLFNEALHRDLAPFRALNPQVVLLQYVDDLLVAAPTYEDCKKGTQKLLQELSKLGYRVSAKKAQLCQREVTYLGYLLKEGKRWLTPARKATVMKIPVP




TTPRQVREFLGTAGFCRLWIPGFASLAAPLYPLTKPSIPFIWTEEHQQAFDHIKKALLSAPALALPDLTKPFTLYIDERAGVARGVLTQTLGPWRRPVAY




LSKKLDPVASGWPTCLKAVAAVALLLKDADKLTLGQNVTVIASHSLESIVRQPPDRWMTNARMTHYQSLLLNERVSFAPPAVLNPATLLPVESEATPVH




RCSEILAEETGTRRDLEDQPLPGVPTWYTDGSSFITEGKRRAGAPIVDGKRTVWASSLPEGTSAQKAELVALTQALRLAEGKNINIYTDSRYAFATAHIH




GAIYKQRGWLTSAGKDIKNKEEILALLEAIHLPRRVAIIHCPGHQRGSNPVATGNRRADEAAKQAALSTRVLAGTTKP





GALV_P21414_3mutA
8,029
VLNLEEEYRLHEKPVPSSIDPSWLQLFPTVWAERAGMGLANQVPPVVVELRSGASPVAVRQYPMSKEAREGIRPHIQKFLDLGVLVPCRSPWNTPLL




PVKKPGTNDYRPVQDLREINKRVQDIHPTVPNPYNLLSSLPPSYTWYSVLDLKDAFFCLRLHPNSQPLFAFEWKDPEKGNTGQLTWTRLPQGFKNSP




TLFNEALHRDLAPFRALNPQVVLLQYVDDLLVAAPTYEDCKKGTQKLLQELSKLGYRVSAKKAQLCQREVTYLGYLLKEGKRWLTPARKATVMKIPVP




TTPRQVREFLGKAGFCRLFIPGFASLAAPLYPLTKPSIPFIWTEEHQQAFDHIKKALLSAPALALPDLTKPFTLYIDERAGVARGVLTQTLGPWRRPVAYL




SKKLDPVASGWPTCLKAVAAVALLLKDADKLTLGQNVTVIASHSLESIVRQPPDRWMTNARMTHYQSLLLNERVSFAPPAVLNPATLLPVESEATPVH




RCSEILAEETGTRRDLEDQPLPGVPTWYTDGSSFITEGKRRAGAPIVDGKRTVWASSLPEGTSAQKAELVALTQALRLAEGKNINIYTDSRYAFATAHIH




GAIYKQRGWLTSAGKDIKNKEEILALLEAIHLPRRVAIIHCPGHQRGSNPVATGNRRADEAAKQAALSTRVLAGTTKP





HTL1A_P03362
8,030
AVLGLEHLPRPPQISQFPLNPERLQALQHLVRKALEAGHIEPYTGPGNNPVFPVKKANGTWRFIHDLRATNSLTIDLSSSSPGPPDLSSLPTTLAHLQTI




DLRDAFFQIPLPKQFQPYFAFTVPQQCNYGPGTRYAWKVLPQGFKNSPTLFEMQLAHILQPIRQAFPQCTILQYMDDILLASPSHEDLLLLSEATMASLI




SHGLPVSENKTQQTPGTIKFLGQIISPNHLTYDAVPTVPIRSRWALPELQALLGEIQWVSKGTPTLRQPLHSLYCALQRHTDPRDQIYLNPSQVQSLVQL




RQALSQNCRSRLVQTLPLLGAIMLTLTGTTTVVFQSKEQWPLVWLHAPLPHTSQCPWGQLLASAVLLLDKYTLQSYGLLCQTIHHNISTQTFNQFIQTS




DHPSVPILLHHSHRFKNLGAQTGELWNTFLKTAAPLAPVKALMPVFTLSPVIINTAPCLFSDGSTSRAAYILWDKQILSQRSFPLPPPHKSAQRAELLGLL




HGLSSARSWRCLNIFLDSKYLYHYLRTLALGTFQGRSSQAPFQALLPRLLSRKVVYLHHVRSHTNLPDPISRLNALTDALLITPVLQL





HTL1A_P03362_2mut
8,031
AVLGLEHLPRPPQISQFPLNPERLQALQHLVRKALEAGHIEPYTGPGNNPVFPVKKANGTWRFIHDLRATNSLTIDLSSSSPGPPDLSSLPTTLAHLQTI




DLRDAFFQIPLPKQFQPYFAFTVPQQCNYGPGTRYAWKVLPQGFKNSPTLFQMQLAHILQPIRQAFPQCTILQYMDDILLASPSHEDLLLLSEATMASLI




SHGLPVSENKTQQTPGTIKFLGQIISPNHLTYDAVPTVPIRSRWALPELQALLGEIQWVSKGTPTLRQPLHSLYCALQPHTDPRDQIYLNPSQVQSLVQL




RQALSQNCRSRLVQTLPLLGAIMLTLTGTTTVVFQSKEQWPLVWLHAPLPHTSQCPWGQLLASAVLLLDKYTLQSYGLLCQTIHHNISTQTFNQFIQTS




DHPSVPILLHHSHRFKNLGAQTGELWNTFLKTAAPLAPVKALMPVFTLSPVIINTAPCLFSDGSTSRAAYILWDKQILSQRSFPLPPPHKSAQRAELLGLL




HGLSSARSWRCLNIFLDSKYLYHYLRTLALGTFQGRSSQAPFQALLPRLLSRKVVYLHHVRSHTNLPDPISRLNALTDALLITPVLQL





HTL1A_P03362_2mutB
8,032
AVLGLEHLPRPPQISQFPLNPERLQALQHLVRKALEAGHIEPYTGPGNNPVFPVKKANGTWRFIHDLRATNSLTIDLSSSSPGPPDLSSPPTTLAHLQTI




DLRDAFFQIPLPKQFQPYFAFTVPQQCNYGPGTRYAWKVLPQGFKNSPTLFQMQLAHILQPIRQAFPQCTILQYMDDILLASPSHEDLLLLSEATMASLI




SHGLPVSENKTQQTPGTIKFLGQIISPNHLTYDAVPTVPIRSRWALPELQALLGEIQWVSKGTPTLRQPLHSLYCALQPHTDPRDQIYLNPSQVQSLVQL




RQALSQNCRSRLVQTLPLLGAIMLTLTGTTTVVFQSKEQWPLVWLHAPLPHTSQCPWGQLLASAVLLLDKYTLQSYGLLCQTIHHNISTQTFNQFIQTS




DHPSVPILLHHSHRFKNLGAQTGELWNTFLKTAAPLAPVKALMPVFTLSPVIINTAPCLFSDGSTSRAAYILWDKQILSQRSFPLPPPHKSAQRAELLGLL




HGLSSARSWRCLNIFLDSKYLYHYLRTLALGTFQGRSSQAPFQALLPRLLSRKVVYLHHVRSHTNLPDPISRLNALTDALLITPVLQL





HTL1C_P14078
8,033
AVLGLEHLPRPPEISQFPLNPERLQALQHLVRKALEAGHIEPYTGPGNNPVFPVKKANGTWRFIHDLRATNSLTIDLSSSSPGPPDLSSLPTTLAHLQTI




DLKDAFFQIPLPKQFQPYFAFTVPQQCNYGPGTRYAWRVLPQGFKNSPTLFEMQLAHILQPIRQAFPQCTILQYMDDILLASPSHADLQLLSEATMASLI




SHGLPVSENKTQQTPGTIKFLGQIISPNHLTYDAVPKVPIRSRWALPELQALLGEIQWVSKGTPTLRQPLHSLYCALQRHTDPRDQIYLNPSQVQSLVQL




RQALSQNCRSRLVQTLPLLGAIMLTLTGTTTVVFQSKQQWPLVWLHAPLPHTSQCPWGQLLASAVLLLDKYTLQSYGLLCQTIHHNISTQTFNQFIQTS




DHPSVPILLHHSHRFKNLGAQTGELWNTFLKTTAPLAPVKALMPVFTLSPVIINTAPCLFSDGSTSQAAYILWDKHILSQRSFPLPPPHKSAQRAELLGLL




HGLSSARSWRCLNIFLDSKYLYHYLRTLALGTFQGRSSQAPFQALLPRLLSRKVVYLHHVRSHTNLPDPISRLNALTDALLITPVLQL





HTL1C_P14078_2mut
8,034
AVLGLEHLPRPPEISQFPLNPERLQALQHLVRKALEAGHIEPYTGPGNNPVFPVKKANGTWRFIHDLRATNSLTIDLSSSSPGPPDLSSLPTTLAHLQTI




DLKDAFFQIPLPKQFQPYFAFTVPQQCNYGPGTRYAWRVLPQGFKNSPTLFQMQLAHILQPIRQAFPQCTILQYMDDILLASPSHADLQLLSEATMASLI




SHGLPVSENKTQQTPGTIKFLGQIISPNHLTYDAVPKVPIRSRWALPELQALLGEIQWVSKGTPTLRQPLHSLYCALQPHTDPRDQIYLNPSQVQSLVQL




RQALSQNCRSRLVQTLPLLGAIMLTLTGTTTVVFQSKQQWPLVWLHAPLPHTSQCPWGQLLASAVLLLDKYTLQSYGLLCQTIHHNISTQTFNQFIQTS




DHPSVPILLHHSHRFKNLGAQTGELWNTFLKTTAPLAPVKALMPVFTLSPVIINTAPCLFSDGSTSQAAYILWDKHILSQRSFPLPPPHKSAQRAELLGLL




HGLSSARSWRCLNIFLDSKYLYHYLRTLALGTFQGRSSQAPFQALLPRLLSRKVVYLHHVRSHTNLPDPISRLNALTDALLITPVLQL





HTL1L_P0C211
8,035
GLEHLPRPPEISQFPLNPERLQALQHLVRKALEAGHIEPYTGPGNNPVFPVKKANGTWRFIHDLRATNSLTVDLSSSSPGPPDLSSLPTTLAHLQTIDLK




DAFFQIPLPKQFQPYFAFTVPQQCNYGPGTRYAWKVLPQGFKNSPTLFEMQLASILQPIRQAFPQCVILQYMDDILLASPSPEDLQQLSEATMASLISH




GLPVSQDKTQQTPGTIKFLGQIISPNHITYDAVPTVPIRSRWALPELQALLGEIQWVSKGTPTLRQPLHSLYCALQGHTDPRDQIYLNPSQVQSLMQLQ




QALSQNCRSRLAQTLPLLGAIMLTLTGTTTVVFQSKQQWPLVWLHAPLPHTSQCPWGQLLASAVLLLDKYTLQSYGLLCQTIHHNISIQTFNQFIQTSD




HPSVPILLHHSHRFKNLGAQTGELWNTFLKTAAPLAPVKALTPVFTLSPIIINTAPCLFSDGSTSQAAYILWDKHILSQRSFPLPPPHKSAQQAELLGLLH




GLSSARSWHCLNIFLDSKYLYHYLRTLALGTFQGKSSQAPFQALLPRLLAHKVIYLHHVRSHTNLPDPISKLNALTDALLITPIL





HTL1L_P0C211_2mut
8,036
GLEHLPRPPEISQFPLNPERLQALQHLVRKALEAGHIEPYTGPGNNPVFPVKKANGTWRFIHDLRATNSLTVDLSSSSPGPPDLSSLPTTLAHLQTIDLK




DAFFQIPLPKQFQPYFAFTVPQQCNYGPGTRYAWKVLPQGFKNSPTLFQMQLASILQPIRQAFPQCVILQYMDDILLASPSPEDLQQLSEATMASLISH




GLPVSQDKTQQTPGTIKFLGQIISPNHITYDAVPTVPIRSRWALPELQALLGEIQWVSKGTPTLRQPLHSLYCALQGHTDPRDQIYLNPSQVQSLMQLQ




QALSQNCRSRLAQTLPLLGAIMLTLTGTTTVVFQSKQQWPLVWLHAPLPHTSQCPWGQLLASAVLLLDKYTLQSYGLLCQTIHHNISIQTFNQFIQTSD




HPSVPILLHHSHRFKNLGAQTGELWNTFLKTAAPLAPVKALTPVFTLSPIIINTAPCLFSDGSTSQAAYILWDKHILSQRSFPLPPPHKSAQQAELLGLLH




GLSSARSWHCLNIFLDSKYLYHYLRTLAWGTFQGKSSQAPFQALLPRLLAHKVIYLHHVRSHTNLPDPISKLNALTDALLITPIL





HTL1L_P0C211_2mutB
8,037
GLEHLPRPPEISQFPLNPERLQALQHLVRKALEAGHIEPYTGPGNNPVFPVKKANGTWRFIHDLRATNSLTVDLSSSSPGPPDLSSPPTTLAHLQTIDLK




DAFFQIPLPKQFQPYFAFTVPQQCNYGPGTRYAWKVLPQGFKNSPTLFQMQLASILQPIRQAFPQCVILQYMDDILLASPSPEDLQQLSEATMASLISH




GLPVSQDKTQQTPGTIKFLGQIISPNHITYDAVPTVPIRSRWALPELQALLGEIQWVSKGTPTLRQPLHSLYCALQGHTDPRDQIYLNPSQVQSLMQLQ




QALSQNCRSRLAQTLPLLGAIMLTLTGTTTVVFQSKQQWPLVWLHAPLPHTSQCPWGQLLASAVLLLDKYTLQSYGLLCQTIHHNISIQTFNQFIQTSD




HPSVPILLHHSHRFKNLGAQTGELWNTFLKTAAPLAPVKALTPVFTLSPIIINTAPCLFSDGSTSQAAYILWDKHILSQRSFPLPPPHKSAQQAELLGLLH




GLSSARSWHCLNIFLDSKYLYHYLRTLAWGTFQGKSSQAPFQALLPRLLAHKVIYLHHVRSHTNLPDPISKLNALTDALLITPIL





HTL32_Q0R5R2
8,038
GLEHLPPPPEVSQFPLNPERLQALTDLVSRALEAKHIEPYQGPGNNPIFPVKKPNGKWRFIHDLRATNSVTRDLASPSPGPPDLTSLPQGLPHLRTIDLT




DAFFQIPLPTIFQPYFAFTLPQPNNYGPGTRYSWRVLPQGFKNSPTLFEQQLSHILTPVRKTFPNSLIIQYMDDILLASPAPGELAALTDKVTNALTKEGL




PLSPEKTQATPGPIHFLGQVISQDCITYETLPSINVKSTWSLAELQSMLGELQWVSKGTPVLRSSLHQLYLALRGHRDPRDTIKLTSIQVQALRTIQKALT




LNCRSRLVNQLPILALIMLRPTGTTAVLFQTKQKWPLVWLHTPHPATSLRPWGQLLANAVIILDKYSLQHYGQVCKSFHHNISNQALTYYLHTSDQSSV




AILLQHSHRFHNLGAQPSGPWRSLLQMPQIFQNIDVLRPPFTISPVVINHAPCLFSDGSASKAAFIIWDRQVIHQQVLSLPSTCSAQAGELFGLLAGLQK




SQPWVALNIFLDSKFLIGHLRRMALGAFPGPSTQCELHTQLLPLLQGKTVYVHHVRSHTLLQDPISRLNEATDALMLAPLLPL





HTL32_Q0R5R2_2mut
8,039
GLEHLPPPPEVSQFPLNPERLQALTDLVSRALEAKHIEPYQGPGNNPIFPVKKPNGKWRFIHDLRATNSVTRDLASPSPGPPDLTSLPQGLPHLRTIDLT




DAFFQIPLPTIFQPYFAFTLPQPNNYGPGTRYSWRVLPQGFKNSPTLFQQQLSHILTPVRKTFPNSLIIQYMDDILLASPAPGELAALTDKVTNALTKEGL




PLSPEKTQATPGPIHFLGQVISQDCITYETLPSINVKSTWSLAELQSMLGELQWVSKGTPVLRSSLHQLYLALRGHRDPRDTIKLTSIQVQALRTIQKALT




LNCRSRLVNQLPILALIMLRPTGTTAVLFQTKQKWPLVWLHTPHPATSLRPWGQLLANAVIILDKYSLQHYGQVCKSFHHNISNQALTYYLHTSDQSSV




AILLQHSHRFHNLGAQPSGPWRSLLQMPQIFQNIDVLRPPFTISPVVINHAPCLFSDGSASKAAFIIWDRQVIHQQVLSLPSTCSAQAGELFGLLAGLQK




SQPWVALNIFLDSKFLIGHLRRMAWGAFPGPSTQCELHTQLLPLLQGKTVYVHHVRSHTLLQDPISRLNEATDALMLAPLLPL





HTL32_Q0R5R2_2mutB
8,040
GLEHLPPPPEVSQFPLNPERLQALTDLVSRALEAKHIEPYQGPGNNPIFPVKKPNGKWRFIHDLRATNSVTRDLASPSPGPPDLTSPPQGLPHLRTIDL




TDAFFQIPLPTIFQPYFAFTLPQPNNYGPGTRYSWRVLPQGFKNSPTLFQQQLSHILTPVRKTFPNSLIIQYMDDILLASPAPGELAALTDKVTNALTKEG




LPLSPEKTQATPGPIHFLGQVISQDCITYETLPSINVKSTWSLAELQSMLGELQWVSKGTPVLRSSLHQLYLALRGHRDPRDTIKLTSIQVQALRTIQKAL




TLNCRSRLVNQLPILALIMLRPTGTTAVLFQTKQKWPLVWLHTPHPATSLRPWGQLLANAVIILDKYSLQHYGQVCKSFHHNISNQALTYYLHTSDQSS




VAILLQHSHRFHNLGAQPSGPWRSLLQMPQIFQNIDVLRPPFTISPVVINHAPCLFSDGSASKAAFIIWDRQVIHQQVLSLPSTCSAQAGELFGLLAGLQ




KSQPWVALNIFLDSKFLIGHLRRMAWGAFPGPSTQCELHTQLLPLLQGKTVYVHHVRSHTLLQDPISRLNEATDALMLAPLLPL





HTL3P_Q4U0X6
8,041
GLEHLPPPPEVSQFPLNPERLQALTDLVSRALEAKHIEPYQGPGNNPIFPVKKPNGKWRFIHDLRATNSLTRDLASPSPGPPDLTSLPQDLPHLRTIDLT




DAFFQIPLPAVFQPYFAFTLPQPNNHGPGTRYSWRVLPQGFKNSPTLFEQQLSHILAPVRKAFPNSLIIQYMDDILLASPALRELTALTDKVTNALTKEGL




PMSLEKTQATPGSIHFLGQVISPDCITYETLPSIHVKSIWSLAELQSMLGELQWVSKGTPVLRSSLHQLYLALRGHRDPRDTIELTSTQVQALKTIQKALA




LNCRSRLVSQLPILALIILRPTGTTAVLFQTKQKWPLVWLHTPHPATSLRPWGQLLANAIITLDKYSLQHYGQICKSFHHNISNQALTYYLHTSDQSSVAIL




LQHSHRFHNLGAQPSGPWRSLLQVPQIFQNIDVLRPPFIISPVVIDHAPCLFSDGATSKAAFILWDKQVIHQQVLPLPSTCSAQAGELFGLLAGLQKSKP




WPALNIFLDSKFLIGHLRRMALGAFLGPSTQCDLHARLFPLLQGKTVYVHHVRSHTLLQDPISRLNEATDALMLAPLLPL





HTL3P_Q4U0X6_2mut
8,042
GLEHLPPPPEVSQFPLNPERLQALTDLVSRALEAKHIEPYQGPGNNPIFPVKKPNGKWRFIHDLRATNSLTRDLASPSPGPPDLTSLPQDLPHLRTIDLT




DAFFQIPLPAVFQPYFAFTLPQPNNHGPGTRYSWRVLPQGFKNSPTLFQQQLSHILAPVRKAFPNSLIIQYMDDILLASPALRELTALTDKVTNALTKEG




LPMSLEKTQATPGSIHFLGQVISPDCITYETLPSIHVKSIWSLAELQSMLGELQWVSKGTPVLRSSLHQLYLALRGHRDPRDTIELTSTQVQALKTIQKAL




ALNCRSRLVSQLPILALIILRPTGTTAVLFQTKQKWPLVWLHTPHPATSLRPWGQLLANAIITLDKYSLQHYGQICKSFHHNISNQALTYYLHTSDQSSVAI




LLQHSHRFHNLGAQPSGPWRSLLQVPQIFQNIDVLRPPFIISPVVIDHAPCLFSDGATSKAAFILWDKQVIHQQVLPLPSTCSAQAGELFGLLAGLQKSK




PWPALNIFLDSKFLIGHLRRMAWGAFLGPSTQCDLHARLFPLLQGKTVYVHHVRSHTLLQDPISRLNEATDALMLAPLLPL





HTL3P_Q4U0X6_2mutB
8,043
GLEHLPPPPEVSQFPLNPERLQALTDLVSRALEAKHIEPYQGPGNNPIFPVKKPNGKWRFIHDLRATNSLTRDLASPSPGPPDLTSPPQDLPHLRTIDLT




DAFFQIPLPAVFQPYFAFTLPQPNNHGPGTRYSWRVLPQGFKNSPTLFQQQLSHILAPVRKAFPNSLIIQYMDDILLASPALRELTALTDKVTNALTKEG




LPMSLEKTQATPGSIHFLGQVISPDCITYETLPSIHVKSIWSLAELQSMLGELQWVSKGTPVLRSSLHQLYLALRGHRDPRDTIELTSTQVQALKTIQKAL




ALNCRSRLVSQLPILALIILRPTGTTAVLFQTKQKWPLVWLHTPHPATSLRPWGQLLANAIITLDKYSLQHYGQICKSFHHNISNQALTYYLHTSDQSSVAI




LLQHSHRFHNLGAQPSGPWRSLLQVPQIFQNIDVLRPPFIISPVVIDHAPCLFSDGATSKAAFILWDKQVIHQQVLPLPSTCSAQAGELFGLLAGLQKSK




PWPALNIFLDSKFLIGHLRRMAWGAFLGPSTQCDLHARLFPLLQGKTVYVHHVRSHTLLQDPISRLNEATDALMLAPLLPL





HTLV2_P03363_2mut
8,044
HLPPPPQVDQFPLNLPERLQALNDLVSKALEAGHIEPYSGPGNNPVFPVKKPNGKWRFIHDLRATNAITTTLTSPSPGPPDLTSLPTALPHLQTIDLTDA




FFQIPLPKQYQPYFAFTIPQPCNYGPGTRYAWTVLPQGFKNSPTLFQQQLAAVLNPMRKMFPTSTIVQYMDDILLASPTNEELQQLSQLTLQALTTHGL




PISQEKTQQTPGQIRFLGQVISPNHITYESTPTIPIKSQWTLTELQVILGEIQWVSKGTPILRKHLQSLYSALHPYRDPRACITLTPQQLHALHAIQQALQH




NCRGRLNPALPLLGLISLSTSGTTSVIFQPKQNWPLAWLHTPHPPTSLCPWGHLLACTILTLDKYTLQHYGQLCQSFHHNMSKQALCDFLRNSPHPSV




GILIHHMGRFHNLGSQPSGPWKTLLHLPTLLQEPRLLRPIFTLSPVVLDTAPCLFSDGSPQKAAYVLWDQTILQQDITPLPSHETHSAQKGELLALICGLR




AAKPWPSLNIFLDSKYLIKYLHSLAIGAFLGTSAHQTLQAALPPLLQGKTIYLHHVRSHTNLPDPISTFNEYTDSLILAPLVPL





JSRV_P31623
8,045
PLGTSDSPVTHADPIDWKSEEPVWVDQWPLTQEKLSAAQQLVQEQLRLGHIEPSTSAWNSPIFVIKKKSGKWRLLQDLRKVNETMMHMGALQPGLPT




PSAIPDKSYIIVIDLKDCFYTIPLAPQDCKRFAFSLPSVNFKEPMQRYQWRVLPQGMTNSPTLCQKFVATAIAPVRQRFPQLYLVHYMDDILLAHTDEHLL




YQAFSILKQHLSLNGLVIADEKIQTHFPYNYLGFSLYPRVYNTQLVKLQTDHLKTLNDFQKLLGDINWIRPYLKLPTYTLQPLFDILKGDSDPASPRTLSLE




GRTALQSIEEAIRQQQITYCDYQRSWGLYILPTPRAPTGVLYQDKPLRWIYLSATPTKHLLPYYELVAKIIAKGRHEAIQYFGMEPPFICVPYALEQQDWL




FQFSDNWSIAFANYPGQITHHYPSDKLLQFASSHAFIFPKIVRRQPIPEATLIFTDGSSNGTAALIINHQTYYAQTSFSSAQVVELFAVHQALLTVPTSFNL




FTDSSYVVGALQMIETVPIIGTTSPEVLNLFTLIQQVLHCRQHPCFFGHIRAHSTLPGALVQGNHTADVLTKQVFFQS





JSRV_P31623_2mutB
8,046
PLGTSDSPVTHADPIDWKSEEPVWVDQWPLTQEKLSAAQQLVQEQLRLGHIEPSTSAWNSPIFVIKKKSGKWRLLQDLRKVNETMMHMGALQPGLPT




PSPIPDKSYIIVIDLKDCFYTIPLAPQDCKRFAFSLPSVNFKEPMQRYQWRVLPQGMTNSPTLCQKFVATAIAPVRQRFPQLYLVHYMDDILLAHTDEHLL




YQAFSILKQHLSLNGLVIADEKIQTHFPYNYLGFSLYPRVYNTQLVKLQTDHLKTLNDFQKLLGDINWIRPYLKLPTYTLQPLFDILKGDSDPASPRTLSLE




GRTALQSIEEAIRQQQITYCDYQRSWGLYILPTPRAPTGVLYQDKPLRWIYLSATPTKHLLPYYELVAKIIAKGRHEAIQYFGMEPPFICVPYALEQQDWL




FQFSDNWSIAFANYPGQITHHYPSDKLLQFASSHAFIFPKIVRRQPIPEATLIFTDGSSNGTAALIINHQTYYAQTSFSSAQVVELFAVHQALLTVPTSFNL




FTDSSYVVGALQMIETVPIIGTTSPEVLNLFTLIQQVLHCRQHPCFFGHIRAHSTLPGALVQGNHTADVLTKQVFFQS





KORV_Q9TTC1
8,047
TLGDQGSRGSDPLPEPRVTLTVEGIPTEFLVNTGAEHSVLTKPMGKMGSKRTVVAGATGSKVYPWTTKRLLKIGQKQVTHSFLVIPECPAPLLGRDLLT




KLKAQIQFSTEGPQVTWEDRPAMCLVLNLEEEYRLHEKPVPPSIDPSWLQLFPMVWAEKAGMGLANQVPPVVVELKSDASPVAVRQYPMSKEAREGI




RPHIQRFLDLGILVPCQSPWNTPLLPVKKPGTNDYRPVQDLREVNKRVQDIHPTVPNPYNLLSSLPPSHTWYSVLDLKDAFFCLKLHPNSQPLFAFEW




RDPEKGNTGQLTWTRLPQGFKNSPTLFDEALHRDLASFRALNPQVVMLQYVDDLLVAAPTYRDCKEGTRRLLQELSKLGYRVSAKKAQLCREEVTYL




GYLLKGGKRWLTPARKATVMKIPTPTTPRQVREFLGTAGFCRLWIPGFASLAAPLYPLTREKVPFTWTEAHQEAFGRIKEALLSAPALALPDLTKPFAL




YVDEKEGVARGVLTQTLGPWRRPVAYLSKKLDPVASGWPTCLKAIAAVALLLKDADKLTLGQNVLVIAPHNLESIVRQPPDRWMTNARMTHYQSLLLN




ERVSFAPPAILNPATLLPVESDDTPIHICSEILAEETGTRPDLRDQPLPGVPAWYTDGSSFIMDGRRQAGAAIVDNKRTVWASNLPEGTSAQKAELIALT




QALRLAEGKSINIYTDSRYAFATAHVHGAIYKQRGLLTSAGKDIKNKEEILALLEAIHLPKRVAIIHCPGHQRGTDPVATGNRKADEAAKQAAQSTRILTETTKN





KORV_Q9TTC1_3mut
8,048
TLGDQGSRGSDPLPEPRVTLTVEGIPTEFLVNTGAEHSVLTKPMGKMGSKRTVVAGATGSKVYPWTTKRLLKIGQKQVTHSFLVIPECPAPLLGRDLLT




KLKAQIQFSTEGPQVTWEDRPAMCLVLNLEEEYRLHEKPVPPSIDPSWLQLFPMVWAEKAGMGLANQVPPVVVELKSDASPVAVRQYPMSKEAREGI




RPHIQRFLDLGILVPCQSPWNTPLLPVKKPGTNDYRPVQDLREVNKRVQDIHPTVPNPYNLLSSLPPSHTWYSVLDLKDAFFCLKLHPNSQPLFAFEW




RDPEKGNTGQLTWTRLPQGFKNSPTLFNEALHRDLASFRALNPQVVMLQYVDDLLVAAPTYRDCKEGTRRLLQELSKLGYRVSAKKAQLCREEVTYL




GYLLKGGKRWLTPARKATVMKIPTPTTPRQVREFLGTAGFCRLWIPGFASLAAPLYPLTRPKVPFTWTEAHQEAFGRIKEALLSAPALALPDLTKPFAL




YVDEKEGVARGVLTQTLGPWRRPVAYLSKKLDPVASGWPTCLKAIAAVALLLKDADKLTLGQNVLVIAPHNLESIVRQPPDRWMTNARMTHYQSLLLN




ERVSFAPPAILNPATLLPVESDDTPIHICSEILAEETGTRPDLRDQPLPGVPAWYTDGSSFIMDGRRQAGAAIVDNKRTVWASNLPEGTSAQKAELIALT




QALRLAEGKSINIYTDSRYAFATAHVHGAIYKQRGWLTSAGKDIKNKEEILALLEAIHLPKRVAIIHCPGHQRGTDPVATGNRKADEAAKQAAQSTRILTETTKN





KORV_Q9TTC1_3mutA
8,049
TLGDQGSRGSDPLPEPRVTLTVEGIPTEFLVNTGAEHSVLTKPMGKMGSKRTVVAGATGSKVYPWTTKRLLKIGQKQVTHSFLVIPECPAPLLGRDLLT




KLKAQIQFSTEGPQVTWEDRPAMCLVLNLEEEYRLHEKPVPPSIDPSWLQLFPMVWAEKAGMGLANQVPPVVVELKSDASPVAVRQYPMSKEAREGI




RPHIQRFLDLGILVPCQSPWNTPLLPVKKPGTNDYRPVQDLREVNKRVQDIHPTVPNPYNLLSSLPPSHTWYSVLDLKDAFFCLKLHPNSQPLFAFEW




RDPEKGNTGQLTWTRLPQGFKNSPTLFNEALHRDLASFRALNPQVVMLQYVDDLLVAAPTYRDCKEGTRRLLQELSKLGYRVSAKKAQLCREEVTYL




GYLLKGGKRWLTPARKATVMKIPTPTTPRQVREFLGKAGFCRLFIPGFASLAAPLYPLTRPKVPFTWTEAHQEAFGRIKEALLSAPALALPDLTKPFALY




VDEKEGVARGVLTQTLGPWRRPVAYLSKKLDPVASGWPTCLKAIAAVALLLKDADKLTLGQNVLVIAPHNLESIVRQPPDRWMTNARMTHYQSLLLNE




RVSFAPPAILNPATLLPVESDDTPIHICSEILAEETGTRPDLRDQPLPGVPAWYTDGSSFIMDGRRQAGAAIVDNKRTVWASNLPEGTSAQKAELIALTQ




ALRLAEGKSINIYTDSRYAFATAHVHGAIYKQRGWLTSAGKDIKNKEEILALLEAIHLPKRVAIIHCPGHQRGTDPVATGNRKADEAAKQAAQSTRILTETTKN





KORV_Q9TTC1-Pro
8,050
LLGRDLLTKLKAQIQFSTEGPQVTWEDRPAMCLVLNLEEEYRLHEKPVPPSIDPSWLQLFPMVWAEKAGMGLANQVPPVVVELKSDASPVAVRQYPM




SKEAREGIRPHIQRFLDLGILVPCQSPWNTPLLPVKKPGTNDYRPVQDLREVNKRVQDIHPTVPNPYNLLSSLPPSHTWYSVLDLKDAFFCLKLHPNSQ




PLFAFEWRDPEKGNTGQLTWTRLPQGFKNSPTLFDEALHRDLASFRALNPQVVMLQYVDDLLVAAPTYRDCKEGTRRLLQELSKLGYRVSAKKAQLC




REEVTYLGYLLKGGKRWLTPARKATVMKIPTPTTPRQVREFLGTAGFCRLWIPGFASLAAPLYPLTREKVPFTWTEAHQEAFGRIKEALLSAPALALPD




LTKPFALYVDEKEGVARGVLTQTLGPWRRPVAYLSKKLDPVASGWPTCLKAIAAVALLLKDADKLTLGQNVLVIAPHNLESIVRQPPDRWMTNARMTH




YQSLLLNERVSFAPPAILNPATLLPVESDDTPIHICSEILAEETGTRPDLRDQPLPGVPAWYTDGSSFIMDGRRQAGAAIVDNKRTVWASNLPEGTSAQ




KAELIALTQALRLAEGKSINIYTDSRYAFATAHVHGAIYKQRGLLTSAGKDIKNKEEILALLEAIHLPKRVAIIHCPGHQRGTDPVATGNRKADEAAKQAAQSTRILTE




TTKN





KORV_Q9TTC1-Pro_3mut
8,051
LLGRDLLTKLKAQIQFSTEGPQVTWEDRPAMCLVLNLEEEYRLHEKPVPPSIDPSWLQLFPMVWAEKAGMGLANQVPPVVVELKSDASPVAVRQYPM




SKEAREGIRPHIQRFLDLGILVPCQSPWNTPLLPVKKPGTNDYRPVQDLREVNKRVQDIHPTVPNPYNLLSSLPPSHTWYSVLDLKDAFFCLKLHPNSQ




PLFAFEWRDPEKGNTGQLTWTRLPQGFKNSPTLFNEALHRDLASFRALNPQVVMLQYVDDLLVAAPTYRDCKEGTRRLLQELSKLGYRVSAKKAQLC




REEVTYLGYLLKGGKRWLTPARKATVMKIPTPTTPRQVREFLGTAGFCRLWIPGFASLAAPLYPLTRPKVPFTWTEAHQEAFGRIKEALLSAPALALPD




LTKPFALYVDEKEGVARGVLTQTLGPWRRPVAYLSKKLDPVASGWPTCLKAIAAVALLLKDADKLTLGQNVLVIAPHNLESIVRQPPDRWMTNARMTH




YQSLLLNERVSFAPPAILNPATLLPVESDDTPIHICSEILAEETGTRPDLRDQPLPGVPAWYTDGSSFIMDGRRQAGAAIVDNKRTVWASNLPEGTSAQ




KAELIALTQALRLAEGKSINIYTDSRYAFATAHVHGAIYKQRGWLTSAGKDIKNKEEILALLEAIHLPKRVAIIHCPGHQRGTDPVATGNRKADEAAKQAAQSTRILTE




TTKN





KORV_Q9TTC1-Pro_3mutA
8,052
LLGRDLLTKLKAQIQFSTEGPQVTWEDRPAMCLVLNLEEEYRLHEKPVPPSIDPSWLQLFPMVWAEKAGMGLANQVPPVVVELKSDASPVAVRQYPM




SKEAREGIRPHIQRFLDLGILVPCQSPWNTPLLPVKKPGTNDYRPVQDLREVNKRVQDIHPTVPNPYNLLSSLPPSHTWYSVLDLKDAFFCLKLHPNSQ




PLFAFEWRDPEKGNTGQLTWTRLPQGFKNSPTLFNEALHRDLASFRALNPQVVMLQYVDDLLVAAPTYRDCKEGTRRLLQELSKLGYRVSAKKAQLC




REEVTYLGYLLKGGKRWLTPARKATVMKIPTPTTPRQVREFLGKAGFCRLFIPGFASLAAPLYPLTRPKVPFTWTEAHQEAFGRIKEALLSAPALALPDL




TKPFALYVDEKEGVARGVLTQTLGPWRRPVAYLSKKLDPVASGWPTCLKAIAAVALLLKDADKLTLGQNVLVIAPHNLESIVRQPPDRWMTNARMTHY




QSLLLNERVSFAPPAILNPATLLPVESDDTPIHICSEILAEETGTRPDLRDQPLPGVPAWYTDGSSFIMDGRRQAGAAIVDNKRTVWASNLPEGTSAQK




AELIALTQALRLAEGKSINIYTDSRYAFATAHVHGAIYKQRGWLTSAGKDIKNKEEILALLEAIHLPKRVAIIHCPGHQRGTDPVATGNRKADEAAKQAAQSTRILTE




TTKN





MLVAV_P03356
8,053
TLNLEDEYRLYETSAEPEVSPGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEAKLGIKPHIQRLLDQGILVPCQSPWNTPLL




PVKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHRWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPGMGISGQLTWTRLPQGFKNSP




TLFDEALHRDLADFRIQHPDLILLQYVDDILLAATSELDCQQGTRALLLTLGNLGYRASAKKAQLCQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPK




TPRQLREFLGTAGFCRLWIPGFAEMAAPLYPLTKTGTLFNWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVA




YLSKKLDPVAAGWPPCLRMVAAIAVLRKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQAMLLDTDRVQFGPVVALNPATLLPLPEEG




APHDCLEILAETHGTRPDLTDQPIPDADHTWYTDGSSFLQEGQRKAGAAVTTETEVIWARALPAGTSAQRAELIALTQALKMAEGKRLNVYTDSRYAF




ATAHIHGEIYRRRGLLTSEGREIKNKSEILALLKALFLPKRLSIIHCLGHQKGDSAEARGNRLADQAAREAAIKTPPDTSTLL





MLVAV_P03356_3mut
8,054
TLNLEDEYRLYETSAEPEVSPGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEAKLGIKPHIQRLLDQGILVPCQSPWNTPLL




PVKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHRWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPGMGISGQLTWTRLPQGFKNSP




TLFNEALHRDLADFRIQHPDLILLQYVDDILLAATSELDCQQGTRALLLTLGNLGYRASAKKAQLCQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPK




TPRQLREFLGTAGFCRLWIPGFAEMAAPLYPLTKPGTLFNWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPV




AYLSKKLDPVAAGWPPCLRMVAAIAVLRKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQAMLLDTDRVQFGPVVALNPATLLPLPEE




GAPHDCLEILAETHGTRPDLTDQPIPDADHTWYTDGSSFLQEGQRKAGAAVTTETEVIWARALPAGTSAQRAELIALTQALKMAEGKRLNVYTDSRYA




FATAHIHGEIYRRRGWLTSEGREIKNKSEILALLKALFLPKRLSIIHCLGHQKGDSAEARGNRLADQAAREAAIKTPPDTSTLL





MLVAV_P03356_3mutA
8,055
TLNLEDEYRLYETSAEPEVSPGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEAKLGIKPHIQRLLDQGILVPCQSPWNTPLL




PVKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHRWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPGMGISGQLTWTRLPQGFKNSP




TLFNEALHRDLADFRIQHPDLILLQYVDDILLAATSELDCQQGTRALLLTLGNLGYRASAKKAQLCQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPK




TPRQLREFLGKAGFCRLFIPGFAEMAAPLYPLTKPGTLFNWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVA




YLSKKLDPVAAGWPPCLRMVAAIAVLRKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQAMLLDTDRVQFGPVVALNPATLLPLPEEG




APHDCLEILAETHGTRPDLTDQPIPDADHTWYTDGSSFLQEGQRKAGAAVTTETEVIWARALPAGTSAQRAELIALTQALKMAEGKRLNVYTDSRYAF




ATAHIHGEIYRRRGWLTSEGREIKNKSEILALLKALFLPKRLSIIHCLGHQKGDSAEARGNRLADQAAREAAIKTPPDTSTLL





MLVBM_Q7SVK7
8,056
TLGIEDEYRLHETSTEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIQQYPMSHEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPGMGISGQLTWTRLPQGFKNSPT




LFDEALHRDLADFRIQHPDLILLQYVDDILLAATSELDCQQGTRALLQTLGDLGYRASAKKAQICQKQVKYLGYLLREGQRWLTEARKETVMGQPVPKT




PRQLREFLGTAGFCRLWIPGFAEMAAPLYPLTKTGTLFSWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAY




LSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQAMLLDTDRVQFGPVVALNPATLLPLPEEGAP




HDCLEILAETHGTRPDLTDQPIPDADHTWYTDGSSFLQEGQRKAGAAVTTETEVIWAGALPAGTSAQRAELIALTQALKMAEGKRLNVYTDSRYAFAT




AHIHGEIYRRRGLLTSEGREIKNKSEILALLKALFLPKRLSIIHCLGHQKGDSAEARGNRLADQAAREAAIKTPPDTSTLL





MLVBM_Q7SVK7
8,057
TLGIEDEYRLHETSTEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIQQYPMSHEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPGMGISGQLTWTRLPQGFKNSPT




LFDEALHRDLADFRIQHPDLILLQYVDDILLAATSELDCQQGTRALLQTLGDLGYRASAKKAQICQKQVKYLGYLLREGQRWLTEARKETVMGQPVPKT




PRQLREFLGTAGFCRLWIPGFAEMAAPLYPLTKTGTLFSWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAY




LSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQAMLLDTDRVQFGPVVALNPATLLPLPEEGAP




HDCLEILAETHGTRPDLTDQPIPDADHTWYTDGSSFLQEGQRKAGAAVTTETEVIWAGALPAGTSAQRAELIALTQALKMAEGKRLNVYTDSRYAFAT




AHIHGEIYRRRGLLTSEGREIKNKSEILALLKALFLPKRLSIIHCLGHQKGDSAEARGNRLADQAAREAAIKTPPDTSTLL





MLVBM_Q7SVK7_3mut
8,058
TLGIEDEYRLHETSTEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIQQYPMSHEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPGMGISGQLTWTRLPQGFKNSPT




LFNEALHRDLADFRIQHPDLILLQYVDDILLAATSELDCQQGTRALLQTLGDLGYRASAKKAQICQKQVKYLGYLLREGQRWLTEARKETVMGQPVPKT




PRQLREFLGTAGFCRLWIPGFAEMAAPLYPLTKPGTLFSWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVA




YLSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQAMLLDTDRVQFGPVVALNPATLLPLPEEGA




PHDCLEILAETHGTRPDLTDQPIPDADHTWYTDGSSFLQEGQRKAGAAVTTETEVIWAGALPAGTSAQRAELIALTQALKMAEGKRLNVYTDSRYAFA




TAHIHGEIYRRRGWLTSEGREIKNKSEILALLKALFLPKRLSIIHCLGHQKGDSAEARGNRLADQAAREAAIKTPPDTSTLL


MLVBM_Q7SVK7_3mut
8,059
TLGIEDEYRLHETSTEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIQQYPMSHEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPGMGISGQLTWTRLPQGFKNSPT




LFNEALHRDLADFRIQHPDLILLQYVDDILLAATSELDCQQGTRALLQTLGDLGYRASAKKAQICQKQVKYLGYLLREGQRWLTEARKETVMGQPVPKT




PRQLREFLGTAGFCRLWIPGFAEMAAPLYPLTKPGTLFSWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVA




YLSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQAMLLDTDRVQFGPVVALNPATLLPLPEEGA




PHDCLEILAETHGTRPDLTDQPIPDADHTWYTDGSSFLQEGQRKAGAAVTTETEVIWAGALPAGTSAQRAELIALTQALKMAEGKRLNVYTDSRYAFA




TAHIHGEIYRRRGWLTSEGREIKNKSEILALLKALFLPKRLSIIHCLGHQKGDSAEARGNRLADQAAREAAIKTPPDTSTLL





MLVBM_Q7SVK7_3mutA_WS
8,060
LGIEDEYRLHETSTEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIQQYPMSHEARLGIKPHIQRLLDQGILVPCQSPWNTPLLPV




KKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPGMGISGQLTWTRLPQGFKNSPTL




FNEALHRDLADFRIQHPDLILLQYVDDILLAATSELDCQQGTRALLQTLGDLGYRASAKKAQICQKQVKYLGYLLREGQRWLTEARKETVMGQPVPKTP




RQLREFLGKAGFCRLFIPGFAEMAAPLYPLTKPGTLFSWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAYL




SKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQAMLLDTDRVQFGPVVALNPATLLPLPEEGAP




HDCLEILAETHGTRPDLTDQPIPDADHTWYTDGSSFLQEGQRKAGAAVTTETEVIWAGALPAGTSAQRAELIALTQALKMAEGKRLNVYTDSRYAFAT




AHIHGEIYRRRGWLTSEGREIKNKSEILALLKALFLPKRLSIIHCLGHQKGDSAEARGNRLADQAAREAAIKTPPDTSTLLI





MLVBM_Q7SVK7_3mutAWS
8,061
LGIEDEYRLHETSTEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIQQYPMSHEARLGIKPHIQRLLDQGILVPCQSPWNTPLLPV




KKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPGMGISGQLTWTRLPQGFKNSPTL




FNEALHRDLADFRIQHPDLILLQYVDDILLAATSELDCQQGTRALLQTLGDLGYRASAKKAQICQKQVKYLGYLLREGQRWLTEARKETVMGQPVPKTP




RQLREFLGKAGFCRLFIPGFAEMAAPLYPLTKPGTLFSWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAYL




SKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQAMLLDTDRVQFGPVVALNPATLLPLPEEGAP




HDCLEILAETHGTRPDLTDQPIPDADHTWYTDGSSFLQEGQRKAGAAVTTETEVIWAGALPAGTSAQRAELIALTQALKMAEGKRLNVYTDSRYAFAT




AHIHGEIYRRRGWLTSEGREIKNKSEILALLKALFLPKRLSIIHCLGHQKGDSAEARGNRLADQAAREAAIKTPPDTSTLLI





MLVCB_P08361
8,062
TLNIEDEYRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPT




LFDEALHRDLAGFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGDLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPIPKT




PRQLREFLGTAGFCRLWIPGFAEMAAPLYPLTKTGTLFNWGPDQQKAFQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAY




LSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPVVALNPATLLPLPEEGLQ




HDCLDILAEAHGTRSDLMDQPLPDADHTWYTDGSSFLQEGQRKAGAAVTTETEVIWARALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFAT




AHIHGEIYRRRGLLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGNSAEARGNRMADQAAREVATRETPETSTLL





MLVCB_P08361_3mut
8,063
TLNIEDEYRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPT




LFNEALHRDLAGFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGDLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPIPKT




PRQLREFLGTAGFCRLWIPGFAEMAAPLYPLTKPGTLFNWGPDQQKAFQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVA




YLSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPVVALNPATLLPLPEEGL




QHDCLDILAEAHGTRSDLMDQPLPDADHTWYTDGSSFLQEGQRKAGAAVTTETEVIWARALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAF




ATAHIHGEIYRRRGWLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGNSAEARGNRMADQAAREVATRETPETSTLL





MLVCB_P08361_3mutA
8,064
TLNIEDEYRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPT




LFNEALHRDLAGFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGDLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPIPKT




PRQLREFLGKAGFCRLFIPGFAEMAAPLYPLTKPGTLFNWGPDQQKAFQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAY




LSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPVVALNPATLLPLPEEGLQ




HDCLDILAEAHGTRSDLMDQPLPDADHTWYTDGSSFLQEGQRKAGAAVTTETEVIWARALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFAT




AHIHGEIYRRRGWLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGNSAEARGNRMADQAAREVATRETPETSTLL





MLVF5_P26810
8,065
TLNIEDEYRLHETSKGPDVPLGSTWLSDFPQAWAETGGMGLAFRQAPLIISLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQSLFAFEWKDPEMGISGQLTWTRLPQGFKNSPT




LFDEALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGDLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKT




PRQLREFLGTAGLCRLWIPGFAEMAAPLYPLTKTGTLFKWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAY




LSKKLDPVAAGWPPCLRMVAAIAVLTKDVGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPIVALNPATLLPLPEEGLQ




HDCLDILAEAHGTRPDLTDQPLPDADHTWYTDGSSFLQEGQRRAGAAVTTETEVIWAKALPAGTSAQRAELIALTQALKMAAGKKLNVYTDSRYAFAT




AHIHGEIYRRRGLLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGNHAEARGNRMADQAAREVATRETPETSTLL





MLVF5_P26810_3mut
8,066
TLNIEDEYRLHETSKGPDVPLGSTWLSDFPQAWAETGGMGLAFRQAPLIISLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQSLFAFEWKDPEMGISGQLTWTRLPQGFKNSPT




LFNEALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGDLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKT




PRQLREFLGTAGLCRLWIPGFAEMAAPLYPLTKPGTLFKWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAY




LSKKLDPVAAGWPPCLRMVAAIAVLTKDVGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPIVALNPATLLPLPEEGLQ




HDCLDILAEAHGTRPDLTDQPLPDADHTWYTDGSSFLQEGQRRAGAAVTTETEVIWAKALPAGTSAQRAELIALTQALKMAAGKKLNVYTDSRYAFAT




AHIHGEIYRRRGWLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGNHAEARGNRMADQAAREVATRETPETSTLL





MLVF5_P26810_3mutA
8,067
TLNIEDEYRLHETSKGPDVPLGSTWLSDFPQAWAETGGMGLAFRQAPLIISLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQSLFAFEWKDPEMGISGQLTWTRLPQGFKNSPT




LFNEALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGDLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKT




PRQLREFLGKAGLCRLFIPGFAEMAAPLYPLTKPGTLFKWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAY




LSKKLDPVAAGWPPCLRMVAAIAVLTKDVGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPIVALNPATLLPLPEEGLQ




HDCLDILAEAHGTRPDLTDQPLPDADHTWYTDGSSFLQEGQRRAGAAVTTETEVIWAKALPAGTSAQRAELIALTQALKMAAGKKLNVYTDSRYAFAT




AHIHGEIYRRRGWLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGNHAEARGNRMADQAAREVATRETPETSTLL





MLVFF_P26809_3mut
8,068
TLNIEDEYRLHETSKGPDVPLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQSLFAFEWRDPEMGISGQLTWTRLPQGFKNSPT




LFNEALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGDLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKT




PRQLREFLGTAGFCRLWIPGFAEMAAPLYPLTKPGTLFEWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVA




YLSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPIVALNPATLLPLPEEGLQ




HDCLDILAEAHGTRPDLTDQPLPDADHTWYTDGSSFLQEGQRKAGAAVTTETEVVWAKALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFA




TAHIHGEIYRRRGWLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGNRAEARGNRMADQAAREVATRETPETSTLL





MLVFF_P26809_3mutA
8,069
TLNIEDEYRLHETSKGPDVPLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQSLFAFEWRDPEMGISGQLTWTRLPQGFKNSPT




LFNEALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGDLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKT




PRQLREFLGKAGFCRLFIPGFAEMAAPLYPLTKPGTLFEWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAY




LSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPIVALNPATLLPLPEEGLQ




HDCLDILAEAHGTRPDLTDQPLPDADHTWYTDGSSFLQEGQRKAGAAVTTETEVVWAKALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFA




TAHIHGEIYRRRGWLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGNRAEARGNRMADQAAREVATRETPETSTLL





MLVMS_P03355
8,070
TLNIEDEHRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPT




LFDEALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGNLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKT




PRQLREFLGTAGFCRLWIPGFAEMAAPLYPLTKTGTLFNWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVA




YLSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPVVALNPATLLPLPEEGL




QHNCLDILAEAHGTRPDLTDQPLPDADHTWYTDGSSLLQEGQRKAGAAVTTETEVIWAKALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFA




TAHIHGEIYRRRGLLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGHSAEARGNRMADQAARKAAITETPDTSTLL





MLVMS_reference
8,137
TLNIEDEYRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPT




LFNEALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGNLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKT




PRQLREFLGKAGFCRLFIPGFAEMAAPLYPLTKPGTLFNWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAY




LSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPVVALNPATLLPLPEEGLQ




HNCLDILAEAHGTRPDLTDQPLPDADHTWYTDGSSLLQEGQRKAGAAVTTETEVIWAKALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFAT




AHIHGEIYRRRGWLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGHSAEARGNRMADQAARKAAITETPDTSTLLIENSSP





MLVMS_P03355
8,071
TLNIEDEHRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPT




LFDEALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGNLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKT




PRQLREFLGTAGFCRLWIPGFAEMAAPLYPLTKTGTLFNWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVA




YLSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPVVALNPATLLPLPEEGL




QHNCLDILAEAHGTRPDLTDQPLPDADHTWYTDGSSLLQEGQRKAGAAVTTETEVIWAKALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFA




TAHIHGEIYRRRGLLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGHSAEARGNRMADQAARKAAITETPDTSTLL





MLVMS_P03355_3mut
8,072
TLNIEDEHRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPT




LFNEALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGNLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKT




PRQLREFLGTAGFCRLWIPGFAEMAAPLYPLTKPGTLFNWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVA




YLSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPVVALNPATLLPLPEEGL




QHNCLDILAEAHGTRPDLTDQPLPDADHTWYTDGSSLLQEGQRKAGAAVTTETEVIWAKALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFA




TAHIHGEIYRRRGWLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGHSAEARGNRMADQAARKAAITETPDTSTLL





MLVMS_P03355_3mut
8,073
TLNIEDEHRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPT




LFNEALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGNLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKT




PRQLREFLGTAGFCRLWIPGFAEMAAPLYPLTKPGTLFNWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVA




YLSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPVVALNPATLLPLPEEGL




QHNCLDILAEAHGTRPDLTDQPLPDADHTWYTDGSSLLQEGQRKAGAAVTTETEVIWAKALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFA




TAHIHGEIYRRRGWLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGHSAEARGNRMADQAARKAAITETPDTSTLL





MLVMS_P03355_3mutA_WS
8,074
TLNIEDEHRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPT




LFNEALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGNLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKT




PRQLREFLGKAGFCRLFIPGFAEMAAPLYPLTKPGTLFNWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAY




LSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPVVALNPATLLPLPEEGLQ




HNCLDILAEAHGTRPDLTDQPLPDADHTWYTDGSSLLQEGQRKAGAAVTTETEVIWAKALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFAT




AHIHGEIYRRRGWLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGHSAEARGNRMADQAARKAAITETPDTSTLL





MLVMS_P03355_3mutA_WS
8,075
TLNIEDEHRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPT




LFNEALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGNLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKT




PRQLREFLGKAGFCRLFIPGFAEMAAPLYPLTKPGTLFNWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAY




LSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPVVALNPATLLPLPEEGLQ




HNCLDILAEAHGTRPDLTDQPLPDADHTWYTDGSSLLQEGQRKAGAAVTTETEVIWAKALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFAT




AHIHGEIYRRRGWLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGHSAEARGNRMADQAARKAAITETPDTSTLL





MLVMS_P03355_PLV919
8,076
TLNIEDEYRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPT




LFNEALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGNLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKT




PRQLREFLGKAGFCRLFIPGFAEMAAPLYPLTKPGTLFNWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAY




LSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPVVALNPATLLPLPEEGLQ




HNCLDILAEAHGTRPDLTDQPLPDADHTWYTDGSSLLQEGQRKAGAAVTTETEVIWAKALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFAT




AHIHGEIYRRRGWLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGHSAEARGNRMADQAARKAAITETPDTSTLLIENSSPSGGSKRTADGSEFE





MLVMS_P03355_PLV919
8,077
TLNIEDEYRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPT




LFNEALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGNLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKT




PRQLREFLGKAGFCRLFIPGFAEMAAPLYPLTKPGTLFNWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAY




LSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPVVALNPATLLPLPEEGLQ




HNCLDILAEAHGTRPDLTDQPLPDADHTWYTDGSSLLQEGQRKAGAAVTTETEVIWAKALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFAT




AHIHGEIYRRRGWLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGHSAEARGNRMADQAARKAAITETPDTSTLLIENSSPSGGSKRTADGSEFE





MLVRD_P11227
8,078
TLNIEDEYRLHEISTEPDVSPGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEAKLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQGLREVNKRVEDIHPTVPNPYNLLSGLPTSHRWYTVLDLKDAFFCLRLHPTSQPLFASEWRDPGMGISGQLTWTRLPQGFKNSPT




LFDEALHRGLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLKTLGNLGYRASAKKAQICQKQVKYLGYLLREGQRWLTEARKETVMGQPTPKT




PRQLREFLGTAGFCRLWIPRFAEMAAPLYPLTKTGTLFNWGPDQQKAYHEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAY




LSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQAMLLDTDRVQFGPVVALNPATLLPLPEEGAP




HDCLEILAETHGTEPDLTDQPIPDADHTWYTDGSSFLQEGQRKAGAAVTTETEVIWARALPAGTSAQRAELIALTQALKMAEGKRLNVYTDSRYAFATA




HIHGEIYKRRGLLTSEGREIKNKSEILALLKALFLPKRLSIIHCLGHQKGDSAEARGNRLADQAAREAAIKTPPDTSTLL





MLVRD_P11227_3mut
8,079
TLNIEDEYRLHEISTEPDVSPGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEAKLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQGLREVNKRVEDIHPTVPNPYNLLSGLPTSHRWYTVLDLKDAFFCLRLHPTSQPLFASEWRDPGMGISGQLTWTRLPQGFKNSPT




LFNEALHRGLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLKTLGNLGYRASAKKAQICQKQVKYLGYLLREGQRWLTEARKETVMGQPTPKT




PRQLREFLGTAGFCRLWIPRFAEMAAPLYPLTKPGTLFNWGPDQQKAYHEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAY




LSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQAMLLDTDRVQFGPVVALNPATLLPLPEEGAP




HDCLEILAETHGTEPDLTDQPIPDADHTWYTDGSSFLQEGQRKAGAAVTTETEVIWARALPAGTSAQRAELIALTQALKMAEGKRLNVYTDSRYAFATA




HIHGEIYKRRGWLTSEGREIKNKSEILALLKALFLPKRLSIIHCLGHQKGDSAEARGNRLADQAAREAAIKTPPDTSTLL





MMTVB_P03365
8,080
WVQEISDSRPMLHIYLNGRRFLGLLNTGADKTCIAGRDWPANWPIHQTESSLQGLGMACGVARSSQPLRWQHEDKSGIIHPFVIPTLPFTLWGRDIMK




DIKVRLMTDSPDDSQDLMIGAIESNLFADQISWKSDQPVWLNQWPLKQEKLQALQQLVTEQLQLGHLEESNSPWNTPVFVIKKKSGKWRLLQDLRAV




NATMHDMGALQPGLPSPVAVPKGWEIIIIDLQDCFFNIKLHPEDCKRFAFSVPSPNFKRPYQRFQWKVLPQGMKNSPTLCQKFVDKAILTVRDKYQDS




YIVHYMDDILLAHPSRSIVDEILTSMIQALNKHGLVVSTEKIQKYDNLKYLGTHIQGDSVSYQKLQIRTDKLRTLNDFQKLLGNINWIRPFLKLTTGELKPLF




EILNGDSNPISTRKLTPEACKALQLMNERLSTARVKRLDLSQPWSLCILKTEYTPTACLWQDGVVEWIHLPHISPKVITPYDIFCTQLIIKGRHRSKELFSK




DPDYIVVPYTKVQFDLLLQEKEDWPISLLGFLGEVHFHLPKDPLLTFTLQTAIIFPHMTSTTPLEKGIVIFTDGSANGRSVTYIQGREPIIKENTQNTAQQA




EIVAVITAFEEVSQPFNLYTDSKYVTGLFPEIETATLSPRTKIYTELKHLQRLIHKRQEKFYIGHIRGHTGLPGPLAQGNAYADSLTRILT





MMTVB_P03365
8,081
WVQEISDSRPMLHIYLNGRRFLGLLNTGADKTCIAGRDWPANWPIHQTESSLQGLGMACGVARSSQPLRWQHEDKSGIIHPFVIPTLPFTLWGRDIMK




DIKVRLMTDSPDDSQDLMIGAIESNLFADQISWKSDQPVWLNQWPLKQEKLQALQQLVTEQLQLGHLEESNSPWNTPVFVIKKKSGKWRLLQDLRAV




NATMHDMGALQPGLPSPVAVPKGWEIIIIDLQDCFFNIKLHPEDCKRFAFSVPSPNFKRPYQRFQWKVLPQGMKNSPTLCQKFVDKAILTVRDKYQDS




YIVHYMDDILLAHPSRSIVDEILTSMIQALNKHGLVVSTEKIQKYDNLKYLGTHIQGDSVSYQKLQIRTDKLRTLNDFQKLLGNINWIRPFLKLTTGELKPLF




EILNGDSNPISTRKLTPEACKALQLMNERLSTARVKRLDLSQPWSLCILKTEYTPTACLWQDGVVEWIHLPHISPKVITPYDIFCTQLIIKGRHRSKELFSK




DPDYIVVPYTKVQFDLLLQEKEDWPISLLGFLGEVHFHLPKDPLLTFTLQTAIIFPHMTSTTPLEKGIVIFTDGSANGRSVTYIQGREPIIKENTQNTAQQA




EIVAVITAFEEVSQPFNLYTDSKYVTGLFPEIETATLSPRTKIYTELKHLQRLIHKRQEKFYIGHIRGHTGLPGPLAQGNAYADSLTRILT





MMTVB_P03365_2mut
8,082
WVQEISDSRPMLHIYLNGRRFLGLLNTGADKTCIAGRDWPANWPIHQTESSLQGLGMACGVARSSQPLRWQHEDKSGIIHPFVIPTLPFTLWGRDIMK




DIKVRLMTDSPDDSQDLMIGAIESNLFADQISWKSDQPVWLNQWPLKQEKLQALQQLVTEQLQLGHLEESNSPWNTPVFVIKKKSGKWRLLQDLRAV




NATMHDMGALQPGLPSPVAVPKGWEIIIIDLQDCFFNIKLHPEDCKRFAFSVPSPNFKRPYQRFQWKVLPQGMKNSPTLCQKFVDKAILTVRDKYQDS




YIVHYMDDILLAHPSRSIVDEILTSMIQALNKHGLVVSTEKIQKYDNLKYLGTHIQGDSVSYQKLQIRTDKLRTLNDFQKLLGNINWIRPFLKLTTGELKPLF




EILNPDSNPISTRKLTPEACKALQLMNERLSTARVKRLDLSQPWSLCILKTEYTPTACLWQDGVVEWIHLPHISPKVITPYDIFCTQLIIKGRHRSKELFSK




DPDYIVVPYTKVQFDLLLQEKEDWPISLLGFLGEVHFHLPKDPLLTFTLQTAIIFPHMTSTTPLEKGIVIFTDGSANGRSVTYIQGREPIIKENTQNTAQQA




EIVAVITAFEEVSQPFNLYTDSKYVTGLFPEIETATLSPRTKIYTELKHLQRLIHKRQEKFYIGHIRGHTGLPGPLAQGNAYADSLTRILT





MMTVB_P03365_2mut_WS
8,083
VQEISDSRPMLHIYLNGRRFLGLLDTGADKTCIAGRDWPANWPIHQTESSLQGLGMACGVARSSQPLRWQHEDKSGIIHPFVIPTLPFTLWGRDIMKDI




KVRLMTDSPDDSQDLMIGAIESNLFADQISWKSDQPVWLNQWPLKQEKLQALQQLVTEQLQLGHLEESNSPWNTPVFVIKKKSGKWRLLQDLRAVNA




TMHDMGALQPGLPSPVAVPKGWEIIIIDLQDCFFNIKLHPEDCKRFAFSVPSPNFKRPYQRFQWKVLPQGMKNSPTLCQKFVDKAILTVRDKYQDSYIV




HYMDDILLAHPSRSIVDEILTSMIQALNKHGLVVSTEKIQKYDNLKYLGTHIQGDSVSYQKLQIRTDKLRTLNDFQKLLGNINWIRPFLKLTTGELKPLFEIL




NPDSNPISTRKLTPEACKALQLMNERLSTARVKRLDLSQPWSLCILKTEYTPTACLWQDGVVEWIHLPHISPKVITPYDIFCTQLIIKGRHRSKELFSKDP




DYIVVPYTKVQFDLLLQEKEDWPISLLGFLGEVHFHLPKDPLLTFTLQTAIIFPHMTSTTPLEKGIVIFTDGSANGRSVTYIQGREPIIKENTQNTAQQAEIV




AVITAFEEVSQPFNLYTDSKYVTGLFPEIETATLSPRTKIYTELKHLQRLIHKRQEKFYIGHIRGHTGLPGPLAQGNAYADSLTRILTA





MMTVB_P03365_2mut_WS
8,084
VQEISDSRPMLHIYLNGRRFLGLLDTGADKTCIAGRDWPANWPIHQTESSLQGLGMACGVARSSQPLRWQHEDKSGIIHPFVIPTLPFTLWGRDIMKDI




KVRLMTDSPDDSQDLMIGAIESNLFADQISWKSDQPVWLNQWPLKQEKLQALQQLVTEQLQLGHLEESNSPWNTPVFVIKKKSGKWRLLQDLRAVNA




TMHDMGALQPGLPSPVAVPKGWEIIIIDLQDCFFNIKLHPEDCKRFAFSVPSPNFKRPYQRFQWKVLPQGMKNSPTLCQKFVDKAILTVRDKYQDSYIV




HYMDDILLAHPSRSIVDEILTSMIQALNKHGLVVSTEKIQKYDNLKYLGTHIQGDSVSYQKLQIRTDKLRTLNDFQKLLGNINWIRPFLKLTTGELKPLFEIL




NPDSNPISTRKLTPEACKALQLMNERLSTARVKRLDLSQPWSLCILKTEYTPTACLWQDGVVEWIHLPHISPKVITPYDIFCTQLIIKGRHRSKELFSKDP




DYIVVPYTKVQFDLLLQEKEDWPISLLGFLGEVHFHLPKDPLLTFTLQTAIIFPHMTSTTPLEKGIVIFTDGSANGRSVTYIQGREPIIKENTQNTAQQAEIV




AVITAFEEVSQPFNLYTDSKYVTGLFPEIETATLSPRTKIYTELKHLQRLIHKRQEKFYIGHIRGHTGLPGPLAQGNAYADSLTRILTA





MMTVB_P03365_2mutB
8,085
WVQEISDSRPMLHIYLNGRRFLGLLNTGADKTCIAGRDWPANWPIHQTESSLQGLGMACGVARSSQPLRWQHEDKSGIIHPFVIPTLPFTLWGRDIMK




DIKVRLMTDSPDDSQDLMIGAIESNLFADQISWKSDQPVWLNQWPLKQEKLQALQQLVTEQLQLGHLEESNSPWNTPVFVIKKKSGKWRLLQDLRAV




NATMHDMGALQPGLPSPVAPPKGWEIIIIDLQDCFFNIKLHPEDCKRFAFSVPSPNFKRPYQRFQWKVLPQGMKNSPTLCQKFVDKAILTVRDKYQDS




YIVHYMDDILLAHPSRSIVDEILTSMIQALNKHGLVVSTEKIQKYDNLKYLGTHIQGDSVSYQKLQIRTDKLRTLNDFQKLLGNINWIRPFLKLTTGELKPLF




EILNPDSNPISTRKLTPEACKALQLMNERLSTARVKRLDLSQPWSLCILKTEYTPTACLWQDGVVEWIHLPHISPKVITPYDIFCTQLIIKGRHRSKELFSK




DPDYIVVPYTKVQFDLLLQEKEDWPISLLGFLGEVHFHLPKDPLLTFTLQTAIIFPHMTSTTPLEKGIVIFTDGSANGRSVTYIQGREPIIKENTQNTAQQA




EIVAVITAFEEVSQPFNLYTDSKYVTGLFPEIETATLSPRTKIYTELKHLQRLIHKRQEKFYIGHIRGHTGLPGPLAQGNAYADSLTRILT





MMTVB_P03365_2mutB
8,086
WVQEISDSRPMLHIYLNGRRFLGLLNTGADKTCIAGRDWPANWPIHQTESSLQGLGMACGVARSSQPLRWQHEDKSGIIHPFVIPTLPFTLWGRDIMK




DIKVRLMTDSPDDSQDLMIGAIESNLFADQISWKSDQPVWLNQWPLKQEKLQALQQLVTEQLQLGHLEESNSPWNTPVFVIKKKSGKWRLLQDLRAV




NATMHDMGALQPGLPSPVAPPKGWEIIIIDLQDCFFNIKLHPEDCKRFAFSVPSPNFKRPYQRFQWKVLPQGMKNSPTLCQKFVDKAILTVRDKYQDS




YIVHYMDDILLAHPSRSIVDEILTSMIQALNKHGLVVSTEKIQKYDNLKYLGTHIQGDSVSYQKLQIRTDKLRTLNDFQKLLGNINWIRPFLKLTTGELKPLF




EILNPDSNPISTRKLTPEACKALQLMNERLSTARVKRLDLSQPWSLCILKTEYTPTACLWQDGVVEWIHLPHISPKVITPYDIFCTQLIIKGRHRSKELFSK




DPDYIVVPYTKVQFDLLLQEKEDWPISLLGFLGEVHFHLPKDPLLTFTLQTAIIFPHMTSTTPLEKGIVIFTDGSANGRSVTYIQGREPIIKENTQNTAQQA




EIVAVITAFEEVSQPFNLYTDSKYVTGLFPEIETATLSPRTKIYTELKHLQRLIHKRQEKFYIGHIRGHTGLPGPLAQGNAYADSLTRILT





MMTVB_P03365_2mutB_WS
8,087
VQEISDSRPMLHIYLNGRRFLGLLDTGADKTCIAGRDWPANWPIHQTESSLQGLGMACGVARSSQPLRWQHEDKSGIIHPFVIPTLPFTLWGRDIMKDI




KVRLMTDSPDDSQDLMIGAIESNLFADQISWKSDQPVWLNQWPLKQEKLQALQQLVTEQLQLGHLEESNSPWNTPVFVIKKKSGKWRLLQDLRAVNA




TMHDMGALQPGLPSPPAVPKGWEIIIIDLQDCFFNIKLHPEDCKRFAFSVPSPNFKRPYQRFQWKVLPQGMKNSPTLCQKFVDKAILTVRDKYQDSYIV




HYMDDILLAHPSRSIVDEILTSMIQALNKHGLVVSTEKIQKYDNLKYLGTHIQGDSVSYQKLQIRTDKLRTLNDFQKLLGNINWIRPFLKLTTGELKPLFEIL




NPDSNPISTRKLTPEACKALQLMNERLSTARVKRLDLSQPWSLCILKTEYTPTACLWQDGVVEWIHLPHISPKVITPYDIFCTQLIIKGRHRSKELFSKDP




DYIVVPYTKVQFDLLLQEKEDWPISLLGFLGEVHFHLPKDPLLTFTLQTAIIFPHMTSTTPLEKGIVIFTDGSANGRSVTYIQGREPIIKENTQNTAQQAEIV




AVITAFEEVSQPFNLYTDSKYVTGLFPEIETATLSPRTKIYTELKHLQRLIHKRQEKFYIGHIRGHTGLPGPLAQGNAYADSLTRILTA





MMTVB_P03365_2mutB_WS
8,088
VQEISDSRPMLHIYLNGRRFLGLLDTGADKTCIAGRDWPANWPIHQTESSLQGLGMACGVARSSQPLRWQHEDKSGIIHPFVIPTLPFTLWGRDIMKDI




KVRLMTDSPDDSQDLMIGAIESNLFADQISWKSDQPVWLNQWPLKQEKLQALQQLVTEQLQLGHLEESNSPWNTPVFVIKKKSGKWRLLQDLRAVNA




TMHDMGALQPGLPSPPAVPKGWEIIIIDLQDCFFNIKLHPEDCKRFAFSVPSPNFKRPYQRFQWKVLPQGMKNSPTLCQKFVDKAILTVRDKYQDSYIV




HYMDDILLAHPSRSIVDEILTSMIQALNKHGLVVSTEKIQKYDNLKYLGTHIQGDSVSYQKLQIRTDKLRTLNDFQKLLGNINWIRPFLKLTTGELKPLFEIL




NPDSNPISTRKLTPEACKALQLMNERLSTARVKRLDLSQPWSLCILKTEYTPTACLWQDGVVEWIHLPHISPKVITPYDIFCTQLIIKGRHRSKELFSKDP




DYIVVPYTKVQFDLLLQEKEDWPISLLGFLGEVHFHLPKDPLLTFTLQTAIIFPHMTSTTPLEKGIVIFTDGSANGRSVTYIQGREPIIKENTQNTAQQAEIV




AVITAFEEVSQPFNLYTDSKYVTGLFPEIETATLSPRTKIYTELKHLQRLIHKRQEKFYIGHIRGHTGLPGPLAQGNAYADSLTRILTA





MMTVB_P03365_WS
8,089
VQEISDSRPMLHIYLNGRRFLGLLDTGADKTCIAGRDWPANWPIHQTESSLQGLGMACGVARSSQPLRWQHEDKSGIIHPFVIPTLPFTLWGRDIMKDI




KVRLMTDSPDDSQDLMIGAIESNLFADQISWKSDQPVWLNQWPLKQEKLQALQQLVTEQLQLGHLEESNSPWNTPVFVIKKKSGKWRLLQDLRAVNA




TMHDMGALQPGLPSPVAVPKGWEIIIIDLQDCFFNIKLHPEDCKRFAFSVPSPNFKRPYQRFQWKVLPQGMKNSPTLCQKFVDKAILTVRDKYQDSYIV




HYMDDILLAHPSRSIVDEILTSMIQALNKHGLVVSTEKIQKYDNLKYLGTHIQGDSVSYQKLQIRTDKLRTLNDFQKLLGNINWIRPFLKLTTGELKPLFEIL




NGDSNPISTRKLTPEACKALQLMNERLSTARVKRLDLSQPWSLCILKTEYTPTACLWQDGVVEWIHLPHISPKVITPYDIFCTQLIIKGRHRSKELFSKDP




DYIVVPYTKVQFDLLLQEKEDWPISLLGFLGEVHFHLPKDPLLTFTLQTAIIFPHMTSTTPLEKGIVIFTDGSANGRSVTYIQGREPIIKENTQNTAQQAEIV




AVITAFEEVSQPFNLYTDSKYVTGLFPEIETATLSPRTKIYTELKHLQRLIHKRQEKFYIGHIRGHTGLPGPLAQGNAYADSLTRILTA





MMTVB_P03365_WS
8,090
VQEISDSRPMLHIYLNGRRFLGLLDTGADKTCIAGRDWPANWPIHQTESSLQGLGMACGVARSSQPLRWQHEDKSGIIHPFVIPTLPFTLWGRDIMKDI




KVRLMTDSPDDSQDLMIGAIESNLFADQISWKSDQPVWLNQWPLKQEKLQALQQLVTEQLQLGHLEESNSPWNTPVFVIKKKSGKWRLLQDLRAVNA




TMHDMGALQPGLPSPVAVPKGWEIIIIDLQDCFFNIKLHPEDCKRFAFSVPSPNFKRPYQRFQWKVLPQGMKNSPTLCQKFVDKAILTVRDKYQDSYIV




HYMDDILLAHPSRSIVDEILTSMIQALNKHGLVVSTEKIQKYDNLKYLGTHIQGDSVSYQKLQIRTDKLRTLNDFQKLLGNINWIRPFLKLTTGELKPLFEIL




NGDSNPISTRKLTPEACKALQLMNERLSTARVKRLDLSQPWSLCILKTEYTPTACLWQDGVVEWIHLPHISPKVITPYDIFCTQLIIKGRHRSKELFSKDP




DYIVVPYTKVQFDLLLQEKEDWPISLLGFLGEVHFHLPKDPLLTFTLQTAIIFPHMTSTTPLEKGIVIFTDGSANGRSVTYIQGREPIIKENTQNTAQQAEIV




AVITAFEEVSQPFNLYTDSKYVTGLFPEIETATLSPRTKIYTELKHLQRLIHKRQEKFYIGHIRGHTGLPGPLAQGNAYADSLTRILTA





MMTVB_P03365-Pro
8,091
GRDIMKDIKVRLMTDSPDDSQDLMIGAIESNLFADQISWKSDQPVWLNQWPLKQEKLQALQQLVTEQLQLGHLEESNSPWNTPVFVIKKKSGKWRLL




QDLRAVNATMHDMGALQPGLPSPVAVPKGWEIIIIDLQDCFFNIKLHPEDCKRFAFSVPSPNFKRPYQRFQWKVLPQGMKNSPTLCQKFVDKAILTVR




DKYQDSYIVHYMDDILLAHPSRSIVDEILTSMIQALNKHGLVVSTEKIQKYDNLKYLGTHIQGDSVSYQKLQIRTDKLRTLNDFQKLLGNINWIRPFLKLTT




GELKPLFEILNGDSNPISTRKLTPEACKALQLMNERLSTARVKRLDLSQPWSLCILKTEYTPTACLWQDGVVEWIHLPHISPKVITPYDIFCTQLIIKGRHR




SKELFSKDPDYIVVPYTKVQFDLLLQEKEDWPISLLGFLGEVHFHLPKDPLLTFTLQTAIIFPHMTSTTPLEKGIVIFTDGSANGRSVTYIQGREPIIKENTQ




NTAQQAEIVAVITAFEEVSQPFNLYTDSKYVTGLFPEIETATLSPRTKIYTELKHLQRLIHKRQEKFYIGHIRGHTGLPGPLAQGNAYADSLTRILT





MMTVB_P03365-Pro
8,092
GRDIMKDIKVRLMTDSPDDSQDLMIGAIESNLFADQISWKSDQPVWLNQWPLKQEKLQALQQLVTEQLQLGHLEESNSPWNTPVFVIKKKSGKWRLL




QDLRAVNATMHDMGALQPGLPSPVAVPKGWEIIIIDLQDCFFNIKLHPEDCKRFAFSVPSPNFKRPYQRFQWKVLPQGMKNSPTLCQKFVDKAILTVR




DKYQDSYIVHYMDDILLAHPSRSIVDEILTSMIQALNKHGLVVSTEKIQKYDNLKYLGTHIQGDSVSYQKLQIRTDKLRTLNDFQKLLGNINWIRPFLKLTT




GELKPLFEILNGDSNPISTRKLTPEACKALQLMNERLSTARVKRLDLSQPWSLCILKTEYTPTACLWQDGVVEWIHLPHISPKVITPYDIFCTQLIIKGRHR




SKELFSKDPDYIVVPYTKVQFDLLLQEKEDWPISLLGFLGEVHFHLPKDPLLTFTLQTAIIFPHMTSTTPLEKGIVIFTDGSANGRSVTYIQGREPIIKENTQ




NTAQQAEIVAVITAFEEVSQPFNLYTDSKYVTGLFPEIETATLSPRTKIYTELKHLQRLIHKRQEKFYIGHIRGHTGLPGPLAQGNAYADSLTRILT





MMTVB_P03365-Pro_2mut
8,093
GRDIMKDIKVRLMTDSPDDSQDLMIGAIESNLFADQISWKSDQPVWLNQWPLKQEKLQALQQLVTEQLQLGHLEESNSPWNTPVFVIKKKSGKWRLL




QDLRAVNATMHDMGALQPGLPSPVAVPKGWEIIIIDLQDCFFNIKLHPEDCKRFAFSVPSPNFKRPYQRFQWKVLPQGMKNSPTLCQKFVDKAILTVR




DKYQDSYIVHYMDDILLAHPSRSIVDEILTSMIQALNKHGLVVSTEKIQKYDNLKYLGTHIQGDSVSYQKLQIRTDKLRTLNDFQKLLGNINWIRPFLKLTT




GELKPLFEILNPDSNPISTRKLTPEACKALQLMNERLSTARVKRLDLSQPWSLCILKTEYTPTACLWQDGVVEWIHLPHISPKVITPYDIFCTQLIIKGRHR




SKELFSKDPDYIVVPYTKVQFDLLLQEKEDWPISLLGFLGEVHFHLPKDPLLTFTLQTAIIFPHMTSTTPLEKGIVIFTDGSANGRSVTYIQGREPIIKENTQ




NTAQQAEIVAVITAFEEVSQPFNLYTDSKYVTGLFPEIETATLSPRTKIYTELKHLQRLIHKRQEKFYIGHIRGHTGLPGPLAQGNAYADSLTRILT





MMTVB_P03365-Pro_2mut
8,094
GRDIMKDIKVRLMTDSPDDSQDLMIGAIESNLFADQISWKSDQPVWLNQWPLKQEKLQALQQLVTEQLQLGHLEESNSPWNTPVFVIKKKSGKWRLL




QDLRAVNATMHDMGALQPGLPSPVAVPKGWEIIIIDLQDCFFNIKLHPEDCKRFAFSVPSPNFKRPYQRFQWKVLPQGMKNSPTLCQKFVDKAILTVR




DKYQDSYIVHYMDDILLAHPSRSIVDEILTSMIQALNKHGLVVSTEKIQKYDNLKYLGTHIQGDSVSYQKLQIRTDKLRTLNDFQKLLGNINWIRPFLKLTT




GELKPLFEILNPDSNPISTRKLTPEACKALQLMNERLSTARVKRLDLSQPWSLCILKTEYTPTACLWQDGVVEWIHLPHISPKVITPYDIFCTQLIIKGRHR




SKELFSKDPDYIVVPYTKVQFDLLLQEKEDWPISLLGFLGEVHFHLPKDPLLTFTLQTAIIFPHMTSTTPLEKGIVIFTDGSANGRSVTYIQGREPIIKENTQ




NTAQQAEIVAVITAFEEVSQPFNLYTDSKYVTGLFPEIETATLSPRTKIYTELKHLQRLIHKRQEKFYIGHIRGHTGLPGPLAQGNAYADSLTRILT





MMTVB_P03365-Pro_2mutB
8,095
GRDIMKDIKVRLMTDSPDDSQDLMIGAIESNLFADQISWKSDQPVWLNQWPLKQEKLQALQQLVTEQLQLGHLEESNSPWNTPVFVIKKKSGKWRLL




QDLRAVNATMHDMGALQPGLPSPVAPPKGWEIIIIDLQDCFFNIKLHPEDCKRFAFSVPSPNFKRPYQRFQWKVLPQGMKNSPTLCQKFVDKAILTVR




DKYQDSYIVHYMDDILLAHPSRSIVDEILTSMIQALNKHGLVVSTEKIQKYDNLKYLGTHIQGDSVSYQKLQIRTDKLRTLNDFQKLLGNINWIRPFLKLTT




GELKPLFEILNPDSNPISTRKLTPEACKALQLMNERLSTARVKRLDLSQPWSLCILKTEYTPTACLWQDGVVEWIHLPHISPKVITPYDIFCTQLIIKGRHR




SKELFSKDPDYIVVPYTKVQFDLLLQEKEDWPISLLGFLGEVHFHLPKDPLLTFTLQTAIIFPHMTSTTPLEKGIVIFTDGSANGRSVTYIQGREPIIKENTQ




NTAQQAEIVAVITAFEEVSQPFNLYTDSKYVTGLFPEIETATLSPRTKIYTELKHLQRLIHKRQEKFYIGHIRGHTGLPGPLAQGNAYADSLTRILT





MMTVB_P03365-Pro_2mutB
8,096
GRDIMKDIKVRLMTDSPDDSQDLMIGAIESNLFADQISWKSDQPVWLNQWPLKQEKLQALQQLVTEQLQLGHLEESNSPWNTPVFVIKKKSGKWRLL




QDLRAVNATMHDMGALQPGLPSPVAPPKGWEIIIIDLQDCFFNIKLHPEDCKRFAFSVPSPNFKRPYQRFQWKVLPQGMKNSPTLCQKFVDKAILTVR




DKYQDSYIVHYMDDILLAHPSRSIVDEILTSMIQALNKHGLVVSTEKIQKYDNLKYLGTHIQGDSVSYQKLQIRTDKLRTLNDFQKLLGNINWIRPFLKLTT




GELKPLFEILNPDSNPISTRKLTPEACKALQLMNERLSTARVKRLDLSQPWSLCILKTEYTPTACLWQDGVVEWIHLPHISPKVITPYDIFCTQLIIKGRHR




SKELFSKDPDYIVVPYTKVQFDLLLQEKEDWPISLLGFLGEVHFHLPKDPLLTFTLQTAIIFPHMTSTTPLEKGIVIFTDGSANGRSVTYIQGREPIIKENTQ




NTAQQAEIVAVITAFEEVSQPFNLYTDSKYVTGLFPEIETATLSPRTKIYTELKHLQRLIHKRQEKFYIGHIRGHTGLPGPLAQGNAYADSLTRILT





MPMV_P07572
8,097
LTAAIDILAPQQCAEPITWKSDEPVWVDQWPLTNDKLAAAQQLVQEQLEAGHITESSSPWNTPIFVIKKKSGKWRLLQDLRAVNATMVLMGALQPGLP




SPVAIPQGYLKIIIDLKDCFFSIPLHPSDQKRFAFSLPSTNFKEPMQRFQWKVLPQGMANSPTLCQKYVATAIHKVRHAWKQMYIIHYMDDILIAGKDGQ




QVLQCFDQLKQELTAAGLHIAPEKVQLQDPYTYLGFELNGPKITNQKAVIRKDKLQTLNDFQKLLGDINWLRPYLKLTTGDLKPLFDTLKGDSDPNSHR




SLSKEALASLEKVETAIAEQFVTHINYSLPLIFLIFNTALTPTGLFWQDNPIMWIHLPASPKKVLLPYYDAIADLIILGRDHSKKYFGIEPSTIIQPYSKSQIDW




LMQNTEMWPIACASFVGILDNHYPPNKLIQFCKLHTFVFPQIISKTPLNNALLVFTDGSSTGMAAYTLTDTTIKFQTNLNSAQLVELQALIAVLSAFPNQPL




NIYTDSAYLAHSIPLLETVAQIKHISETAKLFLQCQQLIYNRSIPFYIGHVRAHSGLPGPIAQGNQRADLATKIVASNINT





MPMV_P07572_2mutB
8,098
LTAAIDILAPQQCAEPITWKSDEPVWVDQWPLTNDKLAAAQQLVQEQLEAGHITESSSPWNTPIFVIKKKSGKWRLLQDLRAVNATMVLMGALQPGLP




SPVAPPQGYLKIIIDLKDCFFSIPLHPSDQKRFAFSLPSTNFKEPMQRFQWKVLPQGMANSPTLCQKYVATAIHKVRHAWKQMYIIHYMDDILIAGKDGQ




QVLQCFDQLKQELTAAGLHIAPEKVQLQDPYTYLGFELNGPKITNQKAVIRKDKLQTLNDFQKLLGDINWLRPYLKLTTGDLKPLFDTLKPDSDPNSHRS




LSKEALASLEKVETAIAEQFVTHINYSLPLIFLIFNTALTPTGLFWQDNPIMWIHLPASPKKVLLPYYDAIADLIILGRDHSKKYFGIEPSTIIQPYSKSQIDWL




MQNTEMWPIACASFVGILDNHYPPNKLIQFCKLHTFVFPQIISKTPLNNALLVFTDGSSTGMAAYTLTDTTIKFQTNLNSAQLVELQALIAVLSAFPNQPL




NIYTDSAYLAHSIPLLETVAQIKHISETAKLFLQCQQLIYNRSIPFYIGHVRAHSGLPGPIAQGNQRADLATKIVASNINT





PERV_Q4VFZ2
8,099
TLQLDDEYRLYSPLVKPDQNIQFWLEQFPQAWAETAGMGLAKQVPPQVIQLKASATPVSVRQYPLSKEAQEGIRPHVQRLIQQGILVPVQSPWNTPLL




PVRKPGTNDYRPVQDLREVNKRVQDIHPTVPNPYNLLCALPPQRSWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPGTGRTGQLTWTRLPQGFKNS




PTIFDEALHRDLANFRIQHPQVTLLQYVDDLLLAGATKQDCLEGTKALLLELSDLGYRASAKKAQICRREVTYLGYSLRDGQRWLTEARKKTVVQIPAPT




TAKQVREFLGTAGFCRLWIPGFATLAAPLYPLTKEKGEFSWAPEHQKAFDAIKKALLSAPALALPDVTKPFTLYVDERKGVARGVLTQTLGPWRRPVA




YLSKKLDPVASGWPVCLKAIAAVAILVKDADKLTLGQNITVIAPHALENIVRQPPDRWMTNARMTHYQSLLLTERVTFAPPAALNPATLLPEETDEPVTH




DCHQLLIEETGVRKDLTDIPLTGEVLTWFTDGSSYVVEGKRMAGAAVVDGTRTIWASSLPEGTSAQKAELMALTQALRLAEGKSINIYTDSRYAFATAH




VHGAIYKQRGLLTSAGREIKNKEEILSLLEALHLPKRLAIIHCPGHQKAKDPISRGNQMADRVAKQAAQGVNLL





PERV_Q4VFZ2
8,100
TLQLDDEYRLYSPLVKPDQNIQFWLEQFPQAWAETAGMGLAKQVPPQVIQLKASATPVSVRQYPLSKEAQEGIRPHVQRLIQQGILVPVQSPWNTPLL




PVRKPGTNDYRPVQDLREVNKRVQDIHPTVPNPYNLLCALPPQRSWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPGTGRTGQLTWTRLPQGFKNS




PTIFDEALHRDLANFRIQHPQVTLLQYVDDLLLAGATKQDCLEGTKALLLELSDLGYRASAKKAQICRREVTYLGYSLRDGQRWLTEARKKTVVQIPAPT




TAKQVREFLGTAGFCRLWIPGFATLAAPLYPLTKEKGEFSWAPEHQKAFDAIKKALLSAPALALPDVTKPFTLYVDERKGVARGVLTQTLGPWRRPVA




YLSKKLDPVASGWPVCLKAIAAVAILVKDADKLTLGQNITVIAPHALENIVRQPPDRWMTNARMTHYQSLLLTERVTFAPPAALNPATLLPEETDEPVTH




DCHQLLIEETGVRKDLTDIPLTGEVLTWFTDGSSYVVEGKRMAGAAVVDGTRTIWASSLPEGTSAQKAELMALTQALRLAEGKSINIYTDSRYAFATAH




VHGAIYKQRGLLTSAGREIKNKEEILSLLEALHLPKRLAIIHCPGHQKAKDPISRGNQMADRVAKQAAQGVNLL





PERV_Q4VFZ2_3mut
8,101
TLQLDDEYRLYSPLVKPDQNIQFWLEQFPQAWAETAGMGLAKQVPPQVIQLKASATPVSVRQYPLSKEAQEGIRPHVQRLIQQGILVPVQSPWNTPLL




PVRKPGTNDYRPVQDLREVNKRVQDIHPTVPNPYNLLCALPPQRSWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPGTGRTGQLTWTRLPQGFKNS




PTIFNEALHRDLANFRIQHPQVTLLQYVDDLLLAGATKQDCLEGTKALLLELSDLGYRASAKKAQICRREVTYLGYSLRDGQRWLTEARKKTVVQIPAPT




TAKQVREFLGTAGFCRLWIPGFATLAAPLYPLTKPKGEFSWAPEHQKAFDAIKKALLSAPALALPDVTKPFTLYVDERKGVARGVLTQTLGPWRRPVA




YLSKKLDPVASGWPVCLKAIAAVAILVKDADKLTLGQNITVIAPHALENIVRQPPDRWMTNARMTHYQSLLLTERVTFAPPAALNPATLLPEETDEPVTH




DCHQLLIEETGVRKDLTDIPLTGEVLTWFTDGSSYVVEGKRMAGAAVVDGTRTIWASSLPEGTSAQKAELMALTQALRLAEGKSINIYTDSRYAFATAH




VHGAIYKQRGWLTSAGREIKNKEEILSLLEALHLPKRLAIIHCPGHQKAKDPISRGNQMADRVAKQAAQGVNLL





PERV_Q4VFZ2_3mut
8,102
TLQLDDEYRLYSPLVKPDQNIQFWLEQFPQAWAETAGMGLAKQVPPQVIQLKASATPVSVRQYPLSKEAQEGIRPHVQRLIQQGILVPVQSPWNTPLL




PVRKPGTNDYRPVQDLREVNKRVQDIHPTVPNPYNLLCALPPQRSWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPGTGRTGQLTWTRLPQGFKNS




PTIFNEALHRDLANFRIQHPQVTLLQYVDDLLLAGATKQDCLEGTKALLLELSDLGYRASAKKAQICRREVTYLGYSLRDGQRWLTEARKKTVVQIPAPT




TAKQVREFLGTAGFCRLWIPGFATLAAPLYPLTKPKGEFSWAPEHQKAFDAIKKALLSAPALALPDVTKPFTLYVDERKGVARGVLTQTLGPWRRPVA




YLSKKLDPVASGWPVCLKAIAAVAILVKDADKLTLGQNITVIAPHALENIVRQPPDRWMTNARMTHYQSLLLTERVTFAPPAALNPATLLPEETDEPVTH




DCHQLLIEETGVRKDLTDIPLTGEVLTWFTDGSSYVVEGKRMAGAAVVDGTRTIWASSLPEGTSAQKAELMALTQALRLAEGKSINIYTDSRYAFATAH




VHGAIYKQRGWLTSAGREIKNKEEILSLLEALHLPKRLAIIHCPGHQKAKDPISRGNQMADRVAKQAAQGVNLL





PERV_Q4VFZ2_3mutA_WS
8,103
LDDEYRLYSPLVKPDQNIQFWLEQFPQAWAETAGMGLAKQVPPQVIQLKASATPVSVRQYPLSKEAQEGIRPHVQRLIQQGILVPVQSPWNTPLLPVR




KPGTNDYRPVQDLREVNKRVQDIHPTVPNPYNLLCALPPQRSWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPGTGRTGQLTWTRLPQGFKNSPTIF




NEALHRDLANFRIQHPQVTLLQYVDDLLLAGATKQDCLEGTKALLLELSDLGYRASAKKAQICRREVTYLGYSLRDGQRWLTEARKKTVVQIPAPTTAK




QVREFLGKAGFCRLFIPGFATLAAPLYPLTKPKGEFSWAPEHQKAFDAIKKALLSAPALALPDVTKPFTLYVDERKGVARGVLTQTLGPWRRPVAYLSK




KLDPVASGWPVCLKAIAAVAILVKDADKLTLGQNITVIAPHALENIVRQPPDRWMTNARMTHYQSLLLTERVTFAPPAALNPATLLPEETDEPVTHDCHQ




LLIEETGVRKDLTDIPLTGEVLTWFTDGSSYVVEGKRMAGAAVVDGTRTIWASSLPEGTSAQKAELMALTQALRLAEGKSINIYTDSRYAFATAHVHGAI




YKQRGWLTSAGREIKNKEEILSLLEALHLPKRLAIIHCPGHQKAKDPISRGNQMADRVAKQAAQGVNLLP





PERV_Q4VFZ2_3mutA_WS
8,104
LDDEYRLYSPLVKPDQNIQFWLEQFPQAWAETAGMGLAKQVPPQVIQLKASATPVSVRQYPLSKEAQEGIRPHVQRLIQQGILVPVQSPWNTPLLPVR




KPGTNDYRPVQDLREVNKRVQDIHPTVPNPYNLLCALPPQRSWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPGTGRTGQLTWTRLPQGFKNSPTIF




NEALHRDLANFRIQHPQVTLLQYVDDLLLAGATKQDCLEGTKALLLELSDLGYRASAKKAQICRREVTYLGYSLRDGQRWLTEARKKTVVQIPAPTTAK




QVREFLGKAGFCRLFIPGFATLAAPLYPLTKPKGEFSWAPEHQKAFDAIKKALLSAPALALPDVTKPFTLYVDERKGVARGVLTQTLGPWRRPVAYLSK




KLDPVASGWPVCLKAIAAVAILVKDADKLTLGQNITVIAPHALENIVRQPPDRWMTNARMTHYQSLLLTERVTFAPPAALNPATLLPEETDEPVTHDCHQ




LLIEETGVRKDLTDIPLTGEVLTWFTDGSSYVVEGKRMAGAAVVDGTRTIWASSLPEGTSAQKAELMALTQALRLAEGKSINIYTDSRYAFATAHVHGAI




YKQRGWLTSAGREIKNKEEILSLLEALHLPKRLAIIHCPGHQKAKDPISRGNQMADRVAKQAAQGVNLLP





SFV1_P23074
8,105
MDPLQLLQPLEAEIKGTKLKAHWNSGATITCVPEAFLEDERPIQTMLIKTIHGEKQQDVYYLTFKVQGRKVEAEVLASPYDYILLNPSDVPWLMKKPLQL




TVLVPLHEYQERLLQQTALPKEQKELLQKLFLKYDALWQHWENQVGHRRIKPHNIATGTLAPRPQKQYPINPKAKPSIQIVIDDLLKQGVLIQQNSTMNT




PVYPVPKPDGKWRMVLDYREVNKTIPLIAAQNQHSAGILSSIYRGKYKTTLDLTNGFWAHPITPESYWLTAFTWQGKQYCWTRLPQGFLNSPALFTAD




VVDLLKEIPNVQAYVDDIYISHDDPQEHLEQLEKIFSILLNAGYVVSLKKSEIAQREVEFLGFNITKEGRGLTDTFKQKLLNITPPKDLKQLQSILGLLNFAR




NFIPNYSELVKPLYTIVANANGKFISWTEDNSNQLQHIISVLNQADNLEERNPETRLIIKVNSSPSAGYIRYYNEGSKRPIMYVNYIFSKAEAKFTQTEKLL




TTMHKGLIKAMDLAMGQEILVYSPIVSMTKIQRTPLPERKALPVRWITWMTYLEDPRIQFHYDKSLPELQQIPNVTEDVIAKTKHPSEFAMVFYTDGSAIK




HPDVNKSHSAGMGIAQVQFIPEYKIVHQWSIPLGDHTAQLAEIAAVEFACKKALKISGPVLIVTDSFYVAESANKELPYWKSNGFLNNKKKPLRHVSKW




KSIAECLQLKPDIIIMHEKGHQQPMTTLHTEGNNLADKLATQGSYVVH





SFV1_P23074_2mut
8,106
MDPLQLLQPLEAEIKGTKLKAHWNSGATITCVPEAFLEDERPIQTMLIKTIHGEKQQDVYYLTFKVQGRKVEAEVLASPYDYILLNPSDVPWLMKKPLQL




TVLVPLHEYQERLLQQTALPKEQKELLQKLFLKYDALWQHWENQVGHRRIKPHNIATGTLAPRPQKQYPINPKAKPSIQIVIDDLLKQGVLIQQNSTMNT




PVYPVPKPDGKWRMVLDYREVNKTIPLIAAQNQHSAGILSSIYRGKYKTTLDLTNGFWAHPITPESYWLTAFTWQGKQYCWTRLPQGFLNSPALFNAD




VVDLLKEIPNVQAYVDDIYISHDDPQEHLEQLEKIFSILLNAGYVVSLKKSEIAQREVEFLGFNITKEGRGLTDTFKQKLLNITPPKDLKQLQSILGLLNFAR




NFIPNYSELVKPLYTIVAPANGKFISWTEDNSNQLQHIISVLNQADNLEERNPETRLIIKVNSSPSAGYIRYYNEGSKRPIMYVNYIFSKAEAKFTQTEKLLT




TMHKGLIKAMDLAMGQEILVYSPIVSMTKIQRTPLPERKALPVRWITWMTYLEDPRIQFHYDKSLPELQQIPNVTEDVIAKTKHPSEFAMVFYTDGSAIKH




PDVNKSHSAGMGIAQVQFIPEYKIVHQWSIPLGDHTAQLAEIAAVEFACKKALKISGPVLIVTDSFYVAESANKELPYWKSNGFLNNKKKPLRHVSKWK




SIAECLQLKPDIIIMHEKGHQQPMTTLHTEGNNLADKLATQGSYVVH





SFV1_P23074_2mutA
8,107
MDPLQLLQPLEAEIKGTKLKAHWNSGATITCVPEAFLEDERPIQTMLIKTIHGEKQQDVYYLTFKVQGRKVEAEVLASPYDYILLNPSDVPWLMKKPLQL




TVLVPLHEYQERLLQQTALPKEQKELLQKLFLKYDALWQHWENQVGHRRIKPHNIATGTLAPRPQKQYPINPKAKPSIQIVIDDLLKQGVLIQQNSTMNT




PVYPVPKPDGKWRMVLDYREVNKTIPLIAAQNQHSAGILSSIYRGKYKTTLDLTNGFWAHPITPESYWLTAFTWQGKQYCWTRLPQGFLNSPALFNAD




VVDLLKEIPNVQAYVDDIYISHDDPQEHLEQLEKIFSILLNAGYVVSLKKSEIAQREVEFLGFNITKEGRGLTDTFKQKLLNITPPKDLKQLQSILGKLNFAR




NFIPNYSELVKPLYTIVAPANGKFISWTEDNSNQLQHIISVLNQADNLEERNPETRLIIKVNSSPSAGYIRYYNEGSKRPIMYVNYIFSKAEAKFTQTEKLLT




TMHKGLIKAMDLAMGQEILVYSPIVSMTKIQRTPLPERKALPVRWITWMTYLEDPRIQFHYDKSLPELQQIPNVTEDVIAKTKHPSEFAMVFYTDGSAIKH




PDVNKSHSAGMGIAQVQFIPEYKIVHQWSIPLGDHTAQLAEIAAVEFACKKALKISGPVLIVTDSFYVAESANKELPYWKSNGFLNNKKKPLRHVSKWK




SIAECLQLKPDIIIMHEKGHQQPMTTLHTEGNNLADKLATQGSYVVH





SFV1_P23074-Pro
8,108
VPWLMKKPLQLTVLVPLHEYQERLLQQTALPKEQKELLQKLFLKYDALWQHWENQVGHRRIKPHNIATGTLAPRPQKQYPINPKAKPSIQIVIDDLLKQ




GVLIQQNSTMNTPVYPVPKPDGKWRMVLDYREVNKTIPLIAAQNQHSAGILSSIYRGKYKTTLDLTNGFWAHPITPESYWLTAFTWQGKQYCWTRLPQ




GFLNSPALFTADVVDLLKEIPNVQAYVDDIYISHDDPQEHLEQLEKIFSILLNAGYVVSLKKSEIAQREVEFLGFNITKEGRGLTDTFKQKLLNITPPKDLKQ




LQSILGLLNFARNFIPNYSELVKPLYTIVANANGKFISWTEDNSNQLQHIISVLNQADNLEERNPETRLIIKVNSSPSAGYIRYYNEGSKRPIMYVNYIFSKA




EAKFTQTEKLLTTMHKGLIKAMDLAMGQEILVYSPIVSMTKIQRTPLPERKALPVRWITWMTYLEDPRIQFHYDKSLPELQQIPNVTEDVIAKTKHPSEFA




MVFYTDGSAIKHPDVNKSHSAGMGIAQVQFIPEYKIVHQWSIPLGDHTAQLAEIAAVEFACKKALKISGPVLIVTDSFYVAESANKELPYWKSNGFLNNK




KKPLRHVSKWKSIAECLQLKPDIIIMHEKGHQQPMTTLHTEGNNLADKLATQGSYVVH





SFV1_P23074-Pro_2mut
8,109
VPWLMKKPLQLTVLVPLHEYQERLLQQTALPKEQKELLQKLFLKYDALWQHWENQVGHRRIKPHNIATGTLAPRPQKQYPINPKAKPSIQIVIDDLLKQ




GVLIQQNSTMNTPVYPVPKPDGKWRMVLDYREVNKTIPLIAAQNQHSAGILSSIYRGKYKTTLDLTNGFWAHPITPESYWLTAFTWQGKQYCWTRLPQ




GFLNSPALFNADVVDLLKEIPNVQAYVDDIYISHDDPQEHLEQLEKIFSILLNAGYVVSLKKSEIAQREVEFLGFNITKEGRGLTDTFKQKLLNITPPKDLK




QLQSILGLLNFARNFIPNYSELVKPLYTIVAPANGKFISWTEDNSNQLQHIISVLNQADNLEERNPETRLIIKVNSSPSAGYIRYYNEGSKRPIMYVNYIFSK




AEAKFTQTEKLLTTMHKGLIKAMDLAMGQEILVYSPIVSMTKIQRTPLPERKALPVRWITWMTYLEDPRIQFHYDKSLPELQQIPNVTEDVIAKTKHPSEF




AMVFYTDGSAIKHPDVNKSHSAGMGIAQVQFIPEYKIVHQWSIPLGDHTAQLAEIAAVEFACKKALKISGPVLIVTDSFYVAESANKELPYWKSNGFLNN




KKKPLRHVSKWKSIAECLQLKPDIIIMHEKGHQQPMTTLHTEGNNLADKLATQGSYVVH





SFV1_P23074-Pro_2mutA
8,110
VPWLMKKPLQLTVLVPLHEYQERLLQQTALPKEQKELLQKLFLKYDALWQHWENQVGHRRIKPHNIATGTLAPRPQKQYPINPKAKPSIQIVIDDLLKQ




GVLIQQNSTMNTPVYPVPKPDGKWRMVLDYREVNKTIPLIAAQNQHSAGILSSIYRGKYKTTLDLTNGFWAHPITPESYWLTAFTWQGKQYCWTRLPQ




GFLNSPALFNADVVDLLKEIPNVQAYVDDIYISHDDPQEHLEQLEKIFSILLNAGYVVSLKKSEIAQREVEFLGFNITKEGRGLTDTFKQKLLNITPPKDLK




QLQSILGKLNFARNFIPNYSELVKPLYTIVAPANGKFISWTEDNSNQLQHIISVLNQADNLEERNPETRLIIKVNSSPSAGYIRYYNEGSKRPIMYVNYIFSK




AEAKFTQTEKLLTTMHKGLIKAMDLAMGQEILVYSPIVSMTKIQRTPLPERKALPVRWITWMTYLEDPRIQFHYDKSLPELQQIPNVTEDVIAKTKHPSEF




AMVFYTDGSAIKHPDVNKSHSAGMGIAQVQFIPEYKIVHQWSIPLGDHTAQLAEIAAVEFACKKALKISGPVLIVTDSFYVAESANKELPYWKSNGFLNN




KKKPLRHVSKWKSIAECLQLKPDIIIMHEKGHQQPMTTLHTEGNNLADKLATQGSYVVH





SFV3L_P27401
8,111
MDPLQLLQPLEAEIKGTKLKAHWNSGATITCVPQAFLEEEVPIKNIWIKTIHGEKEQPVYYLTFKIQGRKVEAEVISSPYDYILVSPSDIPWLMKKPLQLTT




LVPLQEYEERLLKQTMLTGSYKEKLQSLFLKYDALWQHWENQVGHRRIKPHHIATGTVNPRPQKQYPINPKAKASIQTVINDLLKQGVLIQQNSIMNTP




VYPVPKPDGKWRMVLDYREVNKTIPLIAAQNQHSAGILSSIFRGKYKTTLDLSNGFWAHSITPESYWLTAFTWLGQQYCWTRLPQGFLNSPALFTADV




VDLLKEVPNVQVYVDDIYISHDDPREHLEQLEKVFSLLLNAGYVVSLKKSEIAQHEVEFLGFNITKEGRGLTETFKQKLLNITPPRDLKQLQSILGLLNFAR




NFIPNFSELVKPLYNIIATANGKYITWTTDNSQQLQNIISMLNSAENLEERNPEVRLIMKVNTSPSAGYIRFYNEFAKRPIMYLNYVYTKAEVKFTNTEKLL




TTIHKGLIKALDLGMGQEILVYSPIVSMTKIQKTPLPERKALPIRWITWMSYLEDPRIQFHYDKTLPELQQVPTVTDDIIAKIKHPSEFSMVFYTDGSAIKHP




NVNKSHNAGMGIAQVQFKPEFTVINTWSIPLGDHTAQLAEVAAVEFACKKALKIDGPVLIVTDSFYVAESVNKELPYWQSNGFFNNKKKPLKHVSKWK




SIADCIQLKPDIIIIHEKGHQPTASTFHTEGNNLADKLATQGSYVVN





SFV3L_P27401_2mut
8,112
MDPLQLLQPLEAEIKGTKLKAHWNSGATITCVPQAFLEEEVPIKNIWIKTIHGEKEQPVYYLTFKIQGRKVEAEVISSPYDYILVSPSDIPWLMKKPLQLTT




LVPLQEYEERLLKQTMLTGSYKEKLQSLFLKYDALWQHWENQVGHRRIKPHHIATGTVNPRPQKQYPINPKAKASIQTVINDLLKQGVLIQQNSIMNTP




VYPVPKPDGKWRMVLDYREVNKTIPLIAAQNQHSAGILSSIFRGKYKTTLDLSNGFWAHSITPESYWLTAFTWLGQQYCWTRLPQGFLNSPALFNADV




VDLLKEVPNVQVYVDDIYISHDDPREHLEQLEKVFSLLLNAGYVVSLKKSEIAQHEVEFLGFNITKEGRGLTETFKQKLLNITPPRDLKQLQSILGLLNFAR




NFIPNFSELVKPLYNIIATAPGKYITWTTDNSQQLQNIISMLNSAENLEERNPEVRLIMKVNTSPSAGYIRFYNEFAKRPIMYLNYVYTKAEVKFTNTEKLL




TTIHKGLIKALDLGMGQEILVYSPIVSMTKIQKTPLPERKALPIRWITWMSYLEDPRIQFHYDKTLPELQQVPTVTDDIIAKIKHPSEFSMVFYTDGSAIKHP




NVNKSHNAGMGIAQVQFKPEFTVINTWSIPLGDHTAQLAEVAAVEFACKKALKIDGPVLIVTDSFYVAESVNKELPYWQSNGFFNNKKKPLKHVSKWK




SIADCIQLKPDIIIIHEKGHQPTASTFHTEGNNLADKLATQGSYVVN





SFV3L_P27401_2mutA
8,113
MDPLQLLQPLEAEIKGTKLKAHWNSGATITCVPQAFLEEEVPIKNIWIKTIHGEKEQPVYYLTFKIQGRKVEAEVISSPYDYILVSPSDIPWLMKKPLQLTT




LVPLQEYEERLLKQTMLTGSYKEKLQSLFLKYDALWQHWENQVGHRRIKPHHIATGTVNPRPQKQYPINPKAKASIQTVINDLLKQGVLIQQNSIMNTP




VYPVPKPDGKWRMVLDYREVNKTIPLIAAQNQHSAGILSSIFRGKYKTTLDLSNGFWAHSITPESYWLTAFTWLGQQYCWTRLPQGFLNSPALFNADV




VDLLKEVPNVQVYVDDIYISHDDPREHLEQLEKVFSLLLNAGYVVSLKKSEIAQHEVEFLGFNITKEGRGLTETFKQKLLNITPPRDLKQLQSILGKLNFA




RNFIPNFSELVKPLYNIIATAPGKYITWTTDNSQQLQNIISMLNSAENLEERNPEVRLIMKVNTSPSAGYIRFYNEFAKRPIMYLNYVYTKAEVKFTNTEKL




LTTIHKGLIKALDLGMGQEILVYSPIVSMTKIQKTPLPERKALPIRWITWMSYLEDPRIQFHYDKTLPELQQVPTVTDDIIAKIKHPSEFSMVFYTDGSAIKH




PNVNKSHNAGMGIAQVQFKPEFTVINTWSIPLGDHTAQLAEVAAVEFACKKALKIDGPVLIVTDSFYVAESVNKELPYWQSNGFFNNKKKPLKHVSKW




KSIADCIQLKPDIIIIHEKGHQPTASTFHTEGNNLADKLATQGSYVVN





SFV3L_P27401-Pro
8,114
IPWLMKKPLQLTTLVPLQEYEERLLKQTMLTGSYKEKLQSLFLKYDALWQHWENQVGHRRIKPHHIATGTVNPRPQKQYPINPKAKASIQTVINDLLKQ




GVLIQQNSIMNTPVYPVPKPDGKWRMVLDYREVNKTIPLIAAQNQHSAGILSSIFRGKYKTTLDLSNGFWAHSITPESYWLTAFTWLGQQYCWTRLPQ




GFLNSPALFTADVVDLLKEVPNVQVYVDDIYISHDDPREHLEQLEKVFSLLLNAGYVVSLKKSEIAQHEVEFLGFNITKEGRGLTETFKQKLLNITPPRDL




KQLQSILGLLNFARNFIPNFSELVKPLYNIIATANGKYITWTTDNSQQLQNIISMLNSAENLEERNPEVRLIMKVNTSPSAGYIRFYNEFAKRPIMYLNYVY




TKAEVKFTNTEKLLTTIHKGLIKALDLGMGQEILVYSPIVSMTKIQKTPLPERKALPIRWITWMSYLEDPRIQFHYDKTLPELQQVPTVTDDIIAKIKHPSEF




SMVFYTDGSAIKHPNVNKSHNAGMGIAQVQFKPEFTVINTWSIPLGDHTAQLAEVAAVEFACKKALKIDGPVLIVTDSFYVAESVNKELPYWQSNGFFN




NKKKPLKHVSKWKSIADCIQLKPDIIIIHEKGHQPTASTFHTEGNNLADKLATQGSYVVN





SFV3L_P27401-Pro_2mut
8,115
IPWLMKKPLQLTTLVPLQEYEERLLKQTMLTGSYKEKLQSLFLKYDALWQHWENQVGHRRIKPHHIATGTVNPRPQKQYPINPKAKASIQTVINDLLKQ




GVLIQQNSIMNTPVYPVPKPDGKWRMVLDYREVNKTIPLIAAQNQHSAGILSSIFRGKYKTTLDLSNGFWAHSITPESYWLTAFTWLGQQYCWTRLPQ




GFLNSPALFNADVVDLLKEVPNVQVYVDDIYISHDDPREHLEQLEKVFSLLLNAGYVVSLKKSEIAQHEVEFLGFNITKEGRGLTETFKQKLLNITPPRDL




KQLQSILGLLNFARNFIPNFSELVKPLYNIIATAPGKYITWTTDNSQQLQNIISMLNSAENLEERNPEVRLIMKVNTSPSAGYIRFYNEFAKRPIMYLNYVY




TKAEVKFTNTEKLLTTIHKGLIKALDLGMGQEILVYSPIVSMTKIQKTPLPERKALPIRWITWMSYLEDPRIQFHYDKTLPELQQVPTVTDDIIAKIKHPSEF




SMVFYTDGSAIKHPNVNKSHNAGMGIAQVQFKPEFTVINTWSIPLGDHTAQLAEVAAVEFACKKALKIDGPVLIVTDSFYVAESVNKELPYWQSNGFFN




NKKKPLKHVSKWKSIADCIQLKPDIIIIHEKGHQPTASTFHTEGNNLADKLATQGSYVVN





SFV3L_P27401-Pro_2mutA
8,116
IPWLMKKPLQLTTLVPLQEYEERLLKQTMLTGSYKEKLQSLFLKYDALWQHWENQVGHRRIKPHHIATGTVNPRPQKQYPINPKAKASIQTVINDLLKQ




GVLIQQNSIMNTPVYPVPKPDGKWRMVLDYREVNKTIPLIAAQNQHSAGILSSIFRGKYKTTLDLSNGFWAHSITPESYWLTAFTWLGQQYCWTRLPQ




GFLNSPALFNADVVDLLKEVPNVQVYVDDIYISHDDPREHLEQLEKVFSLLLNAGYVVSLKKSEIAQHEVEFLGFNITKEGRGLTETFKQKLLNITPPRDL




KQLQSILGKLNFARNFIPNFSELVKPLYNIIATAPGKYITWTTDNSQQLQNIISMLNSAENLEERNPEVRLIMKVNTSPSAGYIRFYNEFAKRPIMYLNYVY




TKAEVKFTNTEKLLTTIHKGLIKALDLGMGQEILVYSPIVSMTKIQKTPLPERKALPIRWITWMSYLEDPRIQFHYDKTLPELQQVPTVTDDIIAKIKHPSEF




SMVFYTDGSAIKHPNVNKSHNAGMGIAQVQFKPEFTVINTWSIPLGDHTAQLAEVAAVEFACKKALKIDGPVLIVTDSFYVAESVNKELPYWQSNGFFN




NKKKPLKHVSKWKSIADCIQLKPDIIIIHEKGHQPTASTFHTEGNNLADKLATQGSYVVN





SFVCP_Q87040
8,117
MNPLQLLQPLPAEVKGTKLLAHWNSGATITCIPESFLEDEQPIKQTLIKTIHGEKQQNVYYLTFKVKGRKVEAEVIASPYEYILLSPTDVPWLTQQPLQLTI




LVPLQEYQDRILNKTALPEEQKQQLKALFTKYDNLWQHWENQVGHRKIRPHNIATGDYPPRPQKQYPINPKAKPSIQIVIDDLLKQGVLTPQNSTMNTP




VYPVPKPDGRWRMVLDYREVNKTIPLTAAQNQHSAGILATIVRQKYKTTLDLANGFWAHPITPDSYWLTAFTWQGKQYCWTRLPQGFLNSPALFTAD




AVDLLKEVPNVQVYVDDIYLSHDNPHEHIQQLEKVFQILLQAGYVVSLKKSEIGQRTVEFLGFNITKEGRGLTDTFKTKLLNVTPPKDLKQLQSILGLLNF




ARNFIPNFAELVQTLYNLIASSKGKYIEWTEDNTKQLNKVIEALNTASNLEERLPDQRLVIKVNTSPSAGYVRYYNESGKKPIMYLNYVFSKAELKFSMLE




KLLTTMHKALIKAMDLAMGQEILVYSPIVSMTKIQKTPLPERKALPIRWITWMTYLEDPRIQFHYDKTLPELKHIPDVYTSSIPPLKHPSQYEGVFCTDGSA




IKSPDPTKSNNAGMGIVHAIYNPEYKILNQWSIPLGHHTAQMAEIAAVEFACKKALKVPGPVLVITDSFYVAESANKELPYWKSNGFVNNKKEPLKHISK




WKSIAECLSIKPDITIQHEKGHQPINTSIHTEGNALADKLATQGSYVVN





SFVCP_Q87040_2mut
8,118
MNPLQLLQPLPAEVKGTKLLAHWNSGATITCIPESFLEDEQPIKQTLIKTIHGEKQQNVYYLTFKVKGRKVEAEVIASPYEYILLSPTDVPWLTQQPLQLTI




LVPLQEYQDRINKTALPEEQKQQLKALFTKYDNLWQHWENQVGHRKIRPHNIATGDYPPRPQKQYPINPKAKPSIQIVIDDLLKQGVLTPQNSTMNTP




VYPVPKPDGRWRMVLDYREVNKTIPLTAAQNQHSAGILATIVRQKYKTTLDLANGFWAHPITPDSYWLTAFTWQGKQYCWTRLPQGFLNSPALFNAD




AVDLLKEVPNVQVYVDDIYLSHDNPHEHIQQLEKVFQILLQAGYVVSLKKSEIGQRTVEFLGFNITKEGRGLTDTFKTKLLNVTPPKDLKQLQSILGLLNF




ARNFIPNFAELVQTLYNLIASSPGKYIEWTEDNTKQLNKVIEALNTASNLEERLPDQRLVIKVNTSPSAGYVRYYNESGKKPIMYLNYVFSKAELKFSMLE




KLLTTMHKALIKAMDLAMGQEILVYSPIVSMTKIQKTPLPERKALPIRWITWMTYLEDPRIQFHYDKTLPELKHIPDVYTSSIPPLKHPSQYEGVFCTDGSA




IKSPDPTKSNNAGMGIVHAIYNPEYKILNQWSIPLGHHTAQMAEIAAVEFACKKALKVPGPVLVITDSFYVAESANKELPYWKSNGFVNNKKEPLKHISK




WKSIAECLSIKPDITIQHEKGHQPINTSIHTEGNALADKLATQGSYVVN





SFVCP_Q87040_2mutA
8,119
MNPLQLLQPLPAEVKGTKLLAHWNSGATITCIPESFLEDEQPIKQTLIKTIHGEKQQNVYYLTFKVKGRKVEAEVIASPYEYILLSPTDVPWLTQQPLQLTI




LVPLQEYQDRILNKTALPEEQKQQLKALFTKYDNLWQHWENQVGHRKIRPHNIATGDYPPRPQKQYPINPKAKPSIQIVIDDLLKQGVLTPQNSTMNTP




VYPVPKPDGRWRMVLDYREVNKTIPLTAAQNQHSAGILATIVRQKYKTTLDLANGFWAHPITPDSYWLTAFTWQGKQYCWTRLPQGFLNSPALFNAD




AVDLLKEVPNVQVYVDDIYLSHDNPHEHIQQLEKVFQILLQAGYVVSLKKSEIGQRTVEFLGFNITKEGRGLTDTFKTKLLNVTPPKDLKQLQSILGKLNF




ARNFIPNFAELVQTLYNLIASSPGKYIEWTEDNTKQLNKVIEALNTASNLEERLPDQRLVIKVNTSPSAGYVRYYNESGKKPIMYLNYVFSKAELKFSMLE




KLLTTMHKALIKAMDLAMGQEILVYSPIVSMTKIQKTPLPERKALPIRWITWMTYLEDPRIQFHYDKTLPELKHIPDVYTSSIPPLKHPSQYEGVFCTDGSA




IKSPDPTKSNNAGMGIVHAIYNPEYKILNQWSIPLGHHTAQMAEIAAVEFACKKALKVPGPVLVITDSFYVAESANKELPYWKSNGFVNNKKEPLKHISK




WKSIAECLSIKPDITIQHEKGHQPINTSIHTEGNALADKLATQGSYVVN





SFVCP_Q87040-Pro
8,120
VPWLTQQPLQLTILVPLQEYQDRILNKTALPEEQKQQLKALFTKYDNLWQHWENQVGHRKIRPHNIATGDYPPRPQKQYPINPKAKPSIQIVIDDLLKQG




VLTPQNSTMNTPVYPVPKPDGRWRMVLDYREVNKTIPLTAAQNQHSAGILATIVRQKYKTTLDLANGFWAHPITPDSYWLTAFTWQGKQYCWTRLPQ




GFLNSPALFTADAVDLLKEVPNVQVYVDDIYLSHDNPHEHIQQLEKVFQILLQAGYVVSLKKSEIGQRTVEFLGFNITKEGRGLTDTFKTKLLNVTPPKDL




KQLQSILGLLNFARNFIPNFAELVQTLYNLIASSKGKYIEWTEDNTKQLNKVIEALNTASNLEERLPDQRLVIKVNTSPSAGYVRYYNESGKKPIMYLNYV




FSKAELKFSMLEKLLTTMHKALIKAMDLAMGQEILVYSPIVSMTKIQKTPLPERKALPIRWITWMTYLEDPRIQFHYDKTLPELKHIPDVYTSSIPPLKHPS




QYEGVFCTDGSAIKSPDPTKSNNAGMGIVHAIYNPEYKILNQWSIPLGHHTAQMAEIAAVEFACKKALKVPGPVLVITDSFYVAESANKELPYWKSNGF




VNNKKEPLKHISKWKSIAECLSIKPDITIQHEKGHQPINTSIHTEGNALADKLATQGSYVVN





SFVCP_Q87040-Pro_2mut
8,121
VPWLTQQPLQLTILVPLQEYQDRILNKTALPEEQKQQLKALFTKYDNLWQHWENQVGHRKIRPHNIATGDYPPRPQKQYPINPKAKPSIQIVIDDLLKQG




VLTPQNSTMNTPVYPVPKPDGRWRMVLDYREVNKTIPLTAAQNQHSAGILATIVRQKYKTTLDLANGFWAHPITPDSYWLTAFTWQGKQYCWTRLPQ




GFLNSPALFNADAVDLLKEVPNVQVYVDDIYLSHDNPHEHIQQLEKVFQILLQAGYVVSLKKSEIGQRTVEFLGFNITKEGRGLTDTFKTKLLNVTPPKDL




KQLQSILGLLNFARNFIPNFAELVQTLYNLIASSPGKYIEWTEDNTKQLNKVIEALNTASNLEERLPDQRLVIKVNTSPSAGYVRYYNESGKKPIMYLNYV




FSKAELKFSMLEKLLTTMHKALIKAMDLAMGQEILVYSPIVSMTKIQKTPLPERKALPIRWITWMTYLEDPRIQFHYDKTLPELKHIPDVYTSSIPPLKHPS




QYEGVFCTDGSAIKSPDPTKSNNAGMGIVHAIYNPEYKILNQWSIPLGHHTAQMAEIAAVEFACKKALKVPGPVLVITDSFYVAESANKELPYWKSNGF




VNNKKEPLKHISKWKSIAECLSIKPDITIQHEKGHQPINTSIHTEGNALADKLATQGSYVVN





SFVCP_Q87040-Pro_2mutA
8,122
VPWLTQQPLQLTILVPLQEYQDRILNKTALPEEQKQQLKALFTKYDNLWQHWENQVGHRKIRPHNIATGDYPPRPQKQYPINPKAKPSIQIVIDDLLKQG




VLTPQNSTMNTPVYPVPKPDGRWRMVLDYREVNKTIPLTAAQNQHSAGILATIVRQKYKTTLDLANGFWAHPITPDSYWLTAFTWQGKQYCWTRLPQ




GFLNSPALFNADAVDLLKEVPNVQVYVDDIYLSHDNPHEHIQQLEKVFQILLQAGYVVSLKKSEIGQRTVEFLGFNITKEGRGLTDTFKTKLLNVTPPKDL




KQLQSILGKLNFARNFIPNFAELVQTLYNLIASSPGKYIEWTEDNTKQLNKVIEALNTASNLEERLPDQRLVIKVNTSPSAGYVRYYNESGKKPIMYLNYV




FSKAELKFSMLEKLLTTMHKALIKAMDLAMGQEILVYSPIVSMTKIQKTPLPERKALPIRWITWMTYLEDPRIQFHYDKTLPELKHIPDVYTSSIPPLKHPS




QYEGVFCTDGSAIKSPDPTKSNNAGMGIVHAIYNPEYKILNQWSIPLGHHTAQMAEIAAVEFACKKALKVPGPVLVITDSFYVAESANKELPYWKSNGF




VNNKKEPLKHISKWKSIAECLSIKPDITIQHEKGHQPINTSIHTEGNALADKLATQGSYVVN





SMRVH_P03364
8,123
PRSRAIDIPVPHADKISWKITDPVWVDQWPLTYEKTLAAIALVQEQLAAGHIEPTNSPWNTPIFIIKKKSGSWRLLQDLRAVNKVMVPMGALQPGLPSPV




AIPLNYHKIVIDLKDCFFTIPLHPEDRPYFAFSVPQINFQSPMPRYQWKVLPQGMANSPTLCQKFVAAAIAPVRSQWPEAYILHYMDDILLACDSAEAAK




ACYAHIISCLTSYGLKIAPDKVQVSEPFSYLGFELHHQQVFTPRVCLKTDHLKTLNDFQKLLGDIQWLRPYLKLPTSALVPLNNILKGDPNPLSVRALTPE




AKQSLALINKAIQNQSVQQISYNLPLVLLLLPTPHTPTAVFWQPNGTDPTKNGSPLLWLHLPASPSKVLLTYPSLLAMLIIKGRYTGRQLFGRDPHSIIIPY




TQDQLTWLLQTSDEWAIALSSFTGDIDNHYPSDPVIQFAKLHQFIFPKITKCAPIPQATLVFTDGSSNGIAAYVIDNQPISIKSPYLSAQLVELYAILQVFTV




LAHQPFNLYTDSAYIAQSVPLLETVPFIKSSTNATPLFSKLQQLILNRQHPFFIGHLRAHLNLPGPLAEGNALADAATQIFPIISD





SMRVH_P03364_2mut
8,124
PRSRAIDIPVPHADKISWKITDPVWVDQWPLTYEKTLAAIALVQEQLAAGHIEPTNSPWNTPIFIIKKKSGSWRLLQDLRAVNKVMVPMGALQPGLPSPV




AIPLNYHKIVIDLKDCFFTIPLHPEDRPYFAFSVPQINFQSPMPRYQWKVLPQGMANSPTLCQKFVAAAIAPVRSQWPEAYILHYMDDILLACDSAEAAK




ACYAHIISCLTSYGLKIAPDKVQVSEPFSYLGFELHHQQVFTPRVCLKTDHLKTLNDFQKLLGDIQWLRPYLKLPTSALVPLNNILKPDPNPLSVRALTPE




AKQSLALINKAIQNQSVQQISYNLPLVLLLLPTPHTPTAVFWQPNGTDPTKNGSPLLWLHLPASPSKVLLTYPSLLAMLIIKGRYTGRQLFGRDPHSIIIPY




TQDQLTWLLQTSDEWAIALSSFTGDIDNHYPSDPVIQFAKLHQFIFPKITKCAPIPQATLVFTDGSSNGIAAYVIDNQPISIKSPYLSAQLVELYAILQVFTV




LAHQPFNLYTDSAYIAQSVPLLETVPFIKSSTNATPLFSKLQQLILNRQHPFFIGHLRAHLNLPGPLAEGNALADAATQIFPIISD





SMRVH_P03364_2mutB
8,125
PRSRAIDIPVPHADKISWKITDPVWVDQWPLTYEKTLAAIALVQEQLAAGHIEPTNSPWNTPIFIIKKKSGSWRLLQDLRAVNKVMVPMGALQPGLPSPV




APPLNYHKIVIDLKDCFFTIPLHPEDRPYFAFSVPQINFQSPMPRYQWKVLPQGMANSPTLCQKFVAAAIAPVRSQWPEAYILHYMDDILLACDSAEAAK




ACYAHIISCLTSYGLKIAPDKVQVSEPFSYLGFELHHQQVFTPRVCLKTDHLKTLNDFQKLLGDIQWLRPYLKLPTSALVPLNNILKPDPNPLSVRALTPE




AKQSLALINKAIQNQSVQQISYNLPLVLLLLPTPHTPTAVFWQPNGTDPTKNGSPLLWLHLPASPSKVLLTYPSLLAMLIIKGRYTGRQLFGRDPHSIIIPY




TQDQLTWLLQTSDEWAIALSSFTGDIDNHYPSDPVIQFAKLHQFIFPKITKCAPIPQATLVFTDGSSNGIAAYVIDNQPISIKSPYLSAQLVELYAILQVFTV




LAHQPFNLYTDSAYIAQSVPLLETVPFIKSSTNATPLFSKLQQLILNRQHPFFIGHLRAHLNLPGPLAEGNALADAATQIFPIISD





SRV2_P51517
8,126
LATAVDILAPQRYADPITWKSDEPVWVDQWPLTQEKLAAAQQLVQEQLQAGHIIESNSPWNTPIFVIKKKSGKWRLLQDLRAVNATMVLMGALQPGLP




SPVAIPQGYFKIVIDLKDCFFTIPLQPVDQKRFAFSLPSTNFKQPMKRYQWKVLPQGMANSPTLCQKYVAAAIEPVRKSWAQMYIIHYMDDILIAGKLGE




QVLQCFAQLKQALTTTGLQIAPEKVQLQDPYTYLGFQINGPKITNQKAVIRRDKLQTLNDFQKLLGDINWLRPYLHLTTGDLKPLFDILKGDSNPNSPRS




LSEAALASLQKVETAIAEQFVTQIDYTQPLTFLIFNTTLTPTGLFWQNNPVMWVHLPASPKKVLLPYYDAIADLIILGRDNSKKYFGLEPSTIIQPYSKSQIH




WLMQNTETWPIACASYAGNIDNHYPPNKLIQFCKLHAVVFPRIISKTPLDNALLVFTDGSSTGIAAYTFEKTTVRFKTSHTSAQLVELQALIAVLSAFPHR




ALNVYTDSAYLAHSIPLLETVSHIKHISDTAKFFLQCQQLIYNRSIPFYLGHIRAHSGLPGPLSQGNHITDLATKVVATTLTT





SRV2_P51517_2mutB
8,127
LATAVDILAPQRYADPITWKSDEPVWVDQWPLTQEKLAAAQQLVQEQLQAGHIIESNSPWNTPIFVIKKKSGKWRLLQDLRAVNATMVLMGALQPGLP




SPVAPPQGYFKIVIDLKDCFFTIPLQPVDQKRFAFSLPSTNFKQPMKRYQWKVLPQGMANSPTLCQKYVAAAIEPVRKSWAQMYIIHYMDDILIAGKLGE




QVLQCFAQLKQALTTTGLQIAPEKVQLQDPYTYLGFQINGPKITNQKAVIRRDKLQTLNDFQKLLGDINWLRPYLHLTTGDLKPLFDILKGDSNPNSPRS




LSEAALASLQKVETAIAEQFVTQIDYTQPLTFLIFNTTLTPTGLFWQNNPVMWVHLPASPKKVLLPYYDAIADLIILGRDNSKKYFGLEPSTIIQPYSKSQIH




WLMQNTETWPIACASYAGNIDNHYPPNKLIQFCKLHAVVFPRIISKTPLDNALLVFTDGSSTGIAAYTFEKTTVRFKTSHTSAQLVELQALIAVLSAFPHR




ALNVYTDSAYLAHSIPLLETVSHIKHISDTAKFFLQCQQLIYNRSIPFYLGHIRAHSGLPGPLSQGNHITDLATKVVATTLTT





WDSV_O92815
8,128
SCQTKNTLNIDEYLLQFPDQLWASLPTDIGRMLVPPITIKIKDNASLPSIRQYPLPKDKTEGLRPLISSLENQGILIKCHSPCNTPIFPIKKAGRDEYRMIHD




LRAINNIVAPLTAVVASPTTVLSNLAPSLHWFTVIDLSNAFFSVPIHKDSQYLFAFTFEGHQYTWTVLPQGFIHSPTLFSQALYQSLHKIKFKISSEICIYMD




DVLIASKDRDTNLKDTAVMLQHLASEGHKVSKKKLQLCQQEVVYLGQLLTPEGRKILPDRKVTVSQFQQPTTIRQIRAFLGLVGYCRHWIPEFSIHSKFL




EKQLKKDTAEPFQLDDQQVEAFNKLKHAITTAPVLVVPDPAKPFQLYTSHSEHASIAVLTQKHAGRTRPIAFLSSKFDAIESGLPPCLKACASIHRSLTQA




DSFILGAPLIIYTTHAICTLLQRDRSQLVTASRFSKWEADLLRPELTFVACSAVSPAHLYMQSCENNIPPHDCVLLTHTISRPRPDLSDLPIPDPDMTLFSD




GSYTTGRGGAAVVMHRPVTDDFIIIHQQPGGASAQTAELLALAAACHLATDKTVNIYTDSRYAYGVVHDFGHLWMHRGFVTSAGTPIKNHKEIEYLLKQ




IMKPKQVSVIKIEAHTKGVSMEVRGNAAADEAAKNAVFLVQR





WDSV_O92815_2mut
8,129
SCQTKNTLNIDEYLLQFPDQLWASLPTDIGRMLVPPITIKIKDNASLPSIRQYPLPKDKTEGLRPLISSLENQGILIKCHSPCNTPIFPIKKAGRDEYRMIHD




LRAINNIVAPLTAVVASPTTVLSNLAPSLHWFTVIDLSNAFFSVPIHKDSQYLFAFTFEGHQYTWTVLPQGFIHSPTLFNQALYQSLHKIKFKISSEICIYMD




DVLIASKDRDTNLKDTAVMLQHLASEGHKVSKKKLQLCQQEVVYLGQLLTPEGRKILPDRKVTVSQFQQPTTIRQIRAFLGLVGYCRHWIPEFSIHSKFL




EKQLKPDTAEPFQLDDQQVEAFNKLKHAITTAPVLVVPDPAKPFQLYTSHSEHASIAVLTQKHAGRTRPIAFLSSKFDAIESGLPPCLKACASIHRSLTQA




DSFILGAPLIIYTTHAICTLLQRDRSQLVTASRFSKWEADLLRPELTFVACSAVSPAHLYMQSCENNIPPHDCVLLTHTISRPRPDLSDLPIPDPDMTLFSD




GSYTTGRGGAAVVMHRPVTDDFIIIHQQPGGASAQTAELLALAAACHLATDKTVNIYTDSRYAYGVVHDFGHLWMHRGFVTSAGTPIKNHKEIEYLLKQ




IMKPKQVSVIKIEAHTKGVSMEVRGNAAADEAAKNAVFLVQR





WDSV_O92815_2mutA
8,130
SCQTKNTLNIDEYLLQFPDQLWASLPTDIGRMLVPPITIKIKDNASLPSIRQYPLPKDKTEGLRPLISSLENQGILIKCHSPCNTPIFPIKKAGRDEYRMIHD




LRAINNIVAPLTAVVASPTTVLSNLAPSLHWFTVIDLSNAFFSVPIHKDSQYLFAFTFEGHQYTWTVLPQGFIHSPTLFNQALYQSLHKIKFKISSEICIYMD




DVLIASKDRDTNLKDTAVMLQHLASEGHKVSKKKLQLCQQEVVYLGQLLTPEGRKILPDRKVTVSQFQQPTTIRQIRAFLGKVGYCRHFIPEFSIHSKFL




EKQLKPDTAEPFQLDDQQVEAFNKLKHAITTAPVLVVPDPAKPFQLYTSHSEHASIAVLTQKHAGRTRPIAFLSSKFDAIESGLPPCLKACASIHRSLTQA




DSFILGAPLIIYTTHAICTLLQRDRSQLVTASRFSKWEADLLRPELTFVACSAVSPAHLYMQSCENNIPPHDCVLLTHTISRPRPDLSDLPIPDPDMTLFSD




GSYTTGRGGAAVVMHRPVTDDFIIIHQQPGGASAQTAELLALAAACHLATDKTVNIYTDSRYAYGVVHDFGHLWMHRGFVTSAGTPIKNHKEIEYLLKQ




IMKPKQVSVIKIEAHTKGVSMEVRGNAAADEAAKNAVFLVQR





WMSV_P03359
8,131
VLNLEEEYRLHEKPVPSSIDPSWLQLFPTVWAERAGMGLANQVPPVVVELRSGASPVAVRQYPMSKEAREGIRPHIQRFLDLGVLVPCQSPWNTPLL




PVKKPGTNDYRPVQDLREINKRVQDIHPTVPNPYNLLSSLPPSHTWYSVLDLKDAFFCLKLHPNSQPLFAFEWRDPEKGNTGQLTWTRLPQGFKNSP




TLFDEALHRDLAPFRALNPQVVLLQYVDDLLVAAPTYRDCKEGTQKLLQELSKLGYRVSAKKAQLCQKEVTYLGYLLKEGKRWLTPARKATVMKIPPP




TTPRQVREFLGTAGFCRLWIPGFASLAAPLYPLTKESIPFIWTEEHQKAFDRIKEALLSAPALALPDLTKPFTLYVDERAGVARGVLTQTLGPWRRPVAY




LSKKLDPVASGWPTCLKAVAAVALLLKDADKLTLGQNVTVIASHSLESIVRQPPDRWMTNARMTHYQSLLLNERVSFAPPAVLNPATLLPVESEATPVH




RCSEILAEETGTRRDLKDQPLPGVPAWYTDGSSFIAEGKRRAGAAIVDGKRTVWASSLPEGTSAQKAELVALTQALRLAEGKDINIYTDSRYAFATAHI




HGAIYKQRGLLTSAGKDIKNKEEILALLEAIHLPKRVAIIHCPGHQKGNDPVATGNRRADEAAKQAALSTRVLAETTKP





WMSV_P03359_3mut
8,132
VLNLEEEYRLHEKPVPSSIDPSWLQLFPTVWAERAGMGLANQVPPVVVELRSGASPVAVRQYPMSKEAREGIRPHIQRFLDLGVLVPCQSPWNTPLL




PVKKPGTNDYRPVQDLREINKRVQDIHPTVPNPYNLLSSLPPSHTWYSVLDLKDAFFCLKLHPNSQPLFAFEWRDPEKGNTGQLTWTRLPQGFKNSP




TLFNEALHRDLAPFRALNPQVVLLQYVDDLLVAAPTYRDCKEGTQKLLQELSKLGYRVSAKKAQLCQKEVTYLGYLLKEGKRWLTPARKATVMKIPPP




TTPRQVREFLGTAGFCRLWIPGFASLAAPLYPLTKPSIPFIWTEEHQKAFDRIKEALLSAPALALPDLTKPFTLYVDERAGVARGVLTQTLGPWRRPVAY




LSKKLDPVASGWPTCLKAVAAVALLLKDADKLTLGQNVTVIASHSLESIVRQPPDRWMTNARMTHYQSLLLNERVSFAPPAVLNPATLLPVESEATPVH




RCSEILAEETGTRRDLKDQPLPGVPAWYTDGSSFIAEGKRRAGAAIVDGKRTVWASSLPEGTSAQKAELVALTQALRLAEGKDINIYTDSRYAFATAHI




HGAIYKQRGWLTSAGKDIKNKEEILALLEAIHLPKRVAIIHCPGHQKGNDPVATGNRRADEAAKQAALSTRVLAETTKP





WMSV_P03359_3mutA
8,133
VLNLEEEYRLHEKPVPSSIDPSWLQLFPTVWAERAGMGLANQVPPVVVELRSGASPVAVRQYPMSKEAREGIRPHIQRFLDLGVLVPCQSPWNTPLL




PVKKPGTNDYRPVQDLREINKRVQDIHPTVPNPYNLLSSLPPSHTWYSVLDLKDAFFCLKLHPNSQPLFAFEWRDPEKGNTGQLTWTRLPQGFKNSP




TLFNEALHRDLAPFRALNPQVVLLQYVDDLLVAAPTYRDCKEGTQKLLQELSKLGYRVSAKKAQLCQKEVTYLGYLLKEGKRWLTPARKATVMKIPPP




TTPRQVREFLGKAGFCRLFIPGFASLAAPLYPLTKPSIPFIWTEEHQKAFDRIKEALLSAPALALPDLTKPFTLYVDERAGVARGVLTQTLGPWRRPVAY




LSKKLDPVASGWPTCLKAVAAVALLLKDADKLTLGQNVTVIASHSLESIVRQPPDRWMTNARMTHYQSLLLNERVSFAPPAVLNPATLLPVESEATPVH




RCSEILAEETGTRRDLKDQPLPGVPAWYTDGSSFIAEGKRRAGAAIVDGKRTVWASSLPEGTSAQKAELVALTQALRLAEGKDINIYTDSRYAFATAHI




HGAIYKQRGWLTSAGKDIKNKEEILALLEAIHLPKRVAIIHCPGHQKGNDPVATGNRRADEAAKQAALSTRVLAETTKP





XMRV6_A1Z651
8,134
TLNIEDEYRLHETSKEPDVPLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPT




LFDEALHRDLADFRIQHPDLILLQYVDDLLLAATSEQDCQRGTRALLQTLGNLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPK




TPRQLREFLGTAGFCRLWIPGFAEMAAPLYPLTKTGTLFNWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVA




YLSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQAMLLDTDRVQFGPVVALNPATLLPLPEKEA




PHDCLEILAETHGTRPDLTDQPIPDADYTWYTDGSSFLQEGQRRAGAAVTTETEVIWARALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFAT




AHVHGEIYRRRGLLTSEGREIKNKNEILALLKALFLPKRLSIIHCPGHQKGNSAEARGNRMADQAAREAAMKAVLETSTLL





XMRV6_A1Z651_3mut
8,135
TLNIEDEYRLHETSKEPDVPLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPT




LFNEALHRDLADFRIQHPDLILLQYVDDLLLAATSEQDCQRGTRALLQTLGNLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPK




TPRQLREFLGTAGFCRLWIPGFAEMAAPLYPLTKPGTLFNWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPV




AYLSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQAMLLDTDRVQFGPVVALNPATLLPLPEKE




APHDCLEILAETHGTRPDLTDQPIPDADYTWYTDGSSFLQEGQRRAGAAVTTETEVIWARALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAF




ATAHVHGEIYRRRGWLTSEGREIKNKNEILALLKALFLPKRLSIIHCPGHQKGNSAEARGNRMADQAAREAAMKAVLETSTLL





XMRV6_A1Z651_3mutA
8,136
TLNIEDEYRLHETSKEPDVPLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP




VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPT




LFNEALHRDLADFRIQHPDLILLQYVDDLLLAATSEQDCQRGTRALLQTLGNLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPK




TPRQLREFLGKAGFCRLFIPGFAEMAAPLYPLTKPGTLFNWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVA




YLSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQAMLLDTDRVQFGPVVALNPATLLPLPEKEA




PHDCLEILAETHGTRPDLTDQPIPDADYTWYTDGSSFLQEGQRRAGAAVTTETEVIWARALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFAT




AHVHGEIYRRRGWLTSEGREIKNKNEILALLKALFLPKRLSIIHCPGHQKGNSAEARGNRMADQAAREAAMKAVLETSTLL









In some embodiments, reverse transcriptase domains are modified, for example by site-specific mutation. In some embodiments, reverse transcriptase domains are engineered to have improved properties, e.g. SuperScript IV (SSIV) reverse transcriptase derived from the MMLV RT. In some embodiments, the reverse transcriptase domain may be engineered to have lower error rates, e.g., as described in WO2001068895, incorporated herein by reference. In some embodiments, the reverse transcriptase domain may be engineered to be more thermostable. In some embodiments, the reverse transcriptase domain may be engineered to be more processive. In some embodiments, the reverse transcriptase domain may be engineered to have tolerance to inhibitors. In some embodiments, the reverse transcriptase domain may be engineered to be faster. In some embodiments, the reverse transcriptase domain may be engineered to better tolerate modified nucleotides in the RNA template. In some embodiments, the reverse transcriptase domain may be engineered to insert modified DNA nucleotides. In some embodiments, the reverse transcriptase domain is engineered to bind a template RNA. In some embodiments, one or more mutations are chosen from D200N, L603W, T330P, D524G, E562Q, D583N, P51L, S67R, E67K, T197A, H204R, E302K, F309N, W313F, L435G, N454K, H594Q, L671P, E69K, H8Y, T306K, or D653N in the RT domain of murine leukemia virus reverse transcriptase or a corresponding mutation at a corresponding position of another RT domain.


In some embodiments, a gene modifying polypeptide comprises the RT domain from a retroviral reverse transcriptase, e.g., a wild-type M-MLV RT, e.g., comprising the following sequence:









M-MLV (WT):


(SEQ ID NO: 5002)


TLNIEDEYRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLII





PLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP





VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLD





LKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPTLFD





EALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGNL





GYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKTPRQL





REFLGTAGFCRLWIPGFAEMAAPLYPLTKTGTLFNWGPDQQKAYQEIKQA





LLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAYLSKKLD





PVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDR





WLSNARMTHYQALLLDTDRVQFGPVVALNPATLLPLPEEGLQHNCLDILA





EAHGTRPDLTDQPLPDADHTWYTDGSSLLQEGQRKAGAAVTTETEVIWAK





ALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFATAHIHGEIYRR





RGLLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGHSAEARGNR





MADQAARKAAITETPDTSTLLI






In some embodiments, a gene modifying polypeptide comprises the RT domain from a retroviral reverse transcriptase, e.g., an M-MLV RT, e.g., comprising the following sequence:









(SEQ ID NO: 5003)


TLNIEDEHRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLII





PLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP





VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLD





LKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPTLFD





EALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGNL





GYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKTPRQL





REFLGTAGFCRLWIPGFAEMAAPLYPLTKTGTLFNWGPDQQKAYQEIKQA





LLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAYLSKKLD





PVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDR





WLSNARMTHYQALLLDTDRVQFGPVVALNPATLLPLPEEGLQHNCLDILA





EAHGTRPDLTDQPLPDADHTWYTDGSSLLQEGQRKAGAAVTTETEVIWAK





ALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFATAHIHGEIYRR





RGLLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGHSAEARGNR





MADQAARKAAITETPDTSTLL






In some embodiments, a gene modifying polypeptide comprises the RT domain from a retroviral reverse transcriptase comprising the sequence of amino acids 659-1329 of NP 057933. In embodiments, the gene modifying polypeptide further comprises one additional amino acid at the N-terminus of the sequence of amino acids 659-1329 of NP_057933, e.g., as shown below:









(SEQ ID NO: 5004)


TLNIEDEHRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLII





PLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP






VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLD







LKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPTLFD







EALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGNL







GYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKTPRQL






REFLGTAGFCRLWIPGFAEMAAPLYPLTKTGTLFNWGPDQQKAYQEIKQA





LLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAYLSKKLD





PVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDR





WLSNARMTHYQALLLDTDRVQFGPVVALNPATLLPLPEEGLQHNCLDILA





EAHGTRPDLTDQPLPDADHTWYTDGSSLLQEGQRKAGAAVTTETEVIWAK






ALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFATAHIHGEIYRR







RGLLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGHSAEARGNR







MADQAARKAA








Core RT (bold), annotated per above


RNAseH (underlined), annotated per above


In embodiments, the gene modifying polypeptide further comprises one additional amino acid at the C-terminus of the sequence of amino acids 659-1329 of NP 057933. In embodiments, the gene modifying polypeptide comprises an RNaseH1 domain (e.g., amino acids 1178-1318 of NP_057933).


In some embodiments, a retroviral reverse transcriptase domain, e.g., M-MLV RT, may comprise one or more mutations from a wild-type sequence that may improve features of the RT, e.g., thermostability, processivity, and/or template binding. In some embodiments, an M-ML V RT domain comprises, relative to the M-MLV (WT) sequence above, one or more mutations, e.g., selected from D200N, L603W, T330P, T306K, W313F, D524G, E562Q, D583N, P51L, S67R, E67K, T197A, H204R, E302K, F309N, L435G, N454K, H594Q, D653N, R110S, K103L, e.g., a combination of mutations, such as D200N, L603W, and T330P, optionally further including T306K and W313F. In some embodiments, an M-MLV RT used herein comprises the mutations D200N, L603W, T330P, T306K and W313F. In embodiments, the mutant M-MLV RT comprises the following amino acid sequence:









M-MLV (PE2):


(SEQ ID NO: 5005)


TLNIEDEYRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLII





PLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP





VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLD





LKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPTLFN





EALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGNL





GYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKTPRQL





REFLGKAGFCRLFIPGFAEMAAPLYPLTKPGTLFNWGPDQQKAYQEIKQA





LLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAYLSKKLD





PVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDR





WLSNARMTHYQALLLDTDRVQFGPVVALNPATLLPLPEEGLQHNCLDILA





EAHGTRPDLTDQPLPDADHTWYTDGSSLLQEGQRKAGAAVTTETEVIWAK





ALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFATAHIHGEIYRR





RGWLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGHSAEARGNR





MADQAARKAAITETPDTSTLLI






In some embodiments, a writing domain (e.g., RT domain) comprises an RNA-binding domain, e.g., that specifically binds to an RNA sequence. In some embodiments, a template RNA comprises an RNA sequence that is specifically bound by the RNA-binding domain of the writing domain.


In some embodiments, the reverse transcription domain only recognizes and reverse transcribes a specific template, e.g., a template RNA of the system. In some embodiments, the template comprises a sequence or structure that enables recognition and reverse transcription by a reverse transcription domain. In some embodiments, the template comprises a sequence or structure that enables association with an RNA-binding domain of a polypeptide component of a genome engineering system described herein. In some embodiments, the genome engineering system reverse preferably transcribes a template comprising an association sequence over a template lacking an association sequence.


The writing domain may also comprise DNA-dependent DNA polymerase activity, e.g., comprise enzymatic activity capable of writing DNA into the genome from a template DNA sequence. In some embodiments, DNA-dependent DNA polymerization is employed to complete second-strand synthesis of a target site edit. In some embodiments, the DNA-dependent DNA polymerase activity is provided by a DNA polymerase domain in the polypeptide. In some embodiments, the DNA-dependent DNA polymerase activity is provided by a reverse transcriptase domain that is also capable of DNA-dependent DNA polymerization, e.g., second-strand synthesis. In some embodiments, the DNA-dependent DNA polymerase activity is provided by a second polypeptide of the system. In some embodiments, the DNA-dependent DNA polymerase activity is provided by an endogenous host cell polymerase that is optionally recruited to the target site by a component of the genome engineering system.


In some embodiments, the reverse transcriptase domain has a lower probability of premature termination rate (Poff) in vitro relative to a reference reverse transcriptase domain. In some embodiments, the reference reverse transcriptase domain is a viral reverse transcriptase domain, e.g., the RT domain from M-MLV.


In some embodiments, the reverse transcriptase domain has a lower probability of premature termination rate (Poff) in vitro of less than about 5×10−3/nt, 5×10−4/nt, or 5×10−6/nt, e.g., as measured on a 1094 nt RNA. In embodiments, the in vitro premature termination rate is determined as described in Bibillo and Eickbush (2002) J Biol Chem 277(38):34836-34845 (incorporated by reference herein its entirety).


In some embodiments, the reverse transcriptase domain is able to complete at least about 30% or 50% of integrations in cells. The percent of complete integrations can be measured by dividing the number of substantially full-length integration events (e.g., genomic sites that comprise at least 98% of the expected integrated sequence) by the number of total (including substantially full-length and partial) integration events in a population of cells. In embodiments, the integrations in cells is determined (e.g., across the integration site) using long-read amplicon sequencing, e.g., as described in Karst et al. (2020) bioRxiv doi.org/10.1101/645903 (incorporated by reference herein in its entirety).


In embodiments, quantifying integrations in cells comprises counting the fraction of integrations that contain at least about 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% of the DNA sequence corresponding to the template RNA (e.g., a template RNA having a length of at least 0.05, 0.1, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 1.5, 2, 3, 4, or 5 kb, e.g., a length between 0.5-0.6, 0.6-0.7, 0.7-0.8, 0.8-0.9, 1.0-1.2, 1.2-1.4, 1.4-1.6, 1.6-1.8, 1.8-2.0, 2-3, 3-4, or 4-5 kb).


In some embodiments, the reverse transcriptase domain is capable of polymerizing dNTPs in vitro. In embodiments, the reverse transcriptase domain is capable of polymerizing dNTPs in vitro at a rate between 0.1-50 nt/sec (e.g., between 0.1-1, 1-10, or 10-50 nt/sec). In embodiments, polymerization of dNTPs by the reverse transcriptase domain is measured by a single-molecule assay, e.g., as described in Schwartz and Quake (2009) PNAS 106(48):20294-20299 (incorporated by reference in its entirety).


In some embodiments, the reverse transcriptase domain has an in vitro error rate (e.g., misincorporation of nucleotides) of between 1×10−3-1×10−4 or 1×10−4-1×10−5 substitutions/nt, e.g., as described in Yasukawa et al. (2017) Biochem Biophys Res Commun 492(2): 147-153 (incorporated herein by reference in its entirety). In some embodiments, the reverse transcriptase domain has an error rate (e.g., misincorporation of nucleotides) in cells (e.g., HEK293T cells) of between 1×10−3-1×10−4 or 1×10−4-1×10−5 substitutions/nt, e.g., by long-read amplicon sequencing, e.g., as described in Karst et al. (2020) bioRxiv doi.org/10.1101/645903 (incorporated by reference herein in its entirety).


In some embodiments, the reverse transcriptase domain is capable of performing reverse transcription of a target RNA in vitro. In some embodiments, the reverse transcriptase requires a primer of at least 3 nucleotides to initiate reverse transcription of a template. In some embodiments, reverse transcription of the target RNA is determined by detection of cDNA from the target RNA (e.g., when provided with a ssDNA primer, e.g., which anneals to the target with at least 3, 4, 5, 6, 7, 8, 9, or 10 nt at the 3′ end), e.g., as described in Bibillo and Eickbush (2002) J Biol Chem 277(38):34836-34845 (incorporated herein by reference in its entirety).


In some embodiments, the reverse transcriptase domain performs reverse transcription at least 5 or 10 times more efficiently (e.g., by cDNA production), e.g., when converting its RNA template to cDNA, for example, as compared to an RNA template lacking the protein binding motif (e.g., a 3′ UTR). In embodiments, efficiency of reverse transcription is measured as described in Yasukawa et al. (2017) Biochem Biophys Res Commun 492(2): 147-153 (incorporated by reference herein in its entirety).


In some embodiments, the reverse transcriptase domain specifically binds a specific RNA template with higher frequency (e.g., about 5 or 10-fold higher frequency) than any endogenous cellular RNA, e.g., when expressed in cells (e.g., HEK293T cells). In embodiments, frequency of specific binding between the reverse transcriptase domain and the template RNA are measured by CLIP-seq, e.g., as described in Lin and Miles (2019) Nucleic Acids Res 47(11):5490-5501 (incorporated herein by reference in its entirety).


In some embodiments, an RT domain (e.g., as listed in Table 6) comprises one or more mutations as listed in Table 2A below. In some embodiment, an RT domain as listed in Table 6 comprises one, two, three, four, five, or six of the mutations listed in the corresponding row of Table 2A below.









TABLE 2A







Exemplary RT domain mutations (relative to corresponding wild-type sequences as listed in the


corresponding row of Table 6)








RT Domain Name
Mutation(s)
















AVIRE_P03360








AVIRE_P03360_3mut
D200N
G330P
L605W





AVIRE_P03360_3mutA
D200N
G330P
L605W
T306K
W313F



BAEVM_P10272








BAEVM_P10272_3mut
D198N
E328P
L602W





BAEVM_P10272_3mutA
D198N
E328P
L602W
T304K
W311F



BLVAU_P25059








BLVAU_P25059_2mut
E159Q
G286P






BLVJ_P03361








BLVJ_P03361_2mut
E159Q
L524W






BLVJ_P03361_2mutB
E159Q
L524W
197P





FFV_O93209
D21N







FFV_O93209_2mut
D21N
T293N
T419P





FFV_O93209_2mutA
D21N
T293N
T419P
L393K




FFV_O93209-Pro








FFV_O93209-Pro_2mut
T207N
T333P






FFV_O93209-Pro_2mutA
T207N
T333P
L307K





FLV_P10273








FLV_P10273_3mut
D199N
L602W






FLV_P10273_3mutA
D199N
L602W
T305K
W312F




FOAMV_P14350
D24N







FOAMV_P14350_2mut
D24N
T296N
S420P





FOAMV_P14350_2mutA
D24N
T296N
S420P
L396K




FOAMV_P14350-Pro








FOAMV_P14350-Pro_2mut
T207N
S331P






FOAMV_P14350-Pro_2mutA
T207N
S331P
L307K





GALV_P21414








GALV_P21414_3mut
D198N
E328P
L600W





GALV_P21414_3mutA
D198N
E328P
L600W
T304K
W311F



GHTL1A_P03362








GHTL1A_P03362_2mut
E152Q
R279P






GHTL1A_P03362_2mutB
E152Q
R279P
L90P





HTL1C_P14078








HTL1C_P14078_2mut
E152Q
R279P






HTL1L_P0C211








HTL1L_P0C211_2mut
E149Q
L527W






HTL1L_P0C211_2mutB
E149Q
L527W
L87P





HTL32_Q0R5R2








HTL32_Q0R5R2_2mut
E149Q
L526W






HTL32_Q0R5R2_2mutB
E149Q
L526W
L87P





HTL3P_Q4U0X6








HTL3P_Q4U0X6_2mut
E149Q
L526W






HTL3P_Q4U0X6_2mutB
E149Q
L526W
L87P





HTLV2_P03363_2mut
E147Q
G274P






JSRV_P31623








JSRV_P31623_2mutB
A100P







KORV_Q9TTC1
D32N







KORV_Q9TTC1_3mut
D32N
D322N
E452P
L724W




KORV_Q9TTC1_3mutA
D32N
D322N
E452P
L724W
T428K
W435F


KORV_Q9TTC1-Pro








KORV_Q9TTC1-Pro_3mut
D231N
E361P
L633W





KORV_Q9TTC1-Pro_3mutA
D231N
E361P
L633W
T337K
W344F



MLVAV_P03356








MLVAV_P03356_3mut
D200N
T330P
L603W





MLVAV_P03356_3mutA
D200N
T330P
L603W
T306K
W313F



MLVBM_Q7SVK7








MLVBM_Q7SVK7








MLVBM_Q7SVK7_3mut
D200N
T330P
L603W





MLVBM_Q7SVK7_3mut
D200N
T330P
L603W





MLVBM_Q7SVK7_3mutA_WS
D199N
T329P
L602W
T305K
W312F



MLVBM_Q7SVK7_3mutA_WS
D199N
T329P
L602W
T305K
W312F



MLVCB_P08361








MLVCB_P08361_3mut
D200N
T330P
L603W





MLVCB_P08361_3mutA
D200N
T330P
L603W
T306K
W313F



MLVF5_P26810








MLVF5_P26810_3mut
D200N
T330P
L603W





MLVF5_P26810_3mutA
D200N
T330P
L603W
T306K
W313F



MLVFF_P26809_3mut
D200N
T330P
L603W





MLVFF_P26809_3mutA
D200N
T330P
L603W
T306K
W313F



MLVMS_P03355








MLVMS_P03355








MLVMS_P03355_3mut
D200N
T330P
L603W





MLVMS_P03355_3mut
D200N
T330P
L603W





MLVMS_P03355_3mutA_WS
D200N
T330P
L603W
T306K
W313F



MLVMS_P03355_3mutA_WS
D200N
T330P
L603W
T306K
W313F



MLVMS_P03355_PLV919
D200N
T330P
L603W
T306K
W313F
H8Y


MLVMS_P03355_PLV919
D200N
T330P
L603W
T306K
W313F
H8Y


MLVRD_P11227








MLVRD_P11227_3mut
D200N
T330P
L603W





MMTVB_P03365
D26N







MMTVB_P03365
D26N







MMTVB_P03365_2mut
D26N
G401P






MMTVB_P03365_2mut_WS
G400P







MMTVB_P03365_2mut_WS
G400P







MMTVB_P03365_2mutB
D26N
G401P
V215P





MMTVB_P03365_2mutB
D26N
G401P
V215P





MMTVB_P03365_2mutB_WS
G400P
V212P






MMTVB_P03365_2mutB_WS
G400P
V212P






MMTVB_P03365_WS








MMTVB_P03365_WS








MMTVB_P03365-Pro








MMTVB_P03365-Pro








MMTVB_P03365-Pro_2mut
G309P







MMTVB_P03365-Pro_2mut
G309P







MMTVB_P03365-Pro_2mutB
G309P
V123P






MMTVB_P03365-Pro_2mutB
G309P
V123P






MPMV_P07572








MPMV_P07572_2mutB
G289P
I103P






PERV_Q4VFZ2








PERV Q4VFZ2








PERV_Q4VFZ2_3mut
D199N
E329P
L602W





PERV_Q4VFZ2_3mut
D199N
E329P
L602W





PERV_Q4VFZ2_3mutA_WS
D196N
E326P
L599W
T302K
W309F



PERV_Q4VFZ2_3mutA_WS
D196N
E326P
L599W
T302K
W309F



SFV1_P23074
D24N







SFV1_P23074_2mut
D24N
T296N
N420P





SFV1_P23074_2mutA
D24N
T296N
N420P
L396K




SFV1_P23074-Pro








SFV1_P23074-Pro_2mut
T207N
N331P






SFV1_P23074-Pro_2mutA
T207N
N331P
L307K





SFV3L_P27401
D24N







SFV3L_P27401_2mut
D24N
T296N
N422P





SFV3L_P27401_2mutA
D24N
T296N
N422P
L396K




SFV3L_P27401-Pro








SFV3L_P27401-Pro_2mut
T307N
N333P






SFV3L_P27401-Pro_2mutA
T307N
N333P
L307K





SFVCP_Q87040
D24N







SFVCP_Q87040_2mut
D24N
T296N
K422P





SFVCP_Q87040_2mutA
D24N
T296N
K422P
L396K




SFVCP_Q87040-Pro








SFVCP_Q87040-Pro_2mut
T207N
K333P






SFVCP_Q87040-Pro_2mutA
T207N
K333P
L307K





SMRVH_P03364








SMRVH_P03364_2mut
G288P







SMRVH_P03364_2mutB
G288P
I102P






SRV2_P51517








SRV2_P51517_2mutB
I103P







WDSV_O92815








WDSV_O92815_2mut
S183N
K312P






WDSV_O92815_2mutA
S183N
K312P
L288K
W295F




WMSV_P03359








WMSV_P03359_3mut
D198N
E328P
L600W





WMSV_P03359_3mutA
D198N
E328P
L600W
T304K
W311F



XMRV6_A1Z651








XMRV6_A1Z651_3mut
D200N
T330P
L603W





XMRV6_A1Z651_3mutA
D200N
T330P
L603W
T306K
W313F









Template Nucleic Acid Binding Domain

The gene modifying polypeptide typically contains regions capable of associating with the template nucleic acid (e.g., template RNA). In some embodiments, the template nucleic acid binding domain is an RNA binding domain. In some embodiments, the RNA binding domain is a modular domain that can associate with RNA molecules containing specific signatures, e.g., structural motifs. In other embodiments, the template nucleic acid binding domain (e.g., RNA binding domain) is contained within the reverse transcription domain, e.g., the reverse transcriptase-derived component has a known signature for RNA preference.


In other embodiments, the template nucleic acid binding domain (e.g., RNA binding domain) is contained within the target DNA binding domain. For example, in some embodiments, the DNA binding domain is a CRISPR-associated protein that recognizes the structure of a template nucleic acid (e.g., template RNA) comprising a gRNA. In some embodiments, a gene modifying polypeptide comprises a DNA-binding domain comprising a CRISPR-associated protein that associates with a gRNA scaffold that allows the DNA-binding domain to bind a target genomic DNA sequence. In some embodiments, the gRNA scaffold and gRNA spacer is comprised within the template nucleic acid (e.g., template RNA), thus the DNA-binding domain is also the template nucleic acid binding domain. In some embodiments, the polypeptide possesses RNA binding function in multiple domains, e.g., can bind a gRNA structure in a CRISPR-associated DNA binding domain and an additional sequence or structure in a reverse transcriptase domain.


In some embodiments, the RNA binding domain is capable of binding to a template RNA with greater affinity than a reference RNA binding domain. In some embodiments, the reference RNA binding domain is an RNA binding domain from Cas9 of S. pyogenes. In some embodiments, the RNA binding domain is capable of binding to a template RNA with an affinity between 100 pM—10 nM (e.g., between 100 pM-1 nM or 1 nM—10 nM). In some embodiments, the affinity of a RNA binding domain for its template RNA is measured in vitro, e.g., by thermophoresis, e.g., as described in Asmari et al. Methods 146:107-119 (2018) (incorporated by reference herein in its entirety). In some embodiments, the affinity of a RNA binding domain for its template RNA is measured in cells (e.g., by FRET or CLIP-Seq).


In some embodiments, the RNA binding domain is associated with the template RNA in vitro at a frequency at least about 5-fold or 10-fold higher than with a scrambled RNA. In some embodiments, the frequency of association between the RNA binding domain and the template RNA or scrambled RNA is measured by CLIP-seq, e.g., as described in Lin and Miles (2019) Nucleic Acids Res 47(11): 5490-5501 (incorporated by reference herein in its entirety). In some embodiments, the RNA binding domain is associated with the template RNA in cells (e.g., in HEK293T cells) at a frequency at least about 5-fold or 10-fold higher than with a scrambled RNA. In some embodiments, the frequency of association between the RNA binding domain and the template RNA or scrambled RNA is measured by CLIP-seq, e.g., as described in Lin and Miles (2019), supra.


Endonuclease Domains and DNA Binding Domains

In some embodiments, a gene modifying polypeptide possesses the function of DNA target site cleavage via an endonuclease domain. In some embodiments, a gene modifying polypeptide comprises a DNA binding domain, e.g., for binding to a target nucleic acid. In some embodiments, a domain (e.g., a Cas domain) of the gene modifying polypeptide comprises two or more smaller domains, e.g., a DNA binding domain and an endonuclease domain. It is understood that when a DNA binding domain (e.g., a Cas domain) is said to bind to a target nucleic acid sequence, in some embodiments, the binding is mediated by a gRNA.


In some embodiments, a domain has two functions. For example, in some embodiments, the endonuclease domain is also a DNA-binding domain. In some embodiments, the endonuclease domain is also a template nucleic acid (e.g., template RNA) binding domain. For example, in some embodiments, a polypeptide comprises a CRISPR-associated endonuclease domain that binds a template RNA comprising a gRNA, binds a target DNA sequence (e.g., with complementarity to a portion of the gRNA), and cuts the target DNA sequence. In some embodiments, an endonuclease domain or endonuclease/DNA-binding domain from a heterologous source can be used or can be modified (e.g., by insertion, deletion, or substitution of one or more residues) in a gene modifying system described herein.


In some embodiments, a nucleic acid encoding the endonuclease domain or endonuclease/DNA binding domain is altered from its natural sequence to have altered codon usage, e.g. improved for human cells. In some embodiments, the endonuclease element is a heterologous endonuclease element, such as a Cas endonuclease (e.g., Cas9), a type-II restriction endonuclease (e.g., Fok1), a meganuclease (e.g., I-Scel), or other endonuclease domain.


In certain aspects, the DNA-binding domain of a gene modifying polypeptide described herein is selected, designed, or constructed for binding to a desired host DNA target sequence. In certain embodiments, the DNA-binding domain of the polypeptide is a heterologous DNA-binding element. In some embodiments the heterologous DNA binding element is a zinc-finger element or a TAL effector element, e.g., a zinc-finger or TAL polypeptide or functional fragment thereof. In some embodiments the heterologous DNA binding element is a sequence-guided DNA binding element, such as Cas9, Cpf1, or other CRISPR-related protein that has been altered to have no endonuclease activity. In some embodiments the heterologous DNA binding element retains endonuclease activity. In some embodiments, the heterologous DNA binding element retains partial endonuclease activity to cleave ssDNA, e.g., possesses nickase activity. In specific embodiments, the heterologous DNA-binding domain can be any one or more of Cas9, TAL domain, ZF domain, Myb domain, combinations thereof, or multiples thereof.


In some embodiments, DNA-binding domains are modified, for example by site-specific mutation, increasing or decreasing DNA-binding elements (for example, number and/or specificity of zinc fingers), etc., to alter DNA-binding specificity and affinity. In some embodiments a nucleic acid sequence encoding the DNA binding domain is altered from its natural sequence to have altered codon usage, e.g. improved for human cells. In embodiments, the DNA binding domain comprises one or more modifications relative to a wild-type DNA binding domain, e.g., a modification via directed evolution, e.g., phage-assisted continuous evolution (PACE).


In some embodiments, the DNA binding domain comprises a meganuclease domain (e.g., as described herein, e.g., in the endonuclease domain section), or a functional fragment thereof. In some embodiments, the meganuclease domain possesses endonuclease activity, e.g., double-strand cleavage and/or nickase activity. In other embodiments, the meganuclease domain has reduced activity, e.g., lacks endonuclease activity, e.g., the meganuclease is catalytically inactive. In some embodiments, a catalytically inactive meganuclease is used as a DNA binding domain, e.g., as described in Fonfara et al. Nucleic Acids Res 40(2):847-860 (2012), incorporated herein by reference in its entirety.


In some embodiments, a gene modifying polypeptide comprises a modification to a DNA-binding domain, e.g., relative to the wild-type polypeptide. In some embodiments, the DNA-binding domain comprises an addition, deletion, replacement, or modification to the amino acid sequence of the original DNA-binding domain. In some embodiments, the DNA-binding domain is modified to include a heterologous functional domain that binds specifically to a target nucleic acid (e.g., DNA) sequence of interest. In some embodiments, the functional domain replaces at least a portion (e.g., the entirety of) the prior DNA-binding domain of the polypeptide. In some embodiments, the functional domain comprises a zinc finger (e.g., a zinc finger that specifically binds to the target nucleic acid (e.g., DNA) sequence of interest. In some embodiments, the functional domain comprises a Cas domain (e.g., a Cas domain that specifically binds to the target nucleic acid (e.g., DNA) sequence of interest. In some embodiments, the Cas domain comprises a Cas9 or a mutant or variant thereof (e.g., as described herein). In embodiments, the Cas domain is associated with a guide RNA (gRNA), e.g., as described herein. In embodiments, the Cas domain is directed to a target nucleic acid (e.g., DNA) sequence of interest by the gRNA. In embodiments, the Cas domain is encoded in the same nucleic acid (e.g., RNA) molecule as the gRNA. In embodiments, the Cas domain is encoded in a different nucleic acid (e.g., RNA) molecule from the gRNA.


In some embodiments, the DNA binding domain is capable of binding to a target sequence (e.g., a dsDNA target sequence) with greater affinity than a reference DNA binding domain. In some embodiments, the reference DNA binding domain is a DNA binding domain from Cas9 of S. pyogenes. In some embodiments, the DNA binding domain is capable of binding to a target sequence (e.g., a dsDNA target sequence) with an affinity between 100 pM-10 nM (e.g., between 100 pM-1 nM or 1 nM-10 nM).


In some embodiments, the affinity of a DNA binding domain for its target sequence (e.g., dsDNA target sequence) is measured in vitro, e.g., by thermophoresis, e.g., as described in Asmari et al. Methods 146:107-119 (2018) (incorporated by reference herein in its entirety).


In embodiments, the DNA binding domain is capable of binding to its target sequence (e.g., dsDNA target sequence), e.g, with an affinity between 100 pM-10 nM (e.g., between 100 pM-1 nM or 1 nM-10 nM) in the presence of a molar excess of scrambled sequence competitor dsDNA, e.g., of about 100-fold molar excess.


In some embodiments, the DNA binding domain is found associated with its target sequence (e.g., dsDNA target sequence) more frequently than any other sequence in the genome of a target cell, e.g., human target cell, e.g., as measured by ChIP-seq (e.g., in HEK293T cells), e.g., as described in He and Pu (2010) Curr. Protoc Mol Biol Chapter 21 (incorporated herein by reference in its entirety). In some embodiments, the DNA binding domain is found associated with its target sequence (e.g., dsDNA target sequence) at least about 5-fold or 10-fold, more frequently than any other sequence in the genome of a target cell, e.g., as measured by ChIP-seq (e.g., in HEK293T cells), e.g., as described in He and Pu (2010), supra.


In some embodiments, the endonuclease domain has nickase activity and cleaves one strand of a target DNA. In some embodiments, nickase activity reduces the formation of double-stranded breaks at the target site. In some embodiments, the endonuclease domain creates a staggered nick structure in the first and second strands of a target DNA. In some embodiments, a staggered nick structure generates free 3′ overhangs at the target site. In some embodiments, free 3′ overhangs at the target site improve editing efficiency, e.g., by enhancing access and annealing of a 3′ homology region of a template nucleic acid. In some embodiments, a staggered nick structure reduces the formation of double-stranded breaks at the target site.


In some embodiments, the endonuclease domain cleaves both strands of a target DNA, e.g., results in blunt-end cleavage of a target with no ssDNA overhangs on either side of the cut-site. The amino acid sequence of an endonuclease domain of a gene modifying system described herein may be at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99% identical to the amino acid sequence of an endonuclease domain described herein, e.g., an endonuclease domain from Table 8.


In certain embodiments, the heterologous endonuclease is Fok1 or a functional fragment thereof. In certain embodiments, the heterologous endonuclease is a Holliday junction resolvase or homolog thereof, such as the Holliday junction resolving enzyme from Sulfolobus solfataricus-Ssol Hje (Govindaraju et al., Nucleic Acids Research 44:7, 2016). In certain embodiments, the heterologous endonuclease is the endonuclease of the large fragment of a spliceosomal protein, such as Prp8 (Mahbub et al., Mobile DNA 8:16, 2017). In certain embodiments, the heterologous endonuclease is derived from a CRISPR-associated protein, e.g., Cas9. In certain embodiments, the heterologous endonuclease is engineered to have only ssDNA cleavage activity, e.g., only nickase activity, e.g., be a Cas9 nickase, e.g., SpCas9 with D10A, H840A, or N863A mutations. Table 8 provides exemplary Cas proteins and mutations associated with nickase activity. In still other embodiments, homologous endonuclease domains are modified, for example by site-specific mutation, to alter DNA endonuclease activity. In still other embodiments, endonuclease domains are modified to reduce DNA-sequence specificity, e.g., by truncation to remove domains that confer DNA-sequence specificity or mutation to inactivate regions conferring DNA-sequence specificity.


In some embodiments, the endonuclease domain has nickase activity and does not form double-stranded breaks. In some embodiments, the endonuclease domain forms single-stranded breaks at a higher frequency than double-stranded breaks, e.g., at least 90%, 95%, 96%, 97%, 98%, or 99% of the breaks are single-stranded breaks, or less than 10%, 5%, 4%, 3%, 2%, or 1% of the breaks are double-stranded breaks. In some embodiments, the endonuclease forms substantially no double-stranded breaks. In some embodiments, the endonuclease does not form detectable levels of double-stranded breaks.


In some embodiments, the endonuclease domain has nickase activity that nicks the target site DNA of the first strand; e.g., in some embodiments, the endonuclease domain cuts the genomic DNA of the target site near to the site of alteration on the strand that will be extended by the writing domain. In some embodiments, the endonuclease domain has nickase activity that nicks the target site DNA of the first strand and does not nick the target site DNA of the second strand. For example, when a polypeptide comprises a CRISPR-associated endonuclease domain having nickase activity, in some embodiments, said CRISPR-associated endonuclease domain nicks the target site DNA strand containing the PAM site (e.g., and does not nick the target site DNA strand that does not contain the PAM site). As a further example, when a polypeptide comprises a CRISPR-associated endonuclease domain having nickase activity, in some embodiments, said CRISPR-associated endonuclease domain nicks the target site DNA strand not containing the PAM site (e.g., and does not nick the target site DNA strand that contains the PAM site).


In some other embodiments, the endonuclease domain has nickase activity that nicks the target site DNA of the first strand and the second strand. Without wishing to be bound by theory, after a writing domain (e.g., RT domain) of a polypeptide described herein polymerizes (e.g., reverse transcribes) from the heterologous object sequence of a template nucleic acid (e.g., template RNA), the cellular DNA repair machinery must repair the nick on the first DNA strand. The target site DNA now contains two different sequences for the first DNA strand: one corresponding to the original genomic DNA (e.g., having a free 5′ end) and a second corresponding to that polymerized from the heterologous object sequence (e.g., having a free 3′ end). It is thought that the two different sequences equilibrate with one another, first one hybridizing the second strand, then the other, and which sequence the cellular DNA repair apparatus incorporates into its repaired target site may be a stochastic process. Without wishing to be bound by theory, it is thought that introducing an additional nick to the second-strand may bias the cellular DNA repair machinery to adopt the heterologous object sequence-based sequence more frequently than the original genomic sequence (Anzalone et al. Nature 576:149-157 (2019)). In some embodiments, the additional nick is positioned at least 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, or 150 nucleotides 5′ or 3′ of the target site modification (e.g., the insertion, deletion, or substitution) or to the nick on the first strand.


Alternatively or additionally, without wishing to be bound by theory, it is thought that an additional nick to the second strand may promote second-strand synthesis. In some embodiments, where the gene modifying system has inserted or substituted a portion of the first strand, synthesis of a new sequence corresponding to the insertion/substitution in the second strand is necessary.


In some embodiments, the polypeptide comprises a single domain having endonuclease activity (e.g., a single endonuclease domain) and said domain nicks both the first strand and the second strand. For example, in such an embodiment the endonuclease domain may be a CRISPR-associated endonuclease domain, and the template nucleic acid (e.g., template RNA) comprises a gRNA spacer that directs nicking of the first strand and an additional gRNA spacer that directs nicking of the second strand. In some embodiments, the polypeptide comprises a plurality of domains having endonuclease activity, and a first endonuclease domain nicks the first strand and a second endonuclease domain nicks the second strand (optionally, the first endonuclease domain does not (e.g., cannot) nick the second strand and the second endonuclease domain does not (e.g., cannot) nick the first strand).


In some embodiments, the endonuclease domain is capable of nicking a first strand and a second strand. In some embodiments, the first and second strand nicks occur at the same position in the target site but on opposite strands. In some embodiments, the second strand nick occurs in a staggered location, e.g., upstream or downstream, from the first nick. In some embodiments, the endonuclease domain generates a target site deletion if the second strand nick is upstream of the first strand nick. In some embodiments, the endonuclease domain generates a target site duplication if the second strand nick is downstream of the first strand nick. In some embodiments, the endonuclease domain generates no duplication and/or deletion if the first and second strand nicks occur in the same position of the target site. In some embodiments, the endonuclease domain has altered activity depending on protein conformation or RNA-binding status, e.g., which promotes the nicking of the first or second strand (e.g., as described in Christensen et al. PNAS 2006; incorporated by reference herein in its entirety).


In some embodiments, the endonuclease domain comprises a meganuclease, or a functional fragment thereof. In some embodiments, the endonuclease domain comprises a homing endonuclease, or a functional fragment thereof. In some embodiments, the endonuclease domain comprises a meganuclease from the LAGLIDADG (SEQ ID NO: 22002), GIY-YIG, HNH, His-Cys Box, or PD-(D/E) XK families, or a functional fragment or variant thereof, e.g., which possess conserved amino acid motifs, e.g., as indicated in the family names. In some embodiments, the endonuclease domain comprises a meganuclease, or fragment thereof, chosen from, e.g., I-SmaMI (Uniprot F7WD42), I-Scel (Uniprot P03882), I-AniI (Uniprot P03880), I-Dmol (Uniprot P21505), I-Crel (Uniprot P05725), I-TevI (Uniprot P13299), I-Onul (Uniprot Q4VWW5), or I-Bmol (Uniprot Q9ANR6). In some embodiments, the meganuclease is naturally monomeric, e.g., I-Scel, I-TevI, or dimeric, e.g., I-Crel, in its functional form. For example, the LAGLIDADG (SEQ ID NO: 22002) meganucleases with a single copy of the LAGLIDADG (SEQ ID NO: 22002) motif generally form homodimers, whereas members with two copies of the LAGLIDADG (SEQ ID NO: 22002) motif are generally found as monomers. In some embodiments, a meganuclease that normally forms as a dimer is expressed as a fusion, e.g., the two subunits are expressed as a single ORF and, optionally, connected by a linker, e.g., an I-Crel dimer fusion (Rodriguez-Fornes et al. Gene Therapy 2020; incorporated by reference herein in its entirety). In some embodiments, a meganuclease, or a functional fragment thereof, is altered to favor nickase activity for one strand of a double-stranded DNA molecule, e.g., I-Scel (K1221 and/or K223I) (Niu et al. J Mol Biol 2008), I-AniI (K227M) (McConnell Smith et al. PNAS 2009), I-Dmol (Q42A and/or K120M) (Molina et al. J Biol Chem 2015). In some embodiments, a meganuclease or functional fragment thereof possessing this preference for single-strand cleavage is used as an endonuclease domain, e.g., with nickase activity. In some embodiments, an endonuclease domain comprises a meganuclease, or a functional fragment thereof, which naturally targets or is engineered to target a safe harbor site, e.g., an I-Crel targeting SH6 site (Rodriguez-Fornes et al., supra). In some embodiments, an endonuclease domain comprises a meganuclease, or a functional fragment thereof, with a sequence tolerant catalytic domain, e.g., I-TevI recognizing the minimal motif CNNNG (Kleinstiver et al. PNAS 2012). In some embodiments, a target sequence tolerant catalytic domain is fused to a DNA binding domain, e.g., to direct activity, e.g., by fusing I-TevI to: (i) zinc fingers to create Tev-ZFEs (Kleinstiver et al. PNAS 2012), (ii) other meganucleases to create MegaTevs (Wolfs et al. Nucleic Acids Res 2014), and/or (iii) Cas9 to create TevCas9 (Wolfs et al. PNAS 2016).


In some embodiments, the endonuclease domain comprises a restriction enzyme, e.g., a Type IIS or Type IIP restriction enzyme. In some embodiments, the endonuclease domain comprises a Type IIS restriction enzyme, e.g., FokI, or a fragment or variant thereof. In some embodiments, the endonuclease domain comprises a Type IIP restriction enzyme, e.g., PvuII, or a fragment or variant thereof. In some embodiments, a dimeric restriction enzyme is expressed as a fusion such that it functions as a single chain, e.g., a FokI dimer fusion (Minczuk et al. Nucleic Acids Res 36(12):3926-3938 (2008)).


The use of additional endonuclease domains is described, for example, in Guha and Edgell Int J Mol Sci 18(22):2565 (2017), which is incorporated herein by reference in its entirety.


In some embodiments, a gene modifying polypeptide comprises a modification to an endonuclease domain, e.g., relative to a wild-type Cas protein. In some embodiments, the endonuclease domain comprises an addition, deletion, replacement, or modification to the amino acid sequence of the wild-type Cas protein. In some embodiments, the endonuclease domain is modified to include a heterologous functional domain that binds specifically to and/or induces endonuclease cleavage of a target nucleic acid (e.g., DNA) sequence of interest. In some embodiments, the endonuclease domain comprises a zinc finger. In embodiments, the endonuclease domain comprising the Cas domain is associated with a guide RNA (gRNA), e.g., as described herein. In some embodiments, the endonuclease domain is modified to include a functional domain that does not target a specific target nucleic acid (e.g., DNA) sequence. In embodiments, the endonuclease domain comprises a Fok1 domain.


In some embodiments, the endonuclease domain is associated with the target dsDNA in vitro at a frequency at least about 5-fold or 10-fold higher than with a scrambled dsDNA. In some embodiments, the endonuclease domain is associated with the target dsDNA in vitro at a frequency at least about 5-fold or 10-fold higher than with a scrambled dsDNA, e.g., in a cell (e.g., a HEK293T cell). In some embodiments, the frequency of association between the endonuclease domain and the target DNA or scrambled DNA is measured by ChIP-seq, e.g., as described in He and Pu (2010) Curr. Protoc Mol Biol Chapter 21 (incorporated by reference herein in its entirety).


In some embodiments, the endonuclease domain can catalyze the formation of a nick at a target sequence, e.g., to an increase of at least about 5-fold or 10-fold relative to a non-target sequence (e.g., relative to any other genomic sequence in the genome of the target cell). In some embodiments, the level of nick formation is determined using NickSeq, e.g., as described in Elacqua et al. (2019) bioRxiv doi.org/10.1101/867937 (incorporated herein by reference in its entirety).


In some embodiments, the endonuclease domain is capable of nicking DNA in vitro. In embodiments, the nick results in an exposed base. In embodiments, the exposed base can be detected using a nuclease sensitivity assay, e.g., as described in Chaudhry and Weinfeld (1995) Nucleic Acids Res 23(19):3805-3809 (incorporated by reference herein in its entirety). In embodiments, the level of exposed bases (e.g., detected by the nuclease sensitivity assay) is increased by at least 10%, 50%, or more relative to a reference endonuclease domain. In some embodiments, the reference endonuclease domain is an endonuclease domain from Cas9 of S. pyogenes.


In some embodiments, the endonuclease domain is capable of nicking DNA in a cell. In embodiments, the endonuclease domain is capable of nicking DNA in a HEK293T cell. In embodiments, an unrepaired nick that undergoes replication in the absence of Rad51 results in increased NHEJ rates at the site of the nick, which can be detected, e.g., by using a Rad51 inhibition assay, e.g., as described in Bothmer et al. (2017) Nat Commun 8:13905 (incorporated by reference herein in its entirety). In embodiments, NHEJ rates are increased above 0-5%. In embodiments, NHEJ rates are increased to 20-70% (e.g., between 30%-60% or 40-50%), e.g., upon Rad51 inhibition.


In some embodiments, the endonuclease domain releases the target after cleavage. In some embodiments, release of the target is indicated indirectly by assessing for multiple turnovers by the enzyme, e.g., as described in Yourik at al. RNA 25(1):35-44 (2019) (incorporated herein by reference in its entirety) and shown in FIG. 2. In some embodiments, the kexp of an endonuclease domain is 1×10−3-1×10−5 min-1 as measured by such methods.


In some embodiments, the endonuclease domain has a catalytic efficiency (kcat/Km) greater than about 1×108 s−1 M−1 in vitro. In embodiments, the endonuclease domain has a catalytic efficiency greater than about 1×105, 1×106, 1×107, or 1×108, s−1 M−1 in vitro. In embodiments, catalytic efficiency is determined as described in Chen et al. (2018) Science 360(6387):436-439 (incorporated herein by reference in its entirety). In some embodiments, the endonuclease domain has a catalytic efficiency (kcat/Km) greater than about 1×108 s-1 M-1 in cells. In embodiments, the endonuclease domain has a catalytic efficiency greater than about 1×105, 1×106, 1×107, or 1×108 s−1 M−1 in cells.


Gene Modifying Polypeptides Comprising Cas Domains

In some embodiments, a gene modifying polypeptide described herein comprises a Cas domain. In some embodiments, the Cas domain can direct the gene modifying polypeptide to a target site specified by a gRNA spacer, thereby modifying a target nucleic acid sequence in “cis”. In some embodiments, a gene modifying polypeptide is fused to a Cas domain. In some embodiments, a gene modifying polypeptide comprises a CRISPR/Cas domain (also referred to herein as a CRISPR-associated protein). In some embodiments, a CRISPR/Cas domain comprises a protein involved in the clustered regulatory interspaced short palindromic repeat (CRISPR) system, e.g., a Cas protein, and optionally binds a guide RNA, e.g., single guide RNA (sgRNA).


CRISPR systems are adaptive defense systems originally discovered in bacteria and archaea. CRISPR systems use RNA-guided nucleases termed CRISPR-associated or “Cas” endonucleases (e.g., Cas9 or Cpf1) to cleave foreign DNA. For example, in a typical CRISPR-Cas system, an endonuclease is directed to a target nucleotide sequence (e.g., a site in the genome that is to be sequence-edited) by sequence-specific, non-coding “guide RNAs” that target single- or double-stranded DNA sequences. Three classes (I-III) of CRISPR systems have been identified. The class II CRISPR systems use a single Cas endonuclease (rather than multiple Cas proteins). One class II CRISPR system includes a type II Cas endonuclease such as Cas9, a CRISPR RNA (“crRNA”), and a trans-activating crRNA (“tracrRNA”). The crRNA contains a “spacer” sequence, a typically about 20-nucleotide RNA sequence that corresponds to a target DNA sequence (“protospacer”). In the wild-type system, and in some engineered systems, crRNA also contains a region that binds to the tracrRNA to form a partially double-stranded structure that is cleaved by RNase III, resulting in a crRNA/tracrRNA hybrid molecule. A crRNA/tracrRNA hybrid then directs the Cas endonuclease to recognize and cleave a target DNA sequence. A target DNA sequence is generally adjacent to a “protospacer adjacent motif” (“PAM”) that is specific for a given Cas endonuclease and required for cleavage activity at a target site matching the spacer of the crRNA. CRISPR endonucleases identified from various prokaryotic species have unique PAM sequence requirements, e.g., as listed for exemplary Cas enzymes in Table 7; examples of PAM sequences include 5′-NGG (Streptococcus pyogenes), 5′-NNAGAA (Streptococcus thermophilus CRISPR1), 5′-NGGNG (Streptococcus thermophilus CRISPR3), and 5′-NNNGATT (Neisseria meningiditis). Some endonucleases, e.g., Cas9 endonucleases, are associated with G-rich PAM sites, e.g., 5′-NGG, and perform blunt-end cleaving of the target DNA at a location 3 nucleotides upstream from (5′ from) the PAM site. Another class II CRISPR system includes the type V endonuclease Cpf1, which is smaller than Cas9; examples include AsCpf1 (from Acidaminococcus sp.) and LbCpf1 (from Lachnospiraceae sp.). Cpf1-associated CRISPR arrays are processed into mature crRNAs without the requirement of a tracrRNA; in other words, a Cpf1 system, in some embodiments, comprises only Cpf1 nuclease and a crRNA to cleave a target DNA sequence. Cpf1 endonucleases, are typically associated with T-rich PAM sites, e.g., 5′-TTN. Cpf1 can also recognize a 5′-CTA PAM motif. Cpf1 typically cleaves a target DNA by introducing an offset or staggered double-strand break with a 4- or 5-nucleotide 5′ overhang, for example, cleaving a target DNA with a 5-nucleotide offset or staggered cut located 18 nucleotides downstream from (3′ from) from a PAM site on the coding strand and 23 nucleotides downstream from the PAM site on the complimentary strand; the 5-nucleotide overhang that results from such offset cleavage allows more precise genome editing by DNA insertion by homologous recombination than by insertion at blunt-end cleaved DNA. See, e.g., Zetsche et al. (2015) Cell, 163:759-771.


A variety of CRISPR associated (Cas) genes or proteins can be used in the technologies provided by the present disclosure and the choice of Cas protein will depend upon the particular conditions of the method. Specific examples of Cas proteins include class II systems including Cas1, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9, Cas10, Cpf1, C2C1, or C2C3. In some embodiments, a Cas protein, e.g., a Cas9 protein, may be from any of a variety of prokaryotic species. In some embodiments a particular Cas protein, e.g., a particular Cas9 protein, is selected to recognize a particular protospacer-adjacent motif (PAM) sequence. In some embodiments, a DNA-binding domain or endonuclease domain includes a sequence targeting polypeptide, such as a Cas protein, e.g., Cas9. In certain embodiments a Cas protein, e.g., a Cas9 protein, may be obtained from a bacteria or archaea or synthesized using known methods. In certain embodiments, a Cas protein may be from a gram-positive bacteria or a gram-negative bacteria. In certain embodiments, a Cas protein may be from a Streptococcus (e.g., a S. pyogenes, or a S. thermophilus), a Francisella (e.g., an F. novicida), a Staphylococcus (e.g., an S. aureus), an Acidaminococcus (e.g., an Acidaminococcus sp. BV3L6), a Neisseria (e.g., an N. meningitidis), a Cryptococcus, a Corynebacterium, a Haemophilus, a Eubacterium, a Pasteurella, a Prevotella, a Veillonella, or a Marinobacter.


In some embodiments, a gene modifying polypeptide may comprise the amino acid sequence of SEQ ID NO: 4000 below, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% identity thereto. In embodiments, the amino acid sequence of SEQ ID NO: 4000 below, or the sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% identity thereto, is positioned at the N-terminal end of the gene modifying polypeptide. In embodiments, the amino acid sequence of SEQ ID NO: 4000 below, or the sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% identity thereto, is positioned within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, or 30 amino acids of the N-terminal end of the gene modifying polypeptide.









Exemplary N-terminal NLS-Cas9 domain


(SEQ ID NO: 4000)


MPAAKRVKLDGGDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTD





RHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNE





MAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLR





KKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLV





QTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFG





NLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADL





FLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALV





RQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEEL





LVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREK





IEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQ





SFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAF





LSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNA





SLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTY





AHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFA





NRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQ





TVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIK





ELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVD





HIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYWRQLLNA





KLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRM





NTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYL





NAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFY





SNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSM





PQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTV





AYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKE





VKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLA





SHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDK





VLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTST





KEVLDATLIHQSITGLYETRIDLSQLGGDGG






In some embodiments, a gene modifying polypeptide may comprise the amino acid sequence of SEQ ID NO: 4001 below, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% identity thereto. In embodiments, the amino acid sequence of SEQ ID NO: 4001 below, or the sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% identity thereto, is positioned at the C-terminal end of the gene modifying polypeptide. In embodiments, the amino acid sequence of SEQ ID NO: 4001 below, or the sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% identity thereto, is positioned within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, or 30 amino acids of the C-terminal end of the gene modifying polypeptide.









Exemplary C-terminal sequence comprising an NLS


(SEQ ID NO: 4001)


AGKRTADGSEFEKRTADGSEFESPKKKAKVE





Exemplary benchmarking sequence


(SEQ ID NO: 4002)


MPAAKRVKLDGGDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTD





RHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNE





MAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLR





KKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLV





QTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFG





NLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADL





FLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALV





RQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEEL





LVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREK





IEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQ





SFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAF





LSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNA





SLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTY





AHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFA





NRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQ





TVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIK





ELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVD





HIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYWRQLLNA





KLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRM





NTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYL





NAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFY





SNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSM





PQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTV





AYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKE





VKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLA





SHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDK





VLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTST





KEVLDATLIHQSITGLYETRIDLSQLGGDGGSGGSSGGSSGSETPGTSES





ATPESSGGSSGGSSGGTLNIEDEYRLHETSKEPDVSLGSTWLSDFPQAWA





ETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQ





GILVPCQSPWNTPLLPVKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYN





LLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLT





WTRLPQGFKNSPTLFNEALHRDLADFRIQHPDLILLQYVDDLLLAATSEL





DCQQGTRALLQTLGNLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEAR





KETVMGQPTPKTPRQLREFLGKAGFCRLFIPGFAEMAAPLYPLTKPGTLF





NWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQK





LGPWRRPVAYLSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVI





LAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPVVALNPATLL





PLPEEGLQHNCLDILAEAHGTRPDLTDQPLPDADHTWYTDGSSLLQEGQR





KAGAAVTTETEVIWAKALPAGTSAQRAELIALTQALKMAEGKKLNVYTDS





RYAFATAHIHGEIYRRRGWLTSEGKEIKNKDEILALLKALFLPKRLSIIH





CPGHQKGHSAEARGNRMADQAARKAAITETPDTSTLLIENSSPSGGSKRT





ADGSEFEAGKRTADGSEFEKRTADGSEFESPKKKAKVE






In some embodiments, a gene modifying polypeptide may comprise a Cas domain as listed in Table 7 or 8, or a functional fragment thereof, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% identity thereto.









TABLE 7







CRISPR/Cas Proteins, Species, and Mutations
















# of

Mutations to alter PAM
Mutations to make


Name
Enzyme
Species
AAs
PAM
recognition
catalytically dead





FnCas9
Cas9

Francisella

1629
5′-NGG-3′
Wt
D11A/H969A/N995A





novicida










FnCas9
Cas9

Francisella

1629
5′-YG-3′
E1369R/E1449H/R1556A
D11A/H969A/N995A


RHA


novicida










SaCas9
Cas9

Staphylococcus

1053
5′-NNGRRT-3′
Wt
D10A/H557A





aureus










SaCas9
Cas9

Staphylococcus

1053
5′-NNNRRT-3′
E782K/N968K/R1015H
D10A/H557A


KKH


aureus










SpCas9
Cas9

Streptococcus

1368
5′-NGG-3′
Wt
D10A/D839A/H840A/N863A





pyogenes










SpCas9
Cas9

Streptococcus

1368
5′-NGA-3′
D1135V/R1335Q/T1337R
D10A/D839A/H840A/N863A


VQR


pyogenes










AsCpf1
Cpf1

Acidaminococcus

1307
5′-TYCV-3′
S542R/K607R
E993A


RR

sp. BV3L6









AsCpf1
Cpf1

Acidaminococcus

1307
5′-TATV-3′
S542R/K548V/N552R
E993A


RVR

sp. BV3L6









FnCpf1
Cpf1

Francisella

1300
5′-NTTN-3′
Wt
D917A/E1006A/D1255A





novicida










NmCas9
Cas9

Neisseria

1082
5′-NNNGATT-3′
Wt
D16A/D587A/H588A/N611A





meningitidis





















TABLE 8







Amino Acid Sequences of CRISPR/Cas Proteins, Species, and Mutations
















SEQ
Nick-
Nick-
Nick-



Parental

ID
ase
ase
ase


Variant
Host(s)
Protein Sequence
NO:
(HNH)
(HNH)
(RuvC)





Nme2Cas9

Neisseria

MAAFKPNPINYILGLDIGIASVGWAMVEIDEEENPIRLIDLGVRVFERAEVPK
9,001
N611A
H588A
D16A




meningitidis

TGDSLAMARRLARSVRRLTRRRAHRLLRARRLLKREGVLQAADFDENGLIKS








LPNTPWQLRAAALDRKLTPLEWSAVLLHLIKHRGYLSQRKNEGETADKELG








ALLKGVANNAHALQTGDFRTPAELALNKFEKESGHIRNQRGDYSHTFSRKD








LQAELILLFEKQKEFGNPHVSGGLKEGIETLLMTQRPALSGDAVQKMLGHCT








FEPAEPKAAKNTYTAERFIWLTKLNNLRILEQGSERPLTDTERATLMDEPYRK








SKLTYAQARKLLGLEDTAFFKGLRYGKDNAEASTLMEMKAYHAISRALEKEG








LKDKKSPLNLSSELQDEIGTAFSLFKTDEDITGRLKDRVQPEILEALLKHISFDKF








VQISLKALRRIVPLMEQGKRYDEACAEIYGDHYGKKNTEEKIYLPPIPADEIRN








PVVLRALSQARKVINGVVRRYGSPARIHIETAREVGKSFKDRKEIEKRQEENR








KDREKAAAKFREYFPNFVGEPKSKDILKLRLYEQQHGKCLYSGKEINLVRLNE








KGYVEIDHALPFSRTWDDSFNNKVLVLGSENQNKGNQTPYEYFNGKDNSR








EWQEFKARVETSRFPRSKKQRILLQKFDEDGFKECNLNDTRYVNRFLCQFVA








DHILLTGKGKRRVFASNGQITNLLRGFWGLRKVRAENDRHHALDAVVVACS








TVAMQQKITRFVRYKEMNAFDGKTIDKETGKVLHQKTHFPQPWEFFAQEV








MIRVFGKPDGKPEFEEADTPEKLRTLLAEKLSSRPEAVHEYVTPLFVSRAPNR








KMSGAHKDTLRSAKRFVKHNEKISVKRVWLTEIKLADLENMVNYKNGREIEL








YEALKARLEAYGGNAKQAFDPKDNPFYKKGGQLVKAVRVEKTQESGVLLNK








KNAYTIADNGDMVRVDVFCKVDKKGKNQYFIVPIYAWQVAENILPDIDCKG








YRIDDSYTFCFSLHKYDLIAFQKDEKSKVEFAYYINCDSSNGRFYLAWHDKGS








KEQQFRISTQNLVLIQKYQVNELGKEIRPCRLKKRPPVR









PpnCas9

Pasteurella

MQNNPLNYILGLDLGIASIGWAVVEIDEESSPIRLIDVGVRTFERAEVAKTGE
9,002
N605A
H582A
D13A




pneumotropica

SLALSRRLARSSRRLIKRRAERLKKAKRLLKAEKILHSIDEKLPINVWQLRVKGL








KEKLERQEWAAVLLHLSKHRGYLSQRKNEGKSDNKELGALLSGIASNHQML








QSSEYRTPAEIAVKKFQVEEGHIRNQRGSYTHTFSRLDLLAEMELLFQRQAEL








GNSYTSTTLLENLTALLMWQKPALAGDAILKMLGKCTFEPSEYKAAKNSYSA








ERFVWLTKLNNLRILENGTERALNDNERFALLEQPYEKSKLTYAQVRAMLAL








SDNAIFKGVRYLGEDKKTVESKTTLIEMKFYHQIRKTLGSAELKKEWNELKGN








SDLLDEIGTAFSLYKTDDDICRYLEGKLPERVLNALLENLNFDKFIQLSLKALHQ








ILPLMLQGQRYDEAVSAIYGDHYGKKSTETTRLLPTIPADEIRNPVVLRTLTQA








RKVINAVVRLYGSPARIHIETAREVGKSYQDRKKLEKQQEDNRKQRESAVKK








FKEMFPHFVGEPKGKDILKMRLYELQQAKCLYSGKSLELHRLLEKGYVEVDH








ALPFSRTWDDSFNNKVLVLANENQNKGNLTPYEWLDGKNNSERWQHFVV








RVQTSGFSYAKKQRILNHKLDEKGFIERNLNDTRYVARFLCNFIADNMLLVG








KGKRNVFASNGQITALLRHRWGLQKVREQNDRHHALDAVVVACSTVAMQ








QKITRFVRYNEGNVFSGERIDRETGEIIPLHFPSPWAFFKENVEIRIFSENPKLE








LENRLPDYPQYNHEWVQPLFVSRMPTRKMTGQGHMETVKSAKRLNEGLS








VLKVPLTQLKLSDLERMVNRDREIALYESLKARLEQFGNDPAKAFAEPFYKKG








GALVKAVRLEQTQKSGVLVRDGNGVADNASMVRVDVFTKGGKYFLVPIYT








WQVAKGILPNRAATQGKDENDWDIMDEMATFQFSLCQNDLIKLVTKKKTI








FGYFNGLNRATSNINIKEHDLDKSKGKLGIYLEVGVKLAISLEKYQVDELGKNI








RPCRPTKRQHVR









SauCas9

Staphylococcus

MKRNYILGLDIGITSVGYGIIDYETRDVIDAGVRLFKEANVENNEGRRSKRGA
9,003
N580A
H557A
D10A




aureus

RRLKRRRRHRIQRVKKLLFDYNLLTDHSELSGINPYEARVKGLSQKLSEEEFSA








ALLHLAKRRGVHNVNEVEEDTGNELSTKEQISRNSKALEEKYVAELQLERLKK








DGEVRGSINRFKTSDYVKEAKQLLKVQKAYHQLDQSFIDTYIDLLETRRTYYE








GPGEGSPFGWKDIKEWYEMLMGHCTYFPEELRSVKYAYNADLYNALNDLN








NLVITRDENEKLEYYEKFQIIENVFKQKKKPTLKQIAKEILVNEEDIKGYRVTST








GKPEFTNLKVYHDIKDITARKEIIENAELLDQIAKILTIYQSSEDIQEELTNLNSE








LTQEEIEQISNLKGYTGTHNLSLKAINLILDELWHTNDNQIAIFNRLKLVPKKV








DLSQQKEIPTTLVDDFILSPVVKRSFIQSIKVINAIIKKYGLPNDIIIELAREKNSK








DAQKMINEMQKRNRQTNERIEEIIRTTGKENAKYLIEKIKLHDMQEGKCLYS








LEAIPLEDLLNNPFNYEVDHIIPRSVSFDNSFNNKVLVKQEENSKKGNRTPFQ








YLSSSDSKISYETFKKHILNLAKGKGRISKTKKEYLLEERDINRFSVQKDFINRNL








VDTRYATRGLMNLLRSYFRVNNLDVKVKSINGGFTSFLRRKWKFKKERNKG








YKHHAEDALIIANADFIFKEWKKLDKAKKVMENQMFEEKQAESMPEIETEQ








EYKEIFITPHQIKHIKDFKDYKYSHRVDKKPNRELINDTLYSTRKDDKGNTLIVN








NLNGLYDKDNDKLKKLINKSPEKLLMYHHDPQTYQKLKLIMEQYGDEKNPL








YKYYEETGNYLTKYSKKDNGPVIKKIKYYGNKLNAHLDITDDYPNSRNKVVKL








SLKPYRFDVYLDNGVYKFVTVKNLDVIKKENYYEVNSKCYEEAKKLKKISNQA








EFIASFYNNDLIKINGELYRVIGVNNDLLNRIEVNMIDITYREYLENMNDKRPP








RIIKTIASKTQSIKKYSTDILGNLYEVKSKKHPQIIKKG









SauCas9-

Staphylococcus

MKRNYILGLDIGITSVGYGIIDYETRDVIDAGVRLFKEANVENNEGRRSKRGA
9,004
N580A
H557A
D10A


KKH

aureus

RRLKRRRRHRIQRVKKLLFDYNLLTDHSELSGINPYEARVKGLSQKLSEEEFSA








ALLHLAKRRGVHNVNEVEEDTGNELSTKEQISRNSKALEEKYVAELQLERLKK








DGEVRGSINRFKTSDYVKEAKQLLKVQKAYHQLDQSFIDTYIDLLETRRTYYE








GPGEGSPFGWKDIKEWYEMLMGHCTYFPEELRSVKYAYNADLYNALNDLN








NLVITRDENEKLEYYEKFQIIENVFKQKKKPTLKQIAKEILVNEEDIKGYRVTST








GKPEFTNLKVYHDIKDITARKEIIENAELLDQIAKILTIYQSSEDIQEELTNLNSE








LTQEEIEQISNLKGYTGTHNLSLKAINLILDELWHTNDNQIAIFNRLKLVPKKV








DLSQQKEIPTTLVDDFILSPVVKRSFIQSIKVINAIIKKYGLPNDIIIELAREKNSK








DAQKMINEMQKRNRQTNERIEEIIRTTGKENAKYLIEKIKLHDMQEGKCLYS








LEAIPLEDLLNNPFNYEVDHIIPRSVSFDNSFNNKVLVKQEENSKKGNRTPFQ








YLSSSDSKISYETFKKHILNLAKGKGRISKTKKEYLLEERDINRFSVQKDFINRNL








VDTRYATRGLMNLLRSYFRVNNLDVKVKSINGGFTSFLRRKWKFKKERNKG








YKHHAEDALIIANADFIFKEWKKLDKAKKVMENQMFEEKQAESMPEIETEQ








EYKEIFITPHQIKHIKDFKDYKYSHRVDKKPNRKLINDTLYSTRKDDKGNTLIV








NNLNGLYDKDNDKLKKLINKSPEKLLMYHHDPQTYQKLKLIMEQYGDEKNP








LYKYYEETGNYLTKYSKKDNGPVIKKIKYYGNKLNAHLDITDDYPNSRNKVVK








LSLKPYRFDVYLDNGVYKFVTVKNLDVIKKENYYEVNSKCYEEAKKLKKISNQ








AEFIASFYKNDLIKINGELYRVIGVNNDLLNRIEVNMIDITYREYLENMNDKRP








PHIIKTIASKTQSIKKYSTDILGNLYEVKSKKHPQIIKKG









SauriCas9

Staphylococcus

MQENQQKQNYILGLDIGITSVGYGLIDSKTREVIDAGVRLFPEADSENNSNR
9,005
N588A
H565A
D15A




auricularis

RSKRGARRLKRRRIHRLNRVKDLLADYQMIDLNNVPKSTDPYTIRVKGLREPL








TKEEFAIALLHIAKRRGLHNISVSMGDEEQDNELSTKQQLQKNAQQLQDKY








VCELQLERLTNINKVRGEKNRFKTEDFVKEVKQLCETQRQYHNIDDQFIQQY








IDLVSTRREYFEGPGNGSPYGWDGDLLKWYEKLMGRCTYFPEELRSVKYAYS








ADLFNALNDLNNLVVTRDDNPKLEYYEKYHIIENVFKQKKNPTLKQIAKEIGV








QDYDIRGYRITKSGKPQFTSFKLYHDLKNIFEQAKYLEDVEMLDEIAKILTIYQ








DEISIKKALDQLPELLTESEKSQIAQLTGYTGTHRLSLKCIHIVIDELWESPENQ








MEIFTRLNLKPKKVEMSEIDSIPTTLVDEFILSPVVKRAFIQSIKVINAVINRFGL








PEDIIIELAREKNSKDRRKFINKLQKQNEATRKKIEQLLAKYGNTNAKYMIEKI








KLHDMQEGKCLYSLEAIPLEDLLSNPTHYEVDHIIPRSVSFDNSLNNKVLVKQ








SENSKKGNRTPYQYLSSNESKISYNQFKQHILNLSKAKDRISKKKRDMLLEER








DINKFEVQKEFINRNLVDTRYATRELSNLLKTYFSTHDYAVKVKTINGGFTNH








LRKVWDFKKHRNHGYKHHAEDALVIANADFLFKTHKALRRTDKILEQPGLE








VNDTTVKVDTEEKYQELFETPKQVKNIKQFRDFKYSHRVDKKPNRQLINDTL








YSTREIDGETYVVQTLKDLYAKDNEKVKKLFTERPQKILMYQHDPKTFEKLM








TILNQYAEAKNPLAAYYEDKGEYVTKYAKKGNGPAIHKIKYIDKKLGSYLDVS








NKYPETQNKLVKLSLKSFRFDIYKCEQGYKMVSIGYLDVLKKDNYYYIPKDKYE








AEKQKKKIKESDLFVGSFYYNDLIMYEDELFRVIGVNSDINNLVELNMVDITY








KDFCEVNNVTGEKRIKKTIGKRVVLIEKYTTDILGNLYKTPLPKKPQLIFKRGEL









Sauri-

Staphylococcus

MQENQQKQNYILGLDIGITSVGYGLIDSKTREVIDAGVRLFPEADSENNSNR
9,006
N588A
H565A
D15A


Cas9-KKH

auricularis

RSKRGARRLKRRRIHRLNRVKDLLADYQMIDLNNVPKSTDPYTIRVKGLREPL








TKEEFAIALLHIAKRRGLHNISVSMGDEEQDNELSTKQQLQKNAQQLQDKY








VCELQLERLTNINKVRGEKNRFKTEDFVKEVKQLCETQRQYHNIDDQFIQQY








IDLVSTRREYFEGPGNGSPYGWDGDLLKWYEKLMGRCTYFPEELRSVKYAYS








ADLFNALNDLNNLVVTRDDNPKLEYYEKYHIIENVFKQKKNPTLKQIAKEIGV








QDYDIRGYRITKSGKPQFTSFKLYHDLKNIFEQAKYLEDVEMLDEIAKILTIYQ








DEISIKKALDQLPELLTESEKSQIAQLTGYTGTHRLSLKCIHIVIDELWESPENQ








MEIFTRLNLKPKKVEMSEIDSIPTTLVDEFILSPVVKRAFIQSIKVINAVINRFGL








PEDIIIELAREKNSKDRRKFINKLQKQNEATRKKIEQLLAKYGNTNAKYMIEKI








KLHDMQEGKCLYSLEAIPLEDLLSNPTHYEVDHIIPRSVSFDNSLNNKVLVKQ








SENSKKGNRTPYQYLSSNESKISYNQFKQHILNLSKAKDRISKKKRDMLLEER








DINKFEVQKEFINRNLVDTRYATRELSNLLKTYFSTHDYAVKVKTINGGFTNH








LRKVWDFKKHRNHGYKHHAEDALVIANADFLFKTHKALRRTDKILEQPGLE








VNDTTVKVDTEEKYQELFETPKQVKNIKQFRDFKYSHRVDKKPNRKLINDTL








YSTREIDGETYVVQTLKDLYAKDNEKVKKLFTERPQKILMYQHDPKTFEKLM








TILNQYAEAKNPLAAYYEDKGEYVTKYAKKGNGPAIHKIKYIDKKLGSYLDVS








NKYPETQNKLVKLSLKSFRFDIYKCEQGYKMVSIGYLDVLKKDNYYYIPKDKYE








AEKQKKKIKESDLFVGSFYKNDLIMYEDELFRVIGVNSDINNLVELNMVDITY








KDFCEVNNVTGEKHIKKTIGKRVVLIEKYTTDILGNLYKTPLPKKPQLIFKRGEL









ScaCas9-

Streptococcus

MEKKYSIGLDIGTNSVGWAVITDDYKVPSKKFKVLGNTNRKSIKKNLMGALL
9,007
N872A
H849A
D10A


Sc++

canis

FDSGETAEATRLKRTARRRYTRRKNRIRYLQEIFANEMAKLDDSFFQRLEESF








LVEEDKKNERHPIFGNLADEVAYHRNYPTIYHLRKKLADSPEKADLRLIYLALA








HIIKFRGHFLIEGKLNAENSDVAKLFYQLIQTYNQLFEESPLDEIEVDAKGILSA








RLSKSKRLEKLIAVFPNEKKNGLFGNIIALALGLTPNFKSNFDLTEDAKLQLSKD








TYDDDLDELLGQIGDQYADLFSAAKNLSDAILLSDILRSNSEVTKAPLSASMV








KRYDEHHQDLALLKTLVRQQFPEKYAEIFKDDTKNGYAGYVGADKKLRKRS








GKLATEEEFYKFIKPILEKMDGAEELLAKLNRDDLLRKQRTFDNGSIPHQIHLK








ELHAILRRQEEFYPFLKENREKIEKILTFRIPYYVGPLARGNSRFAWLTRKSEEA








ITPWNFEEVVDKGASAQSFIERMTNFDEQLPNKKVLPKHSLLYEYFTVYNEL








TKVKYVTERMRKPEFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDS








VEIIGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIE








ERLKTYAHLFDDKVMKQLKRRHYTGWGRLSRKMINGIRDKQSGKTILDFLKS








DGFSNRNFMQLIHDDSLTFKEEIEKAQVSGQGDSLHEQIADLAGSPAIKKGIL








QTVKIVDELVKVMGHKPENIVIEMARENQTTTKGLQQSRERKKRIEEGIKELE








SQILKENPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVP








QSFIKDDSIDNKVLTRSVENRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQ








RKFDNLTKAERGGLSEADKAGFIKRQLVETRQITKHVARILDSRMNTKRDKN








DKPIREVKVITLKSKLVSDFRKDFQLYKVRDINNYHHAHDAYLNAVVGTALIK








KYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKRFFYSNIMNFFKTEVKL








ANGEIRKRPLIETNGETGEVVWNKEKDFATVRKVLAMPQVNIVKKTEVQTG








GFSKESILSKRESAKLIPRKKGWDTRKYGGFGSPTVAYSILVVAKVEKGKAKKL








KSVKVLVGITIMEKGSYEKDPIGFLEAKGYKDIKKELIFKLPKYSLFELENGRRR








MLASAKELQKANELVLPQHLVRLLYYTQNISATTGSNNLGYIEQHREEFKEIF








EKIIDFSEKYILKNKVNSNLKSSFDEQFAVSDSILLSNSFVSLLKYTSFGASGGFT








FLDLDVKQGRLRYQTVTEVLDATLIYQSITGLYETRTDLSQLGGD









SpyCas9 

Streptococcus

MDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLF
9,008
N863A
H840A
D10A




pyogenes

DSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFL








VEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAH








MIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILS








ARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLS








KDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSAS








MIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEF








YKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQ








EDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFE








EVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTE








GMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVED








RFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA








HLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANR








NFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKV








VDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQ








ILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSF








LKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKF








DNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLI








REVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPK








LESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEI








RKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKES








ILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKE








LLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASA








GELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEII








EQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAF








KYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGD









SpyCas9-

Streptococcus

MDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLF
9,009
N863A
H840A
D10A


NG

pyogenes

DSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFL








VEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAH








MIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILS








ARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLS








KDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSAS








MIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEF








YKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQ








EDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFE








EVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTE








GMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVED








RFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA








HLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANR








NFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKV








VDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQ








ILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSF








LKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKF








DNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLI








REVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPK








LESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEI








RKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKES








IRPKRNSDKLIARKKDWDPKKYGGFVSPTVAYSVLVVAKVEKGKSKKLKSVKE








LLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASA








RFLQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEII








EQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPRAF








KYFDTTIDRKVYRSTKEVLDATLIHQSITGLYETRIDLSQLGGD









SpyCas9-

Streptococcus

MDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLF
9,010
N863A
H840A
D10A


SpRY

pyogenes

DSGETAERTRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFL








VEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAH








MIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILS








ARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLS








KDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSAS








MIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEF








YKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQ








EDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFE








EVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTE








GMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVED








RFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA








HLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANR








NFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKV








VDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQ








ILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSF








LKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKF








DNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLI








REVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPK








LESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEI








RKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKES








IRPKRNSDKLIARKKDWDPKKYGGFLWPTVAYSVLVVAKVEKGKSKKLKSVK








ELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLAS








AKQLQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDE








IIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTRLGAPRAF








KYFDTTIDPKQYRSTKEVLDATLIHQSITGLYETRIDLSQLGGD









St1Cas9

Streptococcus

MSDLVLGLDIGIGSVGVGILNKVTGEIIHKNSRIFPAAQAENNLVRRTNRQG
9,011
N622A
H599A
D9A




thermophilus

RRLARRKKHRRVRLNRLFEESGLITDFTKISINLNPYQLRVKGLTDELSNEELFI








ALKNMVKHRGISYLDDASDDGNSSVGDYAQIVKENSKQLETKTPGQIQLER








YQTYGQLRGDFTVEKDGKKHRLINVFPTSAYRSEALRILQTQQEFNPQITDEF








INRYLEILTGKRKYYHGPGNEKSRTDYGRYRTSGETLDNIFGILIGKCTFYPDEF








RAAKASYTAQEFNLLNDLNNLTVPTETKKLSKEQKNQIINYVKNEKAMGPAK








LFKYIAKLLSCDVADIKGYRIDKSGKAEIHTFEAYRKMKTLETLDIEQMDRETL








DKLAYVLTLNTEREGIQEALEHEFADGSFSQKQVDELVQFRKANSSIFGKGW








HNFSVKLMMELIPELYETSEEQMTILTRLGKQKTTSSSNKTKYIDEKLLTEEIY








NPVVAKSVRQAIKIVNAAIKEYGDFDNIVIEMARETNEDDEKKAIQKIQKAN








KDEKDAAMLKAANQYNGKAELPHSVFHGHKQLATKIRLWHQQGERCLYT








GKTISIHDLINNSNQFEVDHILPLSITFDDSLANKVLVYATANQEKGQRTPYQ








ALDSMDDAWSFRELKAFVRESKTLSNKKKEYLLTEEDISKFDVRKKFIERNLV








DTRYASRVVLNALQEHFRAHKIDTKVSVVRGQFTSQLRRHWGIEKTRDTYH








HHAVDALIIAASSQLNLWKKQKNTLVSYSEDQLLDIETGELISDDEYKESVFK








APYQHFVDTLKSKEFEDSILFSYQVDSKFNRKISDATIYATRQAKVGKDKADE








TYVLGKIKDIYTQDGYDAFMKIYKKDKSKFLMYRHDPQTFEKVIEPILENYPN








KQINEKGKEVPCNPFLKYKEEHGYIRKYSKKGNGPEIKSLKYYDSKLGNHIDIT








PKDSNNKVVLQSVSPWRADVYFNKTTGKYEILGLKYADLQFEKGTGTYKISQ








EKYNDIKKKEGVDSDSEFKFTLYKNDLLLVKDTETKEQQLFRFLSRTMPKQKH








YVELKPYDKQKFEGGEALIKVLGNVANSGQCKKGLGKSNISIYKVRTDVLGN








QHIIKNEGDKPKLDF









BlatCas9

Brevibacillus

MAYTMGIDVGIASCGWAIVDLERQRIIDIGVRTFEKAENPKNGEALAVPRRE
9,012
N607A
H584A
D8A




laterosporus

ARSSRRRLRRKKHRIERLKHMFVRNGLAVDIQHLEQTLRSQNEIDVWQLRV








DGLDRMLTQKEWLRVLIHLAQRRGFQSNRKTDGSSEDGQVLVNVTENDRL








MEEKDYRTVAEMMVKDEKFSDHKRNKNGNYHGVVSRSSLLVEIHTLFETQ








RQHHNSLASKDFELEYVNIWSAQRPVATKDQIEKMIGTCTFLPKEKRAPKAS








WHFQYFMLLQTINHIRITNVQGTRSLNKEEIEQVVNMALTKSKVSYHDTRKI








LDLSEEYQFVGLDYGKEDEKKKVESKETIIKLDDYHKLNKIFNEVELAKGETWE








ADDYDTVAYALTFFKDDEDIRDYLQNKYKDSKNRLVKNLANKEYTNELIGKV








STLSFRKVGHLSLKALRKIIPFLEQGMTYDKACQAAGFDFQGISKKKRSVVLP








VIDQISNPVVNRALTQTRKVINALIKKYGSPETIHIETARELSKTFDERKNITKD








YKENRDKNEHAKKHLSELGIINPTGLDIVKYKLWCEQQGRCMYSNQPISFER








LKESGYTEVDHIIPYSRSMNDSYNNRVLVMTRENREKGNQTPFEYMGNDT








QRWYEFEQRVTTNPQIKKEKRQNLLLKGFTNRRELEMLERNLNDTRYITKYL








SHFISTNLEFSPSDKKKKVVNTSGRITSHLRSRWGLEKNRGQNDLHHAMDAI








VIAVTSDSFIQQVTNYYKRKERRELNGDDKFPLPWKFFREEVIARLSPNPKEQ








IEALPNHFYSEDELADLQPIFVSRMPKRSITGEAHQAQFRRVVGKTKEGKNIT








AKKTALVDISYDKNGDFNMYGRETDPATYEAIKERYLEFGGNVKKAFSTDLH








KPKKDGTKGPLIKSVRIMENKTLVHPVNKGKGVVYNSSIVRTDVFQRKEKYY








LLPVYVTDVTKGKLPNKVIVAKKGYHDWIEVDDSFTFLFSLYPNDLIFIRQNPK








KKISLKKRIESHSISDSKEVQEIHAYYKGVDSSTAAIEFIIHDGSYYAKGVGVQN








LDCFEKYQVDILGNYFKVKGEKRLELETSDSNHKGKDVNSIKSTSR









cCas9-v16

Staphylococcus

MKRNYILGLDIGITSVGYGIIDYETRDVIDAGVRLFKEANVENNEGRRSKRGA
9,013
N580A
H557A
D10A




aureus

RRLKRRRRHRIQRVKKLLFDYNLLTDHSELSGINPYEARVKGLSQKLSEEEFSA








ALLHLAKRRGVHNVNEVEEDTGNELSTKEQISRNSKALEEKYVAELQLERLKK








DGEVRGSINRFKTSDYVKEAKQLLKVQKAYHQLDQSFIDTYIDLLETRRTYYE








GPGEGSPFGWKDIKEWYEMLMGHCTYFPEELRSVKYAYNADLYNALNDLN








NLVITRDENEKLEYYEKFQIIENVFKQKKKPTLKQIAKEILVNEEDIKGYRVTST








GKPEFTNLKVYHDIKDITARKEIIENAELLDQIAKILTIYQSSEDIQEELTNLNSE








LTQEEIEQISNLKGYTGTHNLSLKAINLILDELWHTNDNQIAIFNRLKLVPKKV








DLSQQKEIPTTLVDDFILSPVVKRSFIQSIKVINAIIKKYGLPNDIIIELAREKNSK








DAQKMINEMQKRNRQTNERIEEIIRTTGKENAKYLIEKIKLHDMQEGKCLYS








LEAIPLEDLLNNPFNYEVDHIIPRSVSFDNSFNNKVLVKQEENSKKGNRTPFQ








YLSSSDSKISYETFKKHILNLAKGKGRISKTKKEYLLEERDINRFSVQKDFINRNL








VDTRYATRGLMNLLRSYFRVNNLDVKVKSINGGFTSFLRRKWKFKKERNKG








YKHHAEDALIIANADFIFKEWKKLDKAKKVMENQMFEEKQAESMPEIETEQ








EYKEIFITPHQIKHIKDFKDYKYSHRVDKKPNRKLINDTLYSTRKDDKGNTLIV








NNLNGLYDKDNDKLKKLINKSPEKLLMYHHDPQTYQKLKLIMEQYGDEKNP








LYKYYEETGNYLTKYSKKDNGPVIKKIKYYGNKLNAHLDITDDYPNSRNKVVK








LSLKPYRFDVYLDNGVYKFVTVKNLDVIKKENYYEVNSKCYEEAKKLKKISNQ








AEFIASFYKNDLIKINGELYRVIGVNSDKNNLIEVNMIDITYREYLENMNDKRP








PHIIKTIASKTQSIKKYSTDILGNLYEVKSKKHPQIIKKG









cCas9-v17

Staphylococcus

MKRNYILGLDIGITSVGYGIIDYETRDVIDAGVRLFKEANVENNEGRRSKRGA
9,014
N580A
H557A
D10A




aureus

RRLKRRRRHRIQRVKKLLFDYNLLTDHSELSGINPYEARVKGLSQKLSEEEFSA








ALLHLAKRRGVHNVNEVEEDTGNELSTKEQISRNSKALEEKYVAELQLERLKK








DGEVRGSINRFKTSDYVKEAKQLLKVQKAYHQLDQSFIDTYIDLLETRRTYYE








GPGEGSPFGWKDIKEWYEMLMGHCTYFPEELRSVKYAYNADLYNALNDLN








NLVITRDENEKLEYYEKFQIIENVFKQKKKPTLKQIAKEILVNEEDIKGYRVTST








GKPEFTNLKVYHDIKDITARKEIIENAELLDQIAKILTIYQSSEDIQEELTNLNSE








LTQEEIEQISNLKGYTGTHNLSLKAINLILDELWHTNDNQIAIFNRLKLVPKKV








DLSQQKEIPTTLVDDFILSPVVKRSFIQSIKVINAIIKKYGLPNDIIIELAREKNSK








DAQKMINEMQKRNRQTNERIEEIIRTTGKENAKYLIEKIKLHDMQEGKCLYS








LEAIPLEDLLNNPFNYEVDHIIPRSVSFDNSFNNKVLVKQEENSKKGNRTPFQ








YLSSSDSKISYETFKKHILNLAKGKGRISKTKKEYLLEERDINRFSVQKDFINRNL








VDTRYATRGLMNLLRSYFRVNNLDVKVKSINGGFTSFLRRKWKFKKERNKG








YKHHAEDALIIANADFIFKEWKKLDKAKKVMENQMFEEKQAESMPEIETEQ








EYKEIFITPHQIKHIKDFKDYKYSHRVDKKPNRKLINDTLYSTRKDDKGNTLIV








NNLNGLYDKDNDKLKKLINKSPEKLLMYHHDPQTYQKLKLIMEQYGDEKNP








LYKYYEETGNYLTKYSKKDNGPVIKKIKYYGNKLNAHLDITDDYPNSRNKVVK








LSLKPYRFDVYLDNGVYKFVTVKNLDVIKKENYYEVNSKCYEEAKKLKKISNQ








AEFIASFYKNDLIKINGELYRVIGVNNSTRNIVELNMIDITYREYLENMNDKRP








PHIIKTIASKTQSIKKYSTDILGNLYEVKSKKHPQIIKKG









cCas9-v21

Staphylococcus

MKRNYILGLDIGITSVGYGIIDYETRDVIDAGVRLFKEANVENNEGRRSKRGA
9,015
N580A
H557A
D10A




aureus

RRLKRRRRHRIQRVKKLLFDYNLLTDHSELSGINPYEARVKGLSQKLSEEEFSA








ALLHLAKRRGVHNVNEVEEDTGNELSTKEQISRNSKALEEKYVAELQLERLKK








DGEVRGSINRFKTSDYVKEAKQLLKVQKAYHQLDQSFIDTYIDLLETRRTYYE








GPGEGSPFGWKDIKEWYEMLMGHCTYFPEELRSVKYAYNADLYNALNDLN








NLVITRDENEKLEYYEKFQIIENVFKQKKKPTLKQIAKEILVNEEDIKGYRVTST








GKPEFTNLKVYHDIKDITARKEIIENAELLDQIAKILTIYQSSEDIQEELTNLNSE








LTQEEIEQISNLKGYTGTHNLSLKAINLILDELWHTNDNQIAIFNRLKLVPKKV








DLSQQKEIPTTLVDDFILSPVVKRSFIQSIKVINAIIKKYGLPNDIIIELAREKNSK








DAQKMINEMQKRNRQTNERIEEIIRTTGKENAKYLIEKIKLHDMQEGKCLYS








LEAIPLEDLLNNPFNYEVDHIIPRSVSFDNSFNNKVLVKQEENSKKGNRTPFQ








YLSSSDSKISYETFKKHILNLAKGKGRISKTKKEYLLEERDINRFSVQKDFINRNL








VDTRYATRGLMNLLRSYFRVNNLDVKVKSINGGFTSFLRRKWKFKKERNKG








YKHHAEDALIIANADFIFKEWKKLDKAKKVMENQMFEEKQAESMPEIETEQ








EYKEIFITPHQIKHIKDFKDYKYSHRVDKKPNRKLINDTLYSTRKDDKGNTLIV








NNLNGLYDKDNDKLKKLINKSPEKLLMYHHDPQTYQKLKLIMEQYGDEKNP








LYKYYEETGNYLTKYSKKDNGPVIKKIKYYGNKLNAHLDITDDYPNSRNKVVK








LSLKPYRFDVYLDNGVYKFVTVKNLDVIKKENYYEVNSKCYEEAKKLKKISNQ








AEFIASFYKNDLIKINGELYRVIGVNSDDRNIIELNMIDITYREYLENMNDKRP








PHIIKTIASKTQSIKKYSTDILGNLYEVKSKKHPQIIKKG









cCas9-v42

Staphylococcus

MKRNYILGLDIGITSVGYGIIDYETRDVIDAGVRLFKEANVENNEGRRSKRGA
9,016
N580A
H557A
D10A




aureus

RRLKRRRRHRIQRVKKLLFDYNLLTDHSELSGINPYEARVKGLSQKLSEEEFSA








ALLHLAKRRGVHNVNEVEEDTGNELSTKEQISRNSKALEEKYVAELQLERLKK








DGEVRGSINRFKTSDYVKEAKQLLKVQKAYHQLDQSFIDTYIDLLETRRTYYE








GPGEGSPFGWKDIKEWYEMLMGHCTYFPEELRSVKYAYNADLYNALNDLN








NLVITRDENEKLEYYEKFQIIENVFKQKKKPTLKQIAKEILVNEEDIKGYRVTST








GKPEFTNLKVYHDIKDITARKEIIENAELLDQIAKILTIYQSSEDIQEELTNLNSE








LTQEEIEQISNLKGYTGTHNLSLKAINLILDELWHTNDNQIAIFNRLKLVPKKV








DLSQQKEIPTTLVDDFILSPVVKRSFIQSIKVINAIIKKYGLPNDIIIELAREKNSK








DAQKMINEMQKRNRQTNERIEEIIRTTGKENAKYLIEKIKLHDMQEGKCLYS








LEAIPLEDLLNNPFNYEVDHIIPRSVSFDNSFNNKVLVKQEENSKKGNRTPFQ








YLSSSDSKISYETFKKHILNLAKGKGRISKTKKEYLLEERDINRFSVQKDFINRNL








VDTRYATRGLMNLLRSYFRVNNLDVKVKSINGGFTSFLRRKWKFKKERNKG








YKHHAEDALIIANADFIFKEWKKLDKAKKVMENQMFEEKQAESMPEIETEQ








EYKEIFITPHQIKHIKDFKDYKYSHRVDKKPNRKLINDTLYSTRKDDKGNTLIV








NNLNGLYDKDNDKLKKLINKSPEKLLMYHHDPQTYQKLKLIMEQYGDEKNP








LYKYYEETGNYLTKYSKKDNGPVIKKIKYYGNKLNAHLDITDDYPNSRNKVVK








LSLKPYRFDVYLDNGVYKFVTVKNLDVIKKENYYEVNSKCYEEAKKLKKISNQ








AEFIASFYKNDLIKINGELYRVIGVNNNRLNKIELNMIDITYREYLENMNDKRP








PHIIKTIASKTQSIKKYSTDILGNLYEVKSKKHPQIIKKG









CdiCas9

Corynebacterium

MKYHVGIDVGTFSVGLAAIEVDDAGMPIKTLSLVSHIHDSGLDPDEIKSAVT
9,017
N597A
H573A
D8A




diphtheriae

RLASSGIARRTRRLYRRKRRRLQQLDKFIQRQGWPVIELEDYSDPLYPWKVR








AELAASYIADEKERGEKLSVALRHIARHRGWRNPYAKVSSLYLPDGPSDAFK








AIREEIKRASGQPVPETATVGQMVTLCELGTLKLRGEGGVLSARLQQSDYAR








EIQEICRMQEIGQELYRKIIDVVFAAESPKGSASSRVGKDPLQPGKNRALKAS








DAFQRYRIAALIGNLRVRVDGEKRILSVEEKNLVFDHLVNLTPKKEPEWVTIA








EILGIDRGQLIGTATMTDDGERAGARPPTHDTNRSIVNSRIAPLVDWWKTA








SALEQHAMVKALSNAEVDDFDSPEGAKVQAFFADLDDDVHAKLDSLHLPV








GRAAYSEDTLVRLTRRMLSDGVDLYTARLQEFGIEPSWTPPTPRIGEPVGNP








AVDRVLKTVSRWLESATKTWGAPERVIIEHVREGFVTEKRAREMDGDMRR








RAARNAKLFQEMQEKLNVQGKPSRADLWRYQSVQRQNCQCAYCGSPITF








SNSEMDHIVPRAGQGSTNTRENLVAVCHRCNQSKGNTPFAIWAKNTSIEG








VSVKEAVERTRHWVTDTGMRSTDFKKFTKAVVERFQRATMDEEIDARSME








SVAWMANELRSRVAQHFASHGTTVRVYRGSLTAEARRASGISGKLKFFDGV








GKSRLDRRHHAIDAAVIAFTSDYVAETLAVRSNLKQSQAHRQEAPQWREFT








GKDAEHRAAWRVWCQKMEKLSALLTEDLRDDRVVVMSNVRLRLGNGSA








HKETIGKLSKVKLSSQLSVSDIDKASSEALWCALTREPGFDPKEGLPANPERHI








RVNGTHVYAGDNIGLFPVSAGSIALRGGYAELGSSFHHARVYKITSGKKPAF








AMLRVYTIDLLPYRNQDLFSVELKPQTMSMRQAEKKLRDALATGNAEYLG








WLVVDDELVVDTSKIATDQVKAVEAELGTIRRWRVDGFFSPSKLRLRPLQM








SKEGIKKESAPELSKIIDRPGWLPAVNKLFSDGNVTVVRRDSLGRVRLESTAH








LPVTWKVQ









CjeCas9

Campylobacter

MARILAFDIGISSIGWAFSENDELKDCGVRIFTKVENPKTGESLALPRRLARSA
9,018
N582A
H559A
D8A




jejuni

RKRLARRKARLNHLKHLIANEFKLNYEDYQSFDESLAKAYKGSLISPYELRFRA








LNELLSKQDFARVILHIAKRRGYDDIKNSDDKEKGAILKAIKQNEEKLANYQS








VGEYLYKEYFQKFKENSKEFTNVRNKKESYERCIAQSFLKDELKLIFKKQREFG








FSFSKKFEEEVLSVAFYKRALKDFSHLVGNCSFFTDEKRAPKNSPLAFMFVAL








TRIINLLNNLKNTEGILYTKDDLNALLNEVLKNGTLTYKQTKKLLGLSDDYEFK








GEKGTYFIEFKKYKEFIKALGEHNLSQDDLNEIAKDITLIKDEIKLKKALAKYDLN








QNQIDSLSKLEFKDHLNISFKALKLVTPLMLEGKKYDEACNELNLKVAINEDK








KDFLPAFNETYYKDEVTNPVVLRAIKEYRKVLNALLKKYGKVHKINIELAREVG








KNHSQRAKIEKEQNENYKAKKDAELECEKLGLKINSKNILKLRLFKEQKEFCAY








SGEKIKISDLQDEKMLEIDHIYPYSRSFDDSYMNKVLVFTKQNQEKLNQTPFE








AFGNDSAKWQKIEVLAKNLPTKKQKRILDKNYKDKEQKNFKDRNLNDTRYI








ARLVLNYTKDYLDFLPLSDDENTKLNDTQKGSKVHVEAKSGMLTSALRHTW








GFSAKDRNNHLHHAIDAVIIAYANNSIVKAFSDFKKEQESNSAELYAKKISELD








YKNKRKFFEPFSGFRQKVLDKIDEIFVSKPERKKPSGALHEETFRKEEEFYQSY








GGKEGVLKALELGKIRKVNGKIVKNGDMFRVDIFKHKKTNKFYAVPIYTMDF








ALKVLPNKAVARSKKGEIKDWILMDENYEFCFSLYKDSLILIQTKDMQEPEFV








YYNAFTSSTVSLIVSKHDNKFETLSKNQKILFKNANEKEVIAKSIGIQNLKVFEK








YIVSALGEVTKAEFRQREDFKK









GeoCas9

Geobacillus

MRYKIGLDIGITSVGWAVMNLDIPRIEDLGVRIFDRAENPQTGESLALPRRLA
9,019
N605A
H582A
D8A




stearothermo-

RSARRRLRRRKHRLERIRRLVIREGILTKEELDKLFEEKHEIDVWQLRVEALDR








philus

KLNNDELARVLLHLAKRRGFKSNRKSERSNKENSTMLKHIEENRAILSSYRTV








GEMIVKDPKFALHKRNKGENYTNTIARDDLEREIRLIFSKQREFGNMSCTEEF








ENEYITIWASQRPVASKDDIEKKVGFCTFEPKEKRAPKATYTFQSFIAWEHIN








KLRLISPSGARGLTDEERRLLYEQAFQKNKITYHDIRTLLHLPDDTYFKGIVYDR








GESRKQNENIRFLELDAYHQIRKAVDKVYGKGKSSSFLPIDFDTFGYALTLFKD








DADIHSYLRNEYEQNGKRMPNLANKVYDNELIEELLNLSFTKFGHLSLKALRS








ILPYMEQGEVYSSACERAGYTFTGPKKKQKTMLLPNIPPIANPVVMRALTQA








RKVVNAIIKKYGSPVSIHIELARDLSQTFDERRKTKKEQDENRKKNETAIRQL








MEYGLTLNPTGHDIVKFKLWSEQNGRCAYSLQPIEIERLLEPGYVEVDHVIPY








SRSLDDSYTNKVLVLTRENREKGNRIPAEYLGVGTERWQQFETFVLTNKQFS








KKKRDRLLRLHYDENEETEFKNRNLNDTRYISRFFANFIREHLKFAESDDKQK








VYTVNGRVTAHLRSRWEFNKNREESDLHHAVDAVIVACTTPSDIAKVTAFY








QRREQNKELAKKTEPHFPQPWPHFADELRARLSKHPKESIKALNLGNYDDQ








KLESLQPVFVSRMPKRSVTGAAHQETLRRYVGIDERSGKIQTVVKTKLSEIKL








DASGHFPMYGKESDPRTYEAIRQRLLEHNNDPKKAFQEPLYKPKKNGEPGP








VIRTVKIIDTKNQVIPLNDGKTVAYNSNIVRVDVFEKDGKYYCVPVYTMDIM








KGILPNKAIEPNKPYSEWKEMTEDYTFRFSLYPNDLIRIELPREKTVKTAAGEE








INVKDVFVYYKTIDSANGGLELISHDHRFSLRGVGSRTLKRFEKYQVDVLGNI








YKVRGEKRVGLASSAHSKPGKTIRPLQSTRD









iSpyMac-

Streptococcus

MDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLF
9,020
N863A
H840A
D10A


Cas9
spp.
DSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFL








VEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAH








MIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILS








ARLSKSRKLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLS








KDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSAS








MIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEF








YKFIKPILEKMDGTEELLVKLKREDLLRKQRTFDNGSIPHQIHLGELHAILRRQ








EDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFE








EVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTE








GMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVED








RFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA








HLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANR








NFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKV








VDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQ








ILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSF








LKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKF








DNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLI








REVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPK








LESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEI








RKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEIQTVGQNGG








LFDDNPKSPLEVTPSKLVPLKKELNPKKYGGYQKPTTAYPVLLITDTKQLIPISV








MNKKQFEQNPVKFLRDRGYQQVGKNDFIKLPKYTLVDIGDGIKRLWASSKEI








HKGNQLVVSKKSQILLYHAHHLDSDLSNDYLQNHNQQFDVLFNEISFSKKC








KLGKEHIQKIENVYSNKKNSASIEELAESFIKLLGFTQLGATSPFNFLGVKLNQ








KQYKGKKDYILPCTEGTLIRQSITGLYETRVDLSKIGEDSGGSGGSKRTADGSE








FES









NmeCas9

Neisseria

MAAFKPNSINYILGLDIGIASVGWAMVEIDEEENPIRLIDLGVRVFERAEVPK
9,021
N611A
H588A
D16A




meningitidis

TGDSLAMARRLARSVRRLTRRRAHRLLRTRRLLKREGVLQAANFDENGLIKS








LPNTPWQLRAAALDRKLTPLEWSAVLLHLIKHRGYLSQRKNEGETADKELG








ALLKGVAGNAHALQTGDFRTPAELALNKFEKESGHIRNQRSDYSHTFSRKDL








QAELILLFEKQKEFGNPHVSGGLKEGIETLLMTQRPALSGDAVQKMLGHCTF








EPAEPKAAKNTYTAERFIWLTKLNNLRILEQGSERPLTDTERATLMDEPYRKS








KLTYAQARKLLGLEDTAFFKGLRYGKDNAEASTLMEMKAYHAISRALEKEGL








KDKKSPLNLSPELQDEIGTAFSLFKTDEDITGRLKDRIQPEILEALLKHISFDKFV








QISLKALRRIVPLMEQGKRYDEACAEIYGDHYGKKNTEEKIYLPPIPADEIRNP








VVLRALSQARKVINGVVRRYGSPARIHIETAREVGKSFKDRKEIEKRQEENRK








DREKAAAKFREYFPNFVGEPKSKDILKLRLYEQQHGKCLYSGKEINLGRLNEK








GYVEIDHALPFSRTWDDSFNNKVLVLGSENQNKGNQTPYEYFNGKDNSRE








WQEFKARVETSRFPRSKKQRILLQKFDEDGFKERNLNDTRYVNRFLCQFVA








DRMRLTGKGKKRVFASNGQITNLLRGFWGLRKVRAENDRHHALDAVVVA








CSTVAMQQKITRFVRYKEMNAFDGKTIDKETGEVLHQKTHFPQPWEFFAQ








EVMIRVFGKPDGKPEFEEADTLEKLRTLLAEKLSSRPEAVHEYVTPLFVSRAP








NRKMSGQGHMETVKSAKRLDEGVSVLRVPLTQLKLKDLEKMVNREREPKL








YEALKARLEAHKDDPAKAFAEPFYKYDKAGNRTQQVKAVRVEQVQKTGVW








VRNHNGIADNATMVRVDVFEKGDKYYLVPIYSWQVAKGILPDRAVVQGKD








EEDWQLIDDSFNFKFSLHPNDLVEVITKKARMFGYFASCHRGTGNINIRIHD








LDHKIGKNGILEGIGVKTALSFQKYQIDELGKEIRPCRLKKRPPVR









ScaCas9

Streptococcus

MEKKYSIGLDIGTNSVGWAVITDDYKVPSKKFKVLGNTNRKSIKKNLMGALL
9,022
N872A
H849A
D10A




canis

FDSGETAEATRLKRTARRRYTRRKNRIRYLQEIFANEMAKLDDSFFQRLEESF








LVEEDKKNERHPIFGNLADEVAYHRNYPTIYHLRKKLADSPEKADLRLIYLALA








HIIKFRGHFLIEGKLNAENSDVAKLFYQLIQTYNQLFEESPLDEIEVDAKGILSA








RLSKSKRLEKLIAVFPNEKKNGLFGNIIALALGLTPNFKSNFDLTEDAKLQLSKD








TYDDDLDELLGQIGDQYADLFSAAKNLSDAILLSDILRSNSEVTKAPLSASMV








KRYDEHHQDLALLKTLVRQQFPEKYAEIFKDDTKNGYAGYVGIGIKHRKRTT








KLATQEEFYKFIKPILEKMDGAEELLAKLNRDDLLRKQRTFDNGSIPHQIHLKE








LHAILRRQEEFYPFLKENREKIEKILTFRIPYYVGPLARGNSRFAWLTRKSEEAI








TPWNFEEVVDKGASAQSFIERMTNFDEQLPNKKVLPKHSLLYEYFTVYNELT








KVKYVTERMRKPEFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSV








EIIGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEE








RLKTYAHLFDDKVMKQLKRRHYTGWGRLSRKMINGIRDKQSGKTILDFLKS








DGFSNRNFMQLIHDDSLTFKEEIEKAQVSGQGDSLHEQIADLAGSPAIKKGIL








QTVKIVDELVKVMGHKPENIVIEMARENQTTTKGLQQSRERKKRIEEGIKELE








SQILKENPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVP








QSFIKDDSIDNKVLTRSVENRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQ








RKFDNLTKAERGGLSEADKAGFIKRQLVETRQITKHVARILDSRMNTKRDKN








DKPIREVKVITLKSKLVSDFRKDFQLYKVRDINNYHHAHDAYLNAVVGTALIK








KYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKRFFYSNIMNFFKTEVKL








ANGEIRKRPLIETNGETGEVVWNKEKDFATVRKVLAMPQVNIVKKTEVQTG








GFSKESILSKRESAKLIPRKKGWDTRKYGGFGSPTVAYSILVVAKVEKGKAKKL








KSVKVLVGITIMEKGSYEKDPIGFLEAKGYKDIKKELIFKLPKYSLFELENGRRR








MLASATELQKANELVLPQHLVRLLYYTQNISATTGSNNLGYIEQHREEFKEIF








EKIIDFSEKYILKNKVNSNLKSSFDEQFAVSDSILLSNSFVSLLKYTSFGASGGFT








FLDLDVKQGRLRYQTVTEVLDATLIYQSITGLYETRTDLSQLGGD









ScaCas9-

Streptococcus

MEKKYSIGLDIGTNSVGWAVITDDYKVPSKKFKVLGNTNRKSIKKNLMGALL
9,023
N872A
H849A
D10A


HiFi-Sc++

canis

FDSGETAEATRLKRTARRRYTRRKNRIRYLQEIFANEMAKLDDSFFQRLEESF








LVEEDKKNERHPIFGNLADEVAYHRNYPTIYHLRKKLADSPEKADLRLIYLALA








HIIKFRGHFLIEGKLNAENSDVAKLFYQLIQTYNQLFEESPLDEIEVDAKGILSA








RLSKSKRLEKLIAVFPNEKKNGLFGNIIALALGLTPNFKSNFDLTEDAKLQLSKD








TYDDDLDELLGQIGDQYADLFSAAKNLSDAILLSDILRSNSEVTKAPLSASMV








KRYDEHHQDLALLKTLVRQQFPEKYAEIFKDDTKNGYAGYVGADKKLRKRS








GKLATEEEFYKFIKPILEKMDGAEELLAKLNRDDLLRKQRTFDNGSIPHQIHLK








ELHAILRRQEEFYPFLKENREKIEKILTFRIPYYVGPLARGNSRFAWLTRKSEEA








ITPWNFEEVVDKGASAQSFIERMTNFDEQLPNKKVLPKHSLLYEYFTVYNEL








TKVKYVTERMRKPEFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDS








VEIIGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIE








ERLKTYAHLFDDKVMKQLKRRHYTGWGRLSRKMINGIRDKQSGKTILDFLKS








DGFSNANFMQLIHDDSLTFKEEIEKAQVSGQGDSLHEQIADLAGSPAIKKGIL








QTVKIVDELVKVMGHKPENIVIEMARENQTTTKGLQQSRERKKRIEEGIKELE








SQILKENPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVP








QSFIKDDSIDNKVLTRSVENRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQ








RKFDNLTKAERGGLSEADKAGFIKRQLVETRQITKHVARILDSRMNTKRDKN








DKPIREVKVITLKSKLVSDFRKDFQLYKVRDINNYHHAHDAYLNAVVGTALIK








KYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKRFFYSNIMNFFKTEVKL








ANGEIRKRPLIETNGETGEVVWNKEKDFATVRKVLAMPQVNIVKKTEVQTG








GFSKESILSKRESAKLIPRKKGWDTRKYGGFGSPTVAYSILVVAKVEKGKAKKL








KSVKVLVGITIMEKGSYEKDPIGFLEAKGYKDIKKELIFKLPKYSLFELENGRRR








MLASAKELQKANELVLPQHLVRLLYYTQNISATTGSNNLGYIEQHREEFKEIF








EKIIDFSEKYILKNKVNSNLKSSFDEQFAVSDSILLSNSFVSLLKYTSFGASGGFT








FLDLDVKQGRLRYQTVTEVLDATLIYQSITGLYETRTDLSQLGGD









SpyCas9-

Streptococcus

MDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLF
9,024
N863A
H840A
D10A


3var-NRRH

pyogenes

DSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFL








VEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAH








MIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILS








ARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLS








KDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSAS








MVKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEE








FYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGIIPHQIHLGELHAILRRQ








GDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFE








EVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTE








GMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVED








RFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA








HLFDDKVMKQLKRLRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRN








FMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVV








DELVKVMGGHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQI








LKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSF








LKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKF








DNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLI








REVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPK








LESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEI








RKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKES








ILPKGNSDKLIARKKDWDPKKYGGFNSPTAAYSVLVVAKVEKGKSKKLKSVK








ELLGITIMERSSFEKNPIGFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLAS








AGVLHKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDE








IIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGVPAA








FKYFDTTIDKKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGD









SpyCas9-

Streptococcus

MDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLF
9,025
N863A
H840A
D10A


3var-NRTH

pyogenes

DSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFL








VEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAH








MIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILS








ARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLS








KDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSAS








MVKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEE








FYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGIIPHQIHLGELHAILRRQ








GDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFE








EVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTE








GMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVED








RFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA








HLFDDKVMKQLKRLRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRN








FMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVV








DELVKVMGGHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQI








LKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSF








LKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKF








DNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLI








REVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPK








LESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEI








RKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKES








ILPKGNSDKLIARKKDWDPKKYGGFNSPTVAYSVLVVAKVEKGKSKKLKSVK








ELLGITIMERSSFEKNPIGFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLAS








ASVLHKGNELALPSKYVNFLYLASHYEKLKGSSEDNKQKQLFVEQHKHYLDEI








IEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGASAAF








KYFDTTIGRKLYTSTKEVLDATLIHQSITGLYETRIDLSQLGGD









SpyCas9-

Streptococcus

MDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLF
9,026
N863A
H840A
D10A


3var-NRCH

pyogenes

DSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFL








VEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAH








MIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILS








ARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLS








KDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSAS








MVKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEE








FYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGIIPHQIHLGELHAILRRQ








GDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFE








EVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTE








GMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVED








RFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA








HLFDDKVMKQLKRLRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRN








FMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVV








DELVKVMGGHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQI








LKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSF








LKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKF








DNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLI








REVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPK








LESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEI








RKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKES








ILPKGNSDKLIARKKDWDPKKYGGFNSPTVAYSVLVVAKVEKGKSKKLKSVK








ELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLAS








AGVLQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDE








IIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAA








FKYFDTTINRKQYNTTKEVLDATLIRQSITGLYETRIDLSQLGGD









SpyCas9-

Streptococcus

MDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLF
9,027
N863A
H840A
D10A


HF1

pyogenes

DSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFL








VEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAH








MIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILS








ARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLS








KDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSAS








MIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEF








YKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQ








EDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFE








EVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTE








GMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVED








RFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA








HLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANR








NFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKV








VDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQ








ILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSF








LKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKF








DNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLI








REVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPK








LESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEI








RKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKES








ILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKE








LLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASA








GELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEII








EQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAF








KYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGD









SpyCas9-

Streptococcus

MDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLF
9,028
N863A
H840A
D10A


QQR1

pyogenes

DSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFL








VEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAH








MIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILS








ARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLS








KDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSAS








MIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEF








YKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQ








EDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFE








EVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTE








GMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVED








RFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA








HLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANR








NFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKV








VDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQ








ILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSF








LKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKF








DNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLI








REVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPK








LESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEI








RKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKES








ILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKE








LLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASA








RELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEII








EQISEFSKRVILADAQLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAF








KYFDTTFKQKQYRSTKEVLDATLIHQSITGLYETRIDLSQLGGD









SpyCas9-

Streptococcus

MDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLF
9,029
N863A
H840A
D10A


QQR1

pyogenes

DSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFL








VEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAH








MIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILS








ARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLS








KDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSAS








MIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEF








YKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQ








EDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFE








EVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTE








GMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVED








RFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA








HLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANR








NFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKV








VDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQ








ILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSF








LKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKF








DNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLI








REVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPK








LESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEI








RKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKES








ILPKRNSDKLIARKKDWDPKKYGGFLWPTVAYSVLVVAKVEKGKSKKLKSVK








ELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLAS








AKQLQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDE








IIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAA








FKYFDTTIDRKQYRSTKEVLDATLIHQSITGLYETRIDLSQLGGD









SpyCas9-

Streptococcus

MDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLF
9,030
N863A
H840A
D10A


VQR

pyogenes

DSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFL








VEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAH








MIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILS








ARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLS








KDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSAS








MIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEF








YKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQ








EDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFE








EVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTE








GMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVED








RFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA








HLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANR








NFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKV








VDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQ








ILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSF








LKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKF








DNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLI








REVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPK








LESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEI








RKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKES








ILPKRNSDKLIARKKDWDPKKYGGFVSPTVAYSVLVVAKVEKGKSKKLKSVKE








LLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASA








GELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEII








EQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAF








KYFDTTIDRKQYRSTKEVLDATLIHQSITGLYETRIDLSQLGGD









SpyCas9-

Streptococcus

MDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLF
9,031
N863A
H840A
D10A


VRER

pyogenes

DSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFL








VEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAH








MIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILS








ARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLS








KDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSAS








MIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEF








YKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQ








EDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFE








EVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTE








GMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVED








RFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA








HLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANR








NFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKV








VDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQ








ILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSF








LKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKF








DNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLI








REVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPK








LESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEI








RKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKES








ILPKRNSDKLIARKKDWDPKKYGGFVSPTVAYSVLVVAKVEKGKSKKLKSVKE








LLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASA








RELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEII








EQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAF








KYFDTTIDRKEYRSTKEVLDATLIHQSITGLYETRIDLSQLGGD









SpyCas9-

Streptococcus

MDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLF
9,032
N863A
H840A
D10A


xCas

pyogenes

DSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFL








VEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAH








MIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILS








ARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDTKLQLS








KDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSAS








MIKLYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEF








YKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGIIPHQIHLGELHAILRRQE








DFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEK








VVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTE








GMRKPAFLSGDQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVED








RFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA








HLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANR








NFIQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVV








DELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQI








LKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSF








LKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKF








DNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLI








REVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPK








LESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEI








RKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKES








ILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKE








LLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASA








GVLQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEII








EQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAF








KYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGD









SpyCas9-

Streptococcus

MDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLF
9,033
N863A
H840A
D10A


xCas-NG

pyogenes

DSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFL








VEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAH








MIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILS








ARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDTKLQLS








KDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSAS








MIKLYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEF








YKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGIIPHQIHLGELHAILRRQE








DFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEK








VVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTE








GMRKPAFLSGDQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVED








RFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA








HLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANR








NFIQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVV








DELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQI








LKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSF








LKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKF








DNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLI








REVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPK








LESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEI








RKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKES








IRPKRNSDKLIARKKDWDPKKYGGFVSPTVAYSVLVVAKVEKGKSKKLKSVKE








LLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASA








RFLQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEII








EQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPRAF








KYFDTTIDRKVYRSTKEVLDATLIHQSITGLYETRIDLSQLGGD









St1Cas9-

Streptococcus

MSDLVLGLDIGIGSVGVGILNKVTGEIIHKNSRIFPAAQAENNLVRRTNRQG
9,034
N622A
H599A
D9A


CNRZ1066

thermophilus

RRLARRKKHRRVRLNRLFEESGLITDFTKISINLNPYQLRVKGLTDELSNEELFI








ALKNMVKHRGISYLDDASDDGNSSVGDYAQIVKENSKQLETKTPGQIQLER








YQTYGQLRGDFTVEKDGKKHRLINVFPTSAYRSEALRILQTQQEFNPQITDEF








INRYLEILTGKRKYYHGPGNEKSRTDYGRYRTSGETLDNIFGILIGKCTFYPDEF








RAAKASYTAQEFNLLNDLNNLTVPTETKKLSKEQKNQIINYVKNEKAMGPAK








LFKYIAKLLSCDVADIKGYRIDKSGKAEIHTFEAYRKMKTLETLDIEQMDRETL








DKLAYVLTLNTEREGIQEALEHEFADGSFSQKQVDELVQFRKANSSIFGKGW








HNFSVKLMMELIPELYETSEEQMTILTRLGKQKTTSSSNKTKYIDEKLLTEEIY








NPVVAKSVRQAIKIVNAAIKEYGDFDNIVIEMARETNEDDEKKAIQKIQKAN








KDEKDAAMLKAANQYNGKAELPHSVFHGHKQLATKIRLWHQQGERCLYT








GKTISIHDLINNSNQFEVDHILPLSITFDDSLANKVLVYATANQEKGQRTPYQ








ALDSMDDAWSFRELKAFVRESKTLSNKKKEYLLTEEDISKFDVRKKFIERNLV








DTRYASRVVLNALQEHFRAHKIDTKVSVVRGQFTSQLRRHWGIEKTRDTYH








HHAVDALIIAASSQLNLWKKQKNTLVSYSEEQLLDIETGELISDDEYKESVFKA








PYQHFVDTLKSKEFEDSILFSYQVDSKFNRKISDATIYATRQAKVGKDKKDET








YVLGKIKDIYTQDGYDAFMKIYKKDKSKFLMYRHDPQTFEKVIEPILENYPNK








QMNEKGKEVPCNPFLKYKEEHGYIRKYSKKGNGPEIKSLKYYDSKLLGNPIDI








TPENSKNKVVLQSLKPWRTDVYFNKATGKYEILGLKYADLQFEKGTGTYKIS








QEKYNDIKKKEGVDSDSEFKFTLYKNDLLLVKDTETKEQQLFRFLSRTLPKQK








HYVELKPYDKQKFEGGEALIKVLGNVANGGQCIKGLAKSNISIYKVRTDVLG








NQHIIKNEGDKPKLDF









St1Cas9-

Streptococcus

MSDLVLGLDIGIGSVGVGILNKVTGEIIHKNSRIFPAAQAENNLVRRTNRQG
9,035
N622A
H599A
D9A


LMG1831

thermophilus

RRLARRKKHRRVRLNRLFEESGLITDFTKISINLNPYQLRVKGLTDELSNEELFI








ALKNMVKHRGISYLDDASDDGNSSVGDYAQIVKENSKQLETKTPGQIQLER








YQTYGQLRGDFTVEKDGKKHRLINVFPTSAYRSEALRILQTQQEFNPQITDEF








INRYLEILTGKRKYYHGPGNEKSRTDYGRYRTSGETLDNIFGILIGKCTFYPDEF








RAAKASYTAQEFNLLNDLNNLTVPTETKKLSKEQKNQIINYVKNEKAMGPAK








LFKYIAKLLSCDVADIKGYRIDKSGKAEIHTFEAYRKMKTLETLDIEQMDRETL








DKLAYVLTLNTEREGIQEALEHEFADGSFSQKQVDELVQFRKANSSIFGKGW








HNFSVKLMMELIPELYETSEEQMTILTRLGKQKTTSSSNKTKYIDEKLLTEEIY








NPVVAKSVRQAIKIVNAAIKEYGDFDNIVIEMARETNEDDEKKAIQKIQKAN








KDEKDAAMLKAANQYNGKAELPHSVFHGHKQLATKIRLWHQQGERCLYT








GKTISIHDLINNSNQFEVDHILPLSITFDDSLANKVLVYATANQEKGQRTPYQ








ALDSMDDAWSFRELKAFVRESKTLSNKKKEYLLTEEDISKFDVRKKFIERNLV








DTRYASRVVLNALQEHFRAHKIDTKVSVVRGQFTSQLRRHWGIEKTRDTYH








HHAVDALIIAASSQLNLWKKQKNTLVSYSEEQLLDIETGELISDDEYKESVFKA








PYQHFVDTLKSKEFEDSILFSYQVDSKFNRKISDATIYATRQAKVGKDKKDET








YVLGKIKDIYTQDGYDAFMKIYKKDKSKFLMYRHDPQTFEKVIEPILENYPNK








QMNEKGKEVPCNPFLKYKEEHGYIRKYSKKGNGPEIKSLKYYDSKLLGNPIDI








TPENSKNKVVLQSLKPWRTDVYFNKNTGKYEILGLKYADLQFEKKTGTYKISQ








EKYNGIMKEEGVDSDSEFKFTLYKNDLLLVKDTETKEQQLFRFLSRTMPNVK








YYVELKPYSKDKFEKNESLIEILGSADKSGRCIKGLGKSNISIYKVRTDVLGNQH








IIKNEGDKPKLDF









St1Cas9-

Streptococcus

MSDLVLGLDIGIGSVGVGILNKVTGEIIHKNSRIFPAAQAENNLVRRTNRQG
9,036
N622A
H599A
D9A


CNRZ1066

thermophilus

RRLARRKKHRRVRLNRLFEESGLITDFTKISINLNPYQLRVKGLTDELSNEELFI








ALKNMVKHRGISYLDDASDDGNSSVGDYAQIVKENSKQLETKTPGQIQLER








YQTYGQLRGDFTVEKDGKKHRLINVFPTSAYRSEALRILQTQQEFNPQITDEF








INRYLEILTGKRKYYHGPGNEKSRTDYGRYRTSGETLDNIFGILIGKCTFYPDEF








RAAKASYTAQEFNLLNDLNNLTVPTETKKLSKEQKNQIINYVKNEKAMGPAK








LFKYIAKLLSCDVADIKGYRIDKSGKAEIHTFEAYRKMKTLETLDIEQMDRETL








DKLAYVLTLNTEREGIQEALEHEFADGSFSQKQVDELVQFRKANSSIFGKGW








HNFSVKLMMELIPELYETSEEQMTILTRLGKQKTTSSSNKTKYIDEKLLTEEIY








NPVVAKSVRQAIKIVNAAIKEYGDFDNIVIEMARETNEDDEKKAIQKIQKAN








KDEKDAAMLKAANQYNGKAELPHSVFHGHKQLATKIRLWHQQGERCLYT








GKTISIHDLINNSNQFEVDHILPLSITFDDSLANKVLVYATANQEKGQRTPYQ








ALDSMDDAWSFRELKAFVRESKTLSNKKKEYLLTEEDISKFDVRKKFIERNLV








DTRYASRVVLNALQEHFRAHKIDTKVSVVRGQFTSQLRRHWGIEKTRDTYH








HHAVDALIIAASSQLNLWKKQKNTLVSYSEDQLLDIETGELISDDEYKESVFK








APYQHFVDTLKSKEFEDSILFSYQVDSKFNRKISDATIYATRQAKVGKDKADE








TYVLGKIKDIYTQDGYDAFMKIYKKDKSKFLMYRHDPQTFEKVIEPILENYPN








KQINEKGKEVPCNPFLKYKEEHGYIRKYSKKGNGPEIKSLKYYDSKLGNHIDIT








PKDSNNKVVLQSLKPWRTDVYFNKNTGKYEILGLKYSDMQFEKGTGKYSISK








EQYENIKVREGVDENSEFKFTLYKNDLLLLKDSENGEQILLRFTSRNDTSKHYV








ELKPYNRQKFEGSEYLIKSLGTVAKGGQCIKGLGKSNISIYKVRTDVLGNQHII








KNEGDKPKLDF









St1Cas9-

Streptococcus

MSDLVLGLDIGIGSVGVGILNKVTGEIIHKNSRIFPAAQAENNLVRRTNRQG
9,037
N622A
H599A
D9A


TH1477

thermophilus

RRLARRKKHRRVRLNRLFEESGLITDFTKISINLNPYQLRVKGLTDELSNEELFI








ALKNMVKHRGISYLDDASDDGNSSVGDYAQIVKENSKQLETKTPGQIQLER








YQTYGQLRGDFTVEKDGKKHRLINVFPTSAYRSEALRILQTQQEFNPQITDEF








INRYLEILTGKRKYYHGPGNEKSRTDYGRYRTSGETLDNIFGILIGKCTFYPDEF








RAAKASYTAQEFNLLNDLNNLTVPTETKKLSKEQKNQIINYVKNEKAMGPAK








LFKYIAKLLSCDVADIKGYRIDKSGKAEIHTFEAYRKMKTLETLDIEQMDRETL








DKLAYVLTLNTEREGIQEALEHEFADGSFSQKQVDELVQFRKANSSIFGKGW








HNFSVKLMMELIPELYETSEEQMTILTRLGKQKTTSSSNKTKYIDEKLLTEEIY








NPVVAKSVRQAIKIVNAAIKEYGDFDNIVIEMARETNEDDEKKAIQKIQKAN








KDEKDAAMLKAANQYNGKAELPHSVFHGHKQLATKIRLWHQQGERCLYT








GKTISIHDLINNSNQFEVDHILPLSITFDDSLANKVLVYATANQEKGQRTPYQ








ALDSMDDAWSFRELKAFVRESKTLSNKKKEYLLTEEDISKFDVRKKFIERNLV








DTRYASRVVLNALQEHFRAHKIDTKVSVVRGQFTSQLRRHWGIEKTRDTYH








HHAVDALIIAASSQLNLWKKQKNTLVSYSEDQLLDIETGELISDDEYKESVFK








APYQHFVDTLKSKEFEDSILFSYQVDSKFNRKISDATIYATRQAKVGKDKADE








TYVLGKIKDIYTQDGYDAFMKIYKKDKSKFLMYRHDPQTFEKVIEPILENYPN








KQINEKGKEVPCNPFLKYKEEHGYIRKYSKKGNGPEIKSLKYYDSKLGNHIDIT








PKDSNNKVVLQSLKPWRTDVYFNKNTGKYEILGLKYSDMQFEKGTGKYSISK








EQYENIKVREGVDENSEFKFTLYKNDLLLLKDSENGEQILLRFTSRNDTSKHYV








ELKPYNRQKFEGSEYLIKSLGTVVKGGRCIKGLGKSNISIYKVRTDVLGNQHIIK








NEGDKPKLDF









sRGN3.1

Staphylococcus

MNQKFILGLDIGITSVGYGLIDYETKNIIDAGVRLFPEANVENNEGRRSKRGS
9,038
N585A
H562A
D10A



spp.
RRLKRRRIHRLERVKLLLTEYDLINKEQIPTSNNPYQIRVKGLSEILSKDELAIAL








LHLAKRRGIHNVDVAADKEETASDSLSTKDQINKNAKFLESRYVCELQKERLE








NEGHVRGVENRFLTKDIVREAKKIIDTQMQYYPEIDETFKEKYISLVETRREYF








EGPGQGSPFGWNGDLKKWYEMLMGHCTYFPQELRSVKYAYSADLFNALN








DLNNLIIQRDNSEKLEYHEKYHIIENVFKQKKKPTLKQIAKEIGVNPEDIKGYRI








TKSGTPEFTSFKLFHDLKKVVKDHAILDDIDLLNQIAEILTIYQDKDSIVAELGQ








LEYLMSEADKQSISELTGYTGTHSLSLKCMNMIIDELWHSSMNQMEVFTYL








NMRPKKYELKGYQRIPTDMIDDAILSPVVKRTFIQSINVINKVIEKYGIPEDIIIE








LARENNSDDRKKFINNLQKKNEATRKRINEIIGQTGNQNAKRIVEKIRLHDQ








QEGKCLYSLESIPLEDLLNNPNHYEVDHIIPRSVSFDNSYHNKVLVKQSENSK








KSNLTPYQYFNSGKSKLSYNQFKQHILNLSKSQDRISKKKKEYLLEERDINKFE








VQKEFINRNLVDTRYATRELTNYLKAYFSANNMNVKVKTINGSFTDYLRKV








WKFKKERNHGYKHHAEDALIIANADFLFKENKKLKAVNSVLEKPEIETKQLDI








QVDSEDNYSEMFIIPKQVQDIKDFRNFKYSHRVDKKPNRQLINDTLYSTRKK








DNSTYIVQTIKDIYAKDNTTLKKQFDKSPEKFLMYQHDPRTFEKLEVIMKQYA








NEKNPLAKYHEETGEYLTKYSKKNNGPIVKSLKYIGNKLGSHLDVTHQFKSST








KKLVKLSIKNYRFDVYLTEKGYKFVTIAYLNVFKKDNYYYIPKDKYQELKEKKKI








KDTDQFIASFYKNDLIKLNGDLYKIIGVNSDDRNIIELDYYDIKYKDYCEINNIK








GEPRIKKTIGKKTESIEKFTTDVLGNLYLHSTEKAPQLIFKRGL









sRGN3.3

Staphylococcus

MNQKFILGLDIGITSVGYGLIDYETKNIIDAGVRLFPEANVENNEGRRSKRGS
9,039
N585A
H562A
D10A



spp.
RRLKRRRIHRLERVKLLLTEYDLINKEQIPTSNNPYQIRVKGLSEILSKDELAIAL








LHLAKRRGIHNVDVAADKEETASDSLSTKDQINKNAKFLESRYVCELQKERLE








NEGHVRGVENRFLTKDIVREAKKIIDTQMQYYPEIDETFKEKYISLVETRREYF








EGPGQGSPFGWNGDLKKWYEMLMGHCTYFPQELRSVKYAYSADLFNALN








DLNNLIIQRDNSEKLEYHEKYHIIENVFKQKKKPTLKQIAKEIGVNPEDIKGYRI








TKSGTPEFTSFKLFHDLKKVVKDHAILDDIDLLNQIAEILTIYQDKDSIVAELGQ








LEYLMSEADKQSISELTGYTGTHSLSLKCMNMIIDELWHSSMNQMEVFTYL








NMRPKKYELKGYQRIPTDMIDDAILSPVVKRTFIQSINVINKVIEKYGIPEDIIIE








LARENNSDDRKKFINNLQKKNEATRKRINEIIGQTGNQNAKRIVEKIRLHDQ








QEGKCLYSLESIPLEDLLNNPNHYEVDHIIPRSVSFDNSYHNKVLVKQSENSK








KSNLTPYQYFNSGKSKLSYNQFKQHILNLSKSQDRISKKKKEYLLEERDINKFE








VQKEFINRNLVDTRYATRELTSYLKAYFSANNMDVKVKTINGSFTNHLRKV








WRFDKYRNHGYKHHAEDALIIANADFLFKENKKLQNTNKILEKPTIENNTKK








VTVEKEEDYNNVFETPKLVEDIKQYRDYKFSHRVDKKPNRQLINDTLYSTRM








KDEHDYIVQTITDIYGKDNTNLKKQFNKNPEKFLMYQNDPKTFEKLSIIMKQ








YSDEKNPLAKYYEETGEYLTKYSKKNNGPIVKKIKLLGNKVGNHLDVTNKYEN








STKKLVKLSIKNYRFDVYLTEKGYKFVTIAYLNVFKKDNYYYIPKDKYQELKEKK








KIKDTDQFIASFYKNDLIKLNGDLYKIIGVNSDDRNIIELDYYDIKYKDYCEINNI








KGEPRIKKTIGKKTESIEKFTTDVLGNLYLHSTEKAPQLIFKRGL













In some embodiments, a Cas protein requires a protospacer adjacent motif (PAM) to be present in or adjacent to a target DNA sequence for the Cas protein to bind and/or function. In some embodiments, the PAM is or comprises, from 5′ to 3′, NGG, YG, NNGRRT, NNNRRT, NGA, TYCV, TATV, NTTN, or NNNGATT, where N stands for any nucleotide, Y stands for C or T, R stands for A or G, and V stands for A or C or G. In some embodiments, a Cas protein is a protein listed in Table 7 or 8. In some embodiments, a Cas protein comprises one or more mutations altering its PAM. In some embodiments, a Cas protein comprises E1369R, E1449H, and R1556A mutations or analogous substitutions to the amino acids corresponding to said positions. In some embodiments, a Cas protein comprises E782K, N968K, and R1015H mutations or analogous substitutions to the amino acids corresponding to said positions. In some embodiments, a Cas protein comprises D1135V, R1335Q, and T1337R mutations or analogous substitutions to the amino acids corresponding to said positions. In some embodiments, a Cas protein comprises S542R and K607R mutations or analogous substitutions to the amino acids corresponding to said positions. In some embodiments, a Cas protein comprises S542R, K548V, and N552R mutations or analogous substitutions to the amino acids corresponding to said positions. Exemplary advances in the engineering of Cas enzymes to recognize altered PAM sequences are reviewed in Collias et al Nature Communications 12:555 (2021), incorporated herein by reference in its entirety.


In some embodiments, the Cas protein is catalytically active and cuts one or both strands of the target DNA site. In some embodiments, cutting the target DNA site is followed by formation of an alteration, e.g., an insertion or deletion, e.g., by the cellular repair machinery.


In some embodiments, the Cas protein is modified to deactivate or partially deactivate the nuclease, e.g., nuclease-deficient Cas9. Whereas wild-type Cas9 generates double-strand breaks (DSBs) at specific DNA sequences targeted by a gRNA, a number of CRISPR endonucleases having modified functionalities are available, for example: a “nickase” version of Cas9 that has been partially deactivated generates only a single-strand break; a catalytically inactive Cas9 (“dCas9”) does not cut target DNA. In some embodiments, dCas9 binding to a DNA sequence may interfere with transcription at that site by steric hindrance. In some embodiments, dCas9 binding to an anchor sequence may interfere with (e.g., decrease or prevent) genomic complex (e.g., ASMC) formation and/or maintenance. In some embodiments, a DNA-binding domain comprises a catalytically inactive Cas9, e.g., dCas9. Many catalytically inactive Cas9 proteins are known in the art. In some embodiments, dCas9 comprises mutations in each endonuclease domain of the Cas protein, e.g., D10A and H840A or N863A mutations. In some embodiments, a catalytically inactive or partially inactive CRISPR/Cas domain comprises a Cas protein comprising one or more mutations, e.g., one or more of the mutations listed in Table 7. In some embodiments, a Cas protein described on a given row of Table 7 comprises one, two, three, or all of the mutations listed in the same row of Table 7. In some embodiments, a Cas protein, e.g., not described in Table 7, comprises one, two, three, or all of the mutations listed in a row of Table 7 or a corresponding mutation at a corresponding site in that Cas protein.


In some embodiments, a catalytically inactive, e.g., dCas9, or partially deactivated Cas9 protein comprises a D11 mutation (e.g., D11A mutation) or an analogous substitution to the amino acid corresponding to said position. In some embodiments, a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a H969 mutation (e.g., H969A mutation) or an analogous substitution to the amino acid corresponding to said position. In some embodiments, a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a N995 mutation (e.g., N995A mutation) or an analogous substitution to the amino acid corresponding to said position. In some embodiments, a catalytically inactive Cas9 protein, e.g., dCas9, comprises mutations at one, two, or three of positions D11, H969, and N995 (e.g., D11A, H969A, and N995A mutations) or analogous substitutions to the amino acids corresponding to said positions.


In some embodiments, a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a D10 mutation (e.g., a D10A mutation) or an analogous substitution to the amino acid corresponding to said position. In some embodiments, a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a H557 mutation (e.g., a H557A mutation) or an analogous substitution to the amino acid corresponding to said position. In some embodiments, a catalytically inactive Cas9 protein, e.g., dCas9, comprises a D10 mutation (e.g., a D10A mutation) and a H557 mutation (e.g., a H557A mutation) or analogous substitutions to the amino acids corresponding to said positions.


In some embodiments, a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a D839 mutation (e.g., a D839A mutation) or an analogous substitution to the amino acid corresponding to said position. In some embodiments, a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a H840 mutation (e.g., a H840A mutation) or an analogous substitution to the amino acid corresponding to said position. In some embodiments, a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a N863 mutation (e.g., a N863A mutation) or an analogous substitution to the amino acid corresponding to said position. In some embodiments, a catalytically inactive Cas9 protein, e.g., dCas9, comprises a D10 mutation (e.g., D10A), a D839 mutation (e.g., D839A), a H840 mutation (e.g., H840A), and a N863 mutation (e.g., N863A) or analogous substitutions to the amino acids corresponding to said positions.


In some embodiments, a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a E993 mutation (e.g., a E993A mutation) or an analogous substitution to the amino acid corresponding to said position.


In some embodiments, a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a D917 mutation (e.g., a D917A mutation) or an analogous substitution to the amino acid corresponding to said position. In some embodiments, a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a a E1006 mutation (e.g., a E1006A mutation) or an analogous substitution to the amino acid corresponding to said position. In some embodiments, a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a D1255 mutation (e.g., a D1255A mutation) or an analogous substitution to the amino acid corresponding to said position. In some embodiments, a catalytically inactive Cas9 protein, e.g., dCas9, comprises a D917 mutation (e.g., D917A), a E1006 mutation (e.g., E1006A), and a D1255 mutation (e.g., D1255A) or analogous substitutions to the amino acids corresponding to said positions.


In some embodiments, a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a D16 mutation (e.g., a D16A mutation) or an analogous substitution to the amino acid corresponding to said position. In some embodiments, a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a D587 mutation (e.g., a D587A mutation) or an analogous substitution to the amino acid corresponding to said position. In some embodiments, a partially deactivated Cas domain has nickase activity. In some embodiments, a partially deactivated Cas9 domain is a Cas9 nickase domain. In some embodiments, the catalytically inactive Cas domain or dead Cas domain produces no detectable double strand break formation. In some embodiments, a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a H588 mutation (e.g., a H588A mutation) or an analogous substitution to the amino acid corresponding to said position. In some embodiments, a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a N611 mutation (e.g., a N611A mutation) or an analogous substitution to the amino acid corresponding to said position. In some embodiments, a catalytically inactive Cas9 protein, e.g., dCas9, comprises a D16 mutation (e.g., D16A), a D587 mutation (e.g., D587A), a H588 mutation (e.g., H588A), and a N611 mutation (e.g., N611A) or analogous substitutions to the amino acids corresponding to said positions.


In some embodiments, a DNA-binding domain or endonuclease domain may comprise a Cas molecule comprising or linked (e.g., covalently) to a gRNA (e.g., a template nucleic acid, e.g., template RNA, comprising a gRNA).


In some embodiments, an endonuclease domain or DNA binding domain comprises a Streptococcus pyogenes Cas9 (SpCas9) or a functional fragment or variant thereof. In some embodiments, the endonuclease domain or DNA binding domain comprises a modified SpCas9. In embodiments, the modified SpCas9 comprises a modification that alters protospacer-adjacent motif (PAM) specificity. In embodiments, the PAM has specificity for the nucleic acid sequence 5′-NGT-3′. In embodiments, the modified SpCas9 comprises one or more amino acid substitutions, e.g., at one or more of positions L1111, D1135, G1218, E1219, A1322, of R1335, e.g., selected from L1111R, D1135V, G1218R, E1219F, A1322R, R1335V. In embodiments, the modified SpCas9 comprises the amino acid substitution T1337R and one or more additional amino acid substitutions, e.g., selected from L1111, D1135L, S1136R, G1218S, E1219V, D1332A, D1332S, D1332T, D1332V, D1332L, D1332K, D1332R, R1335Q, T1337, T1337L, T1337Q, T1337I, T1337V, T1337F, T1337S, T1337N, T1337K, T1337H, T1337Q, and T1337M, or corresponding amino acid substitutions thereto. In embodiments, the modified SpCas9 comprises: (i) one or more amino acid substitutions selected from D1135L, S1136R, G1218S, E1219V, A1322R, R1335Q, and T1337; and (ii) one or more amino acid substitutions selected from L1111R, G1218R, E1219F, D1332A, D1332S, D1332T, D1332V, D1332L, D1332K, D1332R, T1337L, T1337I, T1337V, T1337F, T1337S, T1337N, T1337K, T1337R, T1337H, T1337Q, and T1337M, or corresponding amino acid substitutions thereto.


In some embodiments, the endonuclease domain or DNA binding domain comprises a Cas domain, e.g., a Cas9 domain. In embodiments, the endonuclease domain or DNA binding domain comprises a nuclease-active Cas domain, a Cas nickase (nCas) domain, or a nuclease-inactive Cas (dCas) domain. In embodiments, the endonuclease domain or DNA binding domain comprises a nuclease-active Cas9 domain, a Cas9 nickase (nCas9) domain, or a nuclease-inactive Cas9 (dCas9) domain. In some embodiments, the endonuclease domain or DNA binding domain comprises a Cas9 domain of Cas9 (e.g., dCas9 and nCas9), Cas12a/Cpf1, Cas12b/C2c1, Cas12c/C2c3, Cas12d/CasY, Cas12e/CasX, Cas12g, Cas12h, or Cas12i. In some embodiments, the endonuclease domain or DNA binding domain comprises a Cas9 (e.g., dCas9 and nCas9), Cas12a/Cpf1, Cas12b/C2c1, Cas12c/C2c3, Cas12d/CasY, Cas12e/CasX, Cas12g, Cas12h, or Cas12i. In some embodiments, the endonuclease domain or DNA binding domain comprises an S. pyogenes or an S. thermophilus Cas9, or a functional fragment thereof. In some embodiments, the endonuclease domain or DNA binding domain comprises a Cas9 sequence, e.g., as described in Chylinski, Rhun, and Charpentier (2013) RNA Biology 10:5, 726-737; incorporated herein by reference. In some embodiments, the endonuclease domain or DNA binding domain comprises the HNH nuclease subdomain and/or the RuvCI subdomain of a Cas, e.g., Cas9, e.g., as described herein, or a variant thereof. In some embodiments, the endonuclease domain or DNA binding domain comprises Cas12a/Cpf1, Cas12b/C2c1, Cas12c/C2c3, Cas12d/CasY, Cas12e/CasX, Cas12g, Cas12h, or Cas12i. In some embodiments, the endonuclease domain or DNA binding domain comprises a Cas polypeptide (e.g., enzyme), or a functional fragment thereof. In embodiments, the Cas polypeptide (e.g., enzyme) is selected from Cas1, Cas1B, Cas2, Cas3, Cas4, Cas5, Cas5d, Cast, Cas5h, Casa, Cas6, Cas7, Cas8, Cas8a, Cas8b, Cas8c, Cas9 (e.g., Csn1 or Csx12), Cas10, Cas10d, Cas12a/Cpf1, Cas12b/C2c1, Cas12c/C2c3, Cas12d/CasY, Cas12e/CasX, Cas12g, Cas12h, Cas12i, Csy1, Csy2, Csy3, Csy4, Cse1, Cse2, Cse3, Cse4, Cse5e, Csc1, Csc2, Csa5, Csn1, Csn2, Csm1, Csm2, Csm3, Csm4, Csm5, Csm6, Cmr1, Cmr3, Cmr4, Cmr5, Cmr6, Csb1, Csb2, Csb3, Csx17, Csx14, Csx10, Csx16, CsaX, Csx3, Csx1, Csx1S, Csx11, Csf1, Csf2, CsO, Csf4, Csd1, Csd2, Cst1, Cst2, Csh1, Csh2, Csa1, Csa2, Csa3, Csa4, Csa5, Type II Cas effector proteins, Type V Cas effector proteins, Type VI Cas effector proteins, CARF, DinG, Cpf1, Cas12b/C2c1, Cas12c/C2c3, Cas12b/C2c1, Cas12c/C2c3, SpCas9(K855A), eSpCas9(1.1), SpCas9-HF1, hyper accurate Cas9 variant (HypaCas9), homologues thereof, modified or engineered versions thereof, and/or functional fragments thereof. In embodiments, the Cas9 comprises one or more substitutions, e.g., selected from H840A, D10A, P475A, W476A, N477A, D1125A, W1126A, and D1127A. In embodiments, the Cas9 comprises one or more mutations at positions selected from: D10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, and/or A987, e.g., one or more substitutions selected from D10A, G12A, G17A, E762A, H840A, N854A, N863A, H982A, H983A, A984A, and/or D986A. In some embodiments, the endonuclease domain or DNA binding domain comprises a Cas (e.g., Cas9) sequence from Corynebacterium ulcerans, Corynebacterium diphtheria, Spiroplasma syrphidicola, Prevotella intermedia, Spiroplasma taiwanense, Streptococcus iniae, Belliella baltica, Psychroflexus torquis, Streptococcus thermophilus, Listeria innocua, Campylobacter jejuni, Neisseria meningitidis, Streptococcus pyogenes, or Staphylococcus aureus, or a fragment or variant thereof.


In some embodiments, the endonuclease domain or DNA binding domain comprises a Cpf1 domain, e.g., comprising one or more substitutions, e.g., at position D917, E1006A, D1255 or any combination thereof, e.g., selected from D917A, E1006A, D1255A, D917A/E1006A, D917A/D1255A, E1006A/D1255A, and D917A/E1006A/D1255A.


In some embodiments, the endonuclease domain or DNA binding domain comprises spCas9, spCas9-VRQR, spCas9-VRER, xCas9 (sp), saCas9, saCas9-KKH, spCas9-MQKSER, spCas9-LRKIQK, or spCas9-LRVSQL.


In some embodiments, a gene modifying polypeptide has an endonuclease domain comprising a Cas9 nickase, e.g., Cas9 H840A. In embodiments, the Cas9 H840A has the following amino acid sequence:









Cas9 nickase (H840A):


(SEQ ID NO: 11,001)


DKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGA





LLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFH





RLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDK





ADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFE





ENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSL





GLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKN





LSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLP





EKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKL





NREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEK





ILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSF





IERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFL





SGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNA





SLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKT





YAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDG





FANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKG





ILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIE





EGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLS





DYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYW





RQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVA





QILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNY





HHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIG





KATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRD





FATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDP





KKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKN





PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNEL





ALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISE





FSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAF





KYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGD






In some embodiments, a gene modifying polypeptide comprises a dCas9 sequence comprising a D10A and/or H840A mutation, e.g., the following sequence:









(SEQ ID NO: 5007)


SMDKKYSIGLAIGTNSVGWAVITDDYKVPSKKFKVLGNTDRHSIKKNLI





GALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSF





FHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDST





DKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQL





FEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIAL





SLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAA





KNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQ





LPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLV





KLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKI





EKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQ





SFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPA





FLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRF





NASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERL





KTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKS





DGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIK





KGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKR





IEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINR





LSDYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKN





YWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKH





VAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREIN





NYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQE





IGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKG





RDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDW





DPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFE





KNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGN





ELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQI





SEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPA





AFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGD






TAL Effectors and Zinc Finger Nucleases

In some embodiments, an endonuclease domain or DNA-binding domain comprises a TAL effector molecule. A TAL effector molecule, e.g., a TAL effector molecule that specifically binds a DNA sequence, typically comprises a plurality of TAL effector domains or fragments thereof, and optionally one or more additional portions of naturally occurring TAL effectors (e.g., N- and/or C-terminal of the plurality of TAL effector domains). Many TAL effectors are known to those of skill in the art and are commercially available, e.g., from Thermo Fisher Scientific.


Naturally occurring TALEs are natural effector proteins secreted by numerous species of bacterial pathogens including the plant pathogen Xanthomonas which modulates gene expression in host plants and facilitates bacterial colonization and survival. The specific binding of TAL effectors is based on a central repeat domain of tandemly arranged nearly identical repeats of typically 33 or 34 amino acids (the repeat-variable di-residues, RVD domain).


Members of the TAL effectors family differ mainly in the number and order of their repeats. The number of repeats typically ranges from 1.5 to 33.5 repeats and the C-terminal repeat is usually shorter in length (e.g., about 20 amino acids) and is generally referred to as a “half-repeat.” Each repeat of the TAL effector generally features a one-repeat-to-one-base-pair correlation with different repeat types exhibiting different base-pair specificity (one repeat recognizes one base-pair on the target gene sequence). Generally, the smaller the number of repeats, the weaker the protein-DNA interactions. A number of 6.5 repeats has been shown to be sufficient to activate transcription of a reporter gene (Scholze et al., 2010).


Repeat to repeat variations occur predominantly at amino acid positions 12 and 13, which have therefore been termed “hypervariable” and which are responsible for the specificity of the interaction with the target DNA promoter sequence, as shown in Table 9 listing exemplary repeat variable diresidues (RVD) and their correspondence to nucleic acid base targets.









TABLE 9







RVDs and Nucleic Acid Base Specificity








Target
Possible RVD Amino Acid Combinations























A
NI
NN
CI
HI
KI










G
NN
GN
SN
VN
LN
DN
QN
EN
HN
RH
NK
AN
FN


C
HD
RD
KD
ND
AD










T
NG
HG
VG
IG
EG
MG
YG
AA
EP
VA
QG
KG
RG









Accordingly, it is possible to modify the repeats of a TAL effector to target specific DNA sequences. Further studies have shown that the RVD NK can target G. Target sites of TAL effectors also tend to include a T flanking the 5′ base targeted by the first repeat, but the exact mechanism of this recognition is not known. More than 113 TAL effector sequences are known to date. Non-limiting examples of TAL effectors from Xanthomonas include, Hax2, Hax3, Hax4, AvrXa7, AvrXa10 and AvrBs3.


Accordingly, the TAL effector domain of a TAL effector molecule described herein may be derived from a TAL effector from any bacterial species (e.g., Xanthomonas species such as the African strain of Xanthomonas oryzae pv. Oryzae (Yu et al. 2011), Xanthomonas campestris pv. raphani strain 756C and Xanthomonas oryzae pv. oryzicolastrain BLS256 (Bogdanove et al. 2011). In some embodiments, the TAL effector domain comprises an RVD domain as well as flanking sequence(s) (sequences on the N-terminal and/or C-terminal side of the RVD domain) also from the naturally occurring TAL effector. It may comprise more or fewer repeats than the RVD of the naturally occurring TAL effector. The TAL effector molecule can be designed to target a given DNA sequence based on the above code and others known in the art. The number of TAL effector domains (e.g., repeats (monomers or modules)) and their specific sequence can beselected based on the desired DNA target sequence. For example, TAL effector domains, e.g., repeats, may be removed or added in order to suit a specific target sequence. In an embodiment, the TAL effector molecule of the present invention comprises between 6.5 and 33.5 TAL effector domains, e.g., repeats. In an embodiment, TAL effector molecule of the present invention comprises between 8 and 33.5 TAL effector domains, e.g., repeats, e.g., between 10 and 25 TAL effector domains, e.g., repeats, e.g., between 10 and 14 TAL effector domains, e.g., repeats.


In some embodiments, the TAL effector molecule comprises TAL effector domains that correspond to a perfect match to the DNA target sequence. In some embodiments, a mismatch between a repeat and a target base-pair on the DNA target sequence is permitted as along as it allows for the function of the polypeptide comprising the TAL effector molecule. In general, TALE binding is inversely correlated with the number of mismatches. In some embodiments, the TAL effector molecule of a polypeptide of the present invention comprises no more than 7 mismatches, 6 mismatches, 5 mismatches, 4 mismatches, 3 mismatches, 2 mismatches, or 1 mismatch, and optionally no mismatch, with the target DNA sequence. Without wishing to be bound by theory, in general the smaller the number of TAL effector domains in the TAL effector molecule, the smaller the number of mismatches will be tolerated and still allow for the function of the polypeptide comprising the TAL effector molecule. The binding affinity is thought to depend on the sum of matching repeat-DNA combinations. For example, TAL effector molecules having 25 TAL effector domains or more may be able to tolerate up to 7 mismatches.


In addition to the TAL effector domains, the TAL effector molecule of the present invention may comprise additional sequences derived from a naturally occurring TAL effector. The length of the C-terminal and/or N-terminal sequence(s) included on each side of the TAL effector domain portion of the TAL effector molecule can vary and be selected by one skilled in the art, for example based on the studies of Zhang et al. (2011). Zhang et al., have characterized a number of C-terminal and N-terminal truncation mutants in Hax3 derived TAL-effector based proteins and have identified key elements, which contribute to optimal binding to the target sequence and thus activation of transcription. Generally, it was found that transcriptional activity is inversely correlated with the length of N-terminus. Regarding the C-terminus, an important element for DNA binding residues within the first 68 amino acids of the Hax 3 sequence was identified. Accordingly, in some embodiments, the first 68 amino acids on the C-terminal side of the TAL effector domains of the naturally occurring TAL effector is included in the TAL effector molecule. Accordingly, in an embodiment, a TAL effector molecule comprises 1) one or more TAL effector domains derived from a naturally occurring TAL effector; 2) at least 70, 80, 90, 100, 110, 120, 130, 140, 150, 170, 180, 190, 200, 220, 230, 240, 250, 260, 270, 280 or more amino acids from the naturally occurring TAL effector on the N-terminal side of the TAL effector domains; and/or 3) at least 68, 80, 90, 100, 110, 120, 130, 140, 150, 170, 180, 190, 200, 220, 230, 240, 250, 260 or more amino acids from the naturally occurring TAL effector on the C-terminal side of the TAL effector domains.


In some embodiments, an endonuclease domain or DNA-binding domain is or comprises a Zn finger molecule. A Zn finger molecule comprises a Zn finger protein, e.g., a naturally occurring Zn finger protein or engineered Zn finger protein, or fragment thereof. Many Zn finger proteins are known to those of skill in the art and are commercially available, e.g., from Sigma-Aldrich.


In some embodiments, a Zn finger molecule comprises a non-naturally occurring Zn finger protein that is engineered to bind to a target DNA sequence of choice. See, for example, Beerli, et al. (2002) Nature Biotechnol. 20:135-141; Pabo, et al. (2001) Ann. Rev. Biochem. 70:313-340; Isalan, et al. (2001) Nature Biotechnol. 19:656-660; Segal, et al. (2001) Curr. Opin. Biotechnol. 12:632-637; Choo, et al. (2000) Curr. Opin. Struct. Biol. 10:411-416; U.S. Pat. Nos. 6,453,242; 6,534,261; 6,599,692; 6,503,717; 6,689,558; 7,030,215; 6,794,136; 7,067,317; 7,262,054; 7,070,934; 7,361,635; 7,253,273; and U.S. Patent Publication Nos. 2005/0064474; 2007/0218528; 2005/0267061, all incorporated herein by reference in their entireties.


An engineered Zn finger protein may have a novel binding specificity, compared to a naturally-occurring Zn finger protein. Engineering methods include, but are not limited to, rational design and various types of selection. Rational design includes, for example, using databases comprising triplet (or quadruplet) nucleotide sequences and individual Zn finger amino acid sequences, in which each triplet or quadruplet nucleotide sequence is associated with one or more amino acid sequences of zinc fingers which bind the particular triplet or quadruplet sequence. See, for example, U.S. Pat. Nos. 6,453,242 and 6,534,261, incorporated by reference herein in their entireties.


Exemplary selection methods, including phage display and two-hybrid systems, are disclosed in U.S. Pat. Nos. 5,789,538; 5,925,523; 6,007,988; 6,013,453; 6,410,248; 6,140,466; 6,200,759; and 6,242,568; as well as International Patent Publication Nos. WO 98/37186; WO 98/53057; WO 00/27878; and WO 01/88197 and GB 2,338,237. In addition, enhancement of binding specificity for zinc finger proteins has been described, for example, in International Patent Publication No. WO 02/077227.


In addition, as disclosed in these and other references, zinc finger domains and/or multi-fingered zinc finger proteins may be linked together using any suitable linker sequences, including for example, linkers of 5 or more amino acids in length. See, also, U.S. Pat. Nos. 6,479,626; 6,903,185; and 7,153,949 for exemplary linker sequences 6 or more amino acids in length. The proteins described herein may include any combination of suitable linkers between the individual zinc fingers of the protein. In addition, enhancement of binding specificity for zinc finger binding domains has been described, for example, in co-owned International Patent Publication No. WO 02/077227.


Zn finger proteins and methods for design and construction of fusion proteins (and polynucleotides encoding same) are known to those of skill in the art and described in detail in U.S. Pat. Nos. 6,140,0815; 789,538; 6,453,242; 6,534,261; 5,925,523; 6,007,988; 6,013,453; and 6,200,759; International Patent Publication Nos. WO 95/19431; WO 96/06166; WO 98/53057; WO 98/54311; WO 00/27878; WO 01/60970; WO 01/88197; WO 02/099084; WO 98/53058; WO 98/53059; WO 98/53060; WO 02/016536; and WO 03/016496.


In addition, as disclosed in these and other references, Zn finger proteins and/or multi-fingered Zn finger proteins may be linked together, e.g., as a fusion protein, using any suitable linker sequences, including for example, linkers of 5 or more amino acids in length. See, also, U.S. Pat. Nos. 6,479,626; 6,903,185; and 7,153,949 for exemplary linker sequences 6 or more amino acids in length. The Zn finger molecules described herein may include any combination of suitable linkers between the individual zinc finger proteins and/or multi-fingered Zn finger proteins of the Zn finger molecule.


In certain embodiments, the DNA-binding domain or endonuclease domain comprises a Zn finger molecule comprising an engineered zinc finger protein that binds (in a sequence-specific manner) to a target DNA sequence. In some embodiments, the Zn finger molecule comprises one Zn finger protein or fragment thereof. In other embodiments, the Zn finger molecule comprises a plurality of Zn finger proteins (or fragments thereof), e.g., 2, 3, 4, 5, 6 or more Zn finger proteins (and optionally no more than 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, or 2 Zn finger proteins). In some embodiments, the Zn finger molecule comprises at least three Zn finger proteins. In some embodiments, the Zn finger molecule comprises four, five or six fingers. In some embodiments, the Zn finger molecule comprises 8, 9, 10, 11 or 12 fingers. In some embodiments, a Zn finger molecule comprising three Zn finger proteins recognizes a target DNA sequence comprising 9 or 10 nucleotides. In some embodiments, a Zn finger molecule comprising four Zn finger proteins recognizes a target DNA sequence comprising 12 to 14 nucleotides. In some embodiments, a Zn finger molecule comprising six Zn finger proteins recognizes a target DNA sequence comprising 18 to 21 nucleotides.


In some embodiments, a Zn finger molecule comprises a two-handed Zn finger protein. Two handed zinc finger proteins are those proteins in which two clusters of zinc finger proteins are separated by intervening amino acids so that the two zinc finger domains bind to two discontinuous target DNA sequences. An example of a two handed type of zinc finger binding protein is SIP1, where a cluster of four zinc finger proteins is located at the amino terminus of the protein and a cluster of three Zn finger proteins is located at the carboxyl terminus (see Remade, et al. (1999) EMBO Journal 18(18):5073-5084). Each cluster of zinc fingers in these proteins is able to bind to a unique target sequence and the spacing between the two target sequences can comprise many nucleotides.


Linkers

In some embodiments, a gene modifying polypeptide may comprise a linker, e.g., a peptide linker, e.g., a linker as described in Table 10. In some embodiments, a gene modifying polypeptide comprises, in an N-terminal to C-terminal direction, a Cas domain (e.g., a Cas domain of Table 8), a linker of Table 10 (or a sequence having at least 70%, 80%, 85%, 90%, 95%, or 99% identity thereto), and an RT domain (e.g., an RT domain of Table 6). In some embodiments, a gene modifying polypeptide comprises a flexible linker between the endonuclease and the RT domain, e.g., a linker comprising the amino acid sequence SGGSSGGSSGSETPGTSESATPESSGGSSGGSS (SEQ ID NO: 11,002). In some embodiments, an RT domain of a gene modifying polypeptide may be located C-terminal to the endonuclease domain. In some embodiments, an RT domain of a gene modifying polypeptide may be located N-terminal to the endonuclease domain.









TABLE 10







Exemplary linker sequences








Amino Acid Sequence
SEQ ID NO





GGS






GGSGGS
5102





GGSGGSGGS
5103





GGSGGSGGSGGS
5104





GGSGGSGGSGGSGGS
5105





GGSGGSGGSGGSGGSGGS
5106





GGGGS
5107





GGGGSGGGGS
5108





GGGGSGGGGSGGGGS
5109





GGGGSGGGGSGGGGSGGGGS
5110





GGGGSGGGGSGGGGSGGGGSGGGGS
5111





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
5112





GGG






GGGG
5114





GGGGG
5115





GGGGGG
5116





GGGGGGG
5117





GGGGGGGG
5118





GSS






GSSGSS
5120





GSSGSSGSS
5121





GSSGSSGSSGSS
5122





GSSGSSGSSGSSGSS
5123





GSSGSSGSSGSSGSSGSS
5124





EAAAK
5125





EAAAKEAAAK
5126





EAAAKEAAAKEAAAK
5127





EAAAKEAAAKEAAAKEAAAK
5128





EAAAKEAAAKEAAAKEAAAKEAAAK
5129





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
5130





PAP






PAPAP
5132





PAPAPAP
5133





PAPAPAPAP
5134





PAPAPAPAPAP
5135





PAPAPAPAPAPAP
5136





GGSGGG
5137





GGGGGS
5138





GGSGSS
5139





GSSGGS
5140





GGSEAAAK
5141





EAAAKGGS
5142





GGSPAP
5143





PAPGGS
5144





GGGGSS
5145





GSSGGG
5146





GGGEAAAK
5147





EAAAKGGG
5148





GGGPAP
5149





PAPGGG
5150





GSSEAAAK
5151





EAAAKGSS
5152





GSSPAP
5153





PAPGSS
5154





EAAAKPAP
5155





PAPEAAAK
5156





GGSGGGGSS
5157





GGSGSSGGG
5158





GGGGGSGSS
5159





GGGGSSGGS
5160





GSSGGSGGG
5161





GSSGGGGGS
5162





GGSGGGEAAAK
5163





GGSEAAAKGGG
5164





GGGGGSEAAAK
5165





GGGEAAAKGGS
5166





EAAAKGGSGGG
5167





EAAAKGGGGGS
5168





GGSGGGPAP
5169





GGSPAPGGG
5170





GGGGGSPAP
5171





GGGPAPGGS
5172





PAPGGSGGG
5173





PAPGGGGGS
5174





GGSGSSEAAAK
5175





GGSEAAAKGSS
5176





GSSGGSEAAAK
5177





GSSEAAAKGGS
5178





EAAAKGGSGSS
5179





EAAAKGSSGGS
5180





GGSGSSPAP
5181





GGSPAPGSS
5182





GSSGGSPAP
5183





GSSPAPGGS
5184





PAPGGSGSS
5185





PAPGSSGGS
5186





GGSEAAAKPAP
5187





GGSPAPEAAAK
5188





EAAAKGGSPAP
5189





EAAAKPAPGGS
5190





PAPGGSEAAAK
5191





PAPEAAAKGGS
5192





GGGGSSEAAAK
5193





GGGEAAAKGSS
5194





GSSGGGEAAAK
5195





GSSEAAAKGGG
5196





EAAAKGGGGSS
5197





EAAAKGSSGGG
5198





GGGGSSPAP
5199





GGGPAPGSS
5200





GSSGGGPAP
5201





GSSPAPGGG
5202





PAPGGGGSS
5203





PAPGSSGGG
5204





GGGEAAAKPAP
5205





GGGPAPEAAAK
5206





EAAAKGGGPAP
5207





EAAAKPAPGGG
5208





PAPGGGEAAAK
5209





PAPEAAAKGGG
5210





GSSEAAAKPAP
5211





GSSPAPEAAAK
5212





EAAAKGSSPAP
5213





EAAAKPAPGSS
5214





PAPGSSEAAAK
5215





PAPEAAAKGSS
5216





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAA
5217





AKEAAAKEAAAKA






GGGGSEAAAKGGGGS
5218





EAAAKGGGGSEAAAK
5219





SGSETPGTSESATPES
5220





GSAGSAAGSGEF
5221





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
5222









In some embodiments, a linker of a gene modifying polypeptide comprises a motif chosen from: (SGGS)n (SEQ ID NO: 5025), (GGGS)n (SEQ ID NO: 5026), (GGGGS)n (SEQ ID NO: 5027), (G)n, (EAAAK), (SEQ ID NO: 5028), (GGS)n, or (XP)n.


Gene Modifying Polypeptide Selection by Pooled Screening

Candidate gene modifying polypeptides may be screened to evaluate a candidate's gene editing ability. For example, an RNA gene modifying system designed for the targeted editing of a coding sequence in the human genome may be used. In certain embodiments, such a gene modifying system may be used in conjunction with a pooled screening approach.


For example, a library of gene modifying polypeptide candidates and a template guide RNA (tgRNA) may be introduced into mammalian cells to test the candidates' gene editing abilities by a pooled screening approach. In specific embodiments, a library of gene modifying polypeptide candidates is introduced into mammalian cells followed by introduction of the tgRNA into the cells.


Representative, non-limiting examples of mammalian cells that may be used in screening include HEK293T cells, U2OS cells, HeLa cells, HepG2 cells, Huh7 cells, K562 cells, or iPS cells.


A gene modifying polypeptide candidate may comprise 1) a Cas-nuclease, for example a wild-type Cas nuclease, e.g., a wild-type Cas9 nuclease, a mutant Cas nuclease, e.g., a Cas nickase, for example, a Cas9 nickase such as a Cas9 N863A nickase, or a Cas nuclease selected from Table 7 or Table 8, 2) a peptide linker, e.g., a sequence from Table D or Table 10, that may exhibit varying degrees of length, flexibility, hydrophobicity, and/or secondary structure; and 3) a reverse transcriptase (RT), e.g. an RT domain from Table D or Table 6. A gene modifying polypeptide candidate library comprises: a plurality of different gene modifying polypeptide candidates that differ from each other with respect to one, two or all three of the Cas nuclease, peptide linker or RT domain components, or a plurality of nucleic acid expression vectors that encode such gene modifying polypeptide candidates.


For screening of gene modifying polypeptide candidates, a two-component system may be used that comprises a gene modifying polypeptide component and a tgRNA component. A gene modifying component may comprise, for example, an expression vector, e.g., an expression plasmid or lentiviral vector, that encodes a gene modifying polypeptide candidate, for example, comprises a human codon-optimized nucleic acid that encodes a gene modifying polypeptide candidate, e.g., a Cas-linker-RT fusion as described above. In a particular embodiment, a lentiviral cassette is utilized that comprises: (i) a promoter for expression in mammalian cells, e.g., a CMV promoter; (ii) a gene modifying library candidate, e.g. a Cas-linker-RT fusion comprising a Cas nuclease of Table 7 or Table 8, a peptide linker of Table 10, and an RT of Table 6, for example a Cas-linker-RT fusion as in Table D; (iii) a self-cleaving polypeptide, e.g., a T2A peptide; (iv) a marker enabling selection in mammalian cells, e.g., a puromycin resistance gene; and (v) a termination signal, e.g., a poly A tail.


The tgRNA component may comprise a tgRNA or expression vector, e.g., an expression plasmid, that produces the tgRNA, for example, utilizes a U6 promoter to drive expression of the tgRNA, wherein the tgRNA is a non-coding RNA sequence that is recognized by Cas and localizes it to the genomic locus of interest, and that also templates reverse transcription of the desired edit into the genome by the RT domain.


To prepare a pool of cells expressing gene modifying polypeptide library candidates, mammalian cells, e.g., HEK293T or U2OS cells, may be transduced with pooled gene modifying polypeptide candidate expression vector preparations, e.g., lentiviral preparations, of the gene modifying candidate polypeptide library. In a particular embodiment, lentiviral plasmids are utilized, and HEK293 Lenti-X cells are seeded in 15 cm plates (˜12×106 cells) prior to lentiviral plasmid transfection. In such an embodiment, lentiviral plasmid transfection may be performed using the Lentiviral Packaging Mix (Biosettia) and transfection of the plasmid DNA for the gene modifying candidate library is performed the following day using Lipofectamine 2000 and Opti-MEM media according to the manufacturer's protocol. In such an embodiment, extracellular DNA may be removed by a full media change the next day and virus-containing media may be harvested 48 hours after. Lentiviral media may be concentrated using Lenti-X Concentrator (TaKaRa Biosciences) and 5 mL lentiviral aliquots may be made and stored at −80° C. Lentiviral titering is performed by enumerating colony forming units post-selection, e.g., post Puromycin selection.


For monitoring gene editing of a target DNA, mammalian cells, e.g., HEK293T or U2OS cells, carrying a target DNA may be utilized. In other embodiments for monitoring gene editing of a target DNA, mammalian cells, e.g., HEK293T or U2OS cells, carrying a target DNA genomic landing pad may be utilized. In particular embodiments, the target DNA genomic landing pad may comprise a gene to be edited for treatment of a disease or disorder of interest. In other particular embodiments, the target DNA is a gene sequence that expresses a protein that exhibits detectable characteristics that may be monitored to determine whether gene editing has occurred. For example, in certain embodiments, a blue fluorescence protein (BFP)- or green fluorescence protein (GFP)-expressing genomic landing pad is utilized. In certain embodiments, mammalian cells, e.g., HEK293T or U2OS cells, comprising a target DNA, e.g., a target DNA genomic landing pad, are seeded in culture plates at 500x-3000x cells per gene modifying library candidate and transduced at a 0.2-0.3 multiplicity of infection (MOI) to minimize multiple infections per cell. Puromycin (2.5 ug/mL) may be added 48 hours post infection to allow for selection of infected cells. In such an embodiment, cells may be kept under puromycin selection for at least 7 days and then scaled up for tgRNA introduction, e.g., tgRNA electroporation.


To ascertain whether gene editing occurs, mammalian cells containing a target DNA to be edited may be infected with gene modifying polypeptide library candidates then transfected with tgRNA designed for use in editing of the target DNA. Subsequently, the cells may be analyzed to determine whether editing of the target locus has occurred according to the designed outcome, or whether no editing or imperfect editing has occurred, e.g., by using cell sorting and sequence analysis.


In a particular embodiment, to ascertain whether genome editing occurs, BFP- or GFP-expressing mammalian cells, e.g., HEK293T or U2OS cells, may be infected with gene modifying library candidates and then transfected or electroporated with tgRNA plasmid or RNA, e.g., by electroporation of 250,000 cells/well with 200 ng of a tgRNA plasmid designed to convert BFP-to-GFP or GFP-to-BFP, at a cell count ensuring >250x-1000x coverage per library candidate. In such an embodiment, the genome-editing capacity of the various constructs in this assay may be assessed by sorting the cells by Fluorescence-Activated Cell Sorting (FACS) for expression of the color-converted fluorescent protein (FP) at 4-10 days post-electroporation. Cells are sorted and harvested as distinct populations of unedited cells (exhibiting original florescence protein signal), edited cells (exhibiting converted fluorescence protein signal), and imperfect edit (exhibiting no florescence protein signal) cells. A sample of unsorted cells may also be harvested as the input population to determine candidate enrichment during analysis.


To determine which gene modifying library candidates exhibit genome-editing capacity in an assay, genomic DNA (gDNA) is harvested from the sorted cell populations, and analyzed by sequencing the gene modifying library candidates in each population. Briefly, gene modifying candidates may be amplified from the genome using primers specific to the gene modifying polypeptide expression vector, e.g., the lentiviral cassette, amplified in a second round of PCR to dilute genomic DNA, and then sequenced, for example, sequenced by a next-generation sequencing platform. After quality control of sequencing reads, reads of at least about 1500 nucleotides and generally no more than about 3200 nucleotides are mapped to the gene modifying polypeptide library sequences and those containing a minimum of about an 80% match to a library sequence are considered to be successfully aligned to a given candidate for purposes of this pooled screen. In order to identify candidates capable of performing gene editing in the assay, e.g., the BFP-to-GFP or GFP-to-BFP edit, the read count of each library candidate in the edited population is compared to its read count in the initial, unsorted population.


For purposes of pooled screening, gene modifying candidates with genome-editing capacity are identified based on enrichment in the edited (converted FP) population relative to unsorted (input) cells. In some embodiments, an enrichment of at least 1.0, 1.5, 2.0, 2.5, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10, 15, 20, 25, 30, 40, 50, 60, 70, 80, 90, or at least 100-fold over the input indicates potentially useful gene editing activity, e.g., at least 2-fold enrichment. In some embodiments, the enrichment is converted to a log-value by taking the log base 2 of the enrichment ratio. In some embodiments, a log 2 enrichment score of at least 0, 1, 2, 3, 4, 5, 5.5, 6.0, 6.2, 6.3, 6.4, 6.5, or at least 6.6 indicates potentially useful gene editing activity, e.g., a log 2 enrichment score of at least 1.0. In particular embodiments, enrichment values observed for gene modifying candidates may be compared to enrichment values observed under similar conditions utilizing a reference, e.g., Element ID No: 17380.


In some embodiments, multiple tgRNAs may be used to screen the gene modifying candidate library. In particular embodiments, a plurality of tgRNAs may be utilized to optimize template/Cas-linker-RT fusion pairs, e.g., for gene editing of particular target genes, for example, gene targets for the treatment of disease. In specific embodiments, a pooled approach to screening gene modifying candidates may be performed using a multiplicity of different tgRNAs in an arrayed format.


In some embodiments, multiple types of edits, e.g., insertions, substitutions, and/or deletions of different lengths, may be used to screen the gene modifying candidate library.


In some embodiments, multiple target sequences, e.g., different fluorescent proteins, may be used to screen the gene modifying candidate library. In some embodiments, multiple target sequences, e.g., different fluorescent proteins, may be used to screen the gene modifying candidate library. In some embodiments, multiple cell types, e.g., HEK293T or U2OS, may be used to screen the gene modifying candidate library. The person of ordinary skill in the art will appreciate that a given candidate may exhibit altered editing capacity or even the gain or loss of any observable or useful activity across different conditions, including tgRNA sequence (e.g., nucleotide modifications, PBS length, RT template length), target sequence, target location, type of edit, location of mutation relative to the first-strand nick of the gene modifying polypeptide, or cell type. Thus, in some embodiments, gene modifying library candidates are screened across multiple parameters, e.g., with at least two distinct tgRNAs in at least two cell types, and gene editing activity is identified by enrichment in any single condition. In other embodiments, a candidate with more robust activity across different tgRNA and cell types is identified by enrichment in at least two conditions, e.g., in all conditions screened. For clarity, candidates found to exhibit little to no enrichment under any given condition are not assumed to be inactive across all conditions and may be screened with different parameters or reconfigured at the polypeptide level, e.g., by swapping, shuffling, or evolving domains (e.g., RT domain), linkers, or other signals (e.g., NLS).


Sequences of Exemplary (′As9-Linker-RT Fusions

In some embodiments, a gene modifying polypeptide comprises a linker sequence and an RT sequence. In some embodiments, a gene modifying polypeptide comprises a linker sequence as listed in Table D, or an amino acid sequence having at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto. In some embodiments, a gene modifying polypeptide comprises the amino acid sequence of an RT domain as listed in Table D, or an amino acid sequence having at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto. In some embodiments, a gene modifying polypeptide comprises a linker sequence as listed in Table D, or an amino acid sequence having at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto; and the amino acid sequence of an RT domain as listed in Table D, or an amino acid sequence having at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto. In some embodiments, a gene modifying polypeptide comprises: (i) a linker sequence as listed in a row of Table D, or an amino acid sequence having at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto; and (ii) the amino acid sequence of an RT domain as listed in the same row of Table D, or an amino acid sequence having at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto.


Exemplary Gene Modifying Polypeptides

In some embodiments, a gene modifying polypeptide (e.g., a gene modifying polypeptide that is part of a system described herein) comprises an amino acid sequence of any one of SEQ ID NOs: 1-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, a gene modifying polypeptide comprises an amino acid sequence of any one of SEQ ID NOs: 1-7743, or an amino acid sequence having at least 80% identity thereto. In some embodiments, a gene modifying polypeptide comprises an amino acid sequence of any one of SEQ ID NOs: 1-7743, or an amino acid sequence having at least 90% identity thereto. In some embodiments, a gene modifying polypeptide comprises an amino acid sequence of any one of SEQ ID NOs: 1-7743, or an amino acid sequence having at least 95% identity thereto. In some embodiments, a gene modifying polypeptide comprises an amino acid sequence of any one of SEQ ID NOs: 1-7743, or an amino acid sequence having at least 99% identity thereto. In some embodiments, a gene modifying polypeptide comprises an amino acid sequence of any one of SEQ ID NOs: 1-7743. In some embodiments, a gene modifying polypeptide comprises an amino acid sequence of any one of SEQ ID NOs: 6001-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, a gene modifying polypeptide comprises an amino acid sequence of any one of SEQ ID NOs: 4501-4541, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.


In some embodiments, a gene modifying polypeptide comprises an amino acid sequence as listed in Table A1, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.


In some embodiments, a gene modifying polypeptide comprises an amino acid sequence as listed in Table T1, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, a gene modifying polypeptide comprises a linker comprising a linker sequence as listed in Table T1, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, a gene modifying polypeptide comprises an RT domain comprising an RT domain sequence as listed in Table T1, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, a gene modifying polypeptide comprises: (i) a linker comprising a linker sequence as listed in a row of Table T1, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto; and (ii) an RT domain comprising an RT domain sequence as listed in the same row of Table T1, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.









TABLE T1







Selection of exemplary gene modifying polypeptides










SEQ ID NO:





for Full

SEQ ID



Polypeptide

NO: of



Sequence
Linker Sequence
linker
RT name





1372
AEAAAKEAAAKEAAA
15,401
AVIRE_P03360_



KEAAAKALEAEAAAK

3mutA



EAAAKEAAAKEAAAK





A







1197
AEAAAKEAAAKEAAA
15,402
FLV_P10273_



KEAAAKALEAEAAAK

3mutA



EAAAKEAAAKEAAAK





A







2784
AEAAAKEAAAKEAAA
15,403
MLVMS_P03355_



KEAAAKALEAEAAAK

3mutA_WS



EAAAKEAAAKEAAAK





A







 647
AEAAAKEAAAKEAAA
15,404
SFV3L_P27401_



KEAAAKALEAEAAAK

2mutA



EAAAKEAAAKEAAAK





A









In some embodiments, a gene modifying polypeptide comprises an amino acid sequence as listed in Table T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, a gene modifying polypeptide comprises a linker comprising a linker sequence as listed in Table T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, a gene modifying polypeptide comprises an RT domain comprising an RT domain sequence as listed in Table T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, a gene modifying polypeptide comprises: (i) a linker comprising a linker sequence as listed in a row of Table T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto; and (ii) an RT domain comprising an RT domain sequence as listed in the same row of Table T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.









TABLE T2







Selection of exemplary gene modifying polypeptides










SEQ ID NO:





for Full





Polypeptide

SEQ ID NO:



Sequence
Linker Sequence
of linker
RT name





2311
GGGGSGGGGSGGGGSGGGGS
15,405
MLVCB_P08361_3mutA





1373
GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
15,406
AVIRE_P03360_3mutA





2644
GGGGGGGGSGGGGSGGGGSGGGGSGGGGS
15,407
MLVMS_P03355_PLV919





2304
GSSGSSGSSGSSGSSGSS
15,408
MLVCB_P08361_3mutA





2325
EAAAKEAAAKEAAAKEAAAK
15,409
MLVCB_P08361_3mutA





2322
EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
15,410
MLVCB_P08361_3mutA





2187
PAPAPAPAPAP
15,411
MLVBM_Q7SVK7_3mut





2309
PAPAPAPAPAPAP
15,412
MLVCB_P08361_3mutA





2534
PAPAPAPAPAPAP
15,413
MLVFF_P26809_3mutA





2797
PAPAPAPAPAPAP
15,414
MLVMS_P03355_3mutA_WS





3084
PAPAPAPAPAPAP
15,415
MLVMS_P03355_3mutA_WS





2868
PAPAPAPAPAPAP
15,416
MLVMS_P03355_PLV919





 126
EAAAKGGG
15,417
PERV_Q4VFZ2_3mut





 306
EAAAKGGG
15,418
PERV_Q4VFZ2_3mut





1410
PAPGGG
15,419
AVIRE_P03360_3mutA





 804
GGGGSSGGS
15,420
WMSV_P03359_3mut





1937
GGGGGSEAAAK
15,421
BAEVM_P10272_3mutA





2721
GGGEAAAKGGS
15,422
MLVMS_P03355_3mut





3018
GGGEAAAKGGS
15,423
MLVMS_P03355_3mut





1018
GGGEAAAKGGS
15,424
XMRV6_A1Z651_3mutA





2317
GGSGGGPAP
15,425
MLVCB_P08361_3mutA





2649
PAPGGSGGG
15,426
MLVMS_P03355_PLV919





2878
PAPGGSGGG
15,427
MLVMS_P03355_PLV919





 912
GGSEAAAKPAP
15,428
WMSV_P03359_3mutA





2338
GGSPAPEAAAK
15,429
MLVCB_P08361_3mutA





2527
GGSPAPEAAAK
15,430
MLVFF_P26809_3mutA





 141
EAAAKGGSPAP
15,431
PERV_Q4VFZ2_3mut





 341
EAAAKGGSPAP
15,432
PERV_Q4VFZ2_3mut





2315
EAAAKPAPGGS
15,433
MLVCB_P08361_3mutA





3080
EAAAKPAPGGS
15,434
MLVMS_P03355_3mutA_WS





2688
GGGGSSEAAAK
15,435
MLVMS_P03355_PLV919





2885
GGGGSSEAAAK
15,436
MLVMS_P03355_PLV919





2810
GSSGGGEAAAK
15,437
MLVMS_P03355_3mutA_WS





3057
GSSGGGEAAAK
15,438
MLVMS_P03355_3mutA_WS





1861
GSSEAAAKGGG
15,439
MLVAV_P03356_3mutA





3056
GSSGGGPAP
15,440
MLVMS_P03355_3mutA_WS





1038
GSSPAPGGG
15,441
XMRV6_A1Z651_3mutA





2308
PAPGGGGSS
15,442
MLVCB_P08361_3mutA





1672
GGGEAAAKPAP
15,443
KORV_Q9TTC1-Pro_3mutA





2526
GGGEAAAKPAP
15,444
MLVFF_P26809_3mutA





1938
GGGPAPEAAAK
15,445
BAEVM_P10272_3mutA





2641
GSSEAAAKPAP
15,446
MLVMS_P03355_PLV919





2891
GSSEAAAKPAP
15,447
MLVMS_P03355_PLV919





1225
GSSPAPEAAAK
15,448
FLV_P10273_3mutA





2839
GSSPAPEAAAK
15,449
MLVMS_P03355_3mutA_WS





3127
GSSPAPEAAAK
15,450
MLVMS_P03355_3mutA_WS





2798
PAPGSSEAAAK
15,451
MLVMS_P03355_3mutA_WS





3091
PAPGSSEAAAK
15,452
MLVMS_P03355_3mutA_WS





1372
AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAA
15,453
AVIRE_P03360_3mutA



AKEAAAKEAAAKA







1197
AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAA
15,454
FLV_P10273_3mutA



AKEAAAKEAAAKA







2611
AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAA
15,455
MLVMS_P03355_PLV919



AKEAAAKEAAAKA







2784
AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAA
15,456
MLVMS_P03355_3mutA_WS



AKEAAAKEAAAKA







 480
AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAA
15,457
SFV1_P23074_2mutA



AKEAAAKEAAAKA







 647
AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAA
15,458
SFV3L_P27401_2mutA



AKEAAAKEAAAKA







1006
AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAA
15,459
XMRV6_A1Z651_3mutA



AKEAAAKEAAAKA







2518
SGSETPGTSESATPES
15,460
MLVFF_P26809_3mutA









Subsequences of Exemplary Gene Modifying Polypeptides

In some embodiments, the gene modifying polypeptide comprises, in N-terminal to C-terminal order, one or more (e.g., 1, 2, 3, 4, 5, or all 6) of an N-terminal methionine residue, a first nuclear localization signal (NLS), a DNA binding domain, a linker, an RT domain, and/or a second NLS. In some embodiments, a gene modifying polypeptide comprises, in N-terminal to C-terminal order, a NLS (e.g., a first NLS), a DNA binding domain, a linker, and an RT domain, wherein the linker and RT domain are the linker and RT domain of a gene modifying polypeptide of any one of SEQ ID NOs: 1-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity to said linker and RT domain. In some embodiments, a gene modifying polypeptide comprises, in N-terminal to C-terminal order, a DNA binding domain, a linker, an RT domain, and an NLS (e.g., a second NLS) wherein the linker and RT domain are the linker and RT domain of a gene modifying polypeptide of any one of SEQ ID NOs: 1-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity to said linker and RT domain. In some embodiments, a gene modifying polypeptide comprises, in N-terminal to C-terminal order, a first NLS, a DNA binding domain, a linker, an RT domain, and a second NLS, wherein the linker and RT domain are the linker and RT domain of a gene modifying polypeptide of any one of SEQ ID NOs: 1-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity to said linker and RT domain. In some embodiments, the gene modifying polypeptide further comprises an N-terminal methionine residue.


In some embodiments, the gene modifying polypeptide comprises, in N-terminal to C-terminal order, one or more (e.g., 1, 2, 3, 4, 5, or all 6) of an N-terminal methionine residue, a first nuclear localization signal (NLS) (e.g., of a gene modifying polypeptide of any one of SEQ ID NOs: 1-7743 and/or as listed in any of Tables A1, T1, or T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto), a DNA binding domain (e.g., a Cas domain, e.g., a SpyCas9 domain, e.g., as listed in Table 8, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto; or a DNA binding domain of a gene modifying polypeptide of any one of SEQ ID NOs: 1-7743 and/or as listed in any of Tables A1, T1, or T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto), a linker (e.g., of a gene modifying polypeptide of any one of SEQ ID NOs: 1-7743 and/or as listed in any of Tables A1, T1, or T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto), an RT domain (e.g., of a gene modifying polypeptide of any one of SEQ ID NOs: 1-7743 and/or as listed in any of Tables A1, T1, or T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto), and a second NLS (e.g., of a gene modifying polypeptide of any one of SEQ ID NOs: 1-7743 and/or as listed in any of Tables A1, T1, or T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto). In some embodiments, the gene modifying polypeptide further comprises (e.g., C-terminal to the second NLS) a T2A sequence and/or a puromycin sequence (e.g., of a gene modifying polypeptide of any one of SEQ ID NOs: 1-7743 and/or as listed in any of Tables A1, T1, or T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto). In some embodiments, a nucleic acid encoding a gene modifying polypeptide (e.g., as described herein) encodes a T2A sequence, e.g., wherein the T2A sequence is situated between a region encoding the gene modifying polypeptide and a second region, wherein the second region optionally encodes a selectable marker, e.g., puromycin.


In certain embodiments, the first NLS comprises a first NLS sequence of a gene modifying polypeptide having an amino acid sequence of any one of SEQ ID NOs: 1-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the first NLS comprises a first NLS sequence of a gene modifying polypeptide as listed in any of Tables A1, T1, or T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the first NLS sequence comprises a C-myc NLS. In certain embodiments, the first NLS comprises the amino acid sequence PAAKRVKLD (SEQ ID NO: 11,095), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.


In certain embodiments, the gene modifying polypeptide further comprises a spacer sequence between the first NLS and the DNA binding domain. In certain embodiments, the spacer sequence between the first NLS and the DNA binding domain comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids. In certain embodiments, the spacer sequence between the first NLS and the DNA binding domain comprises the amino acid sequence GG.


In certain embodiments, the DNA binding domain comprises a DNA binding domain of a gene modifying polypeptide of any one of SEQ ID NOs: 1-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the DNA binding domain comprises a DNA binding domain of a gene modifying polypeptide as listed in any of Tables A1, T1, or T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the DNA binding domain comprises a Cas domain (e.g., as listed in Table 8). In certain embodiments, the DNA binding domain comprises the amino acid sequence of a SpyCas9 polypeptide (e.g., as listed in Table 8, e.g., a Cas9 N863A polypeptide), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the DNA binding domain comprises the amino acid sequence:









(SEQ ID NO: 11,096)


DKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGA





LLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFH





RLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDK





ADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFE





ENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSL





GLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKN





LSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLP





EKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKL





NREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEK





ILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSF





IERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFL





SGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNA





SLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKT





YAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDG





FANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKG





ILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIE





EGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLS





DYDVDHIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYW





RQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVA





QILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNY





HHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIG





KATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRD





FATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDP





KKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKN





PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNEL





ALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISE





FSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAF





KYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGD,







or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.


In certain embodiments, the gene modifying polypeptide further comprises a spacer sequence between the DNA binding domain and the linker. In certain embodiments, the spacer sequence between the DNA binding domain and the linker comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids. In certain embodiments, the spacer sequence between the DNA binding domain and the linker comprises the amino acid sequence GG.


In certain embodiments, the linker comprises a linker sequence of a gene modifying polypeptide of any one of SEQ ID NOs: 1-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the linker comprises a linker sequence of a gene modifying polypeptide as listed in any of Tables A1, T1, or T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the linker comprises an amino acid sequence as listed in Table D or 10, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.


In certain embodiments, the gene modifying polypeptide further comprises a spacer sequence between the linker and the RT domain. In certain embodiments, the spacer sequence between the linker and the RT domain comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids. In certain embodiments, the spacer sequence between the linker and the RT domain comprises the amino acid sequence GG.


In certain embodiments, the RT domain comprises a RT domain sequence of a gene modifying polypeptide of any one of SEQ ID NOs: 1-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the RT domain comprises a RT domain sequence of a gene modifying polypeptide as listed in any of Tables A1, T1, or T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the RT domain comprises an amino acid sequence as listed in Table D or 6, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, the RT domain has a length of about 400-500, 500-600, 600-700, 700-800, 800-900, or 900-1000 amino acids.


In certain embodiments, the gene modifying polypeptide further comprises a spacer sequence between the RT domain and the second NLS. In certain embodiments, the spacer sequence between the RT domain and the second NLS comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids. In certain embodiments, the spacer sequence between the RT domain and the second NLS comprises the amino acid sequence AG.


In certain embodiments, the second NLS comprises a second NLS sequence of a gene modifying polypeptide of any one of SEQ ID NOs: 1-7743. In certain embodiments, the second NLS comprises a second NLS sequence of a gene modifying polypeptide as listed in any of Tables A1, T1, or T2. In certain embodiments, the second NLS sequence comprises a plurality of partial NLS sequences. In embodiments, the NLS sequence, e.g., the second NLS sequence, comprises a first partial NLS sequence, e.g., comprising the amino acid sequence KRTADGSEFE (SEQ ID NO: 11,097), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In embodiments, the NLS sequence, e.g., the second NLS sequence, comprises a second partial NLS sequence. In embodiments, the NLS sequence, e.g., the second NLS sequence, comprises an SV40A5 NLS, e.g., a bipartite SV40A5 NLS, e.g., comprising the amino acid sequence KRTADGSEFESPKKKAKVE (SEQ ID NO: 11,098), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the NLS sequence, e.g., the second NLS sequence, comprises the amino acid sequence KRTADGSEFEKRTADGSEFESPKKKAKVE (SEQ ID NO: 11,099), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.


In certain embodiments, the gene modifying polypeptide further comprises a spacer sequence between the second NLS and the T2A sequence and/or puromycin sequence. In certain embodiments, the spacer sequence between the second NLS and the T2A sequence and/or puromycin sequence comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids. In certain embodiments, the spacer sequence between the second NLS and the T2A sequence and/or puromycin sequence comprises the amino acid sequence GSG.


Linkers and RT Domains

In some embodiments, the gene modifying polypeptide comprises a linker (e.g., as described herein) and an RT domain (e.g., as described herein). In certain embodiments, the gene modifying polypeptide comprises, in N-terminal to C-terminal order, a linker (e.g., as described herein) and an RT domain (e.g., as described herein).


In certain embodiments, the linker comprises a linker sequence as listed in Table 10, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the linker comprises a linker sequence of any one of SEQ ID NOs: 1-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the linker comprises a linker sequence of any one of SEQ ID NOs: 6001-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the linker comprises a linker sequence of any one of SEQ ID NOs: 4501-4541, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the linker comprises a linker sequence of an exemplary gene modifying polypeptide listed in any of Tables A1, T1, or T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the RT domain comprises an RT domain sequence as listed in Table 6, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the RT domain comprises an RT domain sequence of an exemplary gene modifying polypeptide listed in any of Tables A1, T1, or T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.


In some embodiments, a gene modifying polypeptide comprises a portion of a gene modifying polypeptide of any one of SEQ ID NOs: 1-7743, wherein the portion comprises a linker and RT domain, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity to said portion.


In some embodiments, a gene modifying polypeptide comprises a linker of a gene modifying polypeptide of any one of SEQ ID NOs: 1-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity to said linker. In some embodiments, a gene modifying polypeptide comprises a linker of a gene modifying polypeptide of any one of SEQ ID NOs: 6001-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity to said linker. In some embodiments, a gene modifying polypeptide comprises a linker of a gene modifying polypeptide of any one of SEQ ID NOs: 4501-4541, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity to said linker. In some embodiments, a gene modifying polypeptide comprises a linker of a gene modifying polypeptide as listed in any of Tables A1, T1, or T2, or a linker comprising an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.


In some embodiments, a gene modifying polypeptide comprises an RT domain of a gene modifying polypeptide of any one of SEQ ID NOs: 1-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity to said RT domain. In some embodiments, a gene modifying polypeptide comprises an RT domain of a gene modifying polypeptide of any one of SEQ ID NOs: 6001-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity said RT domain. In some embodiments, a gene modifying polypeptide comprises an RT domain of a gene modifying polypeptide of any one of SEQ ID NOs: 4501-4541, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity said RT domain. In some embodiments, a gene modifying polypeptide comprises an RT domain of a gene modifying polypeptide as listed in any of Tables A1, T1, or T2, or an RT domain comprising an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.


In certain embodiments, the linker and the RT domain of a gene modifying polypeptide comprise the amino acid sequences of a linker and RT domain (or amino acid sequences having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto) of a gene modifying polypeptide having the amino acid sequence of any one of SEQ ID NOs: 1-7743. In certain embodiments, the linker and the RT domain of a gene modifying polypeptide comprise amino acid sequences of a linker and RT domain having at least 80% identity to the linker and RT domains of any one of SEQ ID NOs: 1-7743. In certain embodiments, the linker and the RT domain of a gene modifying polypeptide comprise amino acid sequences of a linker and RT domain having at least 90% identity to the linker and RT domains of any one of SEQ ID NOs: 1-7743. In certain embodiments, the linker and the RT domain of a gene modifying polypeptide comprise amino acid sequences of a linker and RT domain having at least 95% identity to the linker and RT domains of any one of SEQ ID NOs: 1-7743. In certain embodiments, the linker and the RT domain of a gene modifying polypeptide comprise amino acid sequences of a linker and RT domain having at least 99% identity to the linker and RT domains of any one of SEQ ID NOs: 1-7743. In certain embodiments, the linker and the RT domain of a gene modifying polypeptide comprise the amino acid sequences of a linker and RT domain (or amino acid sequences having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto) of a gene modifying polypeptide having the amino acid sequence of any one of SEQ ID NOs: 6001-7743. In certain embodiments, the linker and the RT domain of a gene modifying polypeptide comprise the amino acid sequences of a linker and RT domain (or amino acid sequences having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto) of a gene modifying polypeptide having the amino acid sequence of any one of SEQ ID NOs: 4501-4541. In certain embodiments, the linker and the RT domain of a gene modifying polypeptide comprise the amino acid sequences of a linker and RT domain (or amino acid sequences having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto) from a single row of any of Tables A1, T1, or T2 (e.g., from a single exemplary gene modifying polypeptide as listed in any of Tables A1, T1, or T2).


In certain embodiments, the linker and the RT domain of a gene modifying polypeptide comprise the amino acid sequences of a linker and RT domain (or amino acid sequences having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto) from two different amino acid sequences selected from SEQ ID NOs: 1-7743. In certain embodiments, the linker and the RT domain of a gene modifying polypeptide comprise the amino acid sequences of a linker and RT domain (or amino acid sequences having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto) from different rows of any of Tables A1, T1, or T2.


In certain embodiments, the gene modifying polypeptide further comprises a first NLS (e.g., a 5′ NLS), e.g., as described herein. In certain embodiments, the gene modifying polypeptide further comprises a second NLS (e.g., a 3′ NLS), e.g., as described herein. In certain embodiments, the gene modifying polypeptide further comprises an N-terminal methionine residue.


RT Families and Mutants

In certain embodiments, a gene modifying polypeptide comprises the amino acid sequence of an RT domain sequence from a family selected from: AVIRE, BAEVM, FFV, FLV, FOAMV, GALV, KORV, MLVAV, MLVBM, MLVCB, MLVFF, MLVMS, PERV, SFV1, SFV3L, WMSV, XMRV6, BLVAU, BLVJ, HTLIA, HTLIC, HTLIL, HTL32, HTL3P, HTLV2, JSRV, MLVF5, MLVRD, MMTVB, MPMV, SFVCP, SMRVH, SRV1, SRV2, and WDSV. In certain embodiments, a gene modifying polypeptide comprises comprises the amino acid sequence of an RT domain sequence from a family selected from: AVIRE, BAEVM, FFV, FLV, FOAMV, GALV, KORV, MLVAV, MLVBM, MLVCB, MLVFF, MLVMS, PERV, SFV1, SFV3L, WMSV, and XMRV6.


In certain embodiments, a gene modifying polypeptide comprises comprises the amino acid sequence of an RT domain sequence from an MLVMS RT domain. In embodiments, the amino acid sequence of an RT domain sequence comprises one or more point mutations as listed in column 1 of Table M1, or a point mutation corresponding thereto. In embodiments, the amino acid sequence of an RT domain sequence comprises one or more point mutations as listed in column 3 of Table M1 (Gen1 MLVMS), or a point mutation corresponding thereto. In embodiments, the amino acid sequence of an RT domain sequence comprises one or more point mutations at an amino acid position of the RT domain as listed in columns 1 and 2 of Table M2, or an amino acid position corresponding thereto.


In certain embodiments, a gene modifying polypeptide comprises comprises the amino acid sequence of an RT domain sequence from an AVIRE RT domain. In embodiments, the amino acid sequence of an RT domain sequence comprises one or more point mutations as listed in column 2 of Table M1, or a point mutation corresponding thereto. In embodiments, the amino acid sequence of an RT domain sequence comprises one or more point mutations as listed in column 4 of Table M1 (Gen2 AVIRE), or a point mutation corresponding thereto. In embodiments, the amino acid sequence of an RT domain sequence comprises one or more point mutations at an amino acid position of the RT domain as listed in columns 3 and 4 of Table M2, or an amino acid position corresponding thereto. In certain embodiments, the RT domain comprises an IENSSP (SEQ ID NO: 22003) (e.g., at the C-terminus).









TABLE M1







Exemplary point mutations in MLVMS and AVIRE RT domains










RT-linker filing
Corresponding
Gen1 MLVMS
Gen2 AVIRE


(MLVMS)
AVIRE
(PLV4921)
(PLV10990)







H8Y



P51L
Q51L




S67R
T67R




E67K
E67K




E69K
E69K




T197A
T197A




D200N
D200N
D200N
D200N


H204R
N204R




E302K
E302K






T306K
T306K


F309N
Y309N




W313F
W313F
W313F
W313F


T330P
G330P
T330P
G330P


L435G
T436G




N454K
N455K




D524G
D526G




E562Q
E564Q




D583N
D585N




H594Q
H596Q




L603W
L605W
L603W
L605W


D653N
D655N




L671P
L673P













IENSSP (SEQ ID NO: 22003)




at C-term
















TABLE M2







Positions that can be mutated in exemplary MLVMS and AVIRE


RT domains


WT residue & position











MLVMS

AVIRE


MLVMS aa
position # *
AVIRE aa
position # *













H
8
Y
8


P
51
Q
51


S
67
T
67


E
69
E
69


T
197
T
197


D
200
D
200


H
204
N
204


E
302
E
302


T
306
T
306


F
309
Y
309


W
313
W
313


T
330
G
330


L
435
T
436


N
454
N
455


D
524
D
526


E
562
E
564


D
583
D
585


H
594
H
596


L
603
L
605


D
653
D
655


L
671
S
673









In certain embodiments, a gene modifying polypeptide comprises a gamma retrovirus derived RT domain. In certain embodiments, the gamma retrovirus-derived RT domain of a gene modifying polypeptide comprises the amino acid sequence of an RT domain sequence from a family selected from: AVIRE, BAEVM, FFV, FLV, FOAMV, GALV, KORV, MLVAV, MLVBM, MLVCB, MLVFF, MLVMS, PERV, SFV1, SFV3L, WMSV, and XMRV6. In some embodiments, the gamma retrovirus-derived RT domain of a gene modifying polypeptide is not derived from PERV. In some embodiments, said RT includes one, two, three, four, five, six or more mutations shown in Table 2A and corresponding to mutations D200N, L603W, T330P, D524G, E562Q, D583N, P51L, S67R, E67K, T197A, H204R, E302K, F309N, W313F, L435G, N454K, H594Q, L671P, E69K, or D653N in the RT domain of murine leukemia virus reverse transcriptase. In some embodiments, the gene modifying polypeptide further comprises a linker having at least 99% identity to a linker domains of any one of SEQ ID NOs: 1-7743. In some embodiments, the gene modifying polypeptide further comprises a linker having at least 99% or 100% identity to SEQ ID NO: 5217 or SEQ ID NO:11,041.


In embodiments, the RT domain comprises the amino acid sequence of an RT domain of an AVIRE RT (e.g., an AVIRE_P03360 sequence, e.g., SEQ ID NO: 8001), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, the RT domain comprises the amino acid sequence of an AVIRE RT further comprising one, two, three, four, or five mutations selected from the group consisting of D200N, G330P, L605W, T306K, and W313F, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of an AVIRE RT further comprising one, two, or three mutations selected from the group consisting of D200N, G330P, and L605W, or a corresponding position in a homologous RT domain.


In embodiments, the RT domain comprises the amino acid sequence of an RT domain of a BAEVM RT (e.g., an BAEVM_P10272 sequence, e.g., SEQ ID NO: 8004), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, the RT domain comprises the amino acid sequence of a BAEVM RT further comprising one, two, three, four, or five mutations selected from the group consisting of D198N, E328P, L602W, T304K, and W311F, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a BAEVM RT further comprising one, two, or three mutations selected from the group consisting of D198N, E328P, and L602W, or a corresponding position in a homologous RT domain.


In embodiments, the RT domain comprises the amino acid sequence of an RT domain of an FFV RT (e.g., an FFV_093209 sequence, e.g., SEQ ID NO: 8012), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, the RT domain comprises the amino acid sequence of an FFV RT further comprising one, two, three, or four mutations selected from the group consisting of D21N, T293N, T419P, and L393K, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of an FFV RT further comprising one, two, or three mutations selected from the group consisting of D21N, T293N, and T419P, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of an FFV RT further comprising the mutation D21N. In some embodiments, the RT domain comprises the amino acid sequence of an FFV RT further comprising one, two, or three mutations selected from the group consisting of T207N, T333P, and L307K, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of an FFV RT further comprising one or two mutations selected from the group consisting of T207N and T333P, or a corresponding position in a homologous RT domain.


In embodiments, the RT domain comprises the amino acid sequence of an RT domain of an FLV RT (e.g., an FLV_P10273 sequence, e.g., SEQ ID NO: 8019), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, the RT domain comprises the amino acid sequence of an FLV RT further comprising one, two, three, or four mutations selected from the group consisting of D199N, L602W, T305K, and W312F, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of an FLV RT further comprising one or two mutations selected from the group consisting of D199N and L602W, or a corresponding position in a homologous RT domain.


In embodiments, the RT domain comprises the amino acid sequence of an RT domain of a FOAMV RT (e.g., an FOAMV_P14350 sequence, e.g., SEQ ID NO: 8021), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, the RT domain comprises the amino acid sequence of an FOAMV RT further comprising one, two, three, or four mutations selected from the group consisting of D24N, T296N, S420P, and L396K, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of an FOAMV RT further comprising one, two, or three mutations selected from the group consisting of D24N, T296N, and S420P, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of an FOAMV RT further comprising the mutation D24N, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of an FOAMV RT further comprising one, two, or three mutations selected from the group consisting of T207N, S331P, and L307K, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of an FOAMV RT further comprising one or two mutations selected from the group consisting of T207N and S331P, or a corresponding position in a homologous RT domain.


In embodiments, the RT domain comprises the amino acid sequence of an RT domain of a GALV RT (e.g., an GALV_P21414 sequence, e.g., SEQ ID NO: 8027), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, the RT domain comprises the amino acid sequence of a GALV RT further comprising one, two, three, four, or five mutations selected from the group consisting of D198N, E328P, L600W, T304K, and W311F, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a GALV RT further comprising one, two, or three mutations selected from the group consisting of D198N, E328P, and L600W, or a corresponding position in a homologous RT domain.


In embodiments, the RT domain comprises the amino acid sequence of an RT domain of a KORV RT (e.g., an KORV_Q9TTC1 sequence, e.g., SEQ ID NO: 8047), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, the RT domain comprises the amino acid sequence of a GALV RT further comprising one, two, three, four, five, or six mutations selected from the group consisting of D32N, D322N, E452P, L274W, T428K, and W435F, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a GALV RT further comprising one, two, three, or four mutations selected from the group consisting of D32N, D322N, E452P, and L274W, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a GALV RT further comprising the mutation D32N. In some embodiments, the RT domain comprises the amino acid sequence of a KORV RT further comprising one, two, three, four, or five mutations selected from the group consisting of D23IN, E361P, L633W, T337K, and W344F, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a KORV RT further comprising one, two, or three mutations selected from the group consisting of D23IN, E361P, and L633W, or a corresponding position in a homologous RT domain.


In embodiments, the RT domain comprises the amino acid sequence of an RT domain of a MLVAV RT (e.g., an MLVAV_P03356 sequence, e.g., SEQ ID NO: 8053), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, the RT domain comprises the amino acid sequence of a MLVAV RT further comprising one, two, three, four, or five mutations selected from the group consisting of D200N, T330P, L603W, T306K, and W313F, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a MLVAV RT further comprising one, two, or three mutations selected from the group consisting of D200N, T330P, and L603W, or a corresponding position in a homologous RT domain.


In embodiments, the RT domain comprises the amino acid sequence of an RT domain of a MLVBM RT (e.g., an MLVBM_Q7SVK7 sequence, e.g., SEQ ID NO: 8056), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, the RT domain comprises the amino acid sequence of a MLVBM RT further comprising one, two, three, four, or five mutations selected from the group consisting of D199N, T329P, L602W, T305K, and W312F, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a ML VBM RT further comprising one, two, and three mutations selected from the group consisting of D200N, T330P, and L603W, or a corresponding position in a homologous RT domain.


In embodiments, the RT domain comprises the amino acid sequence of an RT domain of a MLVCB RT (e.g., an MLVCB_P08361 sequence, e.g., SEQ ID NO: 8062), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, the RT domain comprises the amino acid sequence of a ML VCB RT further comprising one, two, three, four, or five mutations selected from the group consisting of D200N, T330P, L603W, T306K, and W313F, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a MLVCB RT further comprising one, two, and three mutations selected from the group consisting of D200N, T330P, and L603W, or a corresponding position in a homologous RT domain.


In embodiments, the RT domain comprises the amino acid sequence of an RT domain of a MLVFF RT, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, the RT domain comprises the amino acid sequence of a ML VFF RT further comprising one, two, three, four, or five mutations selected from the group consisting of D200N, T330P, L603W, T306K, and W313F, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a MLVFF RT further comprising one, two, and three mutations selected from the group consisting of D200N, T330P, and L603W, or a corresponding position in a homologous RT domain.


In embodiments, the RT domain comprises the amino acid sequence of an RT domain of a MLVMS RT (e.g., an MLVMS_reference sequence, e.g., SEQ ID NO: 8137; or an MLVMS_P03355 sequence, e.g., SEQ ID NO: 8070), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, the RT domain comprises the amino acid sequence of a MLVMS RT further comprising one, two, three, four, five, or six mutations selected from the group consisting of D200N, T330P, L603W, T306K, W313F, and H8Y, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a MLVMS RT further comprising one, two, three, four, or five mutations selected from the group consisting of D200N, T330P, L603W, T306K, and W313F, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a MLVMS RT further comprising one, two, or three mutations selected from the group consisting of D200N, T330P, and L603W, or a corresponding position in a homologous RT domain.


In embodiments, the RT domain comprises the amino acid sequence of an RT domain of a PERV RT (e.g., an PERV_Q4VFZ2 sequence, e.g., SEQ ID NO: 8099), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, the RT domain comprises the amino acid sequence of a PERV RT further comprising one, two, three, four, or five mutations selected from the group consisting of D196N, E326P, L599W, T302K, and W309F, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a PERV RT further comprising one, two, or three mutations selected from the group consisting of D196N, E326P, and L599W, or a corresponding position in a homologous RT domain.


In embodiments, the RT domain comprises the amino acid sequence of an RT domain of a SFV1 RT (e.g., an SFV1_P23074 sequence, e.g., SEQ ID NO: 8105), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, the RT domain comprises the amino acid sequence of a SFV1 RT further comprising one, two, three, or four mutations selected from the group consisting of D24N, T296N, N420P, and L396K, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a SFV1 RT further comprising one, two, or three mutations selected from the group consisting of D24N, T296N, and N420P, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a SFV1 RT further comprising the D24N, or a corresponding position in a homologous RT domain.


In embodiments, the RT domain comprises the amino acid sequence of an RT domain of a SFV3L RT (e.g., an SFV3L_P27401 sequence, e.g., SEQ ID NO: 8111), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, the RT domain comprises the amino acid sequence of a SFV3L RT further comprising one, two, three, or four mutations selected from the group consisting of D24N, T296N, N422P, and L396K, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a SFV3L RT further comprising one, two, or three mutations selected from the group consisting of D24N, T296N, and N422P, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a SFV3L RT further comprising the mutation D24N, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a SFV3L RT further comprising one, two, or three mutations selected from the group consisting of T307N, N333P, and L307K, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a SFV3L RT further comprising one or two mutations selected from the group consisting of T307N and N333P, or a corresponding position in a homologous RT domain.


In embodiments, the RT domain comprises the amino acid sequence of an RT domain of a WMSV RT (e.g., an WMSV_P03359 sequence, e.g., SEQ ID NO: 8131), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, the RT domain comprises the amino acid sequence of a WMSV RT further comprising one, two, three, four, or five mutations selected from the group consisting of D198N, E328P, L600W, T304K, and W311F, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a WMSV RT further comprising one, two, or three mutations selected from the group consisting of D198N, E328P, and L600W, or a corresponding position in a homologous RT domain.


In embodiments, the RT domain comprises the amino acid sequence of an RT domain of a XMRV6 RT (e.g., an XMRV6_AIZ651 sequence, e.g., SEQ ID NO: 8134), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, the RT domain comprises the amino acid sequence of a XMRV6 RT further comprising one, two, three, four, or five mutations selected from the group consisting of D200N, T330P, L603W, T306K, and W313F, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a XMRV6 RT further comprising one, two, or three mutations selected from the group consisting of D200N, T330P, and L603W, or a corresponding position in a homologous RT domain.


In certain embodiments, the RT domain of a gene modifying polypeptide comprises the amino acid sequence of an RT domain of an AVIRE RT, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In embodiments, the RT domain comprises the amino acid sequence of an RT domain comprised in a sequence listed in column 1 of Table A5, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, the gene modifying polypeptide further comprises a linker having at least 99% or 100% identity to SEQ ID NO: 5217 or SEQ ID NO:11,041.


In certain embodiments, the RT domain of a gene modifying polypeptide comprises the amino acid sequence of an RT domain of an MLVMS RT, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In embodiments, the RT domain comprises the amino acid sequence of an RT domain comprised in a sequence listed in any of columns 2-6 of Table A5, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, the gene modifying polypeptide further comprises a linker having at least 99% or 100% identity to SEQ ID NO: 5217 or SEQ ID NO:11,041.









TABLE A5







Exemplary gene modifying polypeptides comprising an A VIRE RT


domain or an MLVMS RT domain.








AVIRE



SEQ ID



NOS:
MLVMS SEQ ID NOS:















1
2704
3007
3038
2638
2930


2
2706
3007
3038
2639
2930


3
2708
3008
3039
2639
2931


4
2709
3008
3039
2640
2931


5
2709
3009
3040
2640
2932


6
2710
3010
3040
2641
2932


7
2957
3010
3041
2641
2933


9
2957
3011
3041
2642
2933


10
2958
3012
3042
2642
2934


12
2959
3012
3042
2643
2934


13
2960
3013
3043
2643
2935


14
2962
3013
3043
2644
2935


6076
6042
3014
3044
2644
2936


6143
6068
3014
3044
2645
2936


6200
6097
3015
3045
2645
2937


6254
6136
3015
3045
2646
2937


6274
6156
3016
3046
2646
2938


6315
6215
3016
3046
2647
2938


6328
6216
3017
3047
2647
2939


6337
6301
3018
3047
2648
2939


6403
6352
3018
3048
2648
2940


6420
6365
3019
3048
2649
2940


6440
6411
3019
3049
2649
2941


6513
6436
3020
3049
2650
2941


6552
6458
3020
3050
2650
2942


6613
6459
3021
3051
2651
2942


6671
6524
3021
3051
2651
2943


6822
6562
3022
3052
2652
2943


6840
6563
3023
3052
2652
2944


6884
6699
3023
3053
2653
2945


6907
6865
3024
3053
2653
2945


6970
7022
3024
3054
2654
2946


7025
7037
3025
3054
2655
2946


7052
7088
3025
3055
2655
2947


7078
7116
3026
3055
2656
2947


7243
7175
3026
3056
2656
2948


7253
7200
3027
3056
2657
2948


7318
7206
3027
3057
2657
2949


7379
7277
3028
3057
2658
2949


7486
7294
3028
3058
2658
2950


7524
7330
3029
3058
2659
2950


7668
7411
3030
3059
2659
2951


7680
7455
3030
3059
2660
2951


7720
7477
3031
3060
2660
2952


1137
7511
3031
3060
2661
2952


1138
7538
3032
3061
2661
2953


1139
7559
3032
3061
2662
2953


1140
7560
3033
3062
2662
2954


1141
7593
3033
3062
2663
2954


1142
7594
3034
3063
2663
2955


1143
7607
3034
3063
2664
2955


1144
7623
6025
3064
2664
6485


1145
7638
6041
3064
2665
6486


1146
7717
6043
3065
2665
6504


1147
7731
6098
3065
2666
6505


1148
7732
6099
3066
2666
6595


1149
2711
6180
3066
2667
6596


1150
2711
6182
3067
2667
6751


1151
2712
6237
3067
2668
6752


1152
2712
6238
3068
2668
6777


1153
2713
6311
3068
2669
6778


1154
2713
6312
3069
2669
7172


1155
2714
6578
3069
2670
7174


1156
2714
6579
3070
2670
7313


1157
2715
6663
3070
2671
7314


1158
2715
6664
3071
2671



1159
2716
6708
3071
2672



1160
2716
6709
3072
2672



1161
2717
6809
3072
2673



1162
2717
6831
3073
2673



1163
2718
6832
3073
2674



1164
2718
6864
3074
2674



1165
2719
6866
3074
2675



1166
2719
7089
3075
2675



1167
2720
7157
3075
2676



6015
2720
7159
3076
2676



6029
2721
7173
3076
2677



6045
2721
7176
3077
2677



6077
2722
7293
3077
2678



6129
2722
7295
3078
2678



6144
2723
7343
3078
2679



6164
2723
7393
3079
2680



6201
2724
7394
3079
2680



6227
2724
7425
3080
2681



6244
2725
7426
3080
2681



6250
2725
7444
3081
2682



6264
2726
7445
3081
2682



6289
2726
7476
3082
2683



6304
2727
7478
3082
2683



6316
2727
7496
3083
2684



6384
2728
7497
3083
2684



6421
2728
7537
3084
2685



6441
2729
7539
3084
2685



6492
2729
2780
3085
2686



6514
2730
2780
3085
2686



6530
2730
2781
3086
2687



6569
2731
2781
3086
2687



6584
2731
2782
3087
2688



6621
2732
2782
3087
2688



6651
2732
2783
3088
2689



6659
2733
2783
3088
2689



6683
2734
2784
3089
2690



6703
2734
2784
3089
2690



6727
2735
2785
3090
2691



6732
2735
2785
3090
2692



6745
2736
2786
3091
2692



6755
2736
2786
3091
2693



6784
2737
2787
3092
2693



6817
2737
2787
3092
2694



6823
2738
2788
3093
2694



6841
2739
2788
3093
2695



6871
2740
2789
3094
2695



6885
2740
2789
3095
2696



6898
2741
2790
3095
2696



6908
2741
2790
3096
2697



6933
2742
2791
3096
2697



6971
2742
2791
3097
2698



7009
2743
2792
3097
2698



7018
2743
2792
3098
2699



7045
2744
2793
3098
2699



7053
2744
2793
3099
2700



7068
2745
2794
3099
2700



7079
2745
2794
3100
2701



7096
2746
2795
3100
2701



7104
2746
2795
3101
2702



7122
2747
2796
3101
2702



7151
2747
2796
3102
2703



7163
2748
2797
3102
2703



7181
2748
2797
3103
2862



7244
2749
2798
3103
2862



7273
2750
2798
3104
2863



7319
2750
2799
3104
2863



7336
2751
2799
3105
2864



7380
2751
2800
3105
2864



7402
2752
2800
3106
2865



7462
2752
2801
3106
2865



7487
2753
2801
3107
2866



7525
2753
2802
3107
2866



7569
2754
2802
3108
2867



7626
2754
2803
3108
2867



7689
2755
2803
3109
2868



7707
2755
2804
3109
2868



7721
2756
2804
3110
2869



1371
2756
2805
3110
2869



1372
2757
2805
3111
2870



1373
2758
2806
3111
2870



1374
2758
2806
3112
2871



1375
2759
2807
3112
2871



1376
2759
2807
3113
2872



1377
2760
2808
3113
2872



1378
2760
2808
3114
2873



1379
2761
2809
3114
2873



1380
2761
2809
3115
2874



1381
2762
2810
3115
2874



1382
2762
2810
3116
2875



1383
2763
2811
3116
2875



1384
2763
2811
3117
2876



1385
2764
2812
3117
2876



1386
2764
2812
3118
2877



1387
2765
2813
3118
2877



1388
2765
2813
3119
2878



1389
2766
2814
3119
2878



1390
2766
2814
3120
2879



1391
2767
2815
3120
2879



1392
2767
2815
3121
2880



1393
2768
2816
3121
2880



1394
2768
2816
3122
2881



1395
2769
2817
3122
2881



1396
2769
2817
3123
2882



1397
2770
2818
3123
2882



1398
2770
2818
3124
2883



1399
2771
2819
3124
2883



1400
2771
2819
3125
2884



1401
2772
2820
3125
2884



1402
2773
2820
3126
2885



1403
2773
2821
3126
2885



1404
2774
2821
3127
2886



1405
2774
2822
3127
2886



1406
2775
2822
3128
2887



1407
2775
2823
3128
2887



1408
2776
2823
3129
2888



1409
2776
2824
3129
2888



1410
2777
2824
3130
2889



1411
2777
2825
3130
2889



1412
2778
2825
3131
2890



1413
2779
2826
3131
2890



1414
2779
2826
3132
2891



1415
2965
2827
3133
2891



1416
2965
2827
3133
2892



1417
2966
2828
3134
2893



1418
2966
2828
3134
2893



1419
2967
2829
3135
2894



1420
2968
2829
3135
2894



1421
2968
2830
3136
2895



1422
2969
2830
3136
2895



1423
2969
2831
6181
2896



1424
2970
2831
6183
2896



1425
2970
2832
6284
2897



1426
2971
2832
6285
289



1427
2971
2833
6760
2898



1428
2972
2833
6761
2898



1429
2972
2834
7036
2899



1430
2973
2834
7038
2899



1431
2974
2835
7158
2900



1432
2974
2835
7160
2900



1433
2975
2836
2610
2901



1434
2976
2836
2610
2901



1435
2976
2837
2611
2902



1436
2977
2837
2611
2902



1437
2977
2838
2612
2903



1439
2978
2838
2612
2903



1440
2978
2839
2613
2904



1441
2979
2839
2613
2904



1442
2979
2840
2614
2905



1443
2980
2840
2614
2905



1444
2980
2841
2615
2906



1445
2981
2841
2615
2906



1446
2981
2842
2616
2907



1447
2982
2842
2616
2907



6001
2982
2843
2617
2908



6030
2983
2843
2617
2908



6078
2983
2844
2618
2909



6108
2984
2844
2618
2909



6130
2985
2845
2619
2910



6165
2985
2845
2619
2910



6265
2986
2846
2620
2911



6275
2987
2846
2620
2911



6305
2987
2847
2621
2912



6329
2988
2847
2621
2912



6370
2988
2848
2622
2913



6385
2989
2848
2622
2913



6404
2989
2849
2623
2914



6531
2990
2849
2623
2914



6585
2990
2850
2624
2915



6622
2991
2850
2624
2915



6652
2991
2851
2625
2916



6733
2992
2851
2625
2916



6756
2992
2852
2626
2917



6765
2993
2852
2626
2917



6798
2993
2853
2627
2918



6824
2994
2853
2627
2919



6972
2994
2854
2628
2919



7046
2995
2854
2628
2920



7054
2995
2855
2629
2920



7069
2996
2855
2629
2921



7080
2996
2856
2630
2921



7105
2997
2856
2630
2922



7123
2998
2857
2631
2922



7143
2998
2857
2631
2923



7152
2999
2858
2632
2923



7204
2999
2858
2632
2924



7320
3001
2859
2633
2924



7351
3001
2859
2633
2925



7381
3002
2860
2634
2925



7403
3002
2860
2634
2926



7438
3003
2861
2635
2926



7488
3003
2861
2635
2927



7500
3004
3035
2636
2927



7526
3004
3036
2636
2928



7588
3005
3036
2637
2928



7612
3005
3037
2637
2929



7627
3006
3037
2638
2929









Systems

In an aspect, the disclosure relates to a system comprising nucleic acid molecule encoding a gene modifying polypeptide (e.g., as described herein) and a template nucleic acid (e.g., a template RNA, e.g., as described herein). In certain embodiments, the nucleic acid molecule encoding the gene modifying polypeptide comprises one or more silent mutations in 310500244.1 the coding region (e.g., in the sequence encoding the RT domain) relative to a nucleic acid molecule as described herein. In certain embodiments, the system further comprises a gRNA (e.g., a gRNA that binds to a polypeptide that induces a nick, e.g., in the opposite strand of the target DNA bound by the gene modifying polypeptide).


In certain embodiments, the nucleic acid molecule encoding the gene modifying polypeptide encodes a polypeptide having an amino acid sequence selected from SEQ ID NOs: 1-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the nucleic acid molecule encoding the gene modifying polypeptide encodes a polypeptide having an amino acid sequence selected from SEQ ID NOs: 6001-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the nucleic acid molecule encoding the gene modifying polypeptide encodes a polypeptide having an amino acid sequence selected from SEQ ID NOs: 4501-4541, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the nucleic acid molecule encoding the gene modifying polypeptide encodes a polypeptide as listed in any of Tables A1, T1, or T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.


In certain embodiments, the nucleic acid molecule encoding the gene modifying polypeptide comprises a sequence encoding a portion of an amino acid sequence selected from SEQ ID NOs: 1-7743, wherein the portion comprises a linker and RT domain, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity to said portion. In certain embodiments, the nucleic acid molecule encoding the gene modifying polypeptide comprises a sequence encoding a portion of an amino acid sequence selected from SEQ ID NOs: 6001-7743, wherein the portion comprises a linker and RT domain, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity to said portion. In certain embodiments, the nucleic acid molecule encoding the gene modifying polypeptide comprises a sequence encoding a portion of an amino acid sequence selected from SEQ ID NOs: 4501-4541, wherein the portion comprises a linker and RT domain, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity to said portion. In certain embodiments, the nucleic acid molecule encoding the gene modifying polypeptide comprises a sequence encoding a portion of a polypeptide listed in any of Tables A1, T1, or T2, wherein the portion comprises a linker and RT domain, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity to said portion.


In certain embodiments, the nucleic acid molecule encoding the gene modifying polypeptide comprises a sequence encoding the linker of an amino acid sequence selected from SEQ ID NOs: 1-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the nucleic acid molecule encoding the gene modifying polypeptide comprises a sequence encoding the linker of a polypeptide having an amino acid sequence selected from SEQ ID NOs: 6001-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the nucleic acid molecule encoding the gene modifying polypeptide comprises a sequence encoding the linker of a polypeptide having an amino acid sequence selected from SEQ ID NOs: 4501-4541, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the nucleic acid molecule encoding the gene modifying polypeptide comprises a sequence encoding the linker of a polypeptide as listed in any of Tables A1, T1, or T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.


In certain embodiments, the nucleic acid molecule encoding the gene modifying polypeptide comprises a sequence encoding the RT domain of an amino acid sequence selected from SEQ ID NOs: 1-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the nucleic acid molecule encoding the gene modifying polypeptide comprises a sequence encoding the RT domain of a polypeptide having an amino acid sequence selected from SEQ ID NOs: 6001-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the nucleic acid molecule encoding the gene modifying polypeptide comprises a sequence encoding the RT domain of a polypeptide having an amino acid sequence selected from SEQ ID NOs: 4501-4541, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the nucleic acid molecule encoding the gene modifying polypeptide comprises a sequence encoding the RT domain of a polypeptide as listed in any of Tables A1, T1, or T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.


In an aspect, the disclosure relates to a system comprising a gene modifying polypeptide (e.g., as described herein) and a template nucleic acid (e.g., a template RNA, e.g., as described herein).


In certain embodiments, the gene modifying polypeptide comprises a polypeptide having an amino acid sequence selected from SEQ ID NOs: 1-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the gene modifying polypeptide comprises a polypeptide having an amino acid sequence selected from SEQ ID NOs: 6001-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the gene modifying polypeptide comprises a polypeptide having an amino acid sequence selected from SEQ ID NOs: 4501-4541, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the gene modifying polypeptide comprises a polypeptide as listed in any of Tables A1, T1, or T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.


In certain embodiments, the gene modifying polypeptide comprises a portion of an amino acid sequence selected from SEQ ID NOs: 1-7743, wherein the portion comprises a linker and RT domain, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity to said portion. In certain embodiments, the gene modifying polypeptide comprises a portion of an amino acid sequence selected from SEQ ID NOs: 6001-7743, wherein the portion comprises a linker and RT domain, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity to said portion. In certain embodiments, the gene modifying polypeptide comprises a portion of an amino acid sequence selected from SEQ ID NOs: 4501-4541, wherein the portion comprises a linker and RT domain, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity to said portion. In certain embodiments, the gene modifying polypeptide comprises a portion of a polypeptide listed in any of Tables A1, T1, or T2, wherein the portion comprises a linker and RT domain, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity to said portion.


In certain embodiments, the gene modifying polypeptide comprises the linker of an amino acid sequence selected from SEQ ID NOs: 1-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the gene modifying polypeptide comprises a sequence encoding the linker of a polypeptide having an amino acid sequence selected from SEQ ID NOs: 6001-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the gene modifying polypeptide comprises a sequence encoding the linker of a polypeptide having an amino acid sequence selected from SEQ ID NOs: 4501-4541, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the gene modifying polypeptide comprises the linker of a polypeptide as listed in any of Tables A1, T1, or T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.


In certain embodiments, the gene modifying polypeptide comprises the RT domain of an amino acid sequence selected from SEQ ID NOs: 1-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the gene modifying polypeptide comprises a sequence encoding the RT domain of a polypeptide having an amino acid sequence selected from SEQ ID NOs: 6001-7743, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the gene modifying polypeptide comprises a sequence encoding the RT domain of a polypeptide having an amino acid sequence selected from SEQ ID NOs: 4501-4541, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto. In certain embodiments, the gene modifying polypeptide comprises the RT domain of a polypeptide as listed in any of Tables A1, T1, or T2, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.










Lengthy table referenced here




US20240252682A1-20240801-T00001


Please refer to the end of the specification for access instructions.






Localization Sequences for Gene Modifying Systems

In certain embodiments, a gene editor system RNA further comprises an intracellular localization sequence, e.g., a nuclear localization sequence (NLS). In some embodiments, a gene modifying polypeptide comprises an NLS as comprised in SEQ ID NO: 4000 and/or SEQ ID NO: 4001, or an NLS having an amino acid sequence having at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto.


The nuclear localization sequence may be an RNA sequence that promotes the import of the RNA into the nucleus. In certain embodiments the nuclear localization signal is located on the template RNA. In certain embodiments, the gene modifying polypeptide is encoded on a first RNA, and the template RNA is a second, separate, RNA, and the nuclear localization signal is located on the template RNA and not on an RNA encoding the gene modifying polypeptide. While not wishing to be bound by theory, in some embodiments, the RNA encoding the gene modifying polypeptide is targeted primarily to the cytoplasm to promote its translation, while the template RNA is targeted primarily to the nucleus to promote insertion into the genome. In some embodiments the nuclear localization signal is at the 3′ end, 5′ end, or in an internal region of the template RNA. In some embodiments the nuclear localization signal is 3′ of the heterologous sequence (e.g., is directly 3′ of the heterologous sequence) or is 5′ of the heterologous sequence (e.g., is directly 5′ of the heterologous sequence). In some embodiments the nuclear localization signal is placed outside of the 5′ UTR or outside of the 3′ UTR of the template RNA. In some embodiments the nuclear localization signal is placed between the 5′ UTR and the 3′ UTR, wherein optionally the nuclear localization signal is not transcribed with the transgene (e.g., the nuclear localization signal is an anti-sense orientation or is downstream of a transcriptional termination signal or polyadenylation signal). In some embodiments the nuclear localization sequence is situated inside of an intron. In some embodiments a plurality of the same or different nuclear localization signals are in the RNA, e.g., in the template RNA. In some embodiments the nuclear localization signal is less than 5, 10, 25, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900 or 1000 bp in length. Various RNA nuclear localization sequences can be used. For example, Lubelsky and Ulitsky, Nature 555 (107-111), 2018 describe RNA sequences which drive RNA localization into the nucleus. In some embodiments, the nuclear localization signal is a SINE-derived nuclear RNA localization (SIRLOIN) signal. In some embodiments the nuclear localization signal binds a nuclear-enriched protein. In some embodiments the nuclear localization signal binds the HNRNPK protein. In some embodiments the nuclear localization signal is rich in pyrimidines, e.g., is a C/T rich, C/U rich, C rich, T rich, or U rich region. In some embodiments the nuclear localization signal is derived from a long non-coding RNA. In some embodiments the nuclear localization signal is derived from MALATI long non-coding RNA or is the 600 nucleotide M region of MALATI (described in Miyagawa et al., RNA 18, (738-751), 2012). In some embodiments the nuclear localization signal is derived from BORG long non-coding RNA or is a AGCCC motif (described in Zhang et al., Molecular and Cellular Biology 34, 2318-2329 (2014). In some embodiments the nuclear localization sequence is described in Shukla et al., The EMBO Journal e98452 (2018). In some embodiments the nuclear localization signal is derived from a retrovirus.


In some embodiments, a polypeptide described herein comprises one or more (e.g., 2, 3, 4, 5) nuclear targeting sequences, for example a nuclear localization sequence (NLS). In some embodiments, the NLS is a bipartite NLS. In some embodiments, an NLS facilitates the import of a protein comprising an NLS into the cell nucleus. In some embodiments, the NLS is fused to the N-terminus of a gene modifying polypeptide as described herein. In some embodiments, the NLS is fused to the C-terminus of the gene modifying polypeptide. In some embodiments, the NLS is fused to the N-terminus or the C-terminus of a Cas domain. In some embodiments, a linker sequence is disposed between the NLS and the neighboring domain of the gene modifying polypeptide.


In some embodiments, an NLS comprises the amino acid sequence MDSLLMNRRKFLYQFKNVRWAKGRRETYLC (SEQ ID NO: 5009), PKKRKVEGADKRTADGSEFESPKKKRKV(SEQ ID NO: 5010), RKSGKIAAIWKRPRKPKKKRKV (SEQ ID NO: 5011) KRTADGSEFESPKKKRKV(SEQ ID NO: 5012), KKTELQTTNAENKTKKL (SEQ ID NO: 5013), or KRGINDRNFWRGENGRKTR (SEQ ID NO: 5014), KRPAATKKAGQAKKKK (SEQ ID NO: 5015), PAAKRVKLD (SEQ ID NO: 4644), KRTADGSEFEKRTADGSEFESPKKKAKVE (SEQ ID NO: 4649), KRTADGSEFE (SEQ ID NO: 4650), KRTADGSEFESPKKKAKVE (SEQ ID NO: 4651), AGKRTADGSEFEKRTADGSEFESPKKKAKVE (SEQ ID NO: 4001), or a functional fragment or variant thereof. Exemplary NLS sequences are also described in PCT/EP2000/011690, the contents of which are incorporated herein by reference for their disclosure of exemplary nuclear localization sequences. In some embodiments, an NLS comprises an amino acid sequence as disclosed in Table 11. An NLS of this table may be utilized with one or more copies in a polypeptide in one or more locations in a polypeptide, e.g., 1, 2, 3 or more copies of an NLS in an N-terminal domain, between peptide domains, in a C-terminal domain, or in a combination of locations, in order to improve subcellular localization to the nucleus. Multiple unique sequences may be used within a single polypeptide. Sequences may be naturally monopartite or bipartite, e.g., having one or two stretches of basic amino acids, or may be used as chimeric bipartite sequences. Sequence references correspond to UniProt accession numbers, except where indicated as SeqNLS for sequences mined using a subcellular localization prediction algorithm (Lin et al BMC Bioinformat 13:157 (2012), incorporated herein by reference in its entirety).









TABLE 11







Exemplary nuclear localization


signals for use in gene modifying systems









Sequence
Sequence References
SEQ ID No.





AHFKISGEKRPSTDPGKKAK
Q76IQ7
5223


NPKKKKKKDP







AHRAKKMSKTHA
P21827
5224





ASPEYVNLPINGNG
SeqNLS
5225





CTKRPRW
O88622, Q86W56, Q9QYM2, O02776
5226





DKAKRVSRNKSEKKRR
O15516, Q5RAK8, Q91YB2, Q91YB0,
5227



Q8QGQ6, O08785, Q9WVS9, Q6YGZ4






EELRLKEELLKGIYA
Q9QY16, Q9UHL0, Q2TBP1, Q9QY15
5228





EEQLRRRKNSRLNNTG
G5EFF5
5229





EVLKVIRTGKRKKKAWKR
SeqNLS
5230


MVTKVC







HHHHHHHHHHHHQPH
Q63934, G3V7L5, Q12837
5231





HKKKHPDASVNFSEFSK
P10103, Q4R844, P12682, B0CM99,
5232



A9RA84, Q6YKA4, P09429, P63159,




Q08IE6, P63158, Q9YH06, B1MTB0






HKRTKK
Q2R2D5
5233





IINGRKLKLKKSRRRSSQTS
SeqNLS
5234


NNSFTSRRS







KAEQERRK
Q8LH59
5235





KEKRKRREELFIEQKKRK
SeqNLS
5236





KKGKDEWFSRGKKP
P30999
5237





KKGPSVQKRKKT
Q6ZN17
5238





KKKTVINDLLHYKKEK
SeqNLS, P32354
5239





KKNGGKGKNKPSAKIKK
SeqNLS
5240





KKPKWDDFKKKKK
Q15397, Q8BKS9, Q562C7
5241





KKRKKD
SeqNLS, Q91Z62, Q1A730, Q969P5,
5242



Q2KHT6, Q9CPU7






KKRRKRRRK
SeqNLS
5243





KKRRRRARK
Q9UMS6, D4A702, Q91YE8
5244





KKSKRGR
Q9UBS0
5245





KKSRKRGS
B4FG96
5246





KKSTALSRELGKIMRRR
SeqNLS, P32354
5247





KKSYQDPEIIAHSRPRK
Q9U7C9
5248





KKTGKNRKLKSKRVKTR
Q9Z301, O54943, Q8K3T2
5249





KKVSIAGQSGKLWRWKR
Q6YUL8
5250





KKYENVVIKRSPRKRGRPR
SeqNLS
5251


K







KNKKRK
SeqNLS
5252





KPKKKR
SeqNLS
5253





KRAMKDDSHGNSTSPKRRK
Q0E671
5254





KRANSNLVAAYEKAKKK
P23508
5255





KRASEDTTSGSPPKKSSAGP
Q9BZZ5, Q5R644
5256


KR







KRFKRRWMVRKMKTKK
SeqNLS
5257





KRGLNSSFETSPKKVK
Q8IV63
5258





KRGNSSIGPNDLSKRKQRK
SeqNLS
5259


K







KRIHSVSLSQSQIDPSKKVK
SeqNLS
5260


RAK







KRKGKLKNKGSKRKK
O15381
5261





KRRRRRRREKRKR
Q96GM8
5262





KRSNDRTYSPEEEKQRRA
Q91ZF2
5263





KRTVATNGDASGAHRAKK
SeqNLS
5264


MSK







KRVYNKGEDEQEHLPKGKK
SeqNLS
5265


R







KSGKAPRRRAVSMDNSNK
Q9WVH4, O43524
5266





KVNFLDMSLDDIIIYKELE
Q9P127
5267





KVQHRIAKKTTRRRR
Q9DXE6
5268





LSPSLSPL
Q9Y261, P32182, P35583
5269





MDSLLMNRRKFLYQFKNVR
Q9GZX7
5270


WAKGRRETYLC







MPQNEYIELHRKRYGYRLD
SeqNLS
5271


YHEKKRKKESREAHERSKK




AKKMIGLKAKLYHK







MVQLRPRASR
SeqNLS
5272





NNKLLAKRRKGGASPKDDP
Q965G5
5273


MDDIK







NYKRPMDGTYGPPAKRHEG
O14497, A2BH40
5274


E







PDTKRAKLDSSETTMVKKK
SeqNLS
5275





PEKRTKI
SeqNLS
5276





PGGRGKKK
Q719N1, Q9UBP0, A2VDN5
5277





PGKMDKGEHRQERRDRPY
Q01844, Q61545
5278





PKKGDKYDKTD
Q45FA5
5279





PKKKSRK
O35914, Q01954
5280





PKKNKPE
Q22663
5281





PKKRAKV
P04295, P89438
5282





PKPKKLKVE
P55263, P55262, P55264, Q64640
5283





PKRGRGR
Q9FYS5, Q43386
5284





PKRRLVDDA
P0C797
5285





PKRRRTY
SeqNLS
5286





PLFKRR
A8X6H4, Q9TXJ0
5287





PLRKAKR
Q86WB0, Q5R8V9
5288





PPAKRKCIF
Q6AZ28, O75928, Q8C5D8
5289





PPARRRRL
Q8NAG6
5290





PPKKKRKV
Q3L6L5, P03070, P14999, P03071
5291





PPNKRMKVKH
Q8BN78
5292





PPRIYPQLPSAPT
P0C799
5293





PQRSPFPKSSVKR
SeqNLS
5294





PRPRKVPR
P0C799
5295





PRRRVQRKR
SeqNLS, Q5R448, Q5TAQ9
5296





PRRVRLK
Q58DJ0, P56477, Q13568
5297





PSRKRPR
Q62315, Q5F363, Q92833
5298





PSSKKRKV
SeqNLS
5299





PTKKRVK
P07664
5300





QRPGPYDRP
SeqNLS
5301





RGKGGKGLGKGGAKRHRK
SeqNLS
5302





RKAGKGGGGHKTTKKRSA
B4FG96
5303


KDEKVP







RKIKLKRAK
A1L3G9
5304





RKIKRKRAK
B9X187
5305





RKKEAPGPREELRSRGR
O35126, P54258, Q5IS70, P54259
5306





RKKRKGK
SeqNLS, Q29243, Q62165, Q28685,
5307



O18738, Q9TSZ6, Q14118






RKKRRQRRR
P04326, P69697, P69698, P05907,
5308



P20879, P04613, P19553, P0C1J9,




P20893, P12506, P04612, Q73370,




P0C1K0, P05906, P35965, P04609,




P04610, P04614, P04608, P05905






RKKSIPLSIKNLKRKHKRKK
Q9C0C9
5309


NKITR







RKLVKPKNTKMKTKLRTNP
Q14190
5310


Y







RKRLILSDKGQLDWKK
SeqNLS, Q91Z62, Q1A730, Q2KHT6,
5311



Q9CPU7






RKRLKSK
Q13309
5312





RKRRVRDNM
Q8QPH4, Q809M7, A8C8X1, Q2VNC5,
5313



Q38SQ0, O89749, Q6DNQ9, Q809L9,




Q0A429, Q20NV3, P16509, P16505,




Q6DNQ5, P16506, Q6XT06, P26118,




Q2ICQ2, Q2RCG8, Q0A2D0, Q0A2H9,




Q9IQ46, Q809M3, Q6J847, Q6J856,




B4URE4, A4GCM7, Q0A440, P26120,




P16511,






RKRSPKDKKEKDLDGAGKR
Q7RTP6
5314


RKT







RKRTPRVDGQTGENDMNK
O94851
5315


RRRK







RLPVRRRRRR
P04499, P12541, P03269, P48313,
5316



P03270






RLRFRKPKSK
P69469
5317





RQQRKR
Q14980
5318





RRDLNSSFETSPKKVK
Q8K3G5
5319





RRDRAKLR
Q9SLB8
5320





RRGDGRRR
Q80WE1, Q5R9B4, Q06787, P35922
5321





RRGRKRKAEKQ
Q812D1, Q5XXA9, Q99JF8, Q8MJG1,
5322



Q66T72, O75475






RRKKRR
Q0VD86, Q58DS6, Q5R6G2, Q9ERI5,
5323



Q6AYK2, Q6NYC1






RRKRSKSEDMDSVESKRRR
Q7TT18
5324





RRKRSR
Q99PU7, D3ZHS6, Q92560, A2VDM8
5325





RRPKGKTLQKRKPK
Q6ZN17
5326





RRRGFERFGPDNMGRKRK
Q63014, Q9DBR0
5327





RRRGKNKVAAQNCRK
SeqNLS
5328





RRRKRR
Q5FVH8, Q6MZT1, Q08DH5, Q8BQP9
5329





RRRQKQKGGASRRR
SeqNLS
5330





RRRREGPRARRRR
P08313, P10231
5331





RRTIRLKLVYDKCDRSCKIQ
SeqNLS
5332


KKNRNKCQYCRFHKCLSVG




MSHNAIRFGRMPRSEKAKL




KAE







RRVPQRKEVSRCRKCRK
Q5RJN4, Q32L09, Q8CAK3, Q9NUL5
5333





RVGGRRQAVECIEDLLNEP
P03255
5334


GQPLDLSCKRPRP







RVVKLRIAP
P52639, Q8JMN0
5335





RVVRRR
P70278
5336





SKRKTKISRKTR
Q5RAY1, O00443
5337





SYVKTVPNRTRTYIKL
P21935
5338





TGKNEAKKRKIA
P52739, Q8K3J5, Q5RAU9
5339





TLSPASSPSSVSCPVIPASTD
SeqNLS
5340


ESPGSALNI







VSKKQRTGKKIH
P52739, Q8K3J5, Q5RAU9
5341





SPKKKRKVE

5342





KRTADGSEFESPKKKRKVE

5343





PAAKRVKLD

5344





PKKKRKV

5345





MDSLLMNRRKFLYQFKNVR

5346


WAKGRRETYLC







SPKKKRKVEAS

5347





MAPKKKRKVGIHRGVP

5348





KRTADGSEFEKRTADGSEFE

5349


SPKKKAKVE







KRTADGSEFE

5350





KRTADGSEFESPKKKAKVE

5351





AGKRTADGSEFEKRTADGS

4001


EFESPKKKAKVE









In some embodiments, the NLS is a bipartite NLS. A bipartite NLS typically comprises two basic amino acid clusters separated by a spacer sequence (which may be, e.g., about 10 amino acids in length). A monopartite NLS typically lacks a spacer. An example of a bipartite NLS is the nucleoplasmin NLS, having the sequence KR[PAATKKAGQA]KKKK (SEQ ID NO: 5015), wherein the spacer is bracketed. Another exemplary bipartite NLS has the sequence PKKKRKVEGADKRTADGSEFESPKKKRKV (SEQ ID NO: 5016). Exemplary NLSs are described in International Application WO2020051561, which is herein incorporated by reference in its entirety, including for its disclosures regarding nuclear localization sequences.


In certain embodiments, a gene editor system polypeptide (e.g., a gene modifying polypeptide as described herein) further comprises an intracellular localization sequence, e.g., a nuclear localization sequence and/or a nucleolar localization sequence. The nuclear localization sequence and/or nucleolar localization sequence may be amino acid sequences that promote the import of the protein into the nucleus and/or nucleolus, where it can promote integration of heterologous sequence into the genome. In certain embodiments, a gene editor system polypeptide (e.g., (e.g., a gene modifying polypeptide as described herein) further comprises a nucleolar localization sequence. In certain embodiments, the gene modifying polypeptide is encoded on a first RNA, and the template RNA is a second, separate, RNA, and the nucleolar localization signal is encoded on the RNA encoding the gene modifying polypeptide and not on the template RNA. In some embodiments, the nucleolar localization signal is located at the N-terminus, C-terminus, or in an internal region of the polypeptide. In some embodiments, a plurality of the same or different nucleolar localization signals are used. In some embodiments, the nuclear localization signal is less than 5, 10, 25, 50, 75, or 100 amino acids in length. Various polypeptide nucleolar localization signals can be used. For example, Yang et al., Journal of Biomedical Science 22, 33 (2015), describe a nuclear localization signal that also functions as a nucleolar localization signal. In some embodiments, the nucleolar localization signal may also be a nuclear localization signal. In some embodiments, the nucleolar localization signal may overlap with a nuclear localization signal. In some embodiments, the nucleolar localization signal may comprise a stretch of basic residues. In some embodiments, the nucleolar localization signal may be rich in arginine and lysine residues. In some embodiments, the nucleolar localization signal may be derived from a protein that is enriched in the nucleolus. In some embodiments, the nucleolar localization signal may be derived from a protein enriched at ribosomal RNA loci. In some embodiments, the nucleolar localization signal may be derived from a protein that binds rRNA. In some embodiments, the nucleolar localization signal may be derived from MSP58. In some embodiments, the nucleolar localization signal may be a monopartite motif. In some embodiments, the nucleolar localization signal may be a bipartite motif. In some embodiments, the nucleolar localization signal may consist of a multiple monopartite or bipartite motifs. In some embodiments, the nucleolar localization signal may consist of a mix of monopartite and bipartite motifs. In some embodiments, the nucleolar localization signal may be a dual bipartite motif. In some embodiments, the nucleolar localization motif may be a KRASSQALGTIPKRRSSSRFIKRKK (SEQ ID NO: 5017). In some embodiments, the nucleolar localization signal may be derived from nuclear factor-KB-inducing kinase. In some embodiments, the nucleolar localization signal may be an RKKRKKK motif (SEQ ID NO: 5018) (described in Birbach et al., Journal of Cell Science, 117 (3615-3624), 2004).


Evolved Variants of Gene Modifying Polypeptides and Systems

In some embodiments, the invention provides evolved variants of gene modifying polypeptides as described herein. Evolved variants can, in some embodiments, be produced by mutagenizing a reference gene modifying polypeptide, or one of the fragments or domains comprised therein. In some embodiments, one or more of the domains (e.g., the reverse transcriptase domain) is evolved. One or more of such evolved variant domains can, in some embodiments, be evolved alone or together with other domains. An evolved variant domain or domains may, in some embodiments, be combined with unevolved cognate component(s) or evolved variants of the cognate component(s), e.g., which may have been evolved in either a parallel or serial manner.


In some embodiments, the process of mutagenizing a reference gene modifying polypeptide, or fragment or domain thereof, comprises mutagenizing the reference gene modifying polypeptide or fragment or domain thereof. In embodiments, the mutagenesis comprises a continuous evolution method (e.g., PACE) or non-continuous evolution method (e.g., PANCE), e.g., as described herein. In some embodiments, the evolved gene modifying polypeptide, or a fragment or domain thereof, comprises one or more amino acid variations introduced into its amino acid sequence relative to the amino acid sequence of the reference gene modifying polypeptide, or fragment or domain thereof. In embodiments, amino acid sequence variations may include one or more mutated residues (e.g., conservative substitutions, non-conservative substitutions, or a combination thereof) within the amino acid sequence of a reference gene modifying polypeptide, e.g., as a result of a change in the nucleotide sequence encoding the gene modifying polypeptide that results in, e.g., a change in the codon at any particular position in the coding sequence, the deletion of one or more amino acids (e.g., a truncated protein), the insertion of one or more amino acids, or any combination of the foregoing. The evolved variant gene modifying polypeptide may include variants in one or more components or domains of the gene modifying polypeptide (e.g., variants introduced into a reverse transcriptase domain).


In some aspects, the disclosure provides gene modifying polypeptides, systems, kits, and methods using or comprising an evolved variant of a gene modifying polypeptide, e.g., employs an evolved variant of a gene modifying polypeptide or a gene modifying polypeptide produced or producible by PACE or PANCE. In embodiments, the unevolved reference gene modifying polypeptide is a gene modifying polypeptide as disclosed herein.


The term “phage-assisted continuous evolution (PACE),” as used herein, generally refers to continuous evolution that employs phage as viral vectors. Examples of PACE technology have been described, for example, in International PCT Application No. PCT/US 2009/056194, filed Sep. 8, 2009, published as WO 2010/028347 on Mar. 11, 2010; International PCT Application, PCT/US2011/066747, filed Dec. 22, 2011, published as WO 2012/088381 on Jun. 28, 2012; U.S. Pat. No. 9,023,594, issued May 5, 2015; U.S. Pat. No. 9,771,574, issued Sep. 26, 2017; U.S. Pat. No. 9,394,537, issued Jul. 19, 2016; International PCT Application, PCT/US2015/012022, filed Jan. 20, 2015, published as WO 2015/134121 on Sep. 11, 2015; U.S. Pat. No. 10,179,911, issued Jan. 15, 2019; and International PCT Application, PCT/US2016/027795, filed Apr. 15, 2016, published as WO 2016/168631 on Oct. 20, 2016, the entire contents of each of which are incorporated herein by reference.


The term “phage-assisted non-continuous evolution (PANCE),” as used herein, generally refers to non-continuous evolution that employs phage as viral vectors. Examples of PANCE technology have been described, for example, in Suzuki T. et al, Crystal structures reveal an elusive functional domain of pyrrolysyl-tRNA synthetase, Nat Chem Biol. 13(12): 1261-1266 (2017), incorporated herein by reference in its entirety. Briefly, PANCE is a technique for rapid in vivo directed evolution using serial flask transfers of evolving selection phage (SP), which contain a gene of interest to be evolved, across fresh host cells (e.g., E. coli cells). Genes inside the host cell may be held constant while genes contained in the SP continuously evolve. Following phage growth, an aliquot of infected cells may be used to transfect a subsequent flask containing host E. coli. This process can be repeated and/or continued until the desired phenotype is evolved, e.g., for as many transfers as desired.


Methods of applying PACE and PANCE to gene modifying polypeptides may be readily appreciated by the skilled artisan by reference to, inter alia, the foregoing references. Additional exemplary methods for directing continuous evolution of genome-modifying proteins or systems, e.g., in a population of host cells, e.g., using phage particles, can be applied to generate evolved variants of gene modifying polypeptides, or fragments or subdomains thereof. Non-limiting examples of such methods are described in International PCT Application, PCT/US2009/056194, filed Sep. 8, 2009, published as WO 2010/028347 on Mar. 11, 2010; International PCT Application, PCT/US2011/066747, filed Dec. 22, 2011, published as WO 2012/088381 on Jun. 28, 2012; U.S. Pat. No. 9,023,594, issued May 5, 2015; U.S. Pat. No. 9,771,574, issued Sep. 26, 2017; U.S. Pat. No. 9,394,537, issued Jul. 19, 2016; International PCT Application, PCT/US2015/012022, filed Jan. 20, 2015, published as WO 2015/134121 on Sep. 11, 2015; U.S. Pat. No. 10,179,911, issued Jan. 15, 2019; International Application No. PCT/US2019/37216, filed Jun. 14, 2019, International Patent Publication WO 2019/023680, published Jan. 31, 2019, International PCT Application, PCT/US2016/027795, filed Apr. 15, 2016, published as WO 2016/168631 on Oct. 20, 2016, and International Patent Publication No. PCT/US2019/47996, filed Aug. 23, 2019, each of which is incorporated herein by reference in its entirety.


In some non-limiting illustrative embodiments, a method of evolution of a evolved variant gene modifying polypeptide, of a fragment or domain thereof, comprises: (a) contacting a population of host cells with a population of viral vectors comprising the gene of interest (the starting gene modifying polypeptide or fragment or domain thereof), wherein: (1) the host cell is amenable to infection by the viral vector; (2) the host cell expresses viral genes required for the generation of viral particles; (3) the expression of at least one viral gene required for the production of an infectious viral particle is dependent on a function of the gene of interest; and/or (4) the viral vector allows for expression of the protein in the host cell, and can be replicated and packaged into a viral particle by the host cell. In some embodiments, the method comprises (b) contacting the host cells with a mutagen, using host cells with mutations that elevate mutation rate (e.g., either by carrying a mutation plasmid or some genome modification—e.g., proofing-impaired DNA polymerase, SOS genes, such as UmuC, UmuD′, and/or RecA, which mutations, if plasmid-bound, may be under control of an inducible promoter), or a combination thereof. In some embodiments, the method comprises (c) incubating the population of host cells under conditions allowing for viral replication and the production of viral particles, wherein host cells are removed from the host cell population, and fresh, uninfected host cells are introduced into the population of host cells, thus replenishing the population of host cells and creating a flow of host cells. In some embodiments, the cells are incubated under conditions allowing for the gene of interest to acquire a mutation. In some embodiments, the method further comprises (d) isolating a mutated version of the viral vector, encoding an evolved gene product (e.g., an evolved variant gene modifying polypeptide, or fragment or domain thereof), from the population of host cells.


The skilled artisan will appreciate a variety of features employable within the above-described framework. For example, in some embodiments, the viral vector or the phage is a filamentous phage, for example, an M13 phage, e.g., an M13 selection phage. In certain embodiments, the gene required for the production of infectious viral particles is the M13 gene III (gIII). In embodiments, the phage may lack a functional gIII, but otherwise comprise gl, gII, gIV, gV, gVI, gVII, gVIII, gIX, and a gX. In some embodiments, the generation of infectious VSV particles involves the envelope protein VSV-G. Various embodiments can use different retroviral vectors, for example, Murine Leukemia Virus vectors, or Lentiviral vectors. In embodiments, the retroviral vectors can efficiently be packaged with VSV-G envelope protein, e.g., as a substitute for the native envelope protein of the virus.


In some embodiments, host cells are incubated according to a suitable number of viral life cycles, e.g., at least 10, at least 20, at least 30, at least 40, at least 50, at least 100, at least 200, at least 300, at least 400, at least, 500, at least 600, at least 700, at least 800, at least 900, at least 1000, at least 1250, at least 1500, at least 1750, at least 2000, at least 2500, at least 3000, at least 4000, at least 5000, at least 7500, at least 10000, or more consecutive viral life cycles, which in on illustrative and non-limiting examples of M13 phage is 10-20 minutes per virus life cycle. Similarly, conditions can be modulated to adjust the time a host cell remains in a population of host cells, e.g., about 10, about 11, about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, about 20, about 21, about 22, about 23, about 24, about 25, about 30, about 35, about 40, about 45, about 50, about 55, about 60, about 70, about 80, about 90, about 100, about 120, about 150, or about 180 minutes. Host cell populations can be controlled in part by density of the host cells, or, in some embodiments, the host cell density in an inflow, e.g., 103 cells/ml, about 104 cells/ml, about 105 cells/ml, about 5-105 cells/ml, about 106 cells/ml, about 5-106 cells/ml, about 107 cells/ml, about 5-107 cells/ml, about 108 cells/ml, about 5-108 cells/ml, about 109 cells/ml, about 5·109 cells/ml, about 1010 cells/ml, or about 5·1010 cells/ml.


Inteins

In some embodiments, as described in more detail below, an intein-N(intN) domain may be fused to the N-terminal portion of a first domain of a gene modifying polypeptide described herein, and an intein-C(intC) domain may be fused to the C-terminal portion of a second domain of a gene modifying polypeptide described herein for the joining of the N-terminal portion to the C-terminal portion, thereby joining the first and second domains. In some embodiments, the first and second domains are each independently chosen from a DNA binding domain, an RNA binding domain, an RT domain, and an endonuclease domain.


Inteins can occur as self-splicing protein intron (e.g., peptide), e.g., which ligates flanking N-terminal and C-terminal exteins (e.g., fragments to be joined). An intein may, in some instances, comprise a fragment of a protein that is able to excise itself and join the remaining fragments (the exteins) with a peptide bond in a process known as protein splicing. Inteins are also referred to as “protein introns.” The process of an intein excising itself and joining the remaining portions of the protein is herein termed “protein splicing” or “intein-mediated protein splicing.”


In some embodiments, an intein of a precursor protein (an intein containing protein prior to intein-mediated protein splicing) comes from two genes. Such intein is referred to herein as a split intein (e.g., split intein-N and split intein-C). Accordingly, an intein-based approach may be used to join a first polypeptide sequence and a second polypeptide sequence together. For example, in cyanobacteria, DnaE, the catalytic subunit a of DNA polymerase III, is encoded by two separate genes, dnaE-n and dnaE-c. An intein-N domain, such as that encoded by the dnaE-n gene, when situated as part of a first polypeptide sequence, may join the first polypeptide sequence with a second polypeptide sequence, wherein the second polypeptide sequence comprises an intein-C domain, such as that encoded by the dnaE-c gene. Accordingly, in some embodiments, a protein can be made by providing nucleic acid encoding the first and second polypeptide sequences (e.g., wherein a first nucleic acid molecule encodes the first polypeptide sequence and a second nucleic acid molecule encodes the second polypeptide sequence), and the nucleic acid is introduced into the cell under conditions that allow for production of the first and second polypeptide sequences, and for joining of the first to the second polypeptide sequence via an intein-based mechanism.


Use of inteins for joining heterologous protein fragments is described, for example, in Wood et al., J. Biol. Chem.289(21); 14512-9 (2014) (incorporated herein by reference in its entirety). For example, when fused to separate protein fragments, the inteins IntN and IntC may recognize each other, splice themselves out, and/or simultaneously ligate the flanking N- and C-terminal exteins of the protein fragments to which they were fused, thereby reconstituting a full-length protein from the two protein fragments.


In some embodiments, a synthetic intein based on the dnaE intein, the Cfa-N(e.g., split intein-N) and Cfa-C(e.g., split intein-C) intein pair, is used. Examples of such inteins have been described, e.g., in Stevens et al., J Am Chem Soc. 2016 Feb. 24; 138(7):2162-5 (incorporated herein by reference in its entirety). Non-limiting examples of intein pairs that may be used in accordance with the present disclosure include: Cfa DnaE intein, Ssp GyrB intein, Ssp DnaX intein, Ter DnaE3 intein, Ter Thy X intein, Rma DnaB intein and Cne Prp8 intein (e.g., as described in U.S. Pat. No. 8,394,604, incorporated herein by reference.


In some embodiments involving a split Cas9, an intein-N domain and an intein-C domain may be fused to the N-terminal portion of the split Cas9 and the C-terminal portion of a split Cas9, respectively, for the joining of the N-terminal portion of the split Cas9 and the C-terminal portion of the split Cas9. For example, in some embodiments, an intein-N is fused to the C—terminus of the N-terminal portion of the split Cas9, i.e., to form a structure of N—[N-terminal portion of the split Cas9]-[intein-N]˜C. In some embodiments, an intein-C is fused to the N-terminus of the C-terminal portion of the split Cas9, i.e., to form a structure of N-[intein-C]˜[C-terminal portion of the split Cas9]-C. The mechanism of intein-mediated protein splicing for joining the proteins the inteins are fused to (e.g., split Cas9) is described in Shah et al., Chem Sci. 2014; 5(1):446-461, incorporated herein by reference. Methods for designing and using inteins are known in the art and described, for example by WO2020051561, WO2014004336, WO2017132580, US20150344549, and US20180127780, each of which is incorporated herein by reference in their entirety.


In some embodiments, a split refers to a division into two or more fragments. In some embodiments, a split Cas9 protein or split Cas9 comprises a Cas9 protein that is provided as an N-terminal fragment and a C-terminal fragment encoded by two separate nucleotide sequences. The polypeptides corresponding to the N-terminal portion and the C-terminal portion of the Cas9 protein may be spliced to form a reconstituted Cas9 protein. In embodiments, the Cas9 protein is divided into two fragments within a disordered region of the protein, e.g., as described in Nishimasu et al., Cell, Volume 156, Issue 5, pp. 935-949, 2014, or as described in Jiang et al. (2016) Science 351: 867-871 and PDB file: 5F9R (each of which is incorporated herein by reference in its entirety). A disordered region may be determined by one or more protein structure determination techniques known in the art, including, without limitation, X-ray crystallography, NMR spectroscopy, electron microscopy (e.g., cryoEM), and/or in silico protein modeling. In some embodiments, the protein is divided into two fragments at any C, T, A, or S, e.g., within a region of SpCas9 between amino acids A292-G364, F445-K483, or E565-T637, or at corresponding positions in any other Cas9, Cas9 variant (e.g., nCas9, dCas9), or other napDNAbp. In some embodiments, protein is divided into two fragments at SpCas9 T310, T313, A456, S469, or C574. In some embodiments, the process of dividing the protein into two fragments is referred to as splitting the protein.


In some embodiments, a protein fragment ranges from about 2-1000 amino acids (e.g., between 2-10, 10-50, 50-100, 100-200, 200-300, 300-400, 400-500, 500-600, 600-700, 700-800, 800-900, or 900-1000 amino acids) in length. In some embodiments, a protein fragment ranges from about 5-500 amino acids (e.g., between 5-10, 10-50, 50-100, 100-200, 200-300, 300-400, or 400-500 amino acids) in length. In some embodiments, a protein fragment ranges from about 20-200 amino acids (e.g., between 20-30, 30-40, 40-50, 50-100, or 100-200 amino acids) in length.


In some embodiments, a portion or fragment of a gene modifying polypeptide is fused to an intein. The nuclease can be fused to the N-terminus or the C-terminus of the intein. In some embodiments, a portion or fragment of a fusion protein is fused to an intein and fused to an AAV capsid protein. The intein, nuclease and capsid protein can be fused together in any arrangement (e.g., nuclease-intein-capsid, intein-nuclease-capsid, capsid-intein-nuclease, etc.). In some embodiments, the N-terminus of an intein is fused to the C-terminus of a fusion protein and the C-terminus of the intein is fused to the N-terminus of an AAV capsid protein.


In some embodiments, an endonuclease domain (e.g., a nickase Cas9 domain) is fused to intein-N and a polypeptide comprising an RT domain is fused to an intein-C.


Exemplary nucleotide and amino acid sequences of intein-N domains and compatible intein-C domains are provided below:











DnaE Intein-N DNA:



(SEQ ID NO: 5029)



TGCCTGTCATACGAAACCGAGATACTGACAGTAGAATATGGCCTT







CTGCCAATCGGGAAGATTGTGGAGAAACGGATAGAATGCACAGTT







TACTCTGTCGATAACAATGGTAACATTTATACTCAGCCAGTTGCC







CAGTGGCACGACCGGGGAGAGCAGGAAGTATTCGAATACTGTCTG







GAGGATGGAAGTCTCATTAGGGCCACTAAGGACCACAAATTTATG







ACAGTCGATGGCCAGATGCTGCCTATAGACGAAATCTTTGAGCGA







GAGTTGGACCTCATGCGAGTTGACAACCTTCCTAAT 







DnaE Intein-N Protein:



(SEQ ID NO: 5030)



CLSYETEILTVEYGLLPIGKIVEKRIECTVYSVDNNGNIYTQPVA







QWHDRGEQEVFEYCLEDGSLIRATKDHKFMTVDGQMLPIDEIFER







ELDLMRVDNLPN 







DnaE Intein-C DNA:



(SEQ ID NO: 5031)



ATGATCAAGATAGCTACAAGGAAGTATCTTGGCAAACAAAACGTT







TATGATATTGGAGTCGAAAGAGATCACAACTTTGCTCTGAAGAAC







GGATTCATAGCTTCTAAT







DnaE Intein-C Protein:



(SEQ ID NO: 5032)



MIKIATRKYLGKQNVYDIGVERDHNFALKNGFIASN 







Cfa-N DNA:



(SEQ ID NO: 5033)



TGCCTGTCTTATGATACCGAGATACTTACCGTTGAATATGGCTTC







TTGCCTATTGGAAAGATTGTCGAAGAGAGAATTGAATGCACAGTA







TATACTGTAGACAAGAATGGTTTCGTTTACACACAGCCCATTGCT







CAATGGCACAATCGCGGCGAACAAGAAGTATTTGAGTACTGTCTC







GAGGATGGAAGCATCATACGAGCAACTAAAGATCATAAATTCATG







ACCACTGACGGGCAGATGTTGCCAATAGATGAGATATTCGAGCGG







GGCTTGGATCTCAAACAAGTGGATGGATTGCCA 







Cfa-N Protein:



(SEQ ID NO: 5034)



CLSYDTEILTVEYGFLPIGKIVEERIECTVYTVDKNGFVYTQPIA







QWHNRGEQEVFEYCLEDGSIIRATKDHKFMTTDGQMLPIDEIFER







GLDLKQVDGLP 







Cfa-C DNA:



(SEQ ID NO: 5035)



ATGAAGAGGACTGCCGATGGATCAGAGTTTGAATCTCCCAAGAAG







AAGAGGAAAGTAAAGATAATATCTCGAAAAAGTCTTGGTACCCAA







AATGTCTATGATATTGGAGTGGAGAAAGATCACAACTTCCTTCTC







AAGAACGGTCTCGTAGCCAGCAAC 







Cfa-C Protein:



(SEQ ID NO: 5036)



MKRTADGSEFESPKKKRKVKIISRKSLGTQNVYDIGVEKDHNFLL







KNGLVASN






Additional Domains

The gene modifying polypeptide can bind a target DNA sequence and template nucleic acid (e.g., template RNA), nick the target site, and write (e.g., reverse transcribe) the template into DNA, resulting in a modification of the target site. In some embodiments, additional domains may be added to the polypeptide to enhance the efficiency of the process. In some embodiments, the gene modifying polypeptide may contain an additional DNA ligation domain to join reverse transcribed DNA to the DNA of the target site. In some embodiments, the polypeptide may comprise a heterologous RNA-binding domain. In some embodiments, the polypeptide may comprise a domain having 5′ to 3′ exonuclease activity (e.g., wherein the 5′ to 3′ exonuclease activity increases repair of the alteration of the target site, e.g., in favor of alteration over the original genomic sequence). In some embodiments, the polypeptide may comprise a domain having 3′ to 5′ exonuclease activity, e.g., proof-reading activity. In some embodiments, the writing domain, e.g., RT domain, has 3′ to 5′ exonuclease activity, e.g., proof-reading activity.


Template Nucleic Acids

The gene modifying systems described herein can modify a host target DNA site using a template nucleic acid sequence. In some embodiments, the gene modifying systems described herein transcribe an RNA sequence template into host target DNA sites by target-primed reverse transcription (TPRT). By modifying DNA sequence(s) via reverse transcription of the RNA sequence template directly into the host genome, the gene modifying system can insert an object sequence into a target genome without the need for exogenous DNA sequences to be introduced into the host cell (unlike, for example, CRISPR systems), as well as eliminate an exogenous DNA insertion step. The gene modifying system can also delete a sequence from the target genome or introduce a substitution using an object sequence. Therefore, the gene modifying system provides a platform for the use of customized RNA sequence templates containing object sequences, e.g., sequences comprising heterologous gene coding and/or function information.


In some embodiments, the template nucleic acid comprises one or more sequence (e.g., 2 sequences) that binds the gene modifying polypeptide.


In some embodiments a system or method described herein comprises a single template nucleic acid (e.g., template RNA). In some embodiments a system or method described herein comprises a plurality of template nucleic acids (e.g., template RNAs). For example, a system described herein comprises a first RNA comprising (e.g., from 5′ to 3′) a sequence that binds the gene modifying polypeptide (e.g., the DNA-binding domain and/or the endonuclease domain, e.g., a gRNA) and a sequence that binds a target site (e.g., a second strand of a site in a target genome), and a second RNA (e.g., a template RNA) comprising (e.g., from 5′ to 3′) optionally a sequence that binds the gene modifying polypeptide (e.g., that specifically binds the RT domain), a heterologous object sequence, and a PBS sequence. In some embodiments, when the system comprises a plurality of nucleic acids, each nucleic acid comprises a conjugating domain. In some embodiments, a conjugating domain enables association of nucleic acid molecules, e.g., by hybridization of complementary sequences. For example, in some embodiments a first RNA comprises a first conjugating domain and a second RNA comprises a second conjugating domain, and the first and second conjugating domains are capable of hybridizing to one another, e.g., under stringent conditions. In some embodiments, the stringent conditions for hybridization include hybridization in 4x sodium chloride/sodium citrate (SSC), at about 65 C, followed by a wash in 1×SSC, at about 65 C.


In some embodiments, the template nucleic acid comprises RNA. In some embodiments, the template nucleic acid comprises DNA (e.g., single stranded or double stranded DNA).


In some embodiments, the template nucleic acid comprises one or more (e.g., 2) homology domains that have homology to the target sequence. In some embodiments, the homology domains are about 10-20, 20-50, or 50-100 nucleotides in length.


In some embodiments, a template RNA can comprise a gRNA sequence, e.g., to direct the gene modifying polypeptide to a target site of interest. In some embodiments, a template RNA comprises (e.g., from 5′ to 3′) (i) optionally a gRNA spacer that binds a target site (e.g., a second strand of a site in a target genome), (ii) optionally a gRNA scaffold that binds a polypeptide described herein (e.g., a gene modifying polypeptide or a Cas polypeptide), (iii) a heterologous object sequence comprising a mutation region (optionally the heterologous object sequence comprises, from 5′ to 3′, a first homology region, a mutation region, and a second homology region), and (iv) a primer binding site (PBS) sequence comprising a 3′ target homology domain.


The template nucleic acid (e.g., template RNA) component of a genome editing system described herein typically is able to bind the gene modifying polypeptide of the system. In some embodiments the template nucleic acid (e.g., template RNA) has a 3′ region that is capable of binding a gene modifying polypeptide. The binding region, e.g., 3′ region, may be a structured RNA region, e.g., having at least 1, 2 or 3 hairpin loops, capable of binding the gene modifying polypeptide of the system. The binding region may associate the template nucleic acid (e.g., template RNA) with any of the polypeptide modules. In some embodiments, the binding region of the template nucleic acid (e.g., template RNA) may associate with an RNA-binding domain in the polypeptide. In some embodiments, the binding region of the template nucleic acid (e.g., template RNA) may associate with the reverse transcription domain of the gene modifying polypeptide (e.g., specifically bind to the RT domain). In some embodiments, the template nucleic acid (e.g., template RNA) may associate with the DNA binding domain of the polypeptide, e.g., a gRNA associating with a Cas9-derived DNA binding domain. In some embodiments, the binding region may also provide DNA target recognition, e.g., a gRNA hybridizing to the target DNA sequence and binding the polypeptide, e.g., a Cas9 domain. In some embodiments, the template nucleic acid (e.g., template RNA) may associate with multiple components of the polypeptide, e.g., DNA binding domain and reverse transcription domain.


In some embodiments the template RNA has a poly-A tail at the 3′ end. In some embodiments the template RNA does not have a poly-A tail at the 3′ end.


In some embodiments, the template nucleic acid is a template RNA. In some embodiments, the template RNA comprises one or more modified nucleotides. For example, in some embodiments, the template RNA comprises one or more deoxyribonucleotides. In some embodiments, regions of the template RNA are replaced by DNA nucleotides, e.g., to enhance stability of the molecule. For example, the 3′ end of the template may comprise DNA nucleotides, while the rest of the template comprises RNA nucleotides that can be reverse transcribed. For instance, in some embodiments, the heterologous object sequence is primarily or wholly made up of RNA nucleotides (e.g., at least 90%, 95%, 98%, or 99% RNA nucleotides). In some embodiments, the PBS sequence is primarily or wholly made up of DNA nucleotides (e.g., at least 90%, 95%, 98%, or 99% DNA nucleotides). In other embodiments, the heterologous object sequence for writing into the genome may comprise DNA nucleotides. In some embodiments, the DNA nucleotides in the template are copied into the genome by a domain capable of DNA-dependent DNA polymerase activity. In some embodiments, the DNA-dependent DNA polymerase activity is provided by a DNA polymerase domain in the polypeptide. In some embodiments, the DNA-dependent DNA polymerase activity is provided by a reverse transcriptase domain that is also capable of DNA-dependent DNA polymerization, e.g., second strand synthesis. In some embodiments, the template molecule is composed of only DNA nucleotides.


In some embodiments, a system described herein comprises two nucleic acids which together comprise the sequences of a template RNA described herein. In some embodiments, the two nucleic acids are associated with each other non-covalently, e.g., directly associated with each other (e.g., via base pairing), or indirectly associated as part of a complex comprising one or more additional molecule.


A template RNA described herein may comprise, from 5′ to 3′: (1) a gRNA spacer; (2) a gRNA scaffold; (3) heterologous object sequence (4) a primer binding site (PBS) sequence. Each of these components is now described in more detail.


gRNA Spacer and gRNA Scaffold


A template RNA described herein may comprise a gRNA spacer that directs the gene modifying system to a target nucleic acid, and a gRNA scaffold that promotes association of the template RNA with the Cas domain of the gene modifying polypeptide. The systems described herein can also comprise a gRNA that is not part of a template nucleic acid. For example, a gRNA that comprises a gRNA spacer and gRNA scaffold, but not a heterologous object sequence or a PBS sequence, can be used, e.g., to induce second strand nicking, e.g., as described in the section herein entitled “Second Strand Nicking”.


In some embodiments, the gRNA is a short synthetic RNA composed of a scaffold sequence that participates in CRISPR-associated protein binding and a user-defined ˜20 nucleotide targeting sequence for a genomic target. The structure of a complete gRNA was described by Nishimasu et al. Cell 156, P935-949 (2014). The gRNA (also referred to as sgRNA for single-guide RNA) consists of crRNA- and tracrRNA-derived sequences connected by an artificial tetraloop. The crRNA sequence can be divided into guide (20 nt) and repeat (12 nt) regions, whereas the tracrRNA sequence can be divided into anti-repeat (14 nt) and three tracrRNA stem loops (Nishimasu et al. Cell 156, P935-949 (2014)). In practice, guide RNA sequences are generally designed to have a length of between 17-24 nucleotides (e.g., 19, 20, or 21 nucleotides) and be complementary to a targeted nucleic acid sequence. Custom gRNA generators and algorithms are available commercially for use in the design of effective guide RNAs. In some embodiments, the gRNA comprises two RNA components from the native CRISPR system, e.g. crRNA and tracrRNA. As is well known in the art, the gRNA may also comprise a chimeric, single guide RNA (sgRNA) containing sequence from both a tracrRNA (for binding the nuclease) and at least one crRNA (to guide the nuclease to the sequence targeted for editing/binding). Chemically modified sgRNAs have also been demonstrated to be effective for use with CRISPR-associated proteins; see, for example, Hendel et al. (2015) Nature Biotechnol., 985-991. In some embodiments, a gRNA spacer comprises a nucleic acid sequence that is complementary to a DNA sequence associated with a target gene.


In some embodiments, the region of the template nucleic acid, e.g., template RNA, comprising the gRNA adopts an underwound ribbon-like structure of gRNA bound to target DNA (e.g., as described in Mulepati et al. Science 19 Sep. 2014: Vol. 345, Issue 6203, pp. 1479-1484). Without wishing to be bound by theory, this non-canonical structure is thought to be facilitated by rotation of every sixth nucleotide out of the RNA-DNA hybrid. Thus, in some embodiments, the region of the template nucleic acid, e.g., template RNA, comprising the gRNA may tolerate increased mismatching with the target site at some interval, e.g., every sixth base. In some embodiments, the region of the template nucleic acid, e.g., template RNA, comprising the gRNA comprising homology to the target site may possess wobble positions at a regular interval, e.g., every sixth base, that do not need to base pair with the target site.


In some embodiments, the template nucleic acid (e.g., template RNA) has at least 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 bases of at least 80%, 85%, 90%, 95%, 99%, or 100% homology to the target site, e.g., at the 5′ end, e.g., comprising a gRNA spacer sequence of length appropriate to the Cas9 domain of the gene modifying polypeptide (Table 8).


In some embodiments, a Cas9 derivative with enhanced activity may be used in the gene modification polypeptide. In some embodiments, a Cas9 derivative may comprise mutations that improve activity of the HNH endonuclease domain, e.g., SpyCas9 R221K, N394K, or mutations that improve R-loop formation, e.g., SpyCas9 L1245V, or comprise a combination of such mutations, e.g., SpyCas9 R221K/N394K, SpyCas9 N394K/L1245V, SpyCas9 R221K/L1245V, or SpyCas9 R221K/N394K/L1245V (see, e.g., Spencer and Zhang Sci Rep 7:16836 (2017), the Cas9 derivatives and comprising mutations of which are incorporated herein by reference). In some embodiments, a Cas9 derivative may comprise one or more types of mutations described herein, e.g., PAM-modifying mutations, protein stabilizing mutations, activity enhancing mutations, and/or mutations partially or fully inactivating one or two endonuclease domains relative to the parental enzyme (e.g., one or more mutations to abolish endonuclease activity towards one or both strands of a target DNA, e.g., a nickase or catalytically dead enzyme). In some embodiments, a Cas9 enzyme used in a system described herein may comprise mutations that confer nickase activity toward the enzyme (e.g., SpyCas9 N863A or H840A) in addition to mutations improving catalytic efficiency (e.g., SpyCas9 R221K, N394K, and/or L1245V). In some embodiments, a Cas9 enzyme used in a system described herein is a SpyCas9 enzyme or derivative that further comprises an N863A mutation to confer nickase activity in addition to R221K and N394K mutations to improve catalytic efficiency.


Table 12 provides parameters to define components for designing gRNA and/or Template RNAs to apply Cas variants listed in Table 8 for gene modifying. The cut site indicates the validated or predicted protospacer adjacent motif (PAM) requirements, validated or predicted location of cut site (relative to the most upstream base of the PAM site). The gRNA for a given enzyme can be assembled by concatenating the crRNA, Tetraloop, and tracrRNA sequences, and further adding a 5′ spacer of a length within Spacer (min) and Spacer (max) that matches a protospacer at a target site. Further, the predicted location of the ssDNA nick at the target is important for designing a PBS sequence of a Template RNA that can anneal to the sequence immediately 5′ of the nick in order to initiate target primed reverse transcription. In some embodiments, a gRNA scaffold described herein comprises a nucleic acid sequence comprising, in the 5′ to 3′ direction, a crRNA of Table 12, a tetraloop from the same row of Table 12, and a tracrRNA from the same row of Table 12, or a sequence having at least 70%, 80%, 85%, 90%, 95%, or 99% identity thereto. In some embodiments, the gRNA or template RNA comprising the scaffold further comprises a gRNA spacer having a length within the Spacer (min) and Spacer (max) indicated in the same row of Table 12. In some embodiments, the gRNA or template RNA having a sequence according to Table 12 is comprised by a system that further comprises a gene modifying polypeptide, wherein the gene modifying polypeptide comprises a Cas domain described in the same row of Table 12.









TABLE 12







Parameters to define components for designing gRNA and/or Template RNAs to


apply Cas variants listed in Table 8 in gene modifying systems





















Spacer
Spacer

SEQ ID
Tetra-

SEQ ID


Variant
PAM(s)
Cut
Tier
(min)
(max)
crRNA
NO:
loop
tracrRNA
NO:





Nme2Cas9
NNNNCC
-3
1
22
24
GTTGTAGC
10,051
GAAA
CGAAATGAGAACCGTTGCTACAATAAGGC
10,151








TCCCTTTC


CGTCTGAAAAGATGTGCCGCAACGCTCTG









TCATTTCG


CCCCTTAAAGCTTCTGCTTTAAGGGGCAT












CGTTTA






PpnCas9
NNNNRTT

1
21
24
GTTGTAGC
10,052
GAAA
GCGAAATGAAAAACGTTGTTACAATAAGA
10,152








TCCCTTTT


GATGAATTTCTCGCAAAGCTCTGCCTCTT









TCATTTCG


GAAATTTCGGTTTCAAGAGGCATC









C









SauCas9
NNGRR;
-3
1
21
23
GTTTTAGT
10,053
GAAA
CAGAATCTACTAAAACAAGGCAAAATGCC
10,153



NNGRRT




ACTCTG


GTGTTTATCTCGTCAACTTGTTGGCGAGA






SauCas9-KKH
NNNRR;
-3
1
21
21
GTTTTAGT
10,054
GAAA
CAGAATCTACTAAAACAAGGCAAAATGCC
10,154



NNNRRT




ACTCTG


GTGTTTATCTCGTCAACTTGTTGGCGAGA






SauriCas9
NNGG
-3
1
21
21
GTTTTAGT
10,055
GAAA
CAGAATCTACTAAAACAAGGCAAAATGCC
10,155








ACTCTG


GTGTTTATCTCGTCAACTTGTTGGCGAGA






SauriCas9-
NNRG
-3
1
21
21
GTTTTAGT
10,056
GAAA
CAGAATCTACTAAAACAAGGCAAAATGCC
10,156


KKH





ACTCTG


GTGTTTATCTCGTCAACTTGTTGGCGAGA






ScaCas9-Sc++
NNG
-3
1
20
20
GTTTTAGA
10,057
GAAA
TAGCAAGTTAAAATAAGGCTAGTCCGTTA
10,157








GCTA


TCAACTTGAAAAAGTGGCACCGAGTCGGT












GC






SpyCas9
NGG
-3
1
20
20
GTTTTAGA
10,058
GAAA
TAGCAAGTTAAAATAAGGCTAGTCCGTTA
10,158








GCTA


TCAACTTGAAAAAGTGGCACCGAGTCGGT












GC






SpyCas9_i_v1
NGG
-3
1
20
20
GTTTTAGA
10,058
GAAA
TAGCAAGTTAAAATAAGGCTAGTCCGTTA
10,193








GCTA


TCAACTTGGACTTCGGTCCAAGTGGCACC












GAGTCGGTGC






SpyCas9_i_v2
NGG
-3
1
20
20
GTTTTAGA
10,058
GAAA
TAGCAAGTTAAAATAAGGCTAGTCCGTTA
10,194








GCTA


TCAACTTGGAGCTTGCTCCAAGTGGCACC












GAGTCGGTGC






SpyCas9_i_v3
NGG
-3
1
20
20
GTTTTAGA
10,058
GAAA
GTTTTAGAGCTAGAAATAGCAAGTTAAAA
10,195








GCTA


TAAGGCTAGTCCGTTATCGACTTGAAAAA












GTCGCACCGAGTCGGTGC






SpyCas9-NG
NG
-3
1
20
20
GTTTTAGA
10,059
GAAA
TAGCAAGTTAAAATAAGGCTAGTCCGTTA
10,159



(NGG =




GCTA


TCAACTTGAAAAAGTGGCACCGAGTCGGT




NGA =







GC




NGT >












NGC)














SpyCas9-SpRY
NRN >
-3
1
20
20
GTTTTAGA
10,060
GAAA
TAGCAAGTTAAAATAAGGCTAGTCCGTTA
10,160



NYN




GCTA


TCAACTTGAAAAAGTGGCACCGAGTCGGT












GC






St1Cas9
NNAGAAW >
-3
1
20
20
GTCTTTGT
10,061
GTAC
CAGAAGCTACAAAGATAAGGCTTCATGCC
10,161



NNAGGAW =




ACTCTG


GAAATCAACACCCTGTCATTTTATGGCAG




NNGGAAW







GGTGTTTT






BlatCas9
NNNNCNAA >
-3
1
19
23
GCTATAGT
10,062
GAAA
GGTAAGTTGCTATAGTAAGGGCAACAGAC
10,162



NNNNCNDD >




TCCTTACT


CCGAGGCGTTGGGGATCGCCTAGCCCGTG




NNNNC







TTTACGGGCTCTCCCCATATTCAAAATAA












TGACAGACGAGCACCTTGGAGCATTTATC












TCCGAGGTGCT






cCas9-v16
NNVACT;
-3
2
21
21
GTCTTAGT
10,063
GAAA
CAGAATCTACTAAGACAAGGCAAAATGCC
10,163



NNVATGM;




ACTCTG


GTGTTTATCTCGTCAACTTGTTGGCGAGA




NNVATT;












NNVGCT;












NNVGTG;












NNVGTT














cCas9-v17
NNVRRN
-3
2
21
21
GTCTTAGT
10,064
GAAA
CAGAATCTACTAAGACAAGGCAAAATGCC
10,164








ACTCTG


GTGTTTATCTCGTCAACTTGTTGGCGAGA






cCas9-v21
NNVACT;
-3
2
21
21
GTCTTAGT
10,065
GAAA
CAGAATCTACTAAGACAAGGCAAAATGCC
10,165



NNVATGM;




ACTCTG


GTGTTTATCTCGTCAACTTGTTGGCGAGA




NNVATT;












NNVGCT;












NNVGTG;












NNVGTT














cCas9-v42
NNVRRN
-3
2
21
21
GTCTTAGT
10,066
GAAA
CAGAATCTACTAAGACAAGGCAAAATGCC
10,166








ACTCTG


GTGTTTATCTCGTCAACTTGTTGGCGAGA






CdiCas9
NNRHHHY;

2
22
22
ACTGGGGT
10,067
GAAA
CTGAACCTCAGTAAGCATTGGCTCGTTTC
10,167



NNRAAAY




TCAG


CAATGTTGATTGCTCCGCCGGTGCTCCTT












ATTTTTAAGGGCGCCGGC






CjeCas9
NNNNRYAC
-3
2
21
23
GTTTTAGT
10,068
GAAA
AGGGACTAAAATAAAGAGTTTGCGGGACT
10,168








CCCT


CTGCGGGGTTACAATCCCCTAAAACCGC






GeoCas9
NNNNCRAA

2
21
23
GTCATAGT
10,069
GAAA
TCAGGGTTACTATGATAAGGGCTTTCTGC
10,169








TCCCCTGA


CTAAGGCAGACTGACCCGCGGCGTTGGGG












ATCGCCTGTCGCCCGCTTTTGGCGGGCAT












TCCCCATCCTT






iSpyMacCas9
NAAN
-3
2
19
21
GTTTTAGA
10,070
GAAA
TAGCAAGTTAAAATAAGGCTAGTCCGTTA
10,170








GCTA


TCAACTTGAAAAAGTGGCACCGAGTCGGT












GC






NmeCas9
NNNNGAYT;
-3
2
20
24
GTTGTAGC
10,071
GAAA
CGAAATGAGAACCGTTGCTACAATAAGGC
10,171



NNNNGYTT;




TCCCTTTC


CGTCTGAAAAGATGTGCCGCAACGCTCTG




NNNNGAYA;




TCATTTCG


CCCCTTAAAGCTTCTGCTTTAAGGGGCAT




NNNNGTCT







CGTTTA






ScaCas9
NNG
-3
2
20
20
GTTTTAGA
10,072
GAAA
TAGCAAGTTAAAATAAGGCTAGTCCGTTA
10,172








GCTA


TCAACTTGAAAAAGTGGCACCGAGTCGGT












GC






ScaCas9-
NNG
-3
2
20
20
GTTTTAGA
10,073
GAAA
TAGCAAGTTAAAATAAGGCTAGTCCGTTA
10,173


HiFi-Sc++





GCTA


TCAACTTGAAAAAGTGGCACCGAGTCGGT












GC






SpyCas9-
NRRH
-3
2
20
20
GTTTAAGA
10,074
GAAA
CAGCATAGCAAGTTTAAATAAGGCTAGTC
10,174


3var-NRRH





GCTATGCT


CGTTATCAACTTGAAAAAGTGGCACCGAG









G


TCGGTGC






SpyCas9-
NRTH
-3
2
20
20
GTTTAAGA
10,075
GAAA
CAGCATAGCAAGTTTAAATAAGGCTAGTC
10,175


3var-NRTH





GCTATGCT


CGTTATCAACTTGAAAAAGTGGCACCGAG









G


TCGGTGC






SpyCas9-
NRCH
-3
2
20
20
GTTTAAGA
10,076
GAAA
CAGCATAGCAAGTTTAAATAAGGCTAGTC
10,176


3var-NRCH





GCTATGCT


CGTTATCAACTTGAAAAAGTGGCACCGAG









G


TCGGTGC






SpyCas9-HF1
NGG
-3
2
20
20
GTTTTAGA
10,077
GAAA
TAGCAAGTTAAAATAAGGCTAGTCCGTTA
10,177








GCTA


TCAACTTGAAAAAGTGGCACCGAGTCGGT












GC






SpyCas9-
NAAG
-3
2
20
20
GTTTTAGA
10,078
GAAA
TAGCAAGTTAAAATAAGGCTAGTCCGTTA
10,178


QQR1





GCTA


TCAACTTGAAAAAGTGGCACCGAGTCGGT












GC






SpyCas9-SpG
NGN
-3
2
20
20
GTTTTAGA
10,079
GAAA
TAGCAAGTTAAAATAAGGCTAGTCCGTTA
10,179








GCTA


TCAACTTGAAAAAGTGGCACCGAGTCGGT












GC






SpyCas9-VQR
NGAN
-3
2
20
20
GTTTTAGA
10,080
GAAA
TAGCAAGTTAAAATAAGGCTAGTCCGTTA
10,180








GCTA


TCAACTTGAAAAAGTGGCACCGAGTCGGT












GC






SpyCas9-VRER
NGCG
-3
2
20
20
GTTTTAGA
10,081
GAAA
TAGCAAGTTAAAATAAGGCTAGTCCGTTA
10,181








GCTA


TCAACTTGAAAAAGTGGCACCGAGTCGGT












GC






SpyCas9-xCas
NG;GAA;
-3
2
20
20
GTTTAAGA
10,082
GAAA
CAGCATAGCAAGTTTAAATAAGGCTAGTC
10,182



GAT




GCTATGCT


CGTTATCAACTTGAAAAAGTGGCACCGAG









G


TCGGTGC






SpyCas9-
NG
-3
2
20
20
GTTTAAGA
10,083
GAAA
CAGCATAGCAAGTTTAAATAAGGCTAGTC
10,183


xCas-NG





GCTATGCT


CGTTATCAACTTGAAAAAGTGGCACCGAG









G


TCGGTGC






St1Cas9-
NNACAA
-3
2
20
20
GTCTTTGT
10,084
GTAC
CAGAAGCTACAAAGATAAGGCTTCATGCC
10,184


CNRZ1066





ACTCTG


GAAATCAACACCCTGTCATTTTATGGCAG












GGTGTTTT






St1Cas9-
NNGCAA
-3
2
20
20
GTCTTTGT
10,085
GTAC
CAGAAGCTACAAAGATAAGGCTTCATGCC
10,185


LMG1831





ACTCTG


GAAATCAACACCCTGTCATTTTATGGCAG












GGTGTTTT






St1Cas9-
NNAAAA
-3
2
20
20
GTCTTTGT
10,086
GTAC
CAGAAGCTACAAAGATAAGGCTTCATGCC
10,186


MTH17CL396





ACTCTG


GAAATCAACACCCTGTCATTTTATGGCAG












GGTGTTTT






St1Cas9-
NNGAAA
-3
2
20
20
GTCTTTGT
10,087
GTAC
CAGAAGCTACAAAGATAAGGCTTCATGCC
10,187


TH1477





ACTCTG


GAAATCAACACCCTGTCATTTTATGGCAG












GGTGTTTT






sRGN3.1
NNGG

1
21
23
GTTTTAGT
10,088
GAAA
CAGAATCTACTGAAACAAGACAATATGTC
10,188








ACTCTG


GTGTTTATCCCATCAATTTATTGGTGGGA












TTTT






sRGN3.3
NNGG

1
21
23
GTTTTAGT
10,089
GAAA
CAGAATCTACTGAAACAAGACAATATGTC
10,189








ACTCTG


GTGTTTATCCCATCAATTTATTGGTGGGA












TTTT









Herein, when an RNA sequence (e.g., a template RNA sequence) is said to comprise a particular sequence (e.g., a sequence of Table 12 or a portion thereof) that comprises thymine (T), it is of course understood that the RNA sequence may (and frequently does) comprise uracil (U) in place of T. For instance, the RNA sequence may comprise U at every position shown as T in the sequence in Table 12. More specifically, the present disclosure provides an RNA sequence according to every gRNA scaffold sequence of Table 12, wherein the RNA sequence has a U in place of each T in the sequence in Table 12. Additionally, it is understood that terminal Us and Ts may optionally be added or removed from tracrRNA sequences and may be modified or unmodified when provided as RNA. Without wishing to be bound by example, versions of gRNA scaffold sequences alternative to those exemplified in Table 12 may also function with the different Cas9 enzymes or derivatives thereof exemplified in Table 8, e.g., alternate gRNA scaffold sequences with nucleotide additions, substitutions, or deletions, e.g., sequences with stem-loop structures added or removed. It is contemplated herein that the gRNA scaffold sequences represent a component of gene modifying systems that can be similarly optimized for a given system, Cas-RT fusion polypeptide, indication, target mutation, template RNA, or delivery vehicle.


Heterologous Object Sequence

A template RNA described herein may comprise a heterologous object sequence that the gene modifying polypeptide can use as a template for reverse transcription, to write a desired sequence into the target nucleic acid. In some embodiments, the heterologous object sequence comprises, from 5′ to 3′, a post-edit homology region, the mutation region, and a pre-edit homology region. Without wishing to be bound by theory, an RT performing reverse transcription on the template RNA first reverse transcribes the pre-edit homology region, then the mutation region, and then the post-edit homology region, thereby creating a DNA strand comprising the desired mutation with a homology region on either side.


In some embodiments, the heterologous object sequence is at least 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 120, 140, 160, 180, 200, 500, or 1,000 nucleotides (nts) in length, or at least 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5, 9, 9.5, or 10 kilobases in length. In some embodiments, the heterologous object sequence is no more than 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 120, 140, 160, 180, 200, 500, 1,000, or 2000 nucleotides (nts) in length, or no more than 20, 15, 10, 9, 8, 7, 6, 5, 4, or 3 kilobases in length. In some embodiments, the heterologous object sequence is 30-1000, 40-1000, 50-1000, 60-1000, 70-1000, 74-1000, 75-1000, 76-1000, 77-1000, 78-1000, 79-1000, 80-1000, 85-1000, 90-1000, 100-1000, 120-1000, 140-1000, 160-1000, 180-1000, 200-1000, 500-1000, 30-500, 40-500, 50-500, 60-500, 70-500, 74-500, 75-500, 76-500, 77-500, 78-500, 79-500, 80-500, 85-500, 90-500, 100-500, 120-500, 140-500, 160-500, 180-500, 200-500, 30-200, 40-200, 50-200, 60-200, 70-200, 74-200, 75-200, 76-200, 77-200, 78-200, 79-200, 80-200, 85-200, 90-200, 100-200, 120-200, 140-200, 160-200, 180-200, 30-100, 40-100, 50-100, 60-100, 70-100, 74-100, 75-100, 76-100, 77-100, 78-100, 79-100, 80-100, 85-100, or 90-100 nucleotides (nts) in length, or 1-20, 1-15, 1-10, 1-9, 1-8, 1-7, 1-6, 1-5, 1-4, 1-3, 1-2, 2-20, 2-15, 2-10, 2-9, 2-8, 2-7, 2-6, 2-5, 2-4, 2-3, 3-20, 3-15, 3-10, 3-9, 3-8, 3-7, 3-6, 3-5, 3-4, 4-20, 4-15, 4-10, 4-9, 4-8, 4-7, 4-6, 4-5, 5-20, 5-15, 5-10, 5-9, 5-8, 5-7, 5-6, 6-20, 6-15, 6-10, 6-9, 6-8, 6-7, 7-20, 7-15, 7-10, 7-9, 7-8, 8-20, 8-15, 8-10, 8-9, 9-20, 9-15, 9-10, 10-15, 10-20, or 15-20 kilobases in length. In some embodiments, the heterologous object sequence is 10-100, 10-90, 10-80, 10-70, 10-60, 10-50, 10-40, 10-30, or 10-20 nt in length, e.g., 10-80, 10-50, or 10-20 nt in length, e.g., about 10-20 nt in length. In some embodiments, the heterologous object sequence is 8-30, 9-25, 10-20, 11-16, or 12-15 nucleotides in length, e.g., is 11-16 nt in length. Without wishing to be bound by theory, in some embodiments, a larger insertion size, larger region of editing (e.g., the distance between a first edit/substitution and a second edit/substitution in the target region), and/or greater number of desired edits (e.g., mismatches of the heterologous object sequence to the target genome), may result in a longer optimal heterologous object sequence.


In certain embodiments, the template nucleic acid comprises a customized RNA sequence template which can be identified, designed, engineered and constructed to contain sequences altering or specifying host genome function, for example by introducing a heterologous coding region into a genome; affecting or causing exon structure/alternative splicing, e.g., leading to exon skipping of one or more exons; causing disruption of an endogenous gene, e.g., creating a genetic knockout; causing transcriptional activation of an endogenous gene; causing epigenetic regulation of an endogenous DNA; causing up-regulation of one or more operably linked genes, e.g., leading to gene activation or overexpression; causing down-regulation of one or more operably linked genes, e.g., creating a genetic knock-down; etc. In certain embodiments, a customized RNA sequence template can be engineered to contain sequences coding for exons and/or transgenes, provide binding sites for transcription factor activators, repressors, enhancers, etc., and combinations thereof. In some embodiments, a customized template can be engineered to encode a nucleic acid or peptide tag to be expressed in an endogenous RNA transcript or endogenous protein operably linked to the target site. In other embodiments, the coding sequence can be further customized with splice donor sites, splice acceptor sites, or poly-A tails.


The template nucleic acid (e.g., template RNA) of the system typically comprises an object sequence (e.g., a heterologous object sequence) for writing a desired sequence into a target DNA. The object sequence may be coding or non-coding. The template nucleic acid (e.g., template RNA) can be designed to result in insertions, mutations, or deletions at the target DNA locus. In some embodiments, the template nucleic acid (e.g., template RNA) may be designed to cause an insertion in the target DNA. For example, the template nucleic acid (e.g., template RNA) may contain a heterologous sequence, wherein the reverse transcription will result in insertion of the heterologous sequence into the target DNA. In other embodiments, the RNA template may be designed to introduce a deletion into the target DNA. For example, the template nucleic acid (e.g., template RNA) may match the target DNA upstream and downstream of the desired deletion, wherein the reverse transcription will result in the copying of the upstream and downstream sequences from the template nucleic acid (e.g., template RNA) without the intervening sequence, e.g., causing deletion of the intervening sequence. In other embodiments, the template nucleic acid (e.g., template RNA) may be designed to introduce an edit into the target DNA. For example, the template RNA may match the target DNA sequence with the exception of one or more nucleotides, wherein the reverse transcription will result in the copying of these edits into the target DNA, e.g., resulting in mutations, e.g., transition or transversion mutations.


In some embodiments, writing of an object sequence into a target site results in the substitution of nucleotides, e.g., where the full length of the object sequence corresponds to a matching length of the target site with one or more mismatched bases. In some embodiments, a heterologous object sequence may be designed such that a combination of sequence alterations may occur, e.g., a simultaneous addition and deletion, addition and substitution, or deletion and substitution.


In some embodiments, the heterologous object sequence may contain an open reading frame or a fragment of an open reading frame. In some embodiments the heterologous object sequence has a Kozak sequence. In some embodiments the heterologous object sequence has an internal ribosome entry site. In some embodiments the heterologous object sequence has a self-cleaving peptide such as a T2A or P2A site. In some embodiments the heterologous object sequence has a start codon. In some embodiments the template RNA has a splice acceptor site. In some embodiments the template RNA has a splice donor site. Exemplary splice acceptor and splice donor sites are described in WO2016044416, incorporated herein by reference in its entirety. Exemplary splice acceptor site sequences are known to those of skill in the art. In some embodiments the template RNA has a microRNA binding site downstream of the stop codon. In some embodiments the template RNA has a poly A tail downstream of the stop codon of an open reading frame. In some embodiments the template RNA comprises one or more exons. In some embodiments the template RNA comprises one or more introns. In some embodiments the template RNA comprises a eukaryotic transcriptional terminator. In some embodiments the template RNA comprises an enhanced translation element or a translation enhancing element. In some embodiments the RNA comprises the human T-cell leukemia virus (HTLV-1) R region. In some embodiments the RNA comprises a posttranscriptional regulatory element that enhances nuclear export, such as that of Hepatitis B Virus (HPRE) or Woodchuck Hepatitis Virus (WPRE).


In some embodiments, the heterologous object sequence may contain a non-coding sequence. For example, the template nucleic acid (e.g., template RNA) may comprise a regulatory element, e.g., a promoter or enhancer sequence or miRNA binding site. In some embodiments, integration of the object sequence at a target site will result in upregulation of an endogenous gene. In some embodiments, integration of the object sequence at a target site will result in downregulation of an endogenous gene. In some embodiments the template nucleic acid (e.g., template RNA) comprises a tissue specific promoter or enhancer, each of which may be unidirectional or bidirectional. In some embodiments the promoter is an RNA polymerase I promoter, RNA polymerase II promoter, or RNA polymerase III promoter. In some embodiments the promoter comprises a TATA element. In some embodiments the promoter comprises a B recognition element. In some embodiments the promoter has one or more binding sites for transcription factors.


In some embodiments, the template nucleic acid (e.g., template RNA) comprises a site that coordinates epigenetic modification. In some embodiments, the template nucleic acid (e.g., template RNA) comprises a chromatin insulator. For example, the template nucleic acid (e.g., template RNA) comprises a CTCF site or a site targeted for DNA methylation.


In some embodiments, the template nucleic acid (e.g., template RNA) comprises a gene expression unit composed of at least one regulatory region operably linked to an effector sequence. The effector sequence may be a sequence that is transcribed into RNA (e.g., a coding sequence or a non-coding sequence such as a sequence encoding a micro RNA).


In some embodiments, the heterologous object sequence of the template nucleic acid (e.g., template RNA) is inserted into a target genome in an endogenous intron. In some embodiments, the heterologous object sequence of the template nucleic acid (e.g., template RNA) is inserted into a target genome and thereby acts as a new exon. In some embodiments, the insertion of the heterologous object sequence into the target genome results in replacement of a natural exon or the skipping of a natural exon.


The template nucleic acid (e.g., template RNA) can be designed to result in insertions, mutations, or deletions at the target DNA locus. In some embodiments, the template nucleic acid (e.g., template RNA) may be designed to cause an insertion in the target DNA. For example, the template nucleic acid (e.g., template RNA) may contain a heterologous object sequence, wherein the reverse transcription will result in insertion of the heterologous object sequence into the target DNA. In other embodiments, the RNA template may be designed to write a deletion into the target DNA. For example, the template nucleic acid (e.g., template RNA) may match the target DNA upstream and downstream of the desired deletion, wherein the reverse transcription will result in the copying of the upstream and downstream sequences from the template nucleic acid (e.g., template RNA) without the intervening sequence, e.g., causing deletion of the intervening sequence. In other embodiments, the template nucleic acid (e.g., template RNA) may be designed to write an edit into the target DNA. For example, the template RNA may match the target DNA sequence with the exception of one or more nucleotides, wherein the reverse transcription will result in the copying of these edits into the target DNA, e.g., resulting in mutations, e.g., transition or transversion mutations.


In some embodiments, the pre-edit homology domain comprises a nucleic acid sequence having 100% sequence identity with a nucleic acid sequence comprised in a target nucleic acid molecule.


In some embodiments, the post-edit homology domain comprises a nucleic acid sequence having 100% sequence identity with a nucleic acid sequence comprised in a target nucleic acid molecule.


PBS Sequence

In some embodiments, a template nucleic acid (e.g., template RNA) comprises a PBS sequence. In some embodiments, a PBS sequence is disposed 3′ of the heterologous object sequence and is complementary to a sequence adjacent to a site to be modified by a system described herein, or comprises no more than 1, 2, 3, 4, or 5 mismatches to a sequence complementary to the sequence adjacent to a site to be modified by the system/gene modifying polypeptide. In some embodiments, the PBS sequence binds within 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleotides of a nick site in the target nucleic acid molecule. In some embodiments, binding of the PBS sequence to the target nucleic acid molecule permits initiation of target-primed reverse transcription (TPRT), e.g., with the 3′ homology domain acting as a primer for TPRT. In some embodiments, the PBS sequence is 3-5, 5-10, 10-30, 10-25, 10-20, 10-19, 10-18, 10-17, 10-16, 10-15, 10-14, 10-13, 10-12, 10-11, 11-30, 11-25, 11-20, 11-19, 11-18, 11-17, 11-16, 11-15, 11-14, 11-13, 11-12, 12-30, 12-25, 12-20, 12-19, 12-18, 12-17, 12-16, 12-15, 12-14, 12-13, 13-30, 13-25, 13-20, 13-19, 13-18, 13-17, 13-16, 13-15, 13-14, 14-30, 14-25, 14-20, 14-19, 14-18, 14-17, 14-16, 14-15, 15-30, 15-25, 15-20, 15-19, 15-18, 15-17, 15-16, 16-30, 16-25, 16-20, 16-19, 16-18, 16-17, 17-30, 17-25, 17-20, 17-19, 17-18, 18-30, 18-25, 18-20, 18-19, 19-30, 19-25, 19-20, 20-30, 20-25, or 25-30 nucleotides in length, e.g., 10-17, 12-16, or 12-14 nucleotides in length. In some embodiments, the PBS sequence is 5-20, 8-16, 8-14, 8-13, 9-13, 9-12, or 10-12 nucleotides in length, e.g., 9-12 nucleotides in length.


The template nucleic acid (e.g., template RNA) may have some homology to the target DNA. In some embodiments, the template nucleic acid (e.g., template RNA) PBS sequence domain may serve as an annealing region to the target DNA, such that the target DNA is positioned to prime the reverse transcription of the template nucleic acid (e.g., template RNA). In some embodiments the template nucleic acid (e.g., template RNA) has at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 175, 200 or more bases of exact homology to the target DNA at the 3′ end of the RNA. In some embodiments the template nucleic acid (e.g., template RNA) has at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 175, 200 or more bases of at least 50%, 60%, 70%, 80%, 85%, 90%, 95%, 97%, 98%, 99% or 100% homology to the target DNA, e.g., at the 5′ end of the template nucleic acid (e.g., template RNA).


Exemplary Template Sequences

In some embodiments of the systems and methods herein, the template RNA comprises a gRNA spacer comprising the core nucleotides of a gRNA spacer sequence of Table 1. In some embodiments, the gRNA spacer additionally comprises one or more (e.g., 2, 3, or all) consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the gRNA spacer. In some embodiments, the template RNA comprising a sequence of Table 1 is comprised by a system that further comprises a gene modifying polypeptide having an RT domain listed in the same line of Table 1. RT domain amino acid sequences can be found, e.g., in Table 6 herein.









TABLE 1







Exemplary gRNA spacer Cas pairs


Table 1 provides a gRNA database for correcting


the pathogenic EV6 mutation in HBB. List of spacers,


PAMs, and Cas variants for generating a nick at an


appropriate position to enable installation of a desired


genomic edit with a gene modifying system. The spacers


in this table are designed to be used with a gene


modifying polypeptide comprising a nickase variant of


the Cas species indicated in the table. Tables 2, 3,


and 4 detail the other components of the system and


are organized such that the ID number shown here in


Column 1 (“ID”) is meant to correspond to the same ID


number in the subsequent tables.














PAM

SEQ
Cas

Overlaps


ID
sequence
gRNA spacer
ID NO
species
distance
mutation
















1
AGAAG
TGGTGCATCTGACTCCTGTGG
16917
SauCas9KKH
0
0





2
AGAAGT
TGGTGCATCTGACTCCTGTGG
16918
SauCas9KKH
0
0





3
AGAAGT
TGGTGCATCTGACTCCTGTGG
16919
cCas9-v17
0
0





4
AGAAGT
TGGTGCATCTGACTCCTGTGG
16920
cCas9-v42
0
0





5
AG
GGTGCATCTGACTCCTGTGG
16921
SpyCas9-NG
0
0





6
AG
GGTGCATCTGACTCCTGTGG
16922
SpyCas9-
0
0






xCas







7
AG
GGTGCATCTGACTCCTGTGG
16923
SpyCas9-
0
0






xCas-NG







8
AGA
GGTGCATCTGACTCCTGTGG
16924
SpyCas9-
0
0






SpG







9
AGA
GGTGCATCTGACTCCTGTGG
16925
SpyCas9-
0
0






SpRY







10
AGAAGTCT
ccatGGTGCATCTGACTCCTGTGG
16926
NmeCas9
0
0





11
AGAA
GGTGCATCTGACTCCTGTGG
16927
SpyCas9-
0
0






3var-NRRH







12
AGAA
GGTGCATCTGACTCCTGTGG
16928
SpyCas9-
0
0






VQR







13
GAGAA
ccATGGTGCATCTGACTCCTGTG
16929
SauCas9
1
0





14
GAGAA
ATGGTGCATCTGACTCCTGTG
16930
SauCas9KKH
1
0





15
AGGAG
gcAGTAACGGCAGACTTCTCCAC
16931
SauCas9
1
0





16
AGGAG
AGTAACGGCAGACTTCTCCAC
16932
SauCas9KKH
1
0





17
AGGAGT
gcAGTAACGGCAGACTTCTCCAC
16933
SauCas9
1
0





18
AGGAGT
AGTAACGGCAGACTTCTCCAC
16934
SauCas9KKH
1
0





19
AGGAGT
AGTAACGGCAGACTTCTCCAC
16935
cCas9-v17
1
0





20
AGGAGT
AGTAACGGCAGACTTCTCCAC
16936
cCas9-v42
1
0





21
GAG
TGGTGCATCTGACTCCTGTG
16937
ScaCas9
1
0





22
GAG
TGGTGCATCTGACTCCTGTG
16938
ScaCas9-
1
0






HiFi-Sc++







23
GAG
TGGTGCATCTGACTCCTGTG
16939
ScaCas9-
1
0






Sc++







24
GAG
TGGTGCATCTGACTCCTGTG
16940
SpyCas9-
1
0






SpRY







25
AGG
GTAACGGCAGACTTCTCCAC
16941
ScaCas9
1
0





26
AGG
GTAACGGCAGACTTCTCCAC
16942
ScaCas9-
1
0






HiFi-Sc++







27
AGG
GTAACGGCAGACTTCTCCAC
16943
ScaCas9-
1
0






Sc++







28
AGG
GTAACGGCAGACTTCTCCAC
16944
SpyCas9
1
0





29
AGG
GTAACGGCAGACTTCTCCAC
16945
SpyCas9-
1
0






HF1







30
AGG
GTAACGGCAGACTTCTCCAC
16946
SpyCas9-
1
0






SpG







31
AGG
GTAACGGCAGACTTCTCCAC
16947
SpyCas9-
1
0






SpRY







32
AG
GTAACGGCAGACTTCTCCAC
16948
SpyCas9-NG
1
0





33
AG
GTAACGGCAGACTTCTCCAC
16949
SpyCas9-
1
0






xCas







34
AG
GTAACGGCAGACTTCTCCAC
16950
SpyCas9-
1
0






xCas-NG







35
GAGAAG
ATGGTGCATCTGACTCCTGTG
16951
cCas9-v17
1
0





36
GAGAAG
ATGGTGCATCTGACTCCTGTG
16952
cCas9-v42
1
0





37
GAGA
TGGTGCATCTGACTCCTGTG
16953
SpyCas9-
1
0






3var-NRRH







38
AGGA
GTAACGGCAGACTTCTCCAC
16954
SpyCas9-
1
0






3var-NRRH







39
CAGGA
ggCAGTAACGGCAGACTTCTCCA
16955
SauCas9
2
0





40
CAGGA
CAGTAACGGCAGACTTCTCCA
16956
SauCas9KKH
2
0





41
GGAGA
CATGGTGCATCTGACTCCTGT
16957
SauCas9KKH
2
0





42
CAGG
CAGTAACGGCAGACTTCTCCA
16958
SauriCas9
2
0





43
CAGG
CAGTAACGGCAGACTTCTCCA
16959
SauriCas9-
2
0






KKH







44
GGAG
CATGGTGCATCTGACTCCTGT
16960
SauriCas9-
2
0






KKH







45
GGAG
ATGGTGCATCTGACTCCTGT
16961
SpyCas9-
2
0






VQR







46
CAG
AGTAACGGCAGACTTCTCCA
16962
ScaCas9
2
0





47
CAG
AGTAACGGCAGACTTCTCCA
16963
ScaCas9-
2
0






HiFi-Sc++







48
CAG
AGTAACGGCAGACTTCTCCA
16964
ScaCas9-
2
0






Sc++







49
CAG
AGTAACGGCAGACTTCTCCA
16965
SpyCas9-
2
0






SpRY







50
GG
ATGGTGCATCTGACTCCTGT
16966
SpyCas9-NG
2
0





51
GG
ATGGTGCATCTGACTCCTGT
16967
SpyCas9-
2
0






xCas







52
GG
ATGGTGCATCTGACTCCTGT
16968
SpyCas9-
2
0






xCas-NG







53
GGA
ATGGTGCATCTGACTCCTGT
16969
SpyCas9-
2
0






SpG







54
GGA
ATGGTGCATCTGACTCCTGT
16970
SpyCas9-
2
0






SpRY







55
GGAGAA
CATGGTGCATCTGACTCCTGT
16971
cCas9-v17
2
0





56
GGAGAA
CATGGTGCATCTGACTCCTGT
16972
cCas9-v42
2
0





57
CAGGAG
CAGTAACGGCAGACTTCTCCA
16973
cCas9-v17
2
0





58
CAGGAG
CAGTAACGGCAGACTTCTCCA
16974
cCas9-v42
2
0





59
tGGAG
caCCATGGTGCATCTGACTCCTG
16975
SauCas9
3
1





60
tGGAG
CCATGGTGCATCTGACTCCTG
16976
SauCas9KKH
3
1





61
aCAGG
GCAGTAACGGCAGACTTCTCC
16977
SauCas9KKH
3
1





62
aCAG
GCAGTAACGGCAGACTTCTCC
16978
SauriCas9-
3
1






KKH







63
tGG
CATGGTGCATCTGACTCCTG
16979
ScaCas9
3
1





64
tGG
CATGGTGCATCTGACTCCTG
16980
ScaCas9-
3
1






HiFi-Sc++







65
tGG
CATGGTGCATCTGACTCCTG
16981
ScaCas9-
3
1






Sc++







66
tGG
CATGGTGCATCTGACTCCTG
16982
SpyCas9
3
1





67
tGG
CATGGTGCATCTGACTCCTG
16983
SpyCas9-
3
1






HF1







68
tGG
CATGGTGCATCTGACTCCTG
16984
SpyCas9-
3
1






SpG







69
tGG
CATGGTGCATCTGACTCCTG
16985
SpyCas9-
3
1






SpRY







70
tG
CATGGTGCATCTGACTCCTG
16986
SpyCas9-NG
3
1





71
tG
CATGGTGCATCTGACTCCTG
16987
SpyCas9-
3
1






xCas







72
tG
CATGGTGCATCTGACTCCTG
16988
SpyCas9-
3
1






xCas-NG







73
aCA
CAGTAACGGCAGACTTCTCC
16989
SpyCas9-
3
1






SpRY







74
tGGAGA
CCATGGTGCATCTGACTCCTG
16990
cCas9-v17
3
1





75
tGGAGA
CCATGGTGCATCTGACTCCTG
16991
cCas9-v42
3
1





76
aCAGGA
GCAGTAACGGCAGACTTCTCC
16992
cCas9-v17
3
1





77
aCAGGA
GCAGTAACGGCAGACTTCTCC
16993
cCas9-v42
3
1





78
tGGA
CATGGTGCATCTGACTCCTG
16994
SpyCas9-
3
1






3var-NRRH







79
GtGGA
acACCATGGTGCATCTGACTCCT
16995
SauCas9
4
1





80
GtGGA
ACCATGGTGCATCTGACTCCT
16996
SauCas9KKH
4
1





81
CaCAG
GGCAGTAACGGCAGACTTCTC
16997
SauCas9KKH
4
1





82
GtGG
ACCATGGTGCATCTGACTCCT
16998
SauriCas9
4
1





83
GtGG
ACCATGGTGCATCTGACTCCT
16999
SauriCas9-
4
1






KKH







84
GtG
CCATGGTGCATCTGACTCCT
17000
ScaCas9
4
1





85
GtG
CCATGGTGCATCTGACTCCT
17001
ScaCas9-
4
1






HiFi-Sc++







86
GtG
CCATGGTGCATCTGACTCCT
17002
ScaCas9-
4
1






Sc++







87
GtG
CCATGGTGCATCTGACTCCT
17003
SpyCas9-
4
1






SpRY







88
CaC
GCAGTAACGGCAGACTTCTC
17004
SpyCas9-
4
1






SpRY







89
GtGGAG
ACCATGGTGCATCTGACTCCT
17005
cCas9-v17
4
1





90
GtGGAG
ACCATGGTGCATCTGACTCCT
17006
cCas9-v42
4
1





91
CaCAGG
GGCAGTAACGGCAGACTTCTC
17007
cCas9-v17
4
1





92
CaCAGG
GGCAGTAACGGCAGACTTCTC
17008
cCas9-v42
4
1





93
CaCA
GCAGTAACGGCAGACTTCTC
17009
SpyCas9-
4
1






3var-NRCH







94
TGtGG
CACCATGGTGCATCTGACTCC
17010
SauCas9KKH
5
1





95
TG
ACCATGGTGCATCTGACTCC
17011
SpyCas9-NG
5
0





96
TG
ACCATGGTGCATCTGACTCC
17012
SpyCas9-
5
0






xCas







97
TG
ACCATGGTGCATCTGACTCC
17013
SpyCas9-
5
0






xCas-NG







98
TGt
ACCATGGTGCATCTGACTCC
17014
SpyCas9-
5
1






SpG







99
TGt
ACCATGGTGCATCTGACTCC
17015
SpyCas9-
5
1






SpRY







100
CCa
GGCAGTAACGGCAGACTTCT
17016
SpyCas9-
5
1






SpRY







101
CTG
CACCATGGTGCATCTGACTC
17017
ScaCas9
6
0





102
CTG
CACCATGGTGCATCTGACTC
17018
ScaCas9-
6
0






HiFi-Sc++







103
CTG
CACCATGGTGCATCTGACTC
17019
ScaCas9-
6
0






Sc++







104
CTG
CACCATGGTGCATCTGACTC
17020
SpyCas9-
6
0






SpRY







105
TCC
GGGCAGTAACGGCAGACTTC
17021
SpyCas9-
6
0






SpRY







106
TCCaCAGG
acagGGCAGTAACGGCAGACTTC
17022
BlatCas9
6
1





107
TCCaC
acagGGCAGTAACGGCAGACTTC
17023
BlatCas9
6
1





108
CTC
AGGGCAGTAACGGCAGACTT
17024
SpyCas9-
7
0






SpRY







109
CCT
ACACCATGGTGCATCTGACT
17025
SpyCas9-
7
0






SpRY







110
TCT
CAGGGCAGTAACGGCAGACT
17026
SpyCas9-
8
0






SpRY







111
TCC
GACACCATGGTGCATCTGAC
17027
SpyCas9-
8
0






SpRY







112
TCTCC
ccacAGGGCAGTAACGGCAGACT
17028
BlatCas9
8
0





113
TTCTCC
ccCCACAGGGCAGTAACGGCAG
17029
Nme2Cas9
9
0




AC









114
TTC
ACAGGGCAGTAACGGCAGAC
17030
SpyCas9-
9
0






SpRY







115
CTC
AGACACCATGGTGCATCTGA
17031
SpyCas9-
9
0






SpRY







116
TTCTC
cccaCAGGGCAGTAACGGCAGAC
17032
BlatCas9
9
0





117
CTT
CACAGGGCAGTAACGGCAGA
17033
SpyCas9-
10
0






SpRY







118
ACT
CAGACACCATGGTGCATCTG
17034
SpyCas9-
10
0






SpRY







119
ACTCCTGt
aaacAGACACCATGGTGCATCTG
17035
BlatCas9
10
1





120
ACTCC
aaacAGACACCATGGTGCATCTG
17036
BlatCas9
10
0





121
GACTCC
tcAAACAGACACCATGGTGCATCT
17037
Nme2Cas9
11
0





122
GAC
ACAGACACCATGGTGCATCT
17038
SpyCas9-
11
0






SpRY







123
ACT
CCACAGGGCAGTAACGGCAG
17039
SpyCas9-
11
0






SpRY







124
GACTCCTG
caaaCAGACACCATGGTGCATCT
17040
BlatCas9
11
0





125
ACTTC
gcccCACAGGGCAGTAACGGCAG
17041
BlatCas9
11
0





126
GACTC
caaaCAGACACCATGGTGCATCT
17042
BlatCas9
11
0





127
GACT
ACAGACACCATGGTGCATCT
17043
SpyCas9-
11
0






3var-NRCH







128
TG
AACAGACACCATGGTGCATC
17044
SpyCas9-NG
12
0





129
TG
AACAGACACCATGGTGCATC
17045
SpyCas9-
12
0






xCas







130
TG
AACAGACACCATGGTGCATC
17046
SpyCas9-
12
0






xCas-NG







131
GAC
CCCACAGGGCAGTAACGGCA
17047
SpyCas9-
12
0






SpRY







132
TGA
AACAGACACCATGGTGCATC
17048
SpyCas9-
12
0






SpG







133
TGA
AACAGACACCATGGTGCATC
17049
SpyCas9-
12
0






SpRY







134
TGACTCC
CAAACAGACACCATGGTGCATC
17050
CdiCas9
12
0





135
TGAC
AACAGACACCATGGTGCATC
17051
SpyCas9-
12
0






3var-NRRH







136
TGAC
AACAGACACCATGGTGCATC
17052
SpyCas9-
12
0






VQR







137
GACT
CCCACAGGGCAGTAACGGCA
17053
SpyCas9-
12
0






3var-NRCH







138
CTG
AAACAGACACCATGGTGCAT
17054
ScaCas9
13
0





139
CTG
AAACAGACACCATGGTGCAT
17055
ScaCas9-
13
0






HiFi-Sc++







140
CTG
AAACAGACACCATGGTGCAT
17056
ScaCas9-
13
0






Sc++







141
CTG
AAACAGACACCATGGTGCAT
17057
SpyCas9-
13
0






SpRY







142
AG
CCCCACAGGGCAGTAACGGC
17058
SpyCas9-NG
13
0





143
AG
CCCCACAGGGCAGTAACGGC
17059
SpyCas9-
13
0






xCas







144
AG
CCCCACAGGGCAGTAACGGC
17060
SpyCas9-
13
0






xCas-NG







145
AGA
CCCCACAGGGCAGTAACGGC
17061
SpyCas9-
13
0






SpG







146
AGA
CCCCACAGGGCAGTAACGGC
17062
SpyCas9-
13
0






SpRY







147
CTGAC
ctcaAACAGACACCATGGTGCAT
17063
BlatCas9
13
0





148
CTGACT
CAAACAGACACCATGGTGCAT
17064
cCas9-v16
13
0





149
CTGACT
CAAACAGACACCATGGTGCAT
17065
cCas9-v21
13
0





150
AGACTTC
TGCCCCACAGGGCAGTAACGGC
17066
CdiCas9
13
0





151
CTGACTC
TCAAACAGACACCATGGTGCAT
17067
CdiCas9
13
0





152
AGAC
CCCCACAGGGCAGTAACGGC
17068
SpyCas9-
13
0






3var-NRRH







153
AGAC
CCCCACAGGGCAGTAACGGC
17069
SpyCas9-
13
0






VQR







154
TCTGA
TCAAACAGACACCATGGTGCA
17070
SauCas9KKH
14
0





155
CAG
GCCCCACAGGGCAGTAACGG
17071
ScaCas9
14
0





156
CAG
GCCCCACAGGGCAGTAACGG
17072
ScaCas9-
14
0






HiFi-Sc++







157
CAG
GCCCCACAGGGCAGTAACGG
17073
ScaCas9-
14
0






Sc++







158
CAG
GCCCCACAGGGCAGTAACGG
17074
SpyCas9-
14
0






SpRY







159
TCT
CAAACAGACACCATGGTGCA
17075
SpyCas9-
14
0






SpRY







160
CAGAC
cttgCCCCACAGGGCAGTAACGG
17076
BlatCas9
14
0





161
CAGACT
TGCCCCACAGGGCAGTAACGG
17077
cCas9-v16
14
0





162
CAGACT
TGCCCCACAGGGCAGTAACGG
17078
cCas9-v21
14
0





163
CAGACTT
TTGCCCCACAGGGCAGTAACGG
17079
CdiCas9
14
0





164
CAGA
GCCCCACAGGGCAGTAACGG
17080
SpyCas9-
14
0






3var-NRRH







165
GCAGA
TTGCCCCACAGGGCAGTAACG
17081
SauCas9KKH
15
0





166
GCAG
TTGCCCCACAGGGCAGTAACG
17082
SauriCas9-
15
0






KKH







167
GCA
TGCCCCACAGGGCAGTAACG
17083
SpyCas9-
15
0






SpRY







168
ATC
TCAAACAGACACCATGGTGC
17084
SpyCas9-
15
0






SpRY







169
GCAGAC
TTGCCCCACAGGGCAGTAACG
17085
cCas9-v17
15
0





170
GCAGAC
TTGCCCCACAGGGCAGTAACG
17086
cCas9-v42
15
0





171
ATCTGACT
aaccTCAAACAGACACCATGGTGC
17087
NmeCas9
15
0





172
GGCAG
CTTGCCCCACAGGGCAGTAAC
17088
SauCas9KKH
16
0





173
GG
TTGCCCCACAGGGCAGTAAC
17089
SpyCas9-NG
16
0





174
GG
TTGCCCCACAGGGCAGTAAC
17090
SpyCas9-
16
0






xCas







175
GG
TTGCCCCACAGGGCAGTAAC
17091
SpyCas9-
16
0






xCas-NG







176
GGC
TTGCCCCACAGGGCAGTAAC
17092
SpyCas9-
16
0






SpG







177
GGC
TTGCCCCACAGGGCAGTAAC
17093
SpyCas9-
16
0






SpRY







178
CAT
CTCAAACAGACACCATGGTG
17094
SpyCas9-
16
0






SpRY







179
GGCAGA
CTTGCCCCACAGGGCAGTAAC
17095
cCas9-v17
16
0





180
GGCAGA
CTTGCCCCACAGGGCAGTAAC
17096
cCas9-v42
16
0





181
GGCAGACT
caccTTGCCCCACAGGGCAGTAAC
17097
NmeCas9
16
0





182
CATC
CTCAAACAGACACCATGGTG
17098
SpyCas9-
16
0






3var-NRTH







183
GGCA
TTGCCCCACAGGGCAGTAAC
17099
SpyCas9-
16
0






3var-NRCH







184
CGG
CTTGCCCCACAGGGCAGTAA
17100
ScaCas9
17
0





185
CGG
CTTGCCCCACAGGGCAGTAA
17101
ScaCas9-
17
0






HiFi-Sc++







186
CGG
CTTGCCCCACAGGGCAGTAA
17102
ScaCas9-
17
0






Sc++







187
CGG
CTTGCCCCACAGGGCAGTAA
17103
SpyCas9
17
0





188
CGG
CTTGCCCCACAGGGCAGTAA
17104
SpyCas9-
17
0






HF1







189
CGG
CTTGCCCCACAGGGCAGTAA
17105
SpyCas9-
17
0






SpG







190
CGG
CTTGCCCCACAGGGCAGTAA
17106
SpyCas9-
17
0






SpRY







191
CG
CTTGCCCCACAGGGCAGTAA
17107
SpyCas9-NG
17
0





192
CG
CTTGCCCCACAGGGCAGTAA
17108
SpyCas9-
17
0






xCas







193
CG
CTTGCCCCACAGGGCAGTAA
17109
SpyCas9-
17
0






xCas-NG







194
GCA
CCTCAAACAGACACCATGGT
17110
SpyCas9-
17
0






SpRY







195
GCATCTGA
caacCTCAAACAGACACCATGGT
17111
BlatCas9
17
0





196
GCATC
caacCTCAAACAGACACCATGGT
17112
BlatCas9
17
0





197
CGGC
CTTGCCCCACAGGGCAGTAA
17113
SpyCas9-
17
0






3var-NRRH







198
ACGG
ACCTTGCCCCACAGGGCAGTA
17114
SauriCas9
18
0





199
ACGG
ACCTTGCCCCACAGGGCAGTA
17115
SauriCas9-
18
0






KKH







200
ACG
CCTTGCCCCACAGGGCAGTA
17116
ScaCas9
18
0





201
ACG
CCTTGCCCCACAGGGCAGTA
17117
ScaCas9-
18
0






HiFi-Sc++







202
ACG
CCTTGCCCCACAGGGCAGTA
17118
ScaCas9-
18
0






Sc++







203
ACG
CCTTGCCCCACAGGGCAGTA
17119
SpyCas9-
18
0






SpRY







204
TG
ACCTCAAACAGACACCATGG
17120
SpyCas9-NG
18
0





205
TG
ACCTCAAACAGACACCATGG
17121
SpyCas9-
18
0






xCas







206
TG
ACCTCAAACAGACACCATGG
17122
SpyCas9-
18
0






xCas-NG







207
TGC
ACCTCAAACAGACACCATGG
17123
SpyCas9-
18
0






SpG







208
TGC
ACCTCAAACAGACACCATGG
17124
SpyCas9-
18
0






SpRY







209
ACGGCAGA
tcacCTTGCCCCACAGGGCAGTA
17125
BlatCas9
18
0





210
ACGGC
tcacCTTGCCCCACAGGGCAGTA
17126
BlatCas9
18
0





211
TGCA
ACCTCAAACAGACACCATGG
17127
SpyCas9-
18
0






3var-NRCH







212
AACGG
CACCTTGCCCCACAGGGCAGT
17128
SauCas9KKH
19
0





213
GTG
AACCTCAAACAGACACCATG
17129
ScaCas9
19
0





214
GTG
AACCTCAAACAGACACCATG
17130
ScaCas9-
19
0






HiFi-Sc++







215
GTG
AACCTCAAACAGACACCATG
17131
ScaCas9-
19
0






Sc++







216
GTG
AACCTCAAACAGACACCATG
17132
SpyCas9-
19
0






SpRY







217
AAC
ACCTTGCCCCACAGGGCAGT
17133
SpyCas9-
19
0






SpRY







218
AACGGC
CACCTTGCCCCACAGGGCAGT
17134
cCas9-v17
19
0





219
AACGGC
CACCTTGCCCCACAGGGCAGT
17135
cCas9-v42
19
0





220
GTGCATC
GCAACCTCAAACAGACACCATG
17136
CdiCas9
19
0





221
GG
CAACCTCAAACAGACACCAT
17137
SpyCas9-NG
20
0





222
GG
CAACCTCAAACAGACACCAT
17138
SpyCas9-
20
0






xCas







223
GG
CAACCTCAAACAGACACCAT
17139
SpyCas9-
20
0






xCas-NG







224
TAA
CACCTTGCCCCACAGGGCAG
17140
SpyCas9-
20
0






SpRY







225
GGT
CAACCTCAAACAGACACCAT
17141
SpyCas9-
20
0






SpG







226
GGT
CAACCTCAAACAGACACCAT
17142
SpyCas9-
20
0






SpRY







227
GGTGC
tagcAACCTCAAACAGACACCAT
17143
BlatCas9
20
0





228
TAAC
CACCTTGCCCCACAGGGCAG
17144
SpyCas9-
20
0






3var-NRRH







229
TAAC
tcACCTTGCCCCACAGGGCAG
17145
iSpyMacCas9
20
0





230
TGG
GCAACCTCAAACAGACACCA
17146
ScaCas9
21
0





231
TGG
GCAACCTCAAACAGACACCA
17147
ScaCas9-
21
0






HiFi-Sc++







232
TGG
GCAACCTCAAACAGACACCA
17148
ScaCas9-
21
0






Sc++







233
TGG
GCAACCTCAAACAGACACCA
17149
SpyCas9
21
0





234
TGG
GCAACCTCAAACAGACACCA
17150
SpyCas9-
21
0






HF1







235
TGG
GCAACCTCAAACAGACACCA
17151
SpyCas9-
21
0






SpG







236
TGG
GCAACCTCAAACAGACACCA
17152
SpyCas9-
21
0






SpRY







237
TG
GCAACCTCAAACAGACACCA
17153
SpyCas9-NG
21
0





238
TG
GCAACCTCAAACAGACACCA
17154
SpyCas9-
21
0






xCas







239
TG
GCAACCTCAAACAGACACCA
17155
SpyCas9-
21
0






xCas-NG







240
GTA
TCACCTTGCCCCACAGGGCA
17156
SpyCas9-
21
0






SpRY







241
GTAAC
cgttCACCTTGCCCCACAGGGCA
17157
BlatCas9
21
0





242
TGGT
GCAACCTCAAACAGACACCA
17158
SpyCas9-
21
0






3var-NRRH







243
AGTAA
GTTCACCTTGCCCCACAGGGC
17159
SauCas9KKH
22
0





244
ATGG
TAGCAACCTCAAACAGACACC
17160
SauriCas9
22
0





245
ATGG
TAGCAACCTCAAACAGACACC
17161
SauriCas9-
22
0






KKH







246
ATG
AGCAACCTCAAACAGACACC
17162
ScaCas9
22
0





247
ATG
AGCAACCTCAAACAGACACC
17163
ScaCas9-
22
0






HiFi-Sc++







248
ATG
AGCAACCTCAAACAGACACC
17164
ScaCas9-
22
0






Sc++







249
ATG
AGCAACCTCAAACAGACACC
17165
SpyCas9-
22
0






SpRY







250
AG
TTCACCTTGCCCCACAGGGC
17166
SpyCas9-NG
22
0





251
AG
TTCACCTTGCCCCACAGGGC
17167
SpyCas9-
22
0






xCas







252
AG
TTCACCTTGCCCCACAGGGC
17168
SpyCas9-
22
0






xCas-NG







253
AGT
TTCACCTTGCCCCACAGGGC
17169
SpyCas9-
22
0






SpG







254
AGT
TTCACCTTGCCCCACAGGGC
17170
SpyCas9-
22
0






SpRY







255
ATGGTG
TAGCAACCTCAAACAGACACC
17171
cCas9-v16
22
0





256
ATGGTG
TAGCAACCTCAAACAGACACC
17172
cCas9-v21
22
0





257
AGTA
TTCACCTTGCCCCACAGGGC
17173
SpyCas9-
22
0






3var-NRTH







258
CATGG
CTAGCAACCTCAAACAGACAC
17174
SauCas9KKH
23
0





259
CATGGT
CTAGCAACCTCAAACAGACAC
17175
SauCas9KKH
23
0





260
CAG
GTTCACCTTGCCCCACAGGG
17176
ScaCas9
23
0





261
CAG
GTTCACCTTGCCCCACAGGG
17177
ScaCas9-
23
0






HiFi-Sc++







262
CAG
GTTCACCTTGCCCCACAGGG
17178
ScaCas9-
23
0






Sc++







263
CAG
GTTCACCTTGCCCCACAGGG
17179
SpyCas9-
23
0






SpRY







264
CAT
TAGCAACCTCAAACAGACAC
17180
SpyCas9-
23
0






SpRY







265
CAGTAAC
ACGTTCACCTTGCCCCACAGGG
17181
CdiCas9
23
0





266
CAGT
GTTCACCTTGCCCCACAGGG
17182
SpyCas9-
23
0






3var-NRRH







267
GCAG
ACGTTCACCTTGCCCCACAGG
17183
SauriCas9-
24
0






KKH







268
GCA
CGTTCACCTTGCCCCACAGG
17184
SpyCas9-
24
0






SpRY







269
CCA
CTAGCAACCTCAAACAGACA
17185
SpyCas9-
24
0






SpRY







270
GGCAG
CACGTTCACCTTGCCCCACAG
17186
SauCas9KKH
25
0





271
GGCAGT
CACGTTCACCTTGCCCCACAG
17187
SauCas9KKH
25
0





272
GGCAGT
CACGTTCACCTTGCCCCACAG
17188
cCas9-v17
25
0





273
GGCAGT
CACGTTCACCTTGCCCCACAG
17189
cCas9-v42
25
0





274
GG
ACGTTCACCTTGCCCCACAG
17190
SpyCas9-NG
25
0





275
GG
ACGTTCACCTTGCCCCACAG
17191
SpyCas9-
25
0






xCas







276
GG
ACGTTCACCTTGCCCCACAG
17192
SpyCas9-
25
0






xCas-NG







277
GGC
ACGTTCACCTTGCCCCACAG
17193
SpyCas9-
25
0






SpG







278
GGC
ACGTTCACCTTGCCCCACAG
17194
SpyCas9-
25
0






SpRY







279
ACC
ACTAGCAACCTCAAACAGAC
17195
SpyCas9-
25
0






SpRY







280
GGCA
ACGTTCACCTTGCCCCACAG
17196
SpyCas9-
25
0






3var-NRCH







281
GGG
CACGTTCACCTTGCCCCACA
17197
ScaCas9
26
0





282
GGG
CACGTTCACCTTGCCCCACA
17198
ScaCas9-
26
0






HiFi-Sc++







283
GGG
CACGTTCACCTTGCCCCACA
17199
ScaCas9-
26
0






Sc++







284
GGG
CACGTTCACCTTGCCCCACA
17200
SpyCas9
26
0





285
GGG
CACGTTCACCTTGCCCCACA
17201
SpyCas9-
26
0






HF1







286
GGG
CACGTTCACCTTGCCCCACA
17202
SpyCas9-
26
0






SpG







287
GGG
CACGTTCACCTTGCCCCACA
17203
SpyCas9-
26
0






SpRY







288
GG
CACGTTCACCTTGCCCCACA
17204
SpyCas9-NG
26
0





289
GG
CACGTTCACCTTGCCCCACA
17205
SpyCas9-
26
0






xCas







290
GG
CACGTTCACCTTGCCCCACA
17206
SpyCas9-
26
0






xCas-NG







291
CAC
CACTAGCAACCTCAAACAGA
17207
SpyCas9-
26
0






SpRY







292
GGGC
CACGTTCACCTTGCCCCACA
17208
SpyCas9-
26
0






3var-NRRH







293
CACC
CACTAGCAACCTCAAACAGA
17209
SpyCas9-
26
0






3var-NRCH







294
AGGG
TCCACGTTCACCTTGCCCCAC
17210
SauriCas9
27
0





295
AGGG
TCCACGTTCACCTTGCCCCAC
17211
SauriCas9-
27
0






KKH







296
AGG
CCACGTTCACCTTGCCCCAC
17212
ScaCas9
27
0





297
AGG
CCACGTTCACCTTGCCCCAC
17213
ScaCas9-
27
0






HiFi-Sc++







298
AGG
CCACGTTCACCTTGCCCCAC
17214
ScaCas9-
27
0






Sc++







299
AGG
CCACGTTCACCTTGCCCCAC
17215
SpyCas9
27
0





300
AGG
CCACGTTCACCTTGCCCCAC
17216
SpyCas9-
27
0






HF1







301
AGG
CCACGTTCACCTTGCCCCAC
17217
SpyCas9-
27
0






SpG







302
AGG
CCACGTTCACCTTGCCCCAC
17218
SpyCas9-
27
0






SpRY







303
AG
CCACGTTCACCTTGCCCCAC
17219
SpyCas9-NG
27
0





304
AG
CCACGTTCACCTTGCCCCAC
17220
SpyCas9-
27
0






xCas







305
AG
CCACGTTCACCTTGCCCCAC
17221
SpyCas9-
27
0






xCas-NG







306
ACA
TCACTAGCAACCTCAAACAG
17222
SpyCas9-
27
0






SpRY







307
AGGGCAGT
catcCACGTTCACCTTGCCCCAC
17223
BlatCas9
27
0





308
ACACCATG
tgttCACTAGCAACCTCAAACAG
17224
BlatCas9
27
0





309
AGGGC
catcCACGTTCACCTTGCCCCAC
17225
BlatCas9
27
0





310
ACACC
tgttCACTAGCAACCTCAAACAG
17226
BlatCas9
27
0





311
ACACCAT
GTTCACTAGCAACCTCAAACAG
17227
CdiCas9
27
0





312
GACACC
tgTGTTCACTAGCAACCTCAAACA
17228
Nme2Cas9
28
0





313
CAGGG
tcATCCACGTTCACCTTGCCCCA
17229
SauCas9
28
0





314
CAGGG
ATCCACGTTCACCTTGCCCCA
17230
SauCas9KKH
28
0





315
CAGG
ATCCACGTTCACCTTGCCCCA
17231
SauriCas9
28
0





316
CAGG
ATCCACGTTCACCTTGCCCCA
17232
SauriCas9-
28
0






KKH







317
CAG
TCCACGTTCACCTTGCCCCA
17233
ScaCas9
28
0





318
CAG
TCCACGTTCACCTTGCCCCA
17234
ScaCas9-
28
0






HiFi-Sc++







319
CAG
TCCACGTTCACCTTGCCCCA
17235
ScaCas9-
28
0






Sc++







320
CAG
TCCACGTTCACCTTGCCCCA
17236
SpyCas9-
28
0






SpRY







321
GAC
TTCACTAGCAACCTCAAACA
17237
SpyCas9-
28
0






SpRY







322
GACACCAT
gtgtTCACTAGCAACCTCAAACA
17238
BlatCas9
28
0





323
GACAC
gtgtTCACTAGCAACCTCAAACA
17239
BlatCas9
28
0





324
CAGGGC
ATCCACGTTCACCTTGCCCCA
17240
cCas9-v17
28
0





325
CAGGGC
ATCCACGTTCACCTTGCCCCA
17241
cCas9-v42
28
0





326
GACA
TTCACTAGCAACCTCAAACA
17242
SpyCas9-
28
0






3var-NRCH







327
ACAGG
CATCCACGTTCACCTTGCCCC
17243
SauCas9KKH
29
0





328
ACAG
CATCCACGTTCACCTTGCCCC
17244
SauriCas9-
29
0






KKH







329
AG
GTTCACTAGCAACCTCAAAC
17245
SpyCas9-NG
29
0





330
AG
GTTCACTAGCAACCTCAAAC
17246
SpyCas9-
29
0






xCas







331
AG
GTTCACTAGCAACCTCAAAC
17247
SpyCas9-
29
0






xCas-NG







332
AGA
GTTCACTAGCAACCTCAAAC
17248
SpyCas9-
29
0






SpG







333
AGA
GTTCACTAGCAACCTCAAAC
17249
SpyCas9-
29
0






SpRY







334
ACA
ATCCACGTTCACCTTGCCCC
17250
SpyCas9-
29
0






SpRY







335
ACAGGG
CATCCACGTTCACCTTGCCCC
17251
cCas9-v17
29
0





336
ACAGGG
CATCCACGTTCACCTTGCCCC
17252
cCas9-v42
29
0





337
AGACACC
GTGTTCACTAGCAACCTCAAAC
17253
CdiCas9
29
0





338
AGAC
GTTCACTAGCAACCTCAAAC
17254
SpyCas9-
29
0






3var-NRRH







339
AGAC
GTTCACTAGCAACCTCAAAC
17255
SpyCas9-
29
0






VQR







340
CACAG
TCATCCACGTTCACCTTGCCC
17256
SauCas9KKH
30
0





341
CAG
TGTTCACTAGCAACCTCAAA
17257
ScaCas9
30
0





342
CAG
TGTTCACTAGCAACCTCAAA
17258
ScaCas9-
30
0






HiFi-Sc++







343
CAG
TGTTCACTAGCAACCTCAAA
17259
ScaCas9-
30
0






Sc++







344
CAG
TGTTCACTAGCAACCTCAAA
17260
SpyCas9-
30
0






SpRY







345
CAC
CATCCACGTTCACCTTGCCC
17261
SpyCas9-
30
0






SpRY







346
CAGAC
ctgtGTTCACTAGCAACCTCAAA
17262
BlatCas9
30
0





347
CACAGG
TCATCCACGTTCACCTTGCCC
17263
cCas9-v17
30
0





348
CACAGG
TCATCCACGTTCACCTTGCCC
17264
cCas9-v42
30
0





349
CAGACAC
TGTGTTCACTAGCAACCTCAAA
17265
CdiCas9
30
0





350
CAGA
TGTTCACTAGCAACCTCAAA
17266
SpyCas9-
30
0






3var-NRRH







351
CACA
CATCCACGTTCACCTTGCCC
17267
SpyCas9-
30
0






3var-NRCH







352
ACAGA
TGTGTTCACTAGCAACCTCAA
17268
SauCas9KKH
31
0





353
ACAG
TGTGTTCACTAGCAACCTCAA
17269
SauriCas9-
31
0






KKH







354
CCA
TCATCCACGTTCACCTTGCC
17270
SpyCas9-
31
0






SpRY







355
ACA
GTGTTCACTAGCAACCTCAA
17271
SpyCas9-
31
0






SpRY







356
ACAGAC
TGTGTTCACTAGCAACCTCAA
17272
cCas9-v17
31
0





357
ACAGAC
TGTGTTCACTAGCAACCTCAA
17273
cCas9-v42
31
0





358
ACAGACAC
acTGTGTTCACTAGCAACCTCAA
17274
CjeCas9
31
0





359
AACAG
CTGTGTTCACTAGCAACCTCA
17275
SauCas9KKH
32
0





360
AAC
TGTGTTCACTAGCAACCTCA
17276
SpyCas9-
32
0






SpRY







361
CCC
TTCATCCACGTTCACCTTGC
17277
SpyCas9-
32
0






SpRY







362
CCCACAGG
aactTCATCCACGTTCACCTTGC
17278
BlatCas9
32
0





363
CCCAC
aactTCATCCACGTTCACCTTGC
17279
BlatCas9
32
0





364
AACAGA
CTGTGTTCACTAGCAACCTCA
17280
cCas9-v17
32
0





365
AACAGA
CTGTGTTCACTAGCAACCTCA
17281
cCas9-v42
32
0





366
AACAGACA
caacTGTGTTCACTAGCAACCTCA
17282
NmeCas9
32
0





367
AACA
TGTGTTCACTAGCAACCTCA
17283
SpyCas9-
32
0






3var-NRCH







368
AAA
CTGTGTTCACTAGCAACCTC
17284
SpyCas9-
33
0






SpRY







369
CCC
CTTCATCCACGTTCACCTTG
17285
SpyCas9-
33
0






SpRY







370
AAAC
CTGTGTTCACTAGCAACCTC
17286
SpyCas9-
33
0






3var-NRRH







371
AAAC
acTGTGTTCACTAGCAACCTC
17287
iSpyMacCas9
33
0





372
CAA
ACTGTGTTCACTAGCAACCT
17288
SpyCas9-
34
0






SpRY







373
GCC
ACTTCATCCACGTTCACCTT
17289
SpyCas9-
34
0






SpRY







374
CAAACAGA
acaaCTGTGTTCACTAGCAACCT
17290
BlatCas9
34
0





375
GCCCC
ccaaCTTCATCCACGTTCACCTT
17291
BlatCas9
34
0





376
CAAAC
acaaCTGTGTTCACTAGCAACCT
17292
BlatCas9
34
0





377
CAAA
ACTGTGTTCACTAGCAACCT
17293
SpyCas9-
34
0






3var-NRRH







378
CAAA
aaCTGTGTTCACTAGCAACCT
17294
iSpyMacCas9
34
0





379
TGCCCC
caCCAACTTCATCCACGTTCACCT
17295
Nme2Cas9
35
0





380
TCAAA
CAACTGTGTTCACTAGCAACC
17296
SauCas9KKH
35
0





381
TG
AACTTCATCCACGTTCACCT
17297
SpyCas9-NG
35
0





382
TG
AACTTCATCCACGTTCACCT
17298
SpyCas9-
35
0






xCas







383
TG
AACTTCATCCACGTTCACCT
17299
SpyCas9-
35
0






xCas-NG







384
TGC
AACTTCATCCACGTTCACCT
17300
SpyCas9-
35
0






SpG







385
TGC
AACTTCATCCACGTTCACCT
17301
SpyCas9-
35
0






SpRY







386
TCA
AACTGTGTTCACTAGCAACC
17302
SpyCas9-
35
0






SpRY







387
TGCCC
accaACTTCATCCACGTTCACCT
17303
BlatCas9
35
0





388
TCAAAC
CAACTGTGTTCACTAGCAACC
17304
cCas9-v17
35
0





389
TCAAAC
CAACTGTGTTCACTAGCAACC
17305
cCas9-v42
35
0





390
TGCC
AACTTCATCCACGTTCACCT
17306
SpyCas9-
35
0






3var-NRCH







391
TTGCCC
ccACCAACTTCATCCACGTTCACC
17307
Nme2Cas9
36
0





392
CTCAA
ACAACTGTGTTCACTAGCAAC
17308
SauCas9KKH
36
0





393
TTG
CAACTTCATCCACGTTCACC
17309
ScaCas9
36
0





394
TTG
CAACTTCATCCACGTTCACC
17310
ScaCas9-
36
0






HiFi-Sc++







395
TTG
CAACTTCATCCACGTTCACC
17311
ScaCas9-
36
0






Sc++







396
TTG
CAACTTCATCCACGTTCACC
17312
SpyCas9-
36
0






SpRY







397
CTC
CAACTGTGTTCACTAGCAAC
17313
SpyCas9-
36
0






SpRY







398
TTGCC
caccAACTTCATCCACGTTCACC
17314
BlatCas9
36
0





399
CTCAAA
ACAACTGTGTTCACTAGCAAC
17315
cCas9-v17
36
0





400
CTCAAA
ACAACTGTGTTCACTAGCAAC
17316
cCas9-v42
36
0





401
TTGCCCC
ACCAACTTCATCCACGTTCACC
17317
CdiCas9
36
0





402
CTTGCC
acCACCAACTTCATCCACGTTCAC
17318
Nme2Cas9
37
0





403
CTT
CCAACTTCATCCACGTTCAC
17319
SpyCas9-
37
0






SpRY







404
CCT
ACAACTGTGTTCACTAGCAA
17320
SpyCas9-
37
0






SpRY







405
CTTGC
ccacCAACTTCATCCACGTTCAC
17321
BlatCas9
37
0





406
CCT
ACCAACTTCATCCACGTTCA
17322
SpyCas9-
38
0






SpRY







407
ACC
CACAACTGTGTTCACTAGCA
17323
SpyCas9-
38
0






SpRY







408
ACCTCAAA
tgacACAACTGTGTTCACTAGCA
17324
BlatCas9
38
0





409
ACCTCAAA
tgacACAACTGTGTTCACTAGCA
17325
BlatCas9
38
0





410
ACCTCAAA
tgACACAACTGTGTTCACTAGCA
17326
GeoCas9
38
0





411
ACCTC
tgacACAACTGTGTTCACTAGCA
17327
BlatCas9
38
0





412
AAC
ACACAACTGTGTTCACTAGC
17328
SpyCas9-
39
0






SpRY







413
ACC
CACCAACTTCATCCACGTTC
17329
SpyCas9-
39
0






SpRY







414
AACC
ACACAACTGTGTTCACTAGC
17330
SpyCas9-
39
0






3var-NRCH







415
CAC
CCACCAACTTCATCCACGTT
17331
SpyCas9-
40
0






SpRY







416
CAA
GACACAACTGTGTTCACTAG
17332
SpyCas9-
40
0






SpRY







417
CAACC
tctgACACAACTGTGTTCACTAG
17333
BlatCas9
40
0





418
CAACCTC
CTGACACAACTGTGTTCACTAG
17334
CdiCas9
40
0





419
CAAC
GACACAACTGTGTTCACTAG
17335
SpyCas9-
40
0






3var-NRRH







420
CAAC
tgACACAACTGTGTTCACTAG
17336
iSpyMacCas9
40
0





421
CACC
CCACCAACTTCATCCACGTT
17337
SpyCas9-
40
0






3var-NRCH







422
GCAACC
ctTCTGACACAACTGTGTTCACTA
17338
Nme2Cas9
41
0





423
TCA
ACCACCAACTTCATCCACGT
17339
SpyCas9-
41
0






SpRY







424
GCA
TGACACAACTGTGTTCACTA
17340
SpyCas9-
41
0






SpRY







425
TCACCTTG
ctcaCCACCAACTTCATCCACGT
17341
BlatCas9
41
0





426
TCACC
ctcaCCACCAACTTCATCCACGT
17342
BlatCas9
41
0





427
GCAAC
ttctGACACAACTGTGTTCACTA
17343
BlatCas9
41
0





428
TCACCTT
TCACCACCAACTTCATCCACGT
17344
CdiCas9
41
0





429
GCAACCT
TCTGACACAACTGTGTTCACTA
17345
CdiCas9
41
0





430
TTCACC
gcCTCACCACCAACTTCATCCACG
17346
Nme2Cas9
42
0





431
AGCAA
TCTGACACAACTGTGTTCACT
17347
SauCas9KKH
42
0





432
AG
CTGACACAACTGTGTTCACT
17348
SpyCas9-NG
42
0





433
AG
CTGACACAACTGTGTTCACT
17349
SpyCas9-
42
0






xCas







434
AG
CTGACACAACTGTGTTCACT
17350
SpyCas9-
42
0






xCas-NG







435
AGC
CTGACACAACTGTGTTCACT
17351
SpyCas9-
42
0






SpG







436
AGC
CTGACACAACTGTGTTCACT
17352
SpyCas9-
42
0






SpRY







437
TTC
CACCACCAACTTCATCCACG
17353
SpyCas9-
42
0






SpRY







438
TTCACCTT
cctcACCACCAACTTCATCCACG
17354
BlatCas9
42
0





439
TTCAC
cctcACCACCAACTTCATCCACG
17355
BlatCas9
42
0





440
AGCAAC
TCTGACACAACTGTGTTCACT
17356
cCas9-v17
42
0





441
AGCAAC
TCTGACACAACTGTGTTCACT
17357
cCas9-v42
42
0





442
AGCA
CTGACACAACTGTGTTCACT
17358
SpyCas9-
42
0






3var-NRCH







443
TAG
TCTGACACAACTGTGTTCAC
17359
ScaCas9
43
0





444
TAG
TCTGACACAACTGTGTTCAC
17360
ScaCas9-
43
0






HiFi-Sc++







445
TAG
TCTGACACAACTGTGTTCAC
17361
ScaCas9-
43
0






Sc++







446
TAG
TCTGACACAACTGTGTTCAC
17362
SpyCas9-
43
0






SpRY







447
GTT
TCACCACCAACTTCATCCAC
17363
SpyCas9-
43
0






SpRY







448
TAGCAAC
CTTCTGACACAACTGTGTTCAC
17364
CdiCas9
43
0





449
TAGC
TCTGACACAACTGTGTTCAC
17365
SpyCas9-
43
0






3var-NRRH







450
TAGCAA
TCTGACACAACTGTGTTCAC
17366
St1Cas9-
43
0






LMG1831







451
CTAG
CTTCTGACACAACTGTGTTCA
17367
SauriCas9-
44
0






KKH







452
CG
CTCACCACCAACTTCATCCA
17368
SpyCas9-NG
44
0





453
CG
CTCACCACCAACTTCATCCA
17369
SpyCas9-
44
0






xCas







454
CG
CTCACCACCAACTTCATCCA
17370
SpyCas9-
44
0






xCas-NG







455
CGT
CTCACCACCAACTTCATCCA
17371
SpyCas9-
44
0






SpG







456
CGT
CTCACCACCAACTTCATCCA
17372
SpyCas9-
44
0






SpRY







457
CTA
TTCTGACACAACTGTGTTCA
17373
SpyCas9-
44
0






SpRY







458
CGTTC
ggccTCACCACCAACTTCATCCA
17374
BlatCas9
44
0





459
CTAGC
tgctTCTGACACAACTGTGTTCA
17375
BlatCas9
44
0





460
CGTT
CTCACCACCAACTTCATCCA
17376
SpyCas9-
44
0






3var-NRTH







461
ACTAG
GCTTCTGACACAACTGTGTTC
17377
SauCas9KKH
45
0





462
ACG
CCTCACCACCAACTTCATCC
17378
ScaCas9
45
0





463
ACG
CCTCACCACCAACTTCATCC
17379
ScaCas9-
45
0






HiFi-Sc++







464
ACG
CCTCACCACCAACTTCATCC
17380
ScaCas9-
45
0






Sc++







465
ACG
CCTCACCACCAACTTCATCC
17381
SpyCas9-
45
0






SpRY







466
ACT
CTTCTGACACAACTGTGTTC
17382
SpyCas9-
45
0






SpRY







467
CAC
GCCTCACCACCAACTTCATC
17383
SpyCas9-
46
0






SpRY







468
CAC
GCTTCTGACACAACTGTGTT
17384
SpyCas9-
46
0






SpRY







469
CACGTT
GGCCTCACCACCAACTTCATC
17385
cCas9-v16
46
0





470
CACGTT
GGCCTCACCACCAACTTCATC
17386
cCas9-v21
46
0





471
CACT
GCTTCTGACACAACTGTGTT
17387
SpyCas9-
46
0






3var-NRCH







472
CCACGTT
ccaGGGCCTCACCACCAACTTCAT
17388
PpnCas9
47
0





473
CCA
GGCCTCACCACCAACTTCAT
17389
SpyCas9-
47
0






SpRY







474
TCA
TGCTTCTGACACAACTGTGT
17390
SpyCas9-
47
0






SpRY







475
TCC
GGGCCTCACCACCAACTTCA
17391
SpyCas9-
48
0






SpRY







476
TTC
TTGCTTCTGACACAACTGTG
17392
SpyCas9-
48
0






SpRY







477
TCCACGTT
ccagGGCCTCACCACCAACTTCA
17393
BlatCas9
48
0





478
TTCACTAG
cattTGCTTCTGACACAACTGTG
17394
BlatCas9
48
0





479
TCCAC
ccagGGCCTCACCACCAACTTCA
17395
BlatCas9
48
0





480
TTCAC
cattTGCTTCTGACACAACTGTG
17396
BlatCas9
48
0





481
TTCACT
TTTGCTTCTGACACAACTGTG
17397
cCas9-v16
48
0





482
TTCACT
TTTGCTTCTGACACAACTGTG
17398
cCas9-v21
48
0





483
ATC
AGGGCCTCACCACCAACTTC
17399
SpyCas9-
49
0






SpRY







484
GTT
TTTGCTTCTGACACAACTGT
17400
SpyCas9-
49
0






SpRY







485
TG
ATTTGCTTCTGACACAACTG
17401
SpyCas9-NG
50
0





486
TG
ATTTGCTTCTGACACAACTG
17402
SpyCas9-
50
0






xCas







487
TG
ATTTGCTTCTGACACAACTG
17403
SpyCas9-
50
0






xCas-NG







488
CAT
CAGGGCCTCACCACCAACTT
17404
SpyCas9-
50
0






SpRY







489
TGT
ATTTGCTTCTGACACAACTG
17405
SpyCas9-
50
0






SpG







490
TGT
ATTTGCTTCTGACACAACTG
17406
SpyCas9-
50
0






SpRY







491
CATCC
gcccAGGGCCTCACCACCAACTT
17407
BlatCas9
50
0





492
TGTTC
tacaTTTGCTTCTGACACAACTG
17408
BlatCas9
50
0





493
CATC
CAGGGCCTCACCACCAACTT
17409
SpyCas9-
50
0






3var-NRTH







494
TGTT
ATTTGCTTCTGACACAACTG
17410
SpyCas9-
50
0






3var-NRTH







495
TCATCC
ctGCCCAGGGCCTCACCACCAACT
17411
Nme2Cas9
51
0





496
GTG
CATTTGCTTCTGACACAACT
17412
ScaCas9
51
0





497
GTG
CATTTGCTTCTGACACAACT
17413
ScaCas9-
51
0






HiFi-Sc++







498
GTG
CATTTGCTTCTGACACAACT
17414
ScaCas9-
51
0






Sc++







499
GTG
CATTTGCTTCTGACACAACT
17415
SpyCas9-
51
0






SpRY







500
TCA
CCAGGGCCTCACCACCAACT
17416
SpyCas9-
51
0






SpRY







501
TCATC
tgccCAGGGCCTCACCACCAACT
17417
BlatCas9
51
0





502
TG
ACATTTGCTTCTGACACAAC
17418
SpyCas9-NG
52
0





503
TG
ACATTTGCTTCTGACACAAC
17419
SpyCas9-
52
0






xCas







504
TG
ACATTTGCTTCTGACACAAC
17420
SpyCas9-
52
0






xCas-NG







505
TGT
ACATTTGCTTCTGACACAAC
17421
SpyCas9-
52
0






SpG







506
TGT
ACATTTGCTTCTGACACAAC
17422
SpyCas9-
52
0






SpRY







507
TTC
CCCAGGGCCTCACCACCAAC
17423
SpyCas9-
52
0






SpRY







508
CTGTGTT
tgcTTACATTTGCTTCTGACACAA
17424
PpnCas9
53
0





509
CTG
TACATTTGCTTCTGACACAA
17425
ScaCas9
53
0





510
CTG
TACATTTGCTTCTGACACAA
17426
ScaCas9-
53
0






HiFi-Sc++







511
CTG
TACATTTGCTTCTGACACAA
17427
ScaCas9-
53
0






Sc++







512
CTG
TACATTTGCTTCTGACACAA
17428
SpyCas9-
53
0






SpRY







513
CTT
GCCCAGGGCCTCACCACCAA
17429
SpyCas9-
53
0






SpRY







514
ACT
TGCCCAGGGCCTCACCACCA
17430
SpyCas9-
54
0






SpRY







515
ACT
TTACATTTGCTTCTGACACA
17431
SpyCas9-
54
0






SpRY







516
ACTTC
acctGCCCAGGGCCTCACCACCA
17432
BlatCas9
54
0





517
AAC
CTGCCCAGGGCCTCACCACC
17433
SpyCas9-
55
0






SpRY







518
AAC
CTTACATTTGCTTCTGACAC
17434
SpyCas9-
55
0






SpRY







519
AACT
CTGCCCAGGGCCTCACCACC
17435
SpyCas9-
55
0






3var-NRCH







520
AACT
CTTACATTTGCTTCTGACAC
17436
SpyCas9-
55
0






3var-NRCH







521
CAA
CCTGCCCAGGGCCTCACCAC
17437
SpyCas9-
56
0






SpRY







522
CAA
GCTTACATTTGCTTCTGACA
17438
SpyCas9-
56
0






SpRY







523
CAACTTC
AACCTGCCCAGGGCCTCACCAC
17439
CdiCas9
56
0





524
CAAC
CCTGCCCAGGGCCTCACCAC
17440
SpyCas9-
56
0






3var-NRRH







525
CAAC
acCTGCCCAGGGCCTCACCAC
17441
iSpyMacCas9
56
0





526
CAAC
GCTTACATTTGCTTCTGACA
17442
SpyCas9-
56
0






3var-NRRH







527
CAAC
tgCTTACATTTGCTTCTGACA
17443
iSpyMacCas9
56
0





528
CCA
ACCTGCCCAGGGCCTCACCA
17444
SpyCas9-
57
0






SpRY







529
ACA
TGCTTACATTTGCTTCTGAC
17445
SpyCas9-
57
0






SpRY







530
ACAACTGT
tattGCTTACATTTGCTTCTGAC
17446
BlatCas9
57
0





531
CCAAC
ccaaCCTGCCCAGGGCCTCACCA
17447
BlatCas9
57
0





532
ACAAC
tattGCTTACATTTGCTTCTGAC
17448
BlatCas9
57
0





533
CCAACT
AACCTGCCCAGGGCCTCACCA
17449
cCas9-v16
57
0





534
CCAACT
AACCTGCCCAGGGCCTCACCA
17450
cCas9-v21
57
0





535
ACAACT
TTGCTTACATTTGCTTCTGAC
17451
cCas9-v16
57
0





536
ACAACT
TTGCTTACATTTGCTTCTGAC
17452
cCas9-v21
57
0





537
CCAACTT
CAACCTGCCCAGGGCCTCACCA
17453
CdiCas9
57
0





538
ACCAA
CAACCTGCCCAGGGCCTCACC
17454
SauCas9KKH
58
0





539
CACAA
ATTGCTTACATTTGCTTCTGA
17455
SauCas9KKH
58
0





540
CAC
TTGCTTACATTTGCTTCTGA
17456
SpyCas9-
58
0






SpRY







541
ACC
AACCTGCCCAGGGCCTCACC
17457
SpyCas9-
58
0






SpRY







542
ACCAAC
CAACCTGCCCAGGGCCTCACC
17458
cCas9-v17
58
0





543
ACCAAC
CAACCTGCCCAGGGCCTCACC
17459
cCas9-v42
58
0





544
CACAAC
ATTGCTTACATTTGCTTCTGA
17460
cCas9-v17
58
0





545
CACAAC
ATTGCTTACATTTGCTTCTGA
17461
cCas9-v42
58
0





546
CACA
TTGCTTACATTTGCTTCTGA
17462
SpyCas9-
58
0






3var-NRCH







547
CAC
CAACCTGCCCAGGGCCTCAC
17463
SpyCas9-
59
0






SpRY







548
ACA
ATTGCTTACATTTGCTTCTG
17464
SpyCas9-
59
0






SpRY







549
ACACAAC
CTATTGCTTACATTTGCTTCTG
17465
CdiCas9
59
0





550
CACC
CAACCTGCCCAGGGCCTCAC
17466
SpyCas9-
59
0






3var-NRCH







551
ACACAA
ATTGCTTACATTTGCTTCTG
17467
St1Cas9-
59
0






CNRZ1066







552
GAC
TATTGCTTACATTTGCTTCT
17468
SpyCas9-
60
0






SpRY







553
CCA
CCAACCTGCCCAGGGCCTCA
17469
SpyCas9-
60
0






SpRY







554
CCACC
atacCAACCTGCCCAGGGCCTCA
17470
BlatCas9
60
0





555
GACAC
atctATTGCTTACATTTGCTTCT
17471
BlatCas9
60
0





556
GACA
TATTGCTTACATTTGCTTCT
17472
SpyCas9-
60
0






3var-NRCH







557
ACCACC
tgATACCAACCTGCCCAGGGCCTC
17473
Nme2Cas9
61
0





558
TG
CTATTGCTTACATTTGCTTC
17474
SpyCas9-NG
61
0





559
TG
CTATTGCTTACATTTGCTTC
17475
SpyCas9-
61
0






xCas







560
TG
CTATTGCTTACATTTGCTTC
17476
SpyCas9-
61
0






xCas-NG







561
TGA
CTATTGCTTACATTTGCTTC
17477
SpyCas9-
61
0






SpG







562
TGA
CTATTGCTTACATTTGCTTC
17478
SpyCas9-
61
0






SpRY







563
ACC
ACCAACCTGCCCAGGGCCTC
17479
SpyCas9-
61
0






SpRY







564
ACCACCAA
gataCCAACCTGCCCAGGGCCTC
17480
BlatCas9
61
0





565
ACCACCAA
gataCCAACCTGCCCAGGGCCTC
17481
BlatCas9
61
0





566
ACCAC
gataCCAACCTGCCCAGGGCCTC
17482
BlatCas9
61
0





567
TGAC
CTATTGCTTACATTTGCTTC
17483
SpyCas9-
61
0






3var-NRRH







568
TGAC
CTATTGCTTACATTTGCTTC
17484
SpyCas9-
61
0






VQR







569
CTG
TCTATTGCTTACATTTGCTT
17485
ScaCas9
62
0





570
CTG
TCTATTGCTTACATTTGCTT
17486
ScaCas9-
62
0






HiFi-Sc++







571
CTG
TCTATTGCTTACATTTGCTT
17487
ScaCas9-
62
0






Sc++







572
CTG
TCTATTGCTTACATTTGCTT
17488
SpyCas9-
62
0






SpRY







573
CAC
TACCAACCTGCCCAGGGCCT
17489
SpyCas9-
62
0






SpRY







574
CTGAC
ccatCTATTGCTTACATTTGCTT
17490
BlatCas9
62
0





575
CTGACAC
CATCTATTGCTTACATTTGCTT
17491
CdiCas9
62
0





576
CACC
TACCAACCTGCCCAGGGCCT
17492
SpyCas9-
62
0






3var-NRCH







577
TCTGA
CATCTATTGCTTACATTTGCT
17493
SauCas9KKH
63
0





578
TCA
ATACCAACCTGCCCAGGGCC
17494
SpyCas9-
63
0






SpRY







579
TCT
ATCTATTGCTTACATTTGCT
17495
SpyCas9-
63
0






SpRY







580
TCACC
ttgaTACCAACCTGCCCAGGGCC
17496
BlatCas9
63
0





581
TCACCAC
TGATACCAACCTGCCCAGGGCC
17497
CdiCas9
63
0





582
TCTGACAC
gcCATCTATTGCTTACATTTGCT
17498
CjeCas9
63
0





583
CTCACC
ccTTGATACCAACCTGCCCAGGGC
17499
Nme2Cas9
64
0





584
CTC
GATACCAACCTGCCCAGGGC
17500
SpyCas9-
64
0






SpRY







585
TTC
CATCTATTGCTTACATTTGC
17501
SpyCas9-
64
0






SpRY







586
CTCAC
cttgATACCAACCTGCCCAGGGC
17502
BlatCas9
64
0





587
TTCTGACA
gagcCATCTATTGCTTACATTTGC
17503
NmeCas9
64
0





588
CCT
TGATACCAACCTGCCCAGGG
17504
SpyCas9-
65
0






SpRY







589
CTT
CCATCTATTGCTTACATTTG
17505
SpyCas9-
65
0






SpRY







590
GCC
TTGATACCAACCTGCCCAGG
17506
SpyCas9-
66
0






SpRY







591
GCT
GCCATCTATTGCTTACATTT
17507
SpyCas9-
66
0






SpRY







592
GCTTCTGA
agagCCATCTATTGCTTACATTT
17508
BlatCas9
66
0





593
GCCTC
acctTGATACCAACCTGCCCAGG
17509
BlatCas9
66
0





594
GCTTC
agagCCATCTATTGCTTACATTT
17510
BlatCas9
66
0





595
GG
CTTGATACCAACCTGCCCAG
17511
SpyCas9-NG
67
0





596
GG
CTTGATACCAACCTGCCCAG
17512
SpyCas9-
67
0






xCas







597
GG
CTTGATACCAACCTGCCCAG
17513
SpyCas9-
67
0






xCas-NG







598
TG
AGCCATCTATTGCTTACATT
17514
SpyCas9-NG
67
0





599
TG
AGCCATCTATTGCTTACATT
17515
SpyCas9-
67
0






xCas







600
TG
AGCCATCTATTGCTTACATT
17516
SpyCas9-
67
0






xCas-NG







601
GGC
CTTGATACCAACCTGCCCAG
17517
SpyCas9-
67
0






SpG







602
GGC
CTTGATACCAACCTGCCCAG
17518
SpyCas9-
67
0






SpRY







603
TGC
AGCCATCTATTGCTTACATT
17519
SpyCas9-
67
0






SpG







604
TGC
AGCCATCTATTGCTTACATT
17520
SpyCas9-
67
0






SpRY







605
GGCC
CTTGATACCAACCTGCCCAG
17521
SpyCas9-
67
0






3var-NRCH







606
TGCT
AGCCATCTATTGCTTACATT
17522
SpyCas9-
67
0






3var-NRCH







607
GGG
CCTTGATACCAACCTGCCCA
17523
ScaCas9
68
0





608
GGG
CCTTGATACCAACCTGCCCA
17524
ScaCas9-
68
0






HiFi-Sc++







609
GGG
CCTTGATACCAACCTGCCCA
17525
ScaCas9-
68
0






Sc++







610
GGG
CCTTGATACCAACCTGCCCA
17526
SpyCas9
68
0





611
GGG
CCTTGATACCAACCTGCCCA
17527
SpyCas9-
68
0






HF1







612
GGG
CCTTGATACCAACCTGCCCA
17528
SpyCas9-
68
0






SpG







613
GGG
CCTTGATACCAACCTGCCCA
17529
SpyCas9-
68
0






SpRY







614
TTG
GAGCCATCTATTGCTTACAT
17530
ScaCas9
68
0





615
TTG
GAGCCATCTATTGCTTACAT
17531
ScaCas9-
68
0






HiFi-Sc++







616
TTG
GAGCCATCTATTGCTTACAT
17532
ScaCas9-
68
0






Sc++







617
TTG
GAGCCATCTATTGCTTACAT
17533
SpyCas9-
68
0






SpRY







618
GG
CCTTGATACCAACCTGCCCA
17534
SpyCas9-NG
68
0





619
GG
CCTTGATACCAACCTGCCCA
17535
SpyCas9-
68
0






xCas







620
GG
CCTTGATACCAACCTGCCCA
17536
SpyCas9-
68
0






xCas-NG







621
GGGCC
taacCTTGATACCAACCTGCCCA
17537
BlatCas9
68
0





622
GGGCCTC
AACCTTGATACCAACCTGCCCA
17538
CdiCas9
68
0





623
TTGCTTC
CAGAGCCATCTATTGCTTACAT
17539
CdiCas9
68
0





624
GGGC
CCTTGATACCAACCTGCCCA
17540
SpyCas9-
68
0






3var-NRRH







625
AGGGCC
tgTAACCTTGATACCAACCTGCCC
17541
Nme2Cas9
69
0





626
AGGG
AACCTTGATACCAACCTGCCC
17542
SauriCas9
69
0





627
AGGG
AACCTTGATACCAACCTGCCC
17543
SauriCas9-
69
0






KKH







628
AGG
ACCTTGATACCAACCTGCCC
17544
ScaCas9
69
0





629
AGG
ACCTTGATACCAACCTGCCC
17545
ScaCas9-
69
0






HiFi-Sc++







630
AGG
ACCTTGATACCAACCTGCCC
17546
ScaCas9-
69
0






Sc++







631
AGG
ACCTTGATACCAACCTGCCC
17547
SpyCas9
69
0





632
AGG
ACCTTGATACCAACCTGCCC
17548
SpyCas9-
69
0






HF1







633
AGG
ACCTTGATACCAACCTGCCC
17549
SpyCas9-
69
0






SpG







634
AGG
ACCTTGATACCAACCTGCCC
17550
SpyCas9-
69
0






SpRY







635
AG
ACCTTGATACCAACCTGCCC
17551
SpyCas9-NG
69
0





636
AG
ACCTTGATACCAACCTGCCC
17552
SpyCas9-
69
0






xCas







637
AG
ACCTTGATACCAACCTGCCC
17553
SpyCas9-
69
0






xCas-NG







638
TTT
AGAGCCATCTATTGCTTACA
17554
SpyCas9-
69
0






SpRY







639
AGGGC
gtaaCCTTGATACCAACCTGCCC
17555
BlatCas9
69
0





640
TTTGC
ggcaGAGCCATCTATTGCTTACA
17556
BlatCas9
69
0





641
CAGGG
tgTAACCTTGATACCAACCTGCC
17557
SauCas9
70
0





642
CAGGG
TAACCTTGATACCAACCTGCC
17558
SauCas9KKH
70
0





643
CAGG
TAACCTTGATACCAACCTGCC
17559
SauriCas9
70
0





644
CAGG
TAACCTTGATACCAACCTGCC
17560
SauriCas9-
70
0






KKH







645
CAG
AACCTTGATACCAACCTGCC
17561
ScaCas9
70
0





646
CAG
AACCTTGATACCAACCTGCC
17562
ScaCas9-
70
0






HiFi-Sc++







647
CAG
AACCTTGATACCAACCTGCC
17563
ScaCas9-
70
0






Sc++







648
CAG
AACCTTGATACCAACCTGCC
17564
SpyCas9-
70
0






SpRY







649
ATT
CAGAGCCATCTATTGCTTAC
17565
SpyCas9-
70
0






SpRY







650
CAGGGC
TAACCTTGATACCAACCTGCC
17566
cCas9-v17
70
0





651
CAGGGC
TAACCTTGATACCAACCTGCC
17567
cCas9-v42
70
0





652
ATTTGCTT
agggCAGAGCCATCTATTGCTTAC
17568
NmeCas9
70
0





653
CCAGG
GTAACCTTGATACCAACCTGC
17569
SauCas9KKH
71
0





654
CCAG
GTAACCTTGATACCAACCTGC
17570
SauriCas9-
71
0






KKH







655
CAT
GCAGAGCCATCTATTGCTTA
17571
SpyCas9-
71
0






SpRY







656
CCA
TAACCTTGATACCAACCTGC
17572
SpyCas9-
71
0






SpRY







657
CCAGGG
GTAACCTTGATACCAACCTGC
17573
cCas9-v17
71
0





658
CCAGGG
GTAACCTTGATACCAACCTGC
17574
cCas9-v42
71
0





659
CATT
GCAGAGCCATCTATTGCTTA
17575
SpyCas9-
71
0






3var-NRTH







660
CCCAG
TGTAACCTTGATACCAACCTG
17576
SauCas9KKH
72
0





661
CCC
GTAACCTTGATACCAACCTG
17577
SpyCas9-
72
0






SpRY







662
ACA
GGCAGAGCCATCTATTGCTT
17578
SpyCas9-
72
0






SpRY







663
CCCAGG
TGTAACCTTGATACCAACCTG
17579
cCas9-v17
72
0





664
CCCAGG
TGTAACCTTGATACCAACCTG
17580
cCas9-v42
72
0





665
TAC
GGGCAGAGCCATCTATTGCT
17581
SpyCas9-
73
0






SpRY







666
GCC
TGTAACCTTGATACCAACCT
17582
SpyCas9-
73
0






SpRY







667
TACATT
AGGGCAGAGCCATCTATTGCT
17583
cCas9-v16
73
0





668
TACATT
AGGGCAGAGCCATCTATTGCT
17584
cCas9-v21
73
0





669
TACA
GGGCAGAGCCATCTATTGCT
17585
SpyCas9-
73
0






3var-NRCH







670
TTACATT
TCAGGGCAGAGCCATCTATTGC
17586
CdiCas9
74
0





671
TTACATT
agtCAGGGCAGAGCCATCTATTGC
17587
PpnCas9
74
0





672
TG
TTGTAACCTTGATACCAACC
17588
SpyCas9-NG
74
0





673
TG
TTGTAACCTTGATACCAACC
17589
SpyCas9-
74
0






xCas







674
TG
TTGTAACCTTGATACCAACC
17590
SpyCas9-
74
0






xCas-NG







675
TGC
TTGTAACCTTGATACCAACC
17591
SpyCas9-
74
0






SpG







676
TGC
TTGTAACCTTGATACCAACC
17592
SpyCas9-
74
0






SpRY







677
TTA
AGGGCAGAGCCATCTATTGC
17593
SpyCas9-
74
0






SpRY







678
TGCCCAGG
gtctTGTAACCTTGATACCAACC
17594
BlatCas9
74
0





679
TGCCC
gtctTGTAACCTTGATACCAACC
17595
BlatCas9
74
0





680
TGCC
TTGTAACCTTGATACCAACC
17596
SpyCas9-
74
0






3var-NRCH







681
CTGCCC
ctGTCTTGTAACCTTGATACCAAC
17597
Nme2Cas9
75
0





682
CTG
CTTGTAACCTTGATACCAAC
17598
ScaCas9
75
0





683
CTG
CTTGTAACCTTGATACCAAC
17599
ScaCas9-
75
0






HiFi-Sc++







684
CTG
CTTGTAACCTTGATACCAAC
17600
ScaCas9-
75
0






Sc++







685
CTG
CTTGTAACCTTGATACCAAC
17601
SpyCas9-
75
0






SpRY







686
CTT
CAGGGCAGAGCCATCTATTG
17602
SpyCas9-
75
0






SpRY







687
CTGCCCAG
tgtcTTGTAACCTTGATACCAAC
17603
BlatCas9
75
0





688
CTTACATT
agtcAGGGCAGAGCCATCTATTG
17604
BlatCas9
75
0





689
CTGCC
tgtcTTGTAACCTTGATACCAAC
17605
BlatCas9
75
0





690
CTTAC
agtcAGGGCAGAGCCATCTATTG
17606
BlatCas9
75
0





691
CCTGCC
ccTGTCTTGTAACCTTGATACCAA
17607
Nme2Cas9
76
0





692
CCT
TCTTGTAACCTTGATACCAA
17608
SpyCas9-
76
0






SpRY







693
GCT
TCAGGGCAGAGCCATCTATT
17609
SpyCas9-
76
0






SpRY







694
CCTGC
ctgtCTTGTAACCTTGATACCAA
17610
BlatCas9
76
0





695
TG
GTCAGGGCAGAGCCATCTAT
17611
SpyCas9-NG
77
0





696
TG
GTCAGGGCAGAGCCATCTAT
17612
SpyCas9-
77
0






xCas







697
TG
GTCAGGGCAGAGCCATCTAT
17613
SpyCas9-
77
0






xCas-NG







698
TGC
GTCAGGGCAGAGCCATCTAT
17614
SpyCas9-
77
0






SpG







699
TGC
GTCAGGGCAGAGCCATCTAT
17615
SpyCas9-
77
0






SpRY







700
ACC
GTCTTGTAACCTTGATACCA
17616
SpyCas9-
77
0






SpRY







701
TGCT
GTCAGGGCAGAGCCATCTAT
17617
SpyCas9-
77
0






3var-NRCH







702
TTG
AGTCAGGGCAGAGCCATCTA
17618
ScaCas9
78
0





703
TTG
AGTCAGGGCAGAGCCATCTA
17619
ScaCas9-
78
0






HiFi-Sc++







704
TTG
AGTCAGGGCAGAGCCATCTA
17620
ScaCas9-
78
0






Sc++







705
TTG
AGTCAGGGCAGAGCCATCTA
17621
SpyCas9-
78
0






SpRY







706
AAC
TGTCTTGTAACCTTGATACC
17622
SpyCas9-
78
0






SpRY







707
AACC
TGTCTTGTAACCTTGATACC
17623
SpyCas9-
78
0






3var-NRCH







708
CAA
CTGTCTTGTAACCTTGATAC
17624
SpyCas9-
79
0






SpRY







709
ATT
AAGTCAGGGCAGAGCCATCT
17625
SpyCas9-
79
0






SpRY







710
ATTGCTTA
taaaAGTCAGGGCAGAGCCATCT
17626
BlatCas9
79
0





711
CAACC
aaccTGTCTTGTAACCTTGATAC
17627
BlatCas9
79
0





712
ATTGC
taaaAGTCAGGGCAGAGCCATCT
17628
BlatCas9
79
0





713
CAAC
CTGTCTTGTAACCTTGATAC
17629
SpyCas9-
79
0






3var-NRRH







714
CAAC
ccTGTCTTGTAACCTTGATAC
17630
iSpyMacCas9
79
0





715
CCAACC
taAACCTGTCTTGTAACCTTGATA
17631
Nme2Cas9
80
0





716
TAT
AAAGTCAGGGCAGAGCCATC
17632
SpyCas9-
80
0






SpRY







717
CCA
CCTGTCTTGTAACCTTGATA
17633
SpyCas9-
80
0






SpRY







718
CCAACCTG
aaacCTGTCTTGTAACCTTGATA
17634
BlatCas9
80
0





719
CCAAC
aaacCTGTCTTGTAACCTTGATA
17635
BlatCas9
80
0





720
CCAACCT
AACCTGTCTTGTAACCTTGATA
17636
CdiCas9
80
0





721
TATTGCTT
cataAAAGTCAGGGCAGAGCCATC
17637
NmeCas9
80
0





722
TATT
AAAGTCAGGGCAGAGCCATC
17638
SpyCas9-
80
0






3var-NRTH







723
ACCAA
AACCTGTCTTGTAACCTTGAT
17639
SauCas9KKH
81
0





724
ACC
ACCTGTCTTGTAACCTTGAT
17640
SpyCas9-
81
0






SpRY







725
CTA
AAAAGTCAGGGCAGAGCCAT
17641
SpyCas9-
81
0






SpRY







726
ACCAAC
AACCTGTCTTGTAACCTTGAT
17642
cCas9-v17
81
0





727
ACCAAC
AACCTGTCTTGTAACCTTGAT
17643
cCas9-v42
81
0





728
TAC
AACCTGTCTTGTAACCTTGA
17644
SpyCas9-
82
0






SpRY







729
TCT
TAAAAGTCAGGGCAGAGCCA
17645
SpyCas9-
82
0






SpRY







730
TACC
AACCTGTCTTGTAACCTTGA
17646
SpyCas9-
82
0






3var-NRCH







731
ATCTATT
gggCATAAAAGTCAGGGCAGAGCC
17647
PpnCas9
83
0





732
ATA
AAACCTGTCTTGTAACCTTG
17648
SpyCas9-
83
0






SpRY







733
ATC
ATAAAAGTCAGGGCAGAGCC
17649
SpyCas9-
83
0






SpRY







734
ATACC
cttaAACCTGTCTTGTAACCTTG
17650
BlatCas9
83
0





735
GATACC
tcCTTAAACCTGTCTTGTAACCTT
17651
Nme2Cas9
84
0





736
GAT
TAAACCTGTCTTGTAACCTT
17652
SpyCas9-
84
0






SpRY







737
GAT
TAAACCTGTCTTGTAACCTT
17653
SpyCas9-
84
0






xCas







738
CAT
CATAAAAGTCAGGGCAGAGC
17654
SpyCas9-
84
0






SpRY







739
GATACCAA
ccttAAACCTGTCTTGTAACCTT
17655
BlatCas9
84
0





740
GATACCAA
ccttAAACCTGTCTTGTAACCTT
17656
BlatCas9
84
0





741
GATAC
ccttAAACCTGTCTTGTAACCTT
17657
BlatCas9
84
0





742
GATA
TAAACCTGTCTTGTAACCTT
17658
SpyCas9-
84
0






3var-NRTH







743
CATC
CATAAAAGTCAGGGCAGAGC
17659
SpyCas9-
84
0






3var-NRTH







744
TG
TTAAACCTGTCTTGTAACCT
17660
SpyCas9-NG
85
0





745
TG
TTAAACCTGTCTTGTAACCT
17661
SpyCas9-
85
0






xCas







746
TG
TTAAACCTGTCTTGTAACCT
17662
SpyCas9-
85
0






xCas-NG







747
TGA
TTAAACCTGTCTTGTAACCT
17663
SpyCas9-
85
0






SpG







748
TGA
TTAAACCTGTCTTGTAACCT
17664
SpyCas9-
85
0






SpRY







749
CCA
GCATAAAAGTCAGGGCAGAG
17665
SpyCas9-
85
0






SpRY







750
CCATCTAT
tgggCATAAAAGTCAGGGCAGAG
17666
BlatCas9
85
0





751
CCATC
tgggCATAAAAGTCAGGGCAGAG
17667
BlatCas9
85
0





752
TGATACC
CCTTAAACCTGTCTTGTAACCT
17668
CdiCas9
85
0





753
TGAT
TTAAACCTGTCTTGTAACCT
17669
SpyCas9-
85
0






3var-NRRH







754
TGAT
TTAAACCTGTCTTGTAACCT
17670
SpyCas9-
85
0






VQR







755
TTG
CTTAAACCTGTCTTGTAACC
17671
ScaCas9
86
0





756
TTG
CTTAAACCTGTCTTGTAACC
17672
ScaCas9-
86
0






HiFi-Sc++







757
TTG
CTTAAACCTGTCTTGTAACC
17673
ScaCas9-
86
0






Sc++







758
TTG
CTTAAACCTGTCTTGTAACC
17674
SpyCas9-
86
0






SpRY







759
GCC
GGCATAAAAGTCAGGGCAGA
17675
SpyCas9-
86
0






SpRY







760
TTGATAC
TCCTTAAACCTGTCTTGTAACC
17676
CdiCas9
86
0





761
CTTGA
TCCTTAAACCTGTCTTGTAAC
17677
SauCas9KKH
87
0





762
CTTGAT
TCCTTAAACCTGTCTTGTAAC
17678
SauCas9KKH
87
0





763
AG
GGGCATAAAAGTCAGGGCAG
17679
SpyCas9-NG
87
0





764
AG
GGGCATAAAAGTCAGGGCAG
17680
SpyCas9-
87
0






xCas







765
AG
GGGCATAAAAGTCAGGGCAG
17681
SpyCas9-
87
0






xCas-NG







766
AGC
GGGCATAAAAGTCAGGGCAG
17682
SpyCas9-
87
0






SpG







767
AGC
GGGCATAAAAGTCAGGGCAG
17683
SpyCas9-
87
0






SpRY







768
CTT
CCTTAAACCTGTCTTGTAAC
17684
SpyCas9-
87
0






SpRY







769
CTTGATAC
tcTCCTTAAACCTGTCTTGTAAC
17685
CjeCas9
87
0





770
AGCC
GGGCATAAAAGTCAGGGCAG
17686
SpyCas9-
87
0






3var-NRCH







771
GAG
TGGGCATAAAAGTCAGGGCA
17687
ScaCas9
88
0





772
GAG
TGGGCATAAAAGTCAGGGCA
17688
ScaCas9-
88
0






HiFi-Sc++







773
GAG
TGGGCATAAAAGTCAGGGCA
17689
ScaCas9-
88
0






Sc++







774
GAG
TGGGCATAAAAGTCAGGGCA
17690
SpyCas9-
88
0






SpRY







775
CCT
TCCTTAAACCTGTCTTGTAA
17691
SpyCas9-
88
0






SpRY







776
GAGCC
ggctGGGCATAAAAGTCAGGGCA
17692
BlatCas9
88
0





777
GAGCCAT
GCTGGGCATAAAAGTCAGGGCA
17693
CdiCas9
88
0





778
CCTTGATA
ggtcTCCTTAAACCTGTCTTGTAA
17694
NmeCas9
88
0





779
GAGC
TGGGCATAAAAGTCAGGGCA
17695
SpyCas9-
88
0






3var-NRRH







780
AGAGCC
agGGCTGGGCATAAAAGTCAGGGC
17696
Nme2Cas9
89
0





781
AGAG
GCTGGGCATAAAAGTCAGGGC
17697
SauriCas9-
89
0






KKH







782
AGAG
CTGGGCATAAAAGTCAGGGC
17698
SpyCas9-
89
0






VQR







783
AG
CTGGGCATAAAAGTCAGGGC
17699
SpyCas9-NG
89
0





784
AG
CTGGGCATAAAAGTCAGGGC
17700
SpyCas9-
89
0






xCas







785
AG
CTGGGCATAAAAGTCAGGGC
17701
SpyCas9-
89
0






xCas-NG







786
AGA
CTGGGCATAAAAGTCAGGGC
17702
SpyCas9-
89
0






SpG







787
AGA
CTGGGCATAAAAGTCAGGGC
17703
SpyCas9-
89
0






SpRY







788
ACC
CTCCTTAAACCTGTCTTGTA
17704
SpyCas9-
89
0






SpRY







789
AGAGCCAT
gggcTGGGCATAAAAGTCAGGGC
17705
BlatCas9
89
0





790
AGAGC
gggcTGGGCATAAAAGTCAGGGC
17706
BlatCas9
89
0





791
CAGAG
agGGCTGGGCATAAAAGTCAGGG
17707
SauCas9
90
0





792
CAGAG
GGCTGGGCATAAAAGTCAGGG
17708
SauCas9KKH
90
0





793
CAG
GCTGGGCATAAAAGTCAGGG
17709
ScaCas9
90
0





794
CAG
GCTGGGCATAAAAGTCAGGG
17710
ScaCas9-
90
0






HiFi-Sc++







795
CAG
GCTGGGCATAAAAGTCAGGG
17711
ScaCas9-
90
0






Sc++







796
CAG
GCTGGGCATAAAAGTCAGGG
17712
SpyCas9-
90
0






SpRY







797
AAC
TCTCCTTAAACCTGTCTTGT
17713
SpyCas9-
90
0






SpRY







798
CAGAGC
GGCTGGGCATAAAAGTCAGGG
17714
cCas9-v17
90
0





799
CAGAGC
GGCTGGGCATAAAAGTCAGGG
17715
cCas9-v42
90
0





800
CAGA
GCTGGGCATAAAAGTCAGGG
17716
SpyCas9-
90
0






3var-NRRH







801
AACC
TCTCCTTAAACCTGTCTTGT
17717
SpyCas9-
90
0






3var-NRCH







802
GCAGA
GGGCTGGGCATAAAAGTCAGG
17718
SauCas9KKH
91
0





803
GCAG
GGGCTGGGCATAAAAGTCAGG
17719
SauriCas9-
91
0






KKH







804
TAA
GTCTCCTTAAACCTGTCTTG
17720
SpyCas9-
91
0






SpRY







805
GCA
GGCTGGGCATAAAAGTCAGG
17721
SpyCas9-
91
0






SpRY







806
TAACCTTG
ttggTCTCCTTAAACCTGTCTTG
17722
BlatCas9
91
0





807
TAACC
ttggTCTCCTTAAACCTGTCTTG
17723
BlatCas9
91
0





808
GCAGAG
GGGCTGGGCATAAAAGTCAGG
17724
cCas9-v17
91
0





809
GCAGAG
GGGCTGGGCATAAAAGTCAGG
17725
cCas9-v42
91
0





810
TAACCTT
TGGTCTCCTTAAACCTGTCTTG
17726
CdiCas9
91
0





811
TAAC
GTCTCCTTAAACCTGTCTTG
17727
SpyCas9-
91
0






3var-NRRH







812
TAAC
ggTCTCCTTAAACCTGTCTTG
17728
iSpyMacCas9
91
0





813
GTAACC
taTTGGTCTCCTTAAACCTGTCTT
17729
Nme2Cas9
92
0





814
GGCAG
AGGGCTGGGCATAAAAGTCAG
17730
SauCas9KKH
92
0





815
GG
GGGCTGGGCATAAAAGTCAG
17731
SpyCas9-NG
92
0





816
GG
GGGCTGGGCATAAAAGTCAG
17732
SpyCas9-
92
0






xCas







817
GG
GGGCTGGGCATAAAAGTCAG
17733
SpyCas9-
92
0






xCas-NG







818
GGC
GGGCTGGGCATAAAAGTCAG
17734
SpyCas9-
92
0






SpG







819
GGC
GGGCTGGGCATAAAAGTCAG
17735
SpyCas9-
92
0






SpRY







820
GTA
GGTCTCCTTAAACCTGTCTT
17736
SpyCas9-
92
0






SpRY







821
GTAACCTT
attgGTCTCCTTAAACCTGTCTT
17737
BlatCas9
92
0





822
GTAAC
attgGTCTCCTTAAACCTGTCTT
17738
BlatCas9
92
0





823
GGCAGA
AGGGCTGGGCATAAAAGTCAG
17739
cCas9-v17
92
0





824
GGCAGA
AGGGCTGGGCATAAAAGTCAG
17740
cCas9-v42
92
0





825
GTAACCT
TTGGTCTCCTTAAACCTGTCTT
17741
CdiCas9
92
0





826
GGCA
GGGCTGGGCATAAAAGTCAG
17742
SpyCas9-
92
0






3var-NRCH







827
TGTAA
TTGGTCTCCTTAAACCTGTCT
17743
SauCas9KKH
93
0





828
GGG
AGGGCTGGGCATAAAAGTCA
17744
ScaCas9
93
0





829
GGG
AGGGCTGGGCATAAAAGTCA
17745
ScaCas9-
93
0






HiFi-Sc++







830
GGG
AGGGCTGGGCATAAAAGTCA
17746
ScaCas9-
93
0






Sc++







831
GGG
AGGGCTGGGCATAAAAGTCA
17747
SpyCas9
93
0





832
GGG
AGGGCTGGGCATAAAAGTCA
17748
SpyCas9-
93
0






HF1







833
GGG
AGGGCTGGGCATAAAAGTCA
17749
SpyCas9-
93
0






SpG







834
GGG
AGGGCTGGGCATAAAAGTCA
17750
SpyCas9-
93
0






SpRY







835
TG
TGGTCTCCTTAAACCTGTCT
17751
SpyCas9-NG
93
0





836
TG
TGGTCTCCTTAAACCTGTCT
17752
SpyCas9-
93
0






xCas







837
TG
TGGTCTCCTTAAACCTGTCT
17753
SpyCas9-
93
0






xCas-NG







838
GG
AGGGCTGGGCATAAAAGTCA
17754
SpyCas9-NG
93
0





839
GG
AGGGCTGGGCATAAAAGTCA
17755
SpyCas9-
93
0






xCas







840
GG
AGGGCTGGGCATAAAAGTCA
17756
SpyCas9-
93
0






xCas-NG







841
TGT
TGGTCTCCTTAAACCTGTCT
17757
SpyCas9-
93
0






SpG







842
TGT
TGGTCTCCTTAAACCTGTCT
17758
SpyCas9-
93
0






SpRY







843
GGGC
AGGGCTGGGCATAAAAGTCA
17759
SpyCas9-
93
0






3var-NRRH







844
TGTA
TGGTCTCCTTAAACCTGTCT
17760
SpyCas9-
93
0






3var-NRTH







845
AGGG
CCAGGGCTGGGCATAAAAGTC
17761
SauriCas9
94
0





846
AGGG
CCAGGGCTGGGCATAAAAGTC
17762
SauriCas9-
94
0






KKH







847
TTG
TTGGTCTCCTTAAACCTGTC
17763
ScaCas9
94
0





848
TTG
TTGGTCTCCTTAAACCTGTC
17764
ScaCas9-
94
0






HiFi-Sc++







849
TTG
TTGGTCTCCTTAAACCTGTC
17765
ScaCas9-
94
0






Sc++







850
TTG
TTGGTCTCCTTAAACCTGTC
17766
SpyCas9-
94
0






SpRY







851
AGG
CAGGGCTGGGCATAAAAGTC
17767
ScaCas9
94
0





852
AGG
CAGGGCTGGGCATAAAAGTC
17768
ScaCas9-
94
0






HiFi-Sc++







853
AGG
CAGGGCTGGGCATAAAAGTC
17769
ScaCas9-
94
0






Sc++







854
AGG
CAGGGCTGGGCATAAAAGTC
17770
SpyCas9
94
0





855
AGG
CAGGGCTGGGCATAAAAGTC
17771
SpyCas9-
94
0






HF1







856
AGG
CAGGGCTGGGCATAAAAGTC
17772
SpyCas9-
94
0






SpG







857
AGG
CAGGGCTGGGCATAAAAGTC
17773
SpyCas9-
94
0






SpRY







858
AG
CAGGGCTGGGCATAAAAGTC
17774
SpyCas9-NG
94
0





859
AG
CAGGGCTGGGCATAAAAGTC
17775
SpyCas9-
94
0






xCas







860
AG
CAGGGCTGGGCATAAAAGTC
17776
SpyCas9-
94
0






xCas-NG







861
AGGGCAGA
agccAGGGCTGGGCATAAAAGTC
17777
BlatCas9
94
0





862
AGGGC
agccAGGGCTGGGCATAAAAGTC
17778
BlatCas9
94
0





863
TTGTAAC
TATTGGTCTCCTTAAACCTGTC
17779
CdiCas9
94
0





864
CAGGG
gaGCCAGGGCTGGGCATAAAAGT
17780
SauCas9
95
0





865
CAGGG
GCCAGGGCTGGGCATAAAAGT
17781
SauCas9KKH
95
0





866
CAGG
GCCAGGGCTGGGCATAAAAGT
17782
SauriCas9
95
0





867
CAGG
GCCAGGGCTGGGCATAAAAGT
17783
SauriCas9-
95
0






KKH







868
CAG
CCAGGGCTGGGCATAAAAGT
17784
ScaCas9
95
0





869
CAG
CCAGGGCTGGGCATAAAAGT
17785
ScaCas9-
95
0






HiFi-Sc++







870
CAG
CCAGGGCTGGGCATAAAAGT
17786
ScaCas9-
95
0






Sc++







871
CAG
CCAGGGCTGGGCATAAAAGT
17787
SpyCas9-
95
0






SpRY







872
CTT
ATTGGTCTCCTTAAACCTGT
17788
SpyCas9-
95
0






SpRY







873
CAGGGC
GCCAGGGCTGGGCATAAAAGT
17789
cCas9-v17
95
0





874
CAGGGC
GCCAGGGCTGGGCATAAAAGT
17790
cCas9-v42
95
0





875
TCAGG
AGCCAGGGCTGGGCATAAAAG
17791
SauCas9KKH
96
0





876
TCAG
AGCCAGGGCTGGGCATAAAAG
17792
SauriCas9-
96
0






KKH







877
TCT
TATTGGTCTCCTTAAACCTG
17793
SpyCas9-
96
0






SpRY







878
TCA
GCCAGGGCTGGGCATAAAAG
17794
SpyCas9-
96
0






SpRY







879
TCAGGG
AGCCAGGGCTGGGCATAAAAG
17795
cCas9-v17
96
0





880
TCAGGG
AGCCAGGGCTGGGCATAAAAG
17796
cCas9-v42
96
0





881
GTCAG
GAGCCAGGGCTGGGCATAAAA
17797
SauCas9KKH
97
0





882
GTC
CTATTGGTCTCCTTAAACCT
17798
SpyCas9-
97
0






SpRY







883
GTC
AGCCAGGGCTGGGCATAAAA
17799
SpyCas9-
97
0






SpRY







884
GTCAGG
GAGCCAGGGCTGGGCATAAAA
17800
cCas9-v17
97
0





885
GTCAGG
GAGCCAGGGCTGGGCATAAAA
17801
cCas9-v42
97
0





886
TG
TCTATTGGTCTCCTTAAACC
17802
SpyCas9-NG
98
0





887
TG
TCTATTGGTCTCCTTAAACC
17803
SpyCas9-
98
0






xCas







888
TG
TCTATTGGTCTCCTTAAACC
17804
SpyCas9-
98
0






xCas-NG







889
AG
GAGCCAGGGCTGGGCATAAA
17805
SpyCas9-NG
98
0





890
AG
GAGCCAGGGCTGGGCATAAA
17806
SpyCas9-
98
0






xCas







891
AG
GAGCCAGGGCTGGGCATAAA
17807
SpyCas9-
98
0






xCas-NG







892
TGT
TCTATTGGTCTCCTTAAACC
17808
SpyCas9-
98
0






SpG







893
TGT
TCTATTGGTCTCCTTAAACC
17809
SpyCas9-
98
0






SpRY







894
AGT
GAGCCAGGGCTGGGCATAAA
17810
SpyCas9-
98
0






SpG







895
AGT
GAGCCAGGGCTGGGCATAAA
17811
SpyCas9-
98
0






SpRY







896
TGTC
TCTATTGGTCTCCTTAAACC
17812
SpyCas9-
98
0






3var-NRTH







897
AGTC
GAGCCAGGGCTGGGCATAAA
17813
SpyCas9-
98
0






3var-NRTH







898
CTG
TTCTATTGGTCTCCTTAAAC
17814
ScaCas9
99
0





899
CTG
TTCTATTGGTCTCCTTAAAC
17815
ScaCas9-
99
0






HiFi-Sc++







900
CTG
TTCTATTGGTCTCCTTAAAC
17816
ScaCas9-
99
0






Sc++







901
CTG
TTCTATTGGTCTCCTTAAAC
17817
SpyCas9-
99
0






SpRY







902
AAG
GGAGCCAGGGCTGGGCATAA
17818
ScaCas9
99
0





903
AAG
GGAGCCAGGGCTGGGCATAA
17819
ScaCas9-
99
0






HiFi-Sc++







904
AAG
GGAGCCAGGGCTGGGCATAA
17820
ScaCas9-
99
0






Sc++







905
AAG
GGAGCCAGGGCTGGGCATAA
17821
SpyCas9-
99
0






SpRY







906
CTGTCTTG
agttTCTATTGGTCTCCTTAAAC
17822
BlatCas9
99
0





907
AAGTCAGG
gcagGAGCCAGGGCTGGGCATAA
17823
BlatCas9
99
0





908
CTGTC
agttTCTATTGGTCTCCTTAAAC
17824
BlatCas9
99
0





909
AAGTC
gcagGAGCCAGGGCTGGGCATAA
17825
BlatCas9
99
0





910
CTGTCTT
GTTTCTATTGGTCTCCTTAAAC
17826
CdiCas9
99
0





911
AAGT
GGAGCCAGGGCTGGGCATAA
17827
SpyCas9-
99
0






3var-NRRH







912
AAAG
CAGGAGCCAGGGCTGGGCATA
17828
SauriCas9-
100
0






KKH







913
AAAG
AGGAGCCAGGGCTGGGCATA
17829
SpyCas9-
100
0






QQR1







914
AAAG
caGGAGCCAGGGCTGGGCATA
17830
iSpyMacCas9
100
0





915
AAA
AGGAGCCAGGGCTGGGCATA
17831
SpyCas9-
100
0






SpRY







916
CCT
TTTCTATTGGTCTCCTTAAA
17832
SpyCas9-
100
0






SpRY










In the exemplary template sequences provided herein, capital letters indicate “core nucleotides” while lower case letters indicate “flanking nucleotides.” Herein, when an RNA sequence (e.g., a template RNA sequence) is said to comprise a particular sequence (e.g., a sequence of Table 1 or a portion thereof) that comprises thymine (T), it is of course understood that the RNA sequence may (and frequently does) comprise uracil (U) in place of T. For instance, the RNA sequence may comprise U at every position shown as T in the sequence in Table 1. More specifically, the present disclosure provides an RNA sequence according to every gRNA spacer sequence shown in Table 1, wherein the RNA sequence has a U in place of each T in the sequence in Table 1.


In some embodiments of the systems and methods herein, the heterologous object sequence comprises the core nucleotides of an RT template sequence from Table 3. In some embodiments, the heterologous object sequence additionally comprises one or more (e.g., 2, 3, 4, 5, 10, 20, 30, 40, or all) consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the RT template sequence. In some embodiments, the heterologous object sequence comprises the core nucleotides of the RT template sequence of Table 3 that corresponds to the gRNA spacer sequence. In the context of the sequence tables, a first component “corresponds to” a second component when both components have the same ID number in the referenced table. For example, for a gRNA spacer of ID #1, the corresponding RT template would be the RT template also having ID #1. In some embodiments, the heterologous object sequence additionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the RT template sequence.


In some embodiments, the primer binding site (PBS) sequence has a sequence comprising the core nucleotides of a PBS sequence from the same row of Table 3 as the RT template sequence. In some embodiments, the PBS sequence additionally comprises one or more (e.g., 1, 2, 3, 4, 5, 6, 7, or all) consecutive nucleotides starting with the 5′ end of the flanking nucleotides of the primer region.


Table 3: Exemplary RT Sequence (Heterologous Object Sequence) and PBS Sequence Pairs

Table 3 provides exemplified PBS sequences and heterologous object sequences (reverse transcription template regions) of a template RNA for correcting the pathogenic EV6 mutation in HBB. The gRNA spacers from Table 1 were filtered, e.g., filtered by occurrence within 15 nt of the desired editing location and use of a Tier 1 Cas enzyme. PBS sequences and heterologous object sequences (reverse transcription template regions) were designed relative to the nick site directed by the cognate gRNA from Table 1, as described in this application. For exemplification, these regions were designed to be 8-17 nt (priming) and 1-50 nt extended beyond the location of the edit (RT). Without wishing to be limited by example, given variability of length, sequences are provided that use the maximum length parameters and comprise all templates of shorter length within the given parameters. Sequences are shown with uppercase letters indicating core sequence and lowercase letters indicating flanking sequence that may be truncated within the described length parameters.


















SEQ

SEQ




ID

ID


ID
RT Template Sequence
NO
PBS Sequence
NO



















1
cttcatccacgttcaccttgcccca
17833
CAGGAGTCagatgcacc
18010



cagggcagtaacggcagacttctcC






T








2
cttcatccacgttcaccttgcccca
17834
CAGGAGTCagatgcacc
18011



cagggcagtaacggcagacttctcC






T








5
cttcatccacgttcaccttgcccca
17835
CAGGAGTCagatgcacc
18012



cagggcagtaacggcagacttctcC






T








9
cttcatccacgttcaccttgcccca
17836
CAGGAGTCagatgcacc
18013



cagggcagtaacggcagacttctcC






T








13
cttcatccacgttcaccttgcccca
17837
AGGAGTCAgatgcacca
18014



cagggcagtaacggcagacttctcC






TC








14
cttcatccacgttcaccttgcccca
17838
AGGAGTCAgatgcacca
18015



cagggcagtaacggcagacttctcC






TC








15
ctgtgttcactagcaacctcaaaca
17839
GAGAAGTCtgccgttac
18016



gacaccatggtgcatctgactcctG






AG








16
ctgtgttcactagcaacctcaaaca
17840
GAGAAGTCtgccgttac
18017



gacaccatggtgcatctgactcctG






AG








17
ctgtgttcactagcaacctcaaaca
17841
GAGAAGTCtgccgttac
18018



gacaccatggtgcatctgactcctG






AG








18
ctgtgttcactagcaacctcaaaca
17842
GAGAAGTCtgccgttac
18019



gacaccatggtgcatctgactcctG






AG








23
cttcatccacgttcaccttgcccca
17843
AGGAGTCAgatgcacca
18020



cagggcagtaacggcagacttctcC






TC








24
cttcatccacgttcaccttgcccca
17844
AGGAGTCAgatgcacca
18021



cagggcagtaacggcagacttctcC






TC








27
ctgtgttcactagcaacctcaaaca
17845
GAGAAGTCtgccgttac
18022



gacaccatggtgcatctgactcctG






AG








28
ctgtgttcactagcaacctcaaaca
17846
GAGAAGTCtgccgttac
18023



gacaccatggtgcatctgactcctG






AG








31
ctgtgttcactagcaacctcaaaca
17847
GAGAAGTCtgccgttac
18024



gacaccatggtgcatctgactcctG






AG








32
ctgtgttcactagcaacctcaaaca
17848
GAGAAGTCtgccgttac
18025



gacaccatggtgcatctgactcctG






AG








39
ctgtgttcactagcaacctcaaaca
17849
AGAAGTCTgccgttact
18026



gacaccatggtgcatctgactcctG






AGG








40
ctgtgttcactagcaacctcaaaca
17850
AGAAGTCTgccgttact
18027



gacaccatggtgcatctgactectG






AGG








41
cttcatccacgttcaccttgcccca
17851
GGAGTCAGatgcaccat
18028



cagggcagtaacggcagacttctcC






TCA








42
ctgtgttcactagcaacctcaaaca
17852
AGAAGTCTgccgttact
18029



gacaccatggtgcatctgactectG






AGG








43
ctgtgttcactagcaacctcaaaca
17853
AGAAGTCTgccgttact
18030



gacaccatggtgcatctgactcctG






AGG








44
cttcatccacgttcaccttgcccca
17854
GGAGTCAGatgcaccat
18031



cagggcagtaacggcagacttctcC






TCA








48
ctgtgttcactagcaacctcaaaca
17855
AGAAGTCTgccgttact
18032



gacaccatggtgcatctgactcctG






AGG








49
ctgtgttcactagcaacctcaaaca
17856
AGAAGTCTgccgttact
18033



gacaccatggtgcatctgactcctG






AGG








50
cttcatccacgttcaccttgcccca
17857
GGAGTCAGatgcaccat
18034



cagggcagtaacggcagacttctcC






TCA








54
cttcatccacgttcaccttgcccca
17858
GGAGTCAGatgcaccat
18035



cagggcagtaacggcagacttctcC






TCA








59
cttcatccacgttcaccttgcccca
17859
GAGTCAGAtgcaccatg
18036



cagggcagtaacggcagacttctcC






TCAG








60
cttcatccacgttcaccttgcccca
17860
GAGTCAGAtgcaccatg
18037



cagggcagtaacggcagacttctcC






TCAG








61
ctgtgttcactagcaacctcaaaca
17861
GAAGTCTGccgttactg
18038



gacaccatggtgcatctgactcctG






AGGA








62
ctgtgttcactagcaacctcaaaca
17862
GAAGTCTGccgttactg
18039



gacaccatggtgcatctgactcctG






AGGA








65
cttcatccacgttcaccttgcccca
17863
GAGTCAGAtgcaccatg
18040



cagggcagtaacggcagacttctcC






TCAG








66
cttcatccacgttcaccttgcccca
17864
GAGTCAGAtgcaccatg
18041



cagggcagtaacggcagacttctcC






TCAG








69
cttcatccacgttcaccttgcccca
17865
GAGTCAGAtgcaccatg
18042



cagggcagtaacggcagacttctcC






TCAG








70
cttcatccacgttcaccttgcccca
17866
GAGTCAGAtgcaccatg
18043



cagggcagtaacggcagacttctcC






TCAG








73
ctgtgttcactagcaacctcaaaca
17867
GAAGTCTGccgttactg
18044



gacaccatggtgcatctgactcctG






AGGA








79
cttcatccacgttcaccttgcccca
17868
AGTCAGATgcaccatgg
18045



cagggcagtaacggcagacttctcC






TCAGG








80
cttcatccacgttcaccttgcccca
17869
AGTCAGATgcaccatgg
18046



cagggcagtaacggcagacttctcC






TCAGG








81
ctgtgttcactagcaacctcaaaca
17870
AAGTCTGCcgttactgc
18047



gacaccatggtgcatctgactcctG






AGGAG








82
cttcatccacgttcaccttgcccca
17871
AGTCAGATgcaccatgg
18048



cagggcagtaacggcagacttctcC






TCAGG








83
cttcatccacgttcaccttgcccca
17872
AGTCAGATgcaccatgg
18049



cagggcagtaacggcagacttctcC






TCAGG








86
cttcatccacgttcaccttgcccca
17873
AGTCAGATgcaccatgg
18050



cagggcagtaacggcagacttctcC






TCAGG








87
cttcatccacgttcaccttgcccca
17874
AGTCAGATgcaccatgg
18051



cagggcagtaacggcagacttctcC






TCAGG








88
ctgtgttcactagcaacctcaaaca
17875
AAGTCTGCcgttactgc
18052



gacaccatggtgcatctgactcctG






AGGAG








94
cttcatccacgttcaccttgcccca
17876
GTCAGATGcaccatggt
18053



cagggcagtaacggcagacttctcC






TCAGGA








95
cttcatccacgttcaccttgcccca
17877
GTCAGATGcaccatggt
18054



cagggcagtaacggcagacttctcC






TCAGGA








99
cttcatccacgttcaccttgcccca
17878
GTCAGATGcaccatggt
18055



cagggcagtaacggcagacttctcC






TCAGGA








100
ctgtgttcactagcaacctcaaaca
17879
AGTCTGCCgttactgcc
18056



gacaccatggtgcatctgactcctG






AGGAGA








103
cttcatccacgttcaccttgcccca
17880
TCAGATGCaccatggtg
18057



cagggcagtaacggcagacttctcC






TCAGGAG








104
cttcatccacgttcaccttgcccca
17881
TCAGATGCaccatggtg
18058



cagggcagtaacggcagacttctcC






TCAGGAG








105
ctgtgttcactagcaacctcaaaca
17882
GTCTGCCGttactgccc
18059



gacaccatggtgcatctgactcctG






AGGAGAA








106
ctgtgttcactagcaacctcaaaca
17883
GTCTGCCGttactgCCC
18060



gacaccatggtgcatctgactcctG






AGGAGAA








107
ctgtgttcactagcaacctcaaaca
17884
GTCTGCCGttactgccc
18061



gacaccatggtgcatctgactcctG






AGGAGAA








108
ctgtgttcactagcaacctcaaaca
17885
TCTGCCGTtactgccct
18062



gacaccatggtgcatctgactcctG






AGGAGAAG








109
cttcatccacgttcaccttgcccca
17886
CAGATGCAccatggtgt
18063



cagggcagtaacggcagacttctcC






TCAGGAGT








110
ctgtgttcactagcaacctcaaaca
17887
CTGCCGTTactgccctg
18064



gacaccatggtgcatctgactcctG






AGGAGAAGT








111
cttcatccacgttcaccttgcccca
17888
AGATGCACcatggtgtc
18065



cagggcagtaacggcagacttctcC






TCAGGAGTC








112
ctgtgttcactagcaacctcaaaca
17889
CTGCCGTTactgccctg
18066



gacaccatggtgcatctgactcctG






AGGAGAAGT








113
ctgtgttcactagcaacctcaaaca
17890
TGCCGTTActgccctgt
18067



gacaccatggtgcatctgactcctG






AGGAGAAGTC








114
ctgtgttcactagcaacctcaaaca
17891
TGCCGTTActgccctgt
18068



gacaccatggtgcatctgactcctG






AGGAGAAGTC








115
cttcatccacgttcaccttgcccca
17892
GATGCACCatggtgtct
18069



cagggcagtaacggcagacttctcC






TCAGGAGTCA








116
ctgtgttcactagcaacctcaaaca
17893
TGCCGTTActgccctgt
18070



gacaccatggtgcatctgactcctG






AGGAGAAGTC








117
ctgtgttcactagcaacctcaaaca
17894
GCCGTTACtgccctgtg
18071



gacaccatggtgcatctgactcctG






AGGAGAAGTCT








118
cttcatccacgttcaccttgcccca
17895
ATGCACCAtggtgtctg
18072



cagggcagtaacggcagacttctcC






TCAGGAGTCAG








119
cttcatccacgttcaccttgcccca
17896
ATGCACCAtggtgtctg
18073



cagggcagtaacggcagacttctcC






TCAGGAGTCAG








120
cttcatccacgttcaccttgcccca
17897
ATGCACCAtggtgtctg
18074



cagggcagtaacggcagacttctcC






TCAGGAGTCAG








121
cttcatccacgttcaccttgcccca
17898
TGCACCATggtgtctgt
18075



cagggcagtaacggcagacttctcC






TCAGGAGTCAGA








122
cttcatccacgttcaccttgcccca
17899
TGCACCATggtgtctgt
18076



cagggcagtaacggcagacttctcC






TCAGGAGTCAGA








123
ctgtgttcactagcaacctcaaaca
17900
CCGTTACTgccctgtgg
18077



gacaccatggtgcatctgactcctG






AGGAGAAGTCTG








124
cttcatccacgttcaccttgcccca
17901
TGCACCATggtgtctgt
18078



cagggcagtaacggcagacttctcC






TCAGGAGTCAGA








125
ctgtgttcactagcaacctcaaaca
17902
CCGTTACTgccctgtgg
18079



gacaccatggtgcatctgactcctG






AGGAGAAGTCTG








126
cttcatccacgttcaccttgcccca
17903
TGCACCATggtgtctgt
18080



cagggcagtaacggcagacttctcC






TCAGGAGTCAGA








128
cttcatccacgttcaccttgcccca
17904
GCACCATGgtgtctgtt
18081



cagggcagtaacggcagacttctcC






TCAGGAGTCAGAT








131
ctgtgttcactagcaacctcaaaca
17905
CGTTACTGccctgtggg
18082



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGC








133
cttcatccacgttcaccttgcccca
17906
GCACCATGgtgtctgtt
18083



cagggcagtaacggcagacttctcC






TCAGGAGTCAGAT








140
cttcatccacgttcaccttgcccca
17907
CACCATGGtgtctgttt
18084



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATG








141
cttcatccacgttcaccttgcccca
17908
CACCATGGtgtctgttt
18085



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATG








142
ctgtgttcactagcaacctcaaaca
17909
GTTACTGCcctgtgggg
18086



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCC








146
ctgtgttcactagcaacctcaaaca
17910
GTTACTGCcctgtgggg
18087



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCC








147
cttcatccacgttcaccttgcccca
17911
CACCATGGtgtctgttt
18088



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATG








154
cttcatccacgttcaccttgcccca
17912
ACCATGGTgtctgtttg
18089



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGC








157
ctgtgttcactagcaacctcaaaca
17913
TTACTGCCctgtggggc
18090



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCG








158
ctgtgttcactagcaacctcaaaca
17914
TTACTGCCctgtggggc
18091



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCG








159
cttcatccacgttcaccttgcccca
17915
ACCATGGTgtctgtttg
18092



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGC








160
ctgtgttcactagcaacctcaaaca
17916
TTACTGCCctgtggggc
18093



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCG








165
ctgtgttcactagcaacctcaaaca
17917
TACTGCCCtgtggggca
18094



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGT








166
ctgtgttcactagcaacctcaaaca
17918
TACTGCCCtgtggggca
18095



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGT








167
ctgtgttcactagcaacctcaaaca
17919
TACTGCCCtgtggggca
18096



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGT








168
cttcatccacgttcaccttgcccca
17920
CCATGGTGtctgtttga
18097



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCA








172
ctgtgttcactagcaacctcaaaca
17921
ACTGCCCTgtggggcaa
18098



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTT








173
ctgtgttcactagcaacctcaaaca
17922
ACTGCCCTgtggggcaa
18099



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTT








177
ctgtgttcactagcaacctcaaaca
17923
ACTGCCCTgtggggcaa
18100



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTT








178
cttcatccacgttcaccttgcccca
17924
CATGGTGTctgtttgag
18101



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCAC








186
ctgtgttcactagcaacctcaaaca
17925
CTGCCCTGtggggcaag
18102



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTA








187
ctgtgttcactagcaacctcaaaca
17926
CTGCCCTGtggggcaag
18103



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTA








190
ctgtgttcactagcaacctcaaaca
17927
CTGCCCTGtggggcaag
18104



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTA








191
ctgtgttcactagcaacctcaaaca
17928
CTGCCCTGtggggcaag
18105



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTA








194
cttcatccacgttcaccttgcccca
17929
ATGGTGTCtgtttgagg
18106



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACC








195
cttcatccacgttcaccttgcccca
17930
ATGGTGTCtgtttgagg
18107



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACC








196
cttcatccacgttcaccttgcccca
17931
ATGGTGTCtgtttgagg
18108



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACC








198
ctgtgttcactagcaacctcaaaca
17932
TGCCCTGTggggcaagg
18109



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTAC








199
ctgtgttcactagcaacctcaaaca
17933
TGCCCTGTggggcaagg
18110



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTAC








202
ctgtgttcactagcaacctcaaaca
17934
TGCCCTGTggggcaagg
18111



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTAC








203
ctgtgttcactagcaacctcaaaca
17935
TGCCCTGTggggcaagg
18112



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTAC








204
cttcatccacgttcaccttgcccca
17936
TGGTGTCTgtttgaggt
18113



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCA








208
cttcatccacgttcaccttgcccca
17937
TGGTGTCTgtttgaggt
18114



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCA








209
ctgtgttcactagcaacctcaaaca
17938
TGCCCTGTggggcaagg
18115



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTAC








210
ctgtgttcactagcaacctcaaaca
17939
TGCCCTGTggggcaagg
18116



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTAC








212
ctgtgttcactagcaacctcaaaca
17940
GCCCTGTGgggcaaggt
18117



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACT








215
cttcatccacgttcaccttgcccca
17941
GGTGTCTGtttgaggtt
18118



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCAT








216
cttcatccacgttcaccttgcccca
17942
GGTGTCTGtttgaggtt
18119



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCAT








217
ctgtgttcactagcaacctcaaaca
17943
GCCCTGTGgggcaaggt
18120



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACT








221
cttcatccacgttcaccttgcccca
17944
GTGTCTGTttgaggttg
18121



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATG








224
ctgtgttcactagcaacctcaaaca
17945
CCCTGTGGggcaaggtg
18122



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTG








226
cttcatccacgttcaccttgcccca
17946
GTGTCTGTttgaggttg
18123



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATG








227
cttcatccacgttcaccttgcccca
17947
GTGTCTGTttgaggttg
18124



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATG








232
cttcatccacgttcaccttgcccca
17948
TGTCTGTTtgaggttgc
18125



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGG








233
cttcatccacgttcaccttgcccca
17949
TGTCTGTTtgaggttgc
18126



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGG








236
cttcatccacgttcaccttgcccca
17950
TGTCTGTTtgaggttgc
18127



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGG








237
cttcatccacgttcaccttgcccca
17951
TGTCTGTTtgaggttgc
18128



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGG








240
ctgtgttcactagcaacctcaaaca
17952
CCTGTGGGgcaaggtga
18129



gacaccatggtgcatctgactectG






AGGAGAAGTCTGCCGTTACTGC








241
ctgtgttcactagcaacctcaaaca
17953
CCTGTGGGgcaaggtga
18130



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGC








243
ctgtgttcactagcaacctcaaaca
17954
CTGTGGGGcaaggtgaa
18131



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCC








244
cttcatccacgttcaccttgcccca
17955
GTCTGTTTgaggttgct
18132



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGGT








245
cttcatccacgttcaccttgcccca
17956
GTCTGTTTgaggttgct
18133



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGGT








248
cttcatccacgttcaccttgcccca
17957
GTCTGTTTgaggttgct
18134



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGGT








249
cttcatccacgttcaccttgcccca
17958
GTCTGTTTgaggttgct
18135



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGGT








250
ctgtgttcactagcaacctcaaaca
17959
CTGTGGGGcaaggtgaa
18136



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCC








254
ctgtgttcactagcaacctcaaaca
17960
CTGTGGGGcaaggtgaa
18137



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCC








258
cttcatccacgttcaccttgcccca
17961
TCTGTTTGaggttgcta
18138



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGGTG








259
cttcatccacgttcaccttgcccca
17962
TCTGTTTGaggttgcta
18139



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGGTG








262
ctgtgttcactagcaacctcaaaca
17963
TGTGGGGCaaggtgaac
18140



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCC








263
ctgtgttcactagcaacctcaaaca
17964
TGTGGGGCaaggtgaac
18141



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCC








264
cttcatccacgttcaccttgcccca
17965
TCTGTTTGaggttgcta
18142



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGGTG








267
ctgtgttcactagcaacctcaaaca
17966
GTGGGGCAaggtgaacg
18143



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT








268
ctgtgttcactagcaacctcaaaca
17967
GTGGGGCAaggtgaacg
18144



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT








269
cttcatccacgttcaccttgcccca
17968
CTGTTTGAggttgctag
18145



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGGTGT








270
ctgtgttcactagcaacctcaaaca
17969
TGGGGCAAggtgaacgt
18146



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






G








271
ctgtgttcactagcaacctcaaaca
17970
TGGGGCAAggtgaacgt
18147



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






G








274
ctgtgttcactagcaacctcaaaca
17971
TGGGGCAAggtgaacgt
18148



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






G








278
ctgtgttcactagcaacctcaaaca
17972
TGGGGCAAggtgaacgt
18149



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






G








279
cttcatccacgttcaccttgcccca
17973
TGTTTGAGgttgctagt
18150



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGGTGT






C








283
ctgtgttcactagcaacctcaaaca
17974
GGGGCAAGgtgaacgtg
18151



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GT








284
ctgtgttcactagcaacctcaaaca
17975
GGGGCAAGgtgaacgtg
18152



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GT








287
ctgtgttcactagcaacctcaaaca
17976
GGGGCAAGgtgaacgtg
18153



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GT








288
ctgtgttcactagcaacctcaaaca
17977
GGGGCAAGgtgaacgtg
18154



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GT








291
cttcatccacgttcaccttgcccca
17978
GTTTGAGGttgctagtg
18155



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGGTGT






CT








294
ctgtgttcactagcaacctcaaaca
17979
GGGCAAGGtgaacgtgg
18156



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GTG








295
ctgtgttcactagcaacctcaaaca
17980
GGGCAAGGtgaacgtgg
18157



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GTG








298
ctgtgttcactagcaacctcaaaca
17981
GGGCAAGGtgaacgtgg
18158



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GTG








299
ctgtgttcactagcaacctcaaaca
17982
GGGCAAGGtgaacgtgg
18159



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GTG








302
ctgtgttcactagcaacctcaaaca
17983
GGGCAAGGtgaacgtgg
18160



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GTG








303
ctgtgttcactagcaacctcaaaca
17984
GGGCAAGGtgaacgtgg
18161



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GTG








306
cttcatccacgttcaccttgcccca
17985
TTTGAGGTtgctagtga
18162



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGGTGT






CTG








307
ctgtgttcactagcaacctcaaaca
17986
GGGCAAGGtgaacgtgg
18163



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GTG








308
cttcatccacgttcaccttgcccca
17987
TTTGAGGTtgctagtga
18164



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGGTGT






CTG








309
ctgtgttcactagcaacctcaaaca
17988
GGGCAAGGtgaacgtgg
18165



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GTG








310
cttcatccacgttcaccttgcccca
17989
TTTGAGGTtgctagtga
18166



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGGTGT






CTG








312
cttcatccacgttcaccttgcccca
17990
TTGAGGTTgctagtgaa
18167



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGGTGT






CTGT








313
ctgtgttcactagcaacctcaaaca
17991
GGCAAGGTgaacgtgga
18168



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GTGG








314
ctgtgttcactagcaacctcaaaca
17992
GGCAAGGTgaacgtgga
18169



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GTGG








315
ctgtgttcactagcaacctcaaaca
17993
GGCAAGGTgaacgtgga
18170



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GTGG








316
ctgtgttcactagcaacctcaaaca
17994
GGCAAGGTgaacgtgga
18171



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GTGG








319
ctgtgttcactagcaacctcaaaca
17995
GGCAAGGTgaacgtgga
18172



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GTGG








320
ctgtgttcactagcaacctcaaaca
17996
GGCAAGGTgaacgtgga
18173



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GTGG








321
cttcatccacgttcaccttgcccca
17997
TTGAGGTTgctagtgaa
18174



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGGTGT






CTGT








322
cttcatccacgttcaccttgcccca
17998
TTGAGGTTgctagtgaa
18175



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGGTGT






CTGT








323
cttcatccacgttcaccttgcccca
17999
TTGAGGTTgctagtgaa
18176



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGGTGT






CTGT








327
ctgtgttcactagcaacctcaaaca
18000
GCAAGGTGaacgtggat
18177



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GTGGG








328
ctgtgttcactagcaacctcaaaca
18001
GCAAGGTGaacgtggat
18178



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GTGGG








329
cttcatccacgttcaccttgcccca
18002
TGAGGTTGctagtgaac
18179



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGGTGT






CTGTT








333
cttcatccacgttcaccttgcccca
18003
TGAGGTTGctagtgaac
18180



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGGTGT






CTGTT








334
ctgtgttcactagcaacctcaaaca
18004
GCAAGGTGaacgtggat
18181



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GTGGG








340
ctgtgttcactagcaacctcaaaca
18005
CAAGGTGAacgtggatg
18182



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GTGGGG








343
cttcatccacgttcaccttgcccca
18006
GAGGTTGCtagtgaaca
18183



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGGTGT






CTGTTT








344
cttcatccacgttcaccttgcccca
18007
GAGGTTGCtagtgaaca
18184



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGGTGT






CTGTTT








345
ctgtgttcactagcaacctcaaaca
18008
CAAGGTGAacgtggatg
18185



gacaccatggtgcatctgactcctG






AGGAGAAGTCTGCCGTTACTGCCCT






GTGGGG








346
cttcatccacgttcaccttgcccca
18009
GAGGTTGCtagtgaaca
18186



cagggcagtaacggcagacttctcC






TCAGGAGTCAGATGCACCATGGTGT






CTGTTT









Capital letters indicate “core nucleotides” while lower case letters indicate “flanking nucleotides.” Herein, when an RNA sequence (e.g., a template RNA sequence) is said to comprise a particular sequence (e.g., a sequence of Table 3 or a portion thereof) that comprises thymine (T), it is of course understood that the RNA sequence may (and frequently does) comprise uracil (U) in place of T. For instance, the RNA sequence may comprise U at every position shown as T in the sequence in Table 3. More specifically, the present disclosure provides an RNA sequence according to every heterologous object sequence and PBS sequence shown in Table 3, wherein the RNA sequence has a U in place of each T in the sequence of Table 3.


In some embodiments of the systems and methods herein, the template RNA comprises a gRNA scaffold (e.g., that binds a gene modifying polypeptide, e.g., a Cas polypeptide) that comprises a sequence of a gRNA scaffold of Table 12. In some embodiments, the gRNA scaffold comprises a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to a gRNA scaffold of Table 12. In some embodiments, the gRNA scaffold comprises a sequence of a scaffold region of Table 12 that corresponds to the RT template sequence, the spacer sequence, or both, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto.


In some embodiments of the systems and methods herein, the system further comprises a second strand-targeting gRNA that directs a nick to the second strand of the human HBB gene. In some embodiments, the second strand-targeting gRNA comprises a left gRNA spacer sequence or a right gRNA spacer sequence from Table 2. In some embodiments, the gRNA spacer additionally comprises one or more (e.g., 2, 3, or all) consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the left gRNA spacer sequence or right gRNA spacer sequence. In some embodiments, the second strand-targeting gRNA comprises a sequence comprising the core nucleotides of a second nick gRNA sequence from Table 4, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto. In some embodiments, the second nick gRNA sequence additionally comprises one or more consecutive nucleotides starting with the 3′ end of the flanking nucleotides of the second nick gRNA sequence. In some embodiments, the second nick gRNA comprises a gRNA scaffold sequence that is orthogonal to the Cas domain of the gene modifying polypeptide. In some embodiments, the second nick gRNA comprises a gRNA scaffold sequence of Table 12.









TABLE 2







Exemplary left gRNA spacer and right


gRNA spacer pairs


Table 2 provides exemplified second-nick gRNA


species for optional use for correcting the


pathogenic E6V mutation in HBB. The gRNA spacers


from Table 1 were filtered, e.g., filtered by


occurrence within 15 nt of the desired editing


location and use of a Tier 1 Cas enzyme. Second-


nick gRNAs were generated by searching the


opposite strand of DNA in the regions −40 to


−140 (“left”) and +40 to +140 (“right”),


relative to the first nick site defined by


the first gRNA, for the PAM utilized by the


corresponding Cas variant. One exemplary


spacer is shown for each side of the


target nick site.














Left 
SEQ

Right
SEQ




gRNAs
ID
Left
gRNA
ID
Right


ID
pacer
NO
PAM
spacer
NO
PAM
















1
GCCCAGTTTC
18187
TTAAA
GGCTCTGCCC
18541
CCCAG



TATTGGTCTC


TGACTTTTAT





C


G







2
GCCCAGTTTC
18188
TTAAA
GGCTCTGCCC
18542
CCCAG



TATTGGTCTC


TGACTTTTAT





C


G







5
TCTATTGGTC
18189
TG
CTGCCCTGAC
18543
AG



TCCTTAAACC


TTTTATGCCC







9
TTCTATTGGT
18190
CTG
TCTGCCCTGA
18544
CAG



CTCCTTAAAC


CTTTTATGCC







13
tgTAACCTTG
18191
CAGGG
ccTGGCTCCT
18545
CTGGG



ATACCAACCT


GCCCTCCCTG





GCC


CTC







14
TTGGTCTCCT
18192
TGTAA
GGCTCTGCCC
18546
CCCAG



TAAACCTGTC


TGACTTTTAT





T


G







15
atCAAGGTTA
18193
AGGAG
gaGCCAGGGC
18547
CAGGG



CAAGACAGGT


TGGGCATAAA





TTA


AGT







16
TTACAAGACA
18194
ACCAA
GAGCCAGGGC
18548
GTCAG



GGTTTAAGGA


TGGGCATAAA





G


A







17
atCAAGGTTA
18195
AGGAG
gaGCCAGGGC
18549
CAGGG



CAAGACAGGT


TGGGCATAAA





TTA


AGT







18
TTACAAGACA
18196
ACCAA
GAGCCAGGGC
18550
GTCAG



GGTTTAAGGA


TGGGCATAAA





G


A







23
TTCTATTGGT
18197
CTG
TCTGCCCTGA
18551
CAG



CTCCTTAAAC


CTTTTATGCC







24
TCTATTGGTC
18198
TGT
CTGCCCTGAC
18552
AGC



TCCTTAAACC


TTTTATGCCC







27
GGTTACAAGA
18199
GAG
GGAGCCAGGG
18553
AAG



CAGGTTTAAG


CTGGGCATAA







28
AAGGTTACAA
18200
AGG
CAGGGCTGGG
18554
AGG



GACAGGTTTA


CATAAAAGTC







31
GTTACAAGAC
18201
AGA
GAGCCAGGGC
18555
AGT



AGGTTTAAGG


TGGGCATAAA







32
GTTACAAGAC
18202
AG
GAGCCAGGGC
18556
AG



AGGTTTAAGG


TGGGCATAAA







39
atCAAGGTTA
18203
AGGAG
gaGCCAGGGC
18557
CAGGG



CAAGACAGGT


TGGGCATAAA





TTA


AGT







40
TTACAAGACA
18204
ACCAA
GAGCCAGGGC
18558
GTCAG



GGTTTAAGGA


TGGGCATAAA





G


A







41
TTGGTCTCCT
18205
TGTAA
GCCCTGACTT
18559
CCTGG



TAAACCTGTC


TTATGCCCAG





T


C







42
TCAAGGTTAC
18206
AAGG
GCCAGGGCTG
18560
CAGG



AAGACAGGTT


GGCATAAAAG





T


T







43
AAGGTTACAA
18207
GGAG
AGCCAGGGCT
18561
TCAG



GACAGGTTTA


GGGCATAAAA





A


G







44
TCTCCACATG
18208
TTGG
CCCTGACTTT
18562
CTGG



CCCAGTTTCT


TATGCCCAGC





A


C







48
GGTTACAAGA
18209
GAG
CCAGGGCTGG
18563
CAG



CAGGTTTAAG


GCATAAAAGT







49
TTACAAGACA
18210
GAC
AGCCAGGGCT
18564
GTC



GGTTTAAGGA


GGGCATAAAA







50
TCTATTGGTC
18211
TG
CTGCCCTGAC
18565
AG



TCCTTAAACC


TTTTATGCCC







54
CTATTGGTCT
18212
GTC
TGCCCTGACT
18566
GCC



CCTTAAACCT


TTTATGCCCA







59
tgTAACCTTG
18213
CAGGG
ccTGGCTCCT
18567
CTGGG



ATACCAACCT


GCCCTCCCTG





GCC


CTC







60
TTGGTCTCCT
18214
TGTAA
GCCCTGACTT
18568
CCTGG



TAAACCTGTC


TTATGCCCAG





T


C







61
TTACAAGACA
18215
ACCAA
AGCCAGGGCT
18569
TCAGG



GGTTTAAGGA


GGGCATAAAA





G


G







62
AAGACAGGTT
18216
ATAG
AGCCAGGGCT
18570
TCAG



TAAGGAGACC


GGGCATAAAA





A


G







65
TTGGTCTCCT
18217
TTG
CCTGACTTTT
18571
CTG



TAAACCTGTC


ATGCCCAGCC







66
TCCACATGCC
18218
TGG
CTGACTTTTA
18572
TGG



CAGTTTCTAT


TGCCCAGCCC







69
TATTGGTCTC
18219
TCT
GCCCTGACTT
18573
CCC



CTTAAACCTG


TTATGCCCAG







70
TCTATTGGTC
18220
TG
CTGCCCTGAC
18574
AG



TCCTTAAACC


TTTTATGCCC







73
TACAAGACAG
18221
ACC
GCCAGGGCTG
18575
TCA



GTTTAAGGAG


GGCATAAAAG







79
tgTAACCTTG
18222
CAGGG
ccTGGCTCCT
18576
CTGGG



ATACCAACCT


GCCCTCCCTG





GCC


CTC







80
TTGGTCTCCT
18223
TGTAA
GCCCTGACTT
18577
CCTGG



TAAACCTGTC


TTATGCCCAG





T


C







81
TTACAAGACA
18224
ACCAA
GCCAGGGCTG
18578
CAGGG



GGTTTAAGGA


GGCATAAAAG





G


T







82
TCTCCACATG
18225
TTGG
CCCTGACTTT
18579
CTGG



CCCAGTTTCT


TATGCCCAGC





A


C







83
TCTCCACATG
18226
TTGG
CCCTGACTTT
18580
CTGG



CCCAGTTTCT


TATGCCCAGC





A


C







86
TTGGTCTCCT
18227
TTG
CCTGACTTTT
18581
CTG



TAAACCTGTC


ATGCCCAGCC







87
ATTGGTCTCC
18228
CTT
CCCTGACTTT
18582
CCT



TTAAACCTGT


TATGCCCAGC







88
ACAAGACAGG
18229
CCA
CCAGGGCTGG
18583
CAG



TTTAAGGAGA


GCATAAAAGT







94
TTGGTCTCCT
18230
TGTAA
GCCCTGACTT
18584
CCTGG



TAAACCTGTC


TTATGCCCAG





T


C







95
TGGTCTCCTT
18231
TG
CTGACTTTTA
18585
TG



AAACCTGTCT


TGCCCAGCCC







99
TTGGTCTCCT
18232
TTG
CCTGACTTTT
18586
CTG



TAAACCTGTC


ATGCCCAGCC







100
CAAGACAGGT
18233
CAA
CAGGGCTGGG
18587
AGG



TTAAGGAGAC


CATAAAAGTC







103
TTGGTCTCCT
18234
TTG
CTGACTTTTA
18588
TGG



TAAACCTGTC


TGCCCAGCCC







104
TGGTCTCCTT
18235
TGT
CTGACTTTTA
18589
TGG



AAACCTGTCT


TGCCCAGCCC







105
AAGACAGGTT
18236
AAT
AGGGCTGGGC
18590
GGG



TAAGGAGACC


ATAAAAGTCA







106
agacAGGTTT
18237
GAA
agccAGGGCT
18591
AGG



AAGGAGACCA

ACT
GGGCATAAAA

GCA



ATA

GG
GTC

GA





107
agacAGGTTT
18238
GAA
agccAGGGCT
18592
AGG



AAGGAGACCA

ACT
GGGCATAAAA

GCA



ATA

GG
GTC

GA





108
AGACAGGTTT
18239
ATA
GGGCTGGGCA
18593
GGC



AAGGAGACCA


TAAAAGTCAG







109
GGTCTCCTTA
18240
GTA
TGACTTTTAT
18594
GGC



AACCTGTCTT


GCCCAGCCCT







110
GACAGGTTTA
18241
TAG
GGCTGGGCAT
18595
GCA



AGGAGACCAA


AAAAGTCAGG







111
GTCTCCTTAA
18242
TAA
GACTTTTATG
18596
GCT



ACCTGTCTTG


CCCAGCCCTG







112
agacAGGTTT
18243
GAA
gggcTGGGCA
18597
AGA



AAGGAGACCA

ACT
TAAAAGTCAG

GC



ATA

GG
GGC







113
tcAAGGTTAC
18244
GAG
agGGCTGGGC
18598
AGA



AAGACAGGTT

ACC
ATAAAAGTCA

GCC



TAAG


GGGC







114
ACAGGTTTAA
18245
AGA
GCTGGGCATA
18599
CAG



GGAGACCAAT


AAAGTCAGGG







115
TCTCCTTAAA
18246
AAC
ACTTTTATGC
18600
CTC



CCTGTCTTGT


CCAGCCCTGG







116
agacAGGTTT
18247
GAA
gggcTGGGCA
18601
AGA



AAGGAGACCA

ACT
TAAAAGTCAG

GC



ATA

GG
GGC







117
CAGGTTTAAG
18248
GAA
CTGGGCATAA
18602
AGA



GAGACCAATA


AAGTCAGGGC







118
CTCCTTAAAC
18249
ACC
CTTTTATGCC
18603
TCC



CTGTCTTGTA


CAGCCCTGGC







119
ttggTCTCCT
18250
TAA
gactTTTATG
18604
CCT



TAAACCTGTC

CCT
CCCAGCCCTG

GC



TTG

TG
GCT







120
ttggTCTCCT
18251
TAA
gactTTTATG
18605
CCT



TAAACCTGTC

CCT
CCCAGCCCTG

GC



TTG

TG
GCT







121
taTTGGTCTC
18252
GTA
tgACTTTTAT
18606
CCT



CTTAAACCTG

ACC
GCCCAGCCCT

GCC



TCTT


GGCT







122
TCCTTAAACC
18253
CCT
TTTTATGCCC
18607
CCT



TGTCTTGTAA


AGCCCTGGCT







123
AGGTTTAAGG
18254
AAA
TGGGCATAAA
18608
GAG



AGACCAATAG


AGTCAGGGCA







124
ttggTCTCCT
18255
TAA
gactTTTATG
18609
CCT



TAAACCTGTC

CCT
CCCAGCCCTG

GC



TTG

TG
GCT







125
agacAGGTTT
18256
GAA
ggctGGGCAT
18610
GAG



AAGGAGACCA

ACT
AAAAGTCAGG

CC



ATA

GG
GCA







126
ttggTCTCCT
18257
TAA
gactTTTATG
18611
CCT



TAAACCTGTC

CCT
CCCAGCCCTG

GC



TTG

TG
GCT







128
TTAAACCTGT
18258
TG
TTATGCCCAG
18612
TG



CTTGTAACCT


CCCTGGCTCC







131
GGTTTAAGGA
18259
AAC
GGGCATAAAA
18613
AGC



GACCAATAGA


GTCAGGGCAG







133
CCTTAAACCT
18260
CTT
TTTATGCCCA
18614
CTG



GTCTTGTAAC


GCCCTGGCTC







140
CTTAAACCTG
18261
TTG
TTTATGCCCA
18615
CTG



TCTTGTAACC


GCCCTGGCTC







141
CTTAAACCTG
18262
TTG
TTATGCCCAG
18616
TGC



TCTTGTAACC


CCCTGGCTCC







142
TTAAGGAGAC
18263
TG
GGGCATAAAA
18617
AG



CAATAGAAAC


GTCAGGGCAG







146
GTTTAAGGAG
18264
ACT
GGCATAAAAG
18618
GCC



ACCAATAGAA


TCAGGGCAGA







147
ccttAAACCT
18265
GAT
ctttTATGCC
18619
TGCCC



GTCTTGTAAC

ACC
CAGCCCTGGC





CTT

AA
TCC







154
TCCTTAAACC
18266
CTT
GCCCTGACTT
18620
CCTGG



TGTCTTGTAA

GAT
TTATGCCCAG





C


C







157
TTTAAGGAGA
18267
CTG
TGGGCATAAA
18621
GAG



CCAATAGAAA


AGTCAGGGCA







158
TTTAAGGAGA
18268
CTG
GCATAAAAGT
18622
CCA



CCAATAGAAA


CAGGGCAGAG







159
TTAAACCTGT
18269
TGA
TATGCCCAGC
18623
GCC



CTTGTAACCT


CCTGGCTCCT







160
ggttTAAGGA
18270
TGG
tgggCATAAA
18624
CCA



GACCAATAGA

GC
AGTCAGGGCA

TC



AAC


GAG







165
GTTTAAGGAG
18271
CTG
GGCTGGGCAT
18625
CAG



ACCAATAGAA

GG
AAAAGTCAGG

AG



A


G







166
TTTAAGGAGA
18272
TGGG
GCTGGGCATA
18626
AGAG



CCAATAGAAA


AAAGTCAGGG





C


C







167
TTAAGGAGAC
18273
TGG
CATAAAAGTC
18627
CAT



CAATAGAAAC


AGGGCAGAGC







168
TAAACCTGTC
18274
GAT
ATGCCCAGCC
18628
CCC



TTGTAACCTT


CTGGCTCCTG







172
GTTTAAGGAG
18275
CTG
GGCTGGGCAT
18629
CAG



ACCAATAGAA

GG
AAAAGTCAGG

AG



A


G







173
TAAGGAGACC
18276
GG
GGGCATAAAA
18630
AG



AATAGAAACT


GTCAGGGCAG







177
TAAGGAGACC
18277
GGG
ATAAAAGTCA
18631
ATC



AATAGAAACT


GGGCAGAGCC







178
AAACCTGTCT
18278
ATA
TGCCCAGCCC
18632
CCT



TGTAACCTTG


TGGCTCCTGC







186
TAAGGAGACC
18279
GGG
AGTCAGGGCA
18633
TTG



AATAGAAACT


GAGCCATCTA







187
TAAGGAGACC
18280
GGG
AGGGCTGGGC
18634
GGG



AATAGAAACT


ATAAAAGTCA







190
AAGGAGACCA
18281
GGC
TAAAAGTCAG
18635
TCT



ATAGAAACTG


GGCAGAGCCA







191
AAGGAGACCA
18282
GG
GTCAGGGCAG
18636
TG



ATAGAAACTG


AGCCATCTAT







194
AACCTGTCTT
18283
TAC
GCCCAGCCCT
18637
CTC



GTAACCTTGA


GGCTCCTGCC







195
cttaAACCTG
18284
ATA
tatgCCCAGC
18638
CTC



TCTTGTAACC

CC
CCTGGCTCCT

CC



TTG


GCC







196
cttaAACCTG
18285
ATA
tatgCCCAGC
18639
CTC



TCTTGTAACC

CC
CCTGGCTCCT

CC



TTG


GCC







198
TTTAAGGAGA
18286
TGGG
CCAGGGCTGG
18640
AGGG



CCAATAGAAA


GCATAAAAGT





C


C







199
TTTAAGGAGA
18287
TGGG
GCTGGGCATA
18641
AGAG



CCAATAGAAA


AAAGTCAGGG





C


C







202
TAAGGAGACC
18288
GGG
AGTCAGGGCA
18642
TTG



AATAGAAACT


GAGCCATCTA







203
AGGAGACCAA
18289
GCA
AAAAGTCAGG
18643
CTA



TAGAAACTGG


GCAGAGCCAT







204
TTAAACCTGT
18290
TG
GCCCTGGCTC
18644
TG



CTTGTAACCT


CTGCCCTCCC







208
ACCTGTCTTG
18291
ACC
CCCAGCCCTG
18645
TCC



TAACCTTGAT


GCTCCTGCCC







209
ggttTAAGGA
18292
TGG
taaaAGTCAG
18646
ATT



GACCAATAGA

GC
GGCAGAGCCA

GC



AAC


TCT







210
ggttTAAGGA
18293
TGG
taaaAGTCAG
18647
ATT



GACCAATAGA

GC
GGCAGAGCCA

GC



AAC


TCT







212
GAGACCAATA
18294
TGT
GGCTGGGCAT
18648
CAG



GAAACTGGGC

GG
AAAAGTCAGG

AG



A


G







215
CTTGTAACCT
18295
CTG
AGCCCTGGCT
18649
CTG



TGATACCAAC


CCTGCCCTCC







216
CCTGTCTTGT
18296
CCA
CCAGCCCTGG
18650
CCC



AACCTTGATA


CTCCTGCCCT







217
GGAGACCAAT
18297
CAT
AAAGTCAGGG
18651
TAT



AGAAACTGGG


CAGAGCCATC







221
TTGTAACCTT
18298
TG
GCCCTGGCTC
18652
TG



GATACCAACC


CTGCCCTCCC







224
GAGACCAATA
18299
ATG
AAGTCAGGGC
18653
ATT



GAAACTGGGC


AGAGCCATCT







226
CTGTCTTGTA
18300
CAA
CAGCCCTGGC
18654
CCT



ACCTTGATAC


TCCTGCCCTC







227
aaccTGTCTT
18301
CAA
gcccAGCCCT
18655
CCTGC



GTAACCTTGA

CC
GGCTCCTGCC





TAC


CTC







232
CTTGTAACCT
18302
CTG
AGCCCTGGCT
18656
CTG



TGATACCAAC


CCTGCCCTCC







233
ACCTTGATAC
18303
AGG
GCTCCTGCCC
18657
TGG



CAACCTGCCC


TCCCTGCTCC







236
TGTCTTGTAA
18304
AAC
AGCCCTGGCT
18658
CTG



CCTTGATACC


CCTGCCCTCC







237
TTGTAACCTT
18305
TG
GCCCTGGCTC
18659
TG



GATACCAACC


CTGCCCTCCC







240
AGACCAATAG
18306
TGT
AGTCAGGGCA
18660
TTG



AAACTGGGCA


GAGCCATCTA







241
gaccAATAGA
18307
GAG
taaaAGTCAG
18661
ATTGC



AACTGGGCAT

ACA
GGCAGAGCCA





GTG

GA
TCT







243
AGACCAATAG
18308
GTGGA
GGCTGGGCAT
18662
CAGAG



AAACTGGGCA


AAAAGTCAGG





T


G







244
TAACCTTGAT
18309
CAGG
TGGCTCCTGC
18663
CTGG



ACCAACCTGC


CCTCCCTGCT





C


C







245
GTAACCTTGA
18310
CCAG
TGGCTCCTGC
18664
CTGG



TACCAACCTG


CCTCCCTGCT





C


C







248
CTTGTAACCT
18311
CTG
AGCCCTGGCT
18665
CTG



TGATACCAAC


CCTGCCCTCC







249
GTCTTGTAAC
18312
ACC
GCCCTGGCTC
18666
TGC



CTTGATACCA


CTGCCCTCCC







250
AGACCAATAG
18313
TG
GTCAGGGCAG
18667
TG



AAACTGGGCA


AGCCATCTAT







254
GACCAATAGA
18314
GTG
GTCAGGGCAG
18668
TGC



AACTGGGCAT


AGCCATCTAT







258
TGTAACCTTG
18315
CCCAG
CTGGCTCCTG
18669
CCT



ATACCAACCT


CCCTCCCTGC

GG



G


T







259
TGTAACCTTG
18316
CCCAG
CTGGCTCCTG
18670
CCT



ATACCAACCT


CCCTCCCTGC

GG



G


T







262
ACCAATAGAA
18317
TGG
AGTCAGGGCA
18671
TTG



ACTGGGCATG


GAGCCATCTA







263
ACCAATAGAA
18318
TGG
TCAGGGCAGA
18672
GCT



ACTGGGCATG


GCCATCTATT







264
TCTTGTAACC
18319
CCT
CCCTGGCTCC
18673
GCT



TTGATACCAA


TGCCCTCCCT







267
ACCAATAGAA
18320
GGAG
GCTGGGCATA
18674
AGAG



ACTGGGCATG


AAAGTCAGGG





T


C







268
CCAATAGAAA
18321
GGA
CAGGGCAGAG
18675
CTT



CTGGGCATGT


CCATCTATTG







269
CTTGTAACCT
18322
CTG
CCTGGCTCCT
18676
CTC



TGATACCAAC


GCCCTCCCTG







270
ACCAATAGAA
18323
GGA
CATCTATTGC
18677
TCT



ACTGGGCATG

GA
TTACATTTGC

GA



T


T







271
ACCAATAGAA
18324
GGA
CATCTATTGC
18678
TCT



ACTGGGCATG

GA
TTACATTTGC

GA



T


T







274
CCAATAGAAA
18325
GG
GTCAGGGCAG
18679
TG



CTGGGCATGT


AGCCATCTAT







278
CAATAGAAAC
18326
GAG
AGGGCAGAGC
18680
TTA



TGGGCATGTG


CATCTATTGC







279
TTGTAACCTT
18327
TGC
CTGGCTCCTG
18681
TCC



GATACCAACC


CCCTCCCTGC







283
CAATAGAAAC
18328
GAG
GAGCCATCTA
18682
TTG



TGGGCATGTG


TTGCTTACAT







284
ACCAATAGAA
18329
TGG
AGGGCTGGGC
18683
GGG



ACTGGGCATG


ATAAAAGTCA







287
AATAGAAACT
18330
AGA
GGGCAGAGCC
18684
TAC



GGGCATGTGG


ATCTATTGCT







288
AATAGAAACT
18331
AG
GTCAGGGCAG
18685
TG



GGGCATGTGG


AGCCATCTAT







291
TGTAACCTTG
18332
GCC
TGGCTCCTGC
18686
CCT



ATACCAACCT


CCTCCCTGCT







294
AGACCAATAG
18333
GTGG
CCAGGGCTGG
18687
AGGG



AAACTGGGCA


GCATAAAAGT





T


C







295
ATAGAAACTG
18334
ACAG
GCTGGGCATA
18688
AGAG



GGCATGTGGA


AAAGTCAGGG





G


C







298
CAATAGAAAC
18335
GAG
GAGCCATCTA
18689
TTG



TGGGCATGTG


TTGCTTACAT







299
ACCAATAGAA
18336
TGG
AGGGCTGGGC
18690
GGG



ACTGGGCATG


ATAAAAGTCA







302
ATAGAAACTG
18337
GAC
GGCAGAGCCA
18691
ACA



GGCATGTGGA


TCTATTGCTT







303
AATAGAAACT
18338
AG
GTCAGGGCAG
18692
TG



GGGCATGTGG


AGCCATCTAT







306
GTAACCTTGA
18339
CCC
GGCTCCTGCC
18693
CTG



TACCAACCTG


CTCCCTGCTC







307
gaccAATAGA
18340
GAG
agtcAGGGCA
18694
CTT



AACTGGGCAT

ACA
GAGCCATCTA

ACA



GTG

GA
TTG

TT





308
gtctTGTAAC
18341
TGC
cagcCCTGGC
18695
GCT



CTTGATACCA

CCA
TCCTGCCCTC

CCT



ACC

GG
CCT

GG





309
gaccAATAGA
18342
GAG
agtcAGGGCA
18696
CTT



AACTGGGCAT

ACA
GAGCCATCTA

ACA



GTG

GA
TTG

TT





310
gtctTGTAAC
18343
TGC
cagcCCTGGC
18697
GCT



CTTGATACCA

CCA
TCCTGCCCTC

CCT



ACC

GG
CCT

GG





312
tgTAACCTTG
18344
AGG
ccCAGCCCTG
18698
TGC



ATACCAACCT

GCC
GCTCCTGCCC

TCC



GCCC


TCCC







313
aaTAGAAACT
18345
CAG
agGGCTGGGC
18699
CAG



GGGCATGTGG

AG
ATAAAAGTCA

AG



AGA


GGG







314
ATAGAAACTG
18346
ACA
CATCTATTGC
18700
TCT



GGCATGTGGA

GA
TTACATTTGC

GA



G


T







315
AGACCAATAG
18347
GTGG
CCAGGGCTGG
18701
AGGG



AAACTGGGCA


GCATAAAAGT





T


C







316
ATAGAAACTG
18348
ACAG
GCTGGGCATA
18702
AGAG



GGCATGTGGA


AAAGTCAGGG





G


C







319
AGAAACTGGG
18349
CAG
GAGCCATCTA
18703
TTG



CATGTGGAGA


TTGCTTACAT







320
TAGAAACTGG
18350
ACA
GCAGAGCCAT
18704
CAT



GCATGTGGAG


CTATTGCTTA







321
TAACCTTGAT
18351
CCA
GCTCCTGCCC
18705
TGG



ACCAACCTGC


TCCCTGCTCC







322
gtaaCCTTGA
18352
AGG
cagcCCTGGC
18706
GCTC



TACCAACCTG

GC
TCCTGCCCTC

CTGG



CCC


CCT







323
gtaaCCTTGA
18353
AGG
cagcCCTGGC
18707
GCT



TACCAACCTG

GC
TCCTGCCCTC

CCT



CCC


CCT

GG





327
TAGAAACTGG
18354
CAG
CATCTATTGC
18708
TCT



GCATGTGGAG

AG
TTACATTTGC

GA



A


T







328
ATAGAAACTG
18355
ACAG
GCTGGGCATA
18709
AGAG



GGCATGTGGA


AAAGTCAGGG





G


C







329
ACCTTGATAC
18356
AG
CTCCTGCCCT
18710
GG



CAACCTGCCC


CCCTGCTCCT







333
AACCTTGATA
18357
CAG
CTCCTGCCCT
18711
GGG



CCAACCTGCC


CCCTGCTCCT







334
AGAAACTGGG
18358
CAG
CAGAGCCATC
18712
ATT



CATGTGGAGA


TATTGCTTAC







340
AGAAACTGGG
18359
AGA
CATCTATTGC
18713
TCTGA



CATGTGGAGA

GA
TTACATTTGC





C


T







343
ACCTTGATAC
18360
AGG
CCTGCCCTCC
18714
GAG



CAACCTGCCC


CTGCTCCTGG







344
ACCTTGATAC
18361
AGG
TCCTGCCCTC
18715
GGA



CAACCTGCCC


CCTGCTCCTG







345
GAAACTGGGC
18362
AGA
AGAGCCATCT
18716
TTT



ATGTGGAGAC


ATTGCTTACA







346
gtaaCCTTGA
18363
AGG
cagcCCTGGC
18717
GCT



TACCAACCTG

GC
TCCTGCCCTC

CCT



CCC


CCT

GG









Capital letters indicate “core nucleotides” while lower case letters indicate “flanking nucleotides.” Herein, when an RNA sequence (e.g., a gRNA to produce a second nick) is said to comprise a particular sequence (e.g., a sequence of Table 2 or a portion thereof) that comprises thymine (T), it is of course understood that the RNA sequence may (and frequently does) comprise uracil (U) in place of T. For instance, the RNA sequence may comprise U at every position shown as T in the sequence in Table 2. More specifically, the present disclosure provides an RNA sequence according to every gRNA spacer sequence shown in Table 2, wherein the RNA sequence has a U in place of each T in the sequence in Table 2.


In some embodiments, the systems and methods provided herein may comprise a template sequence listed in Table 4. Table 4 provides exemplary template RNA sequences (column 4) and optional second-nick gRNA sequences (column 5) designed to be paired with a gene modifying polypeptide to correct a mutation in the HBB gene. The templates in Table 4 are meant to exemplify the total sequence of: (1) gRNA spacer (e.g., for targeting for first strand nick), (2) gRNA scaffold, (3) heterologous object sequence, and (4) PBS sequence (e.g., for initiating TPRT at first strand nick).









TABLE 4







Exemplary template RNA sequences and second nick gRNA sequences


Table 4 provides design of RNA components of gene modifying systems for correcting the


pathogenic E6V mutation in HBB. The gRNA spacers from Table 1 were filtered, e.g.,


filtered by occurrence within 15 nt of the desired editing location and use of a Tier 1


Cas enzyme. For each gRNA ID, this table details the sequence of a complete template RNA,


optional second-nick gRNA, and Cas variant for use in a Cas-RT fusion gene modifying


polypeptide. For exemplification, PBS sequences and post-edit homology regions (after


the location of the edit) are set to 12 nt and 30 nt, respectively. Additionally, a


second-nick gRNA is selected with preference for a distance near 100 nt from the first


nick and a first preference for a design resulting in a PAM-in system, as described


elsewhere in this application.

















SEQ

SEQ



Cas


ID

ID


ID
species
strand
Template RNA
NO
second-nick gRNA
NO
















1
SauCas9KKH

TGGTGCATCTGACTCCTGTGGGTTT
18895
GCCCAGTTTCTATTGGTCTCCGTTT
19072





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAccc

TCTCGTCAACTTGTTGGCGAGA






acagggcagtaacggcagacttctc








CTCAGGAGTCagat








2
SauCas9KKH

TGGTGCATCTGACTCCTGTGGGTTT
18896
GCCCAGTTTCTATTGGTCTCCGTTT
19073





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAccc

TCTCGTCAACTTGTTGGCGAGA






acagggcagtaacggcagacttctc








CTCAGGAGTCagat








5
SpyCaS9-

GGTGCATCTGACTCCTGTGGGTTTT
18897
TCTATTGGTCTCCTTAAACCGTTTT
19074



NG

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCccca

AAAGTGGCACCGAGTCGGTGC






cagggcagtaacggcagacttctcC








TCAGGAGTCagat








9
SpyCas9-

GGTGCATCTGACTCCTGTGGGTTTT
18898
TTCTATTGGTCTCCTTAAACGTTTT
19075



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCccca

AAAGTGGCACCGAGTCGGTGC






cagggcagtaacggcagacttctcC








TCAGGAGTCagat








13
SauCas9

ccATGGTGCATCTGACTCCTGTGGT
18899
tgTAACCTTGATACCAACCTGCCGT
19076





TTTAGTACTCTGGAAACAGAATCTA

TTTAGTACTCTGGAAACAGAATCTA






CTAAAACAAGGCAAAATGCCGTGTT

CTAAAACAAGGCAAAATGCCGTGTT






TATCTCGTCAACTTGTTGGCGAGAc

TATCTCGTCAACTTGTTGGCGAGA






cacagggcagtaacggcagacttct








cCTCAGGAGTCAgatg








14
SauCas9KKH

ATGGTGCATCTGACTCCTGTGGTTT
18900
TTGGTCTCCTTAAACCTGTCTGTTT
19077





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAcca

TCTCGTCAACTTGTTGGCGAGA






cagggcagtaacggcagacttctcC








TCAGGAGTCAgatg








15
SauCas9
+
gcAGTAACGGCAGACTTCTCCACGT
18901
gaGCCAGGGCTGGGCATAAAAGTGT
19078





TTTAGTACTCTGGAAACAGAATCTA

TTTAGTACTCTGGAAACAGAATCTA






CTAAAACAAGGCAAAATGCCGTGTT

CTAAAACAAGGCAAAATGCCGTGTT






TATCTCGTCAACTTGTTGGCGAGAa

TATCTCGTCAACTTGTTGGCGAGA






cagacaccatggtgcatctgactcc








tGAGGAGAAGTCtgcc








16
SauCas9KKH
+
AGTAACGGCAGACTTCTCCACGTTT
18902
GAGCCAGGGCTGGGCATAAAAGTTT
19079





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAaca

TCTCGTCAACTTGTTGGCGAGA






gacaccatggtgcatctgactcctG








AGGAGAAGTCtgcc








17
SauCas9
+
gcAGTAACGGCAGACTTCTCCACGT
18903
gaGCCAGGGCTGGGCATAAAAGTGT
19080





TTTAGTACTCTGGAAACAGAATCTA

TTTAGTACTCTGGAAACAGAATCTA






CTAAAACAAGGCAAAATGCCGTGTT

CTAAAACAAGGCAAAATGCCGTGTT






TATCTCGTCAACTTGTTGGCGAGAa

TATCTCGTCAACTTGTTGGCGAGA






cagacaccatggtgcatctgactcc








tGAGGAGAAGTCtgcc








18
SauCas9KKH
+
AGTAACGGCAGACTTCTCCACGTTT
18904
GAGCCAGGGCTGGGCATAAAAGTTT
19081





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAaca

TCTCGTCAACTTGTTGGCGAGA






gacaccatggtgcatctgactcctG








AGGAGAAGTCtgcc








23
ScaCas9-

TGGTGCATCTGACTCCTGTGGTTTT
18905
TTCTATTGGTCTCCTTAAACGTTTT
19082



Sc++

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCccac

AAAGTGGCACCGAGTCGGTGC






agggcagtaacggcagacttctcCT








CAGGAGTCAgatg








24
SpyCas9-

TGGTGCATCTGACTCCTGTGGTTTT
18906
TCTATTGGTCTCCTTAAACCGTTTT
19083



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCccac

AAAGTGGCACCGAGTCGGTGC






agggcagtaacggcagacttctcCT








CAGGAGTCAgatg








27
ScaCas9-
+
GTAACGGCAGACTTCTCCACGTTTT
18907
GGAGCCAGGGCTGGGCATAAGTTTT
19084



Sc++

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCacag

AAAGTGGCACCGAGTCGGTGC






acaccatggtgcatctgactectGA








GGAGAAGTCtgcc








28
SpyCas9
+
GTAACGGCAGACTTCTCCACGTTTT
18908
CAGGGCTGGGCATAAAAGTCGTTTT
19085





AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCacag

AAAGTGGCACCGAGTCGGTGC






acaccatggtgcatctgactcctGA








GGAGAAGTCtgcc








31
SpyCas9-
+
GTAACGGCAGACTTCTCCACGTTTT
18909
GAGCCAGGGCTGGGCATAAAGTTTT
19086



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCacag

AAAGTGGCACCGAGTCGGTGC






acaccatggtgcatctgactcctGA








GGAGAAGTCtgcc








32
SpyCas9-
+
GTAACGGCAGACTTCTCCACGTTTT
18910
GAGCCAGGGCTGGGCATAAAGTTTT
19087



NG

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCacag

AAAGTGGCACCGAGTCGGTGC






acaccatggtgcatctgactectGA








GGAGAAGTCtgcc








39
SauCas9
+
ggCAGTAACGGCAGACTTCTCCAGT
18911
gaGCCAGGGCTGGGCATAAAAGTGT
19088





TTTAGTACTCTGGAAACAGAATCTA

TTTAGTACTCTGGAAACAGAATCTA






CTAAAACAAGGCAAAATGCCGTGTT

CTAAAACAAGGCAAAATGCCGTGTT






TATCTCGTCAACTTGTTGGCGAGAc

TATCTCGTCAACTTGTTGGCGAGA






agacaccatggtgcatctgactcct








GAGGAGAAGTCTgccg








40
SauCas9KKH
+
CAGTAACGGCAGACTTCTCCAGTTT
18912
GAGCCAGGGCTGGGCATAAAAGTTT
19089





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAcag

TCTCGTCAACTTGTTGGCGAGA






acaccatggtgcatctgactcctGA








GGAGAAGTCTgccg








41
SauCas9KKH

CATGGTGCATCTGACTCCTGTGTTT
18913
TTGGTCTCCTTAAACCTGTCTGTTT
19090





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAcac

TCTCGTCAACTTGTTGGCGAGA






agggcagtaacggcagacttctcCT








CAGGAGTCAGatgc








42
SauriCas9
+
CAGTAACGGCAGACTTCTCCAGTTT
18914
GCCAGGGCTGGGCATAAAAGTGTTT
19091





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAcag

TCTCGTCAACTTGTTGGCGAGA






acaccatggtgcatctgactcctGA








GGAGAAGTCTgccg








43
SauriCas9-
+
CAGTAACGGCAGACTTCTCCAGTTT
18915
AGCCAGGGCTGGGCATAAAAGGTTT
19092



KKH

TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAcag

TCTCGTCAACTTGTTGGCGAGA






acaccatggtgcatctgactcctGA








GGAGAAGTCTgccg








44
SauriCas9-

CATGGTGCATCTGACTCCTGTGTTT
18916
TCTCCACATGCCCAGTTTCTAGTTT
19093



KKH

TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAcac

TCTCGTCAACTTGTTGGCGAGA






agggcagtaacggcagacttctcCT








CAGGAGTCAGatgc








48
ScaCas9-
+
AGTAACGGCAGACTTCTCCAGTTTT
18917
CCAGGGCTGGGCATAAAAGTGTTTT
19094



Sc++

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCcaga

AAAGTGGCACCGAGTCGGTGC






caccatggtgcatctgactectGAG








GAGAAGTCTgccg








49
SpyCas9-
+
AGTAACGGCAGACTTCTCCAGTTTT
18918
AGCCAGGGCTGGGCATAAAAGTTTT
19095



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCcaga

AAAGTGGCACCGAGTCGGTGC






caccatggtgcatctgactcctGAG








GAGAAGTCTgccg








50
SpyCas9-

ATGGTGCATCTGACTCCTGTGTTTT
18919
TCTATTGGTCTCCTTAAACCGTTTT
19096



NG

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCcaca

AAAGTGGCACCGAGTCGGTGC






gggcagtaacggcagacttctcCTC








AGGAGTCAGatgc








54
SpyCas9-

ATGGTGCATCTGACTCCTGTGTTTT
18920
CTATTGGTCTCCTTAAACCTGTTTT
19097



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCcaca

AAAGTGGCACCGAGTCGGTGC






gggcagtaacggcagacttctcCTC








AGGAGTCAGatgc








59
SauCas9

caCCATGGTGCATCTGACTCCTGGT
18921
tgTAACCTTGATACCAACCTGCCGT
19098





TTTAGTACTCTGGAAACAGAATCTA

TTTAGTACTCTGGAAACAGAATCTA






CTAAAACAAGGCAAAATGCCGTGTT

CTAAAACAAGGCAAAATGCCGTGTT






TATCTCGTCAACTTGTTGGCGAGAa

TATCTCGTCAACTTGTTGGCGAGA






cagggcagtaacggcagacttctcC








TCAGGAGTCAGAtgca








60
SauCas9KKH

CCATGGTGCATCTGACTCCTGGTTT
18922
TTGGTCTCCTTAAACCTGTCTGTTT
19099





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAaca

TCTCGTCAACTTGTTGGCGAGA






gggcagtaacggcagacttctcCTC








AGGAGTCAGAtgca








61
SauCas9KKH
+
GCAGTAACGGCAGACTTCTCCGTTT
18923
AGCCAGGGCTGGGCATAAAAGGTTT
19100





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAaga

TCTCGTCAACTTGTTGGCGAGA






caccatggtgcatctgactcctGAG








GAGAAGTCTGccgt








62
SauriCas9-
+
GCAGTAACGGCAGACTTCTCCGTTT
18924
AGCCAGGGCTGGGCATAAAAGGTTT
19101



KKH

TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAaga

TCTCGTCAACTTGTTGGCGAGA






caccatggtgcatctgactcctGAG








GAGAAGTCTGccgt








65
ScaCas9-

CATGGTGCATCTGACTCCTGGTTTT
18925
TTGGTCTCCTTAAACCTGTCGTTTT
19102



Sc++

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCacag

AAAGTGGCACCGAGTCGGTGC






ggcagtaacggcagacttctcCTCA








GGAGTCAGAtgca








66
SpyCas9

CATGGTGCATCTGACTCCTGGTTTT
18926
TCCACATGCCCAGTTTCTATGTTTT
19103





AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCacag

AAAGTGGCACCGAGTCGGTGC






ggcagtaacggcagacttctcCTCA








GGAGTCAGAtgca








69
SpyCas9-

CATGGTGCATCTGACTCCTGGTTTT
18927
TATTGGTCTCCTTAAACCTGGTTTT
19104



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCacag

AAAGTGGCACCGAGTCGGTGC






ggcagtaacggcagacttctcCTCA








GGAGTCAGAtgca








70
SpyCas9-

CATGGTGCATCTGACTCCTGGTTTT
18928
TCTATTGGTCTCCTTAAACCGTTTT
19105



NG

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCacag

AAAGTGGCACCGAGTCGGTGC






ggcagtaacggcagacttctcCTCA








GGAGTCAGAtgca








73
SpyCas9-
+
CAGTAACGGCAGACTTCTCCGTTTT
18929
GCCAGGGCTGGGCATAAAAGGTTTT
19106



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCagac

AAAGTGGCACCGAGTCGGTGC






accatggtgcatctgactcctGAGG








AGAAGTCTGccgt








79
SauCas9

acACCATGGTGCATCTGACTCCTGT
18930
tgTAACCTTGATACCAACCTGCCGT
19107





TTTAGTACTCTGGAAACAGAATCTA

TTTAGTACTCTGGAAACAGAATCTA






CTAAAACAAGGCAAAATGCCGTGTT

CTAAAACAAGGCAAAATGCCGTGTT






TATCTCGTCAACTTGTTGGCGAGAc

TATCTCGTCAACTTGTTGGCGAGA






agggcagtaacggcagacttctcCT








CAGGAGTCAGATgcac








80
SauCas9KKH

ACCATGGTGCATCTGACTCCTGTTT
18931
TTGGTCTCCTTAAACCTGTCTGTTT
19108





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAcag

TCTCGTCAACTTGTTGGCGAGA






ggcagtaacggcagacttctcCTCA








GGAGTCAGATgcac








81
SauCas9KKH
+
GGCAGTAACGGCAGACTTCTCGTTT
18932
GCCAGGGCTGGGCATAAAAGTGTTT
19109





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAgac

TCTCGTCAACTTGTTGGCGAGA






accatggtgcatctgactcctGAGG








AGAAGTCTGCcgtt








82
SauriCas9

ACCATGGTGCATCTGACTCCTGTTT
18933
TCTCCACATGCCCAGTTTCTAGTTT
19110





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAcag

TCTCGTCAACTTGTTGGCGAGA






ggcagtaacggcagacttctcCTCA








GGAGTCAGATgcac








83
SauriCas9-

ACCATGGTGCATCTGACTCCTGTTT
18934
TCTCCACATGCCCAGTTTCTAGTTT
19111



KKH

TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAcag

TCTCGTCAACTTGTTGGCGAGA






ggcagtaacggcagacttctcCTCA








GGAGTCAGATgcac








86
ScaCas9-

CCATGGTGCATCTGACTCCTGTTTT
18935
TTGGTCTCCTTAAACCTGTCGTTTT
19112



Sc++

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCcagg

AAAGTGGCACCGAGTCGGTGC






gcagtaacggcagacttctcCTCAG








GAGTCAGATgcac








87
SpyCas9-

CCATGGTGCATCTGACTCCTGTTTT
18936
ATTGGTCTCCTTAAACCTGTGTTTT
19113



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCcagg

AAAGTGGCACCGAGTCGGTGC






gcagtaacggcagacttctcCTCAG








GAGTCAGATgcac








88
SpyCas9-
+
GCAGTAACGGCAGACTTCTCGTTTT
18937
CCAGGGCTGGGCATAAAAGTGTTTT
19114



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCgaca

AAAGTGGCACCGAGTCGGTGC






ccatggtgcatctgactcctGAGGA








GAAGTCTGCcgtt








94
SauCas9

CACCATGGTGCATCTGACTCCGTTT
18938
TTGGTCTCCTTAAACCTGTCTGTTT
19115



KKH

TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAagg

TCTCGTCAACTTGTTGGCGAGA






gcagtaacggcagacttctcCTCAG








GAGTCAGATGcacc








95
SpyCas9-

ACCATGGTGCATCTGACTCCGTTTT
18939
TGGTCTCCTTAAACCTGTCTGTTTT
19116



NG

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCaggg

AAAGTGGCACCGAGTCGGTGC






cagtaacggcagacttctcCTCAGG








AGTCAGATGcacc








99
SpyCas9-

ACCATGGTGCATCTGACTCCGTTTT
18940
TTGGTCTCCTTAAACCTGTCGTTTT
19117



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCaggg

AAAGTGGCACCGAGTCGGTGC






cagtaacggcagacttctcCTCAGG








AGTCAGATGcacc








100
SpyCas9-
+
GGCAGTAACGGCAGACTTCTGTTTT
18941
CAGGGCTGGGCATAAAAGTCGTTTT
19118



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCacac

AAAGTGGCACCGAGTCGGTGC






catggtgcatctgactcctGAGGAG








AAGTCTGCCgtta








103
ScaCas9-

CACCATGGTGCATCTGACTCGTTTT
18942
TTGGTCTCCTTAAACCTGTCGTTTT
19119



Sc++

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCgggc

AAAGTGGCACCGAGTCGGTGC






agtaacggcagacttctcCTCAGGA








GTCAGATGCacca








104
SpyCas9-

CACCATGGTGCATCTGACTCGTTTT
18943
TGGTCTCCTTAAACCTGTCTGTTTT
19120



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCgggc

AAAGTGGCACCGAGTCGGTGC






agtaacggcagacttctcCTCAGGA








GTCAGATGCacca








105
SpyCas9-
+
GGGCAGTAACGGCAGACTTCGTTTT
18944
AGGGCTGGGCATAAAAGTCAGTTTT
19121



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCcacc

AAAGTGGCACCGAGTCGGTGC






atggtgcatctgactcctGAGGAGA








AGTCTGCCGttac








106
BlatCas9
+
acagGGCAGTAACGGCAGACTTCGC
18945
agccAGGGCTGGGCATAAAAGTCGC
19122





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTcacca

GCATTTATCTCCGAGGTGCT






tggtgcatctgactcctGAGGAGAA








GTCTGCCGttac








107
BlatCas9
+
acagGGCAGTAACGGCAGACTTCGC
18946
agccAGGGCTGGGCATAAAAGTCGC
19123





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTcacca

GCATTTATCTCCGAGGTGCT






tggtgcatctgactcctGAGGAGAA








GTCTGCCGttac








108
SpyCas9-
+
AGGGCAGTAACGGCAGACTTGTTTT
18947
GGGCTGGGCATAAAAGTCAGGTTTT
19124



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCacca

AAAGTGGCACCGAGTCGGTGC






tggtgcatctgactcctGAGGAGAA








GTCTGCCGTtact








109
SpyCas9-

ACACCATGGTGCATCTGACTGTTTT
18948
GGTCTCCTTAAACCTGTCTTGTTTT
19125



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCggca

AAAGTGGCACCGAGTCGGTGC






gtaacggcagacttctcCTCAGGAG








TCAGATGCAccat








110
SpyCas9-
+
CAGGGCAGTAACGGCAGACTGTTTT
18949
GGCTGGGCATAAAAGTCAGGGTTTT
19126



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCccat

AAAGTGGCACCGAGTCGGTGC






ggtgcatctgactcctGAGGAGAAG








TCTGCCGTTactg








111
SpyCas9-

GACACCATGGTGCATCTGACGTTTT
18950
GTCTCCTTAAACCTGTCTTGGTTTT
19127



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCgcag

AAAGTGGCACCGAGTCGGTGC






taacggcagacttctcCTCAGGAGT








CAGATGCACcatg








112
BlatCas9
+
ccacAGGGCAGTAACGGCAGACTGC
18951
gggcTGGGCATAAAAGTCAGGGCGC
19128





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTccatg

GCATTTATCTCCGAGGTGCT






gtgcatctgactcctGAGGAGAAGT








CTGCCGTTactg








113
Nme2Cas9
+
ccCCACAGGGCAGTAACGGCAGACG
18952
agGGCTGGGCATAAAAGTCAGGGCG
19129





TTGTAGCTCCCTTTCTCATTTCGGA

TTGTAGCTCCCTTTCTCATTTCGGA






AACGAAATGAGAACCGTTGCTACAA

AACGAAATGAGAACCGTTGCTACAA






TAAGGCCGTCTGAAAAGATGTGCCG

TAAGGCCGTCTGAAAAGATGTGCCG






CAACGCTCTGCCCCTTAAAGCTTCT

CAACGCTCTGCCCCTTAAAGCTTCT






GCTTTAAGGGGCATCGTTTAcatgg

GCTTTAAGGGGCATCGTTTA






tgcatctgactcctGAGGAGAAGTC








TGCCGTTActgc








114
SpyCas9-
+
ACAGGGCAGTAACGGCAGACGTTTT
18953
GCTGGGCATAAAAGTCAGGGGTTTT
19130



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCcatg

AAAGTGGCACCGAGTCGGTGC






gtgcatctgactcctGAGGAGAAGT








CTGCCGTTActgc








115
SpyCas9-

AGACACCATGGTGCATCTGAGTTTT
18954
TCTCCTTAAACCTGTCTTGTGTTTT
19131



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCcagt

AAAGTGGCACCGAGTCGGTGC






aacggcagacttctcCTCAGGAGTC








AGATGCACCatgg








116
BlatCas9
+
cccaCAGGGCAGTAACGGCAGACGC
18955
gggcTGGGCATAAAAGTCAGGGCGC
19132





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTcatgg

GCATTTATCTCCGAGGTGCT






tgcatctgactcctGAGGAGAAGTC








TGCCGTTActgc








117
SpyCas9-
+
CACAGGGCAGTAACGGCAGAGTTTT
18956
CTGGGCATAAAAGTCAGGGCGTTTT
19133



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCatgg

AAAGTGGCACCGAGTCGGTGC






tgcatctgactcctGAGGAGAAGTC








TGCCGTTACtgcc








118
SpyCas9-

CAGACACCATGGTGCATCTGGTTTT
18957
CTCCTTAAACCTGTCTTGTAGTTTT
19134



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCagta

AAAGTGGCACCGAGTCGGTGC






acggcagacttctcCTCAGGAGTCA








GATGCACCAtggt








119
BlatCas9

aaacAGACACCATGGTGCATCTGGC
18958
ttggTCTCCTTAAACCTGTCTTGGC
19135





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTagtaa

GCATTTATCTCCGAGGTGCT






cggcagacttctcCTCAGGAGTCAG








ATGCACCAtggt








120
BlatCas9

aaacAGACACCATGGTGCATCTGGC
18959
ttggTCTCCTTAAACCTGTCTTGGC
19136





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTagtaa

GCATTTATCTCCGAGGTGCT






cggcagacttctcCTCAGGAGTCAG








ATGCACCAtggt








121
Nme2Cas9

tcAAACAGACACCATGGTGCATCTG
18960
taTTGGTCTCCTTAAACCTGTCTTG
19137





TTGTAGCTCCCTTTCTCATTTCGGA

TTGTAGCTCCCTTTCTCATTTCGGA






AACGAAATGAGAACCGTTGCTACAA

AACGAAATGAGAACCGTTGCTACAA






TAAGGCCGTCTGAAAAGATGTGCCG

TAAGGCCGTCTGAAAAGATGTGCCG






CAACGCTCTGCCCCTTAAAGCTTCT

CAACGCTCTGCCCCTTAAAGCTTCT






GCTTTAAGGGGCATCGTTTAgtaac

GCTTTAAGGGGCATCGTTTA






ggcagacttctcCTCAGGAGTCAGA








TGCACCATggtg








122
SpyCas9-

ACAGACACCATGGTGCATCTGTTTT
18961
TCCTTAAACCTGTCTTGTAAGTTTT
19138



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCgtaa

AAAGTGGCACCGAGTCGGTGC






cggcagacttctcCTCAGGAGTCAG








ATGCACCATggtg








123
SpyCas9-
+
CCACAGGGCAGTAACGGCAGGTTTT
18962
TGGGCATAAAAGTCAGGGCAGTTTT
19139



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCtggt

AAAGTGGCACCGAGTCGGTGC






gcatctgactcctGAGGAGAAGTCT








GCCGTTACTgccc








124
BlatCas9

caaaCAGACACCATGGTGCATCTGC
18963
ttggTCTCCTTAAACCTGTCTTGGC
19140





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTgtaac

GCATTTATCTCCGAGGTGCT






ggcagacttctcCTCAGGAGTCAGA








TGCACCATggtg








125
BlatCas9
+
gcccCACAGGGCAGTAACGGCAGGC
18964
ggctGGGCATAAAAGTCAGGGCAGC
19141





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTtggtg

GCATTTATCTCCGAGGTGCT






catctgactcctGAGGAGAAGTCTG








CCGTTACTgccc








126
BlatCas9

caaaCAGACACCATGGTGCATCTGC
18965
ttggTCTCCTTAAACCTGTCTTGGC
19142





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTgtaac

GCATTTATCTCCGAGGTGCT






ggcagacttctcCTCAGGAGTCAGA








TGCACCATggtg








128
SpyCas9-

AACAGACACCATGGTGCATCGTTTT
18966
TTAAACCTGTCTTGTAACCTGTTTT
19143



NG

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCtaac

AAAGTGGCACCGAGTCGGTGC






ggcagacttctcCTCAGGAGTCAGA








TGCACCATGgtgt








131
SpyCas9-
+
CCCACAGGGCAGTAACGGCAGTTTT
18967
GGGCATAAAAGTCAGGGCAGGTTTT
19144



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCggtg

AAAGTGGCACCGAGTCGGTGC






catctgactcctGAGGAGAAGTCTG








CCGTTACTGccct








133
SpyCas9-

AACAGACACCATGGTGCATCGTTTT
18968
CCTTAAACCTGTCTTGTAACGTTTT
19145



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCtaac

AAAGTGGCACCGAGTCGGTGC






ggcagacttctcCTCAGGAGTCAGA








TGCACCATGgtgt








140
ScaCas9-

AAACAGACACCATGGTGCATGTTTT
18969
CTTAAACCTGTCTTGTAACCGTTTT
19146



Sc++

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCaacg

AAAGTGGCACCGAGTCGGTGC






gcagacttctcCTCAGGAGTCAGAT








GCACCATGGtgtc








141
SpyCas9-

AAACAGACACCATGGTGCATGTTTT
18970
CTTAAACCTGTCTTGTAACCGTTTT
19147



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCaacg

AAAGTGGCACCGAGTCGGTGC






gcagacttctcCTCAGGAGTCAGAT








GCACCATGGtgtc








142
SpyCas9-
+
CCCCACAGGGCAGTAACGGCGTTTT
18971
GGGCATAAAAGTCAGGGCAGGTTTT
19148



NG

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCgtgc

AAAGTGGCACCGAGTCGGTGC






atctgactcctGAGGAGAAGTCTGC








CGTTACTGCcctg








146
SpyCas9-
+
CCCCACAGGGCAGTAACGGCGTTTT
18972
GGCATAAAAGTCAGGGCAGAGTTTT
19149



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCgtgc

AAAGTGGCACCGAGTCGGTGC






atctgactcctGAGGAGAAGTCTGC








CGTTACTGCcctg








147
BlatCas9

ctcaAACAGACACCATGGTGCATGC
18973
ccttAAACCTGTCTTGTAACCTTGC
19150





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTaacgg

GCATTTATCTCCGAGGTGCT






cagacttctcCTCAGGAGTCAGATG








CACCATGGtgtc








154
SauCas9KKH

TCAAACAGACACCATGGTGCAGTTT
18974
TCCTTAAACCTGTCTTGTAACGTTT
19151





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAacg

TCTCGTCAACTTGTTGGCGAGA






gcagacttctcCTCAGGAGTCAGAT








GCACCATGGTgtct








157
ScaCas9-
+
GCCCCACAGGGCAGTAACGGGTTTT
18975
TGGGCATAAAAGTCAGGGCAGTTTT
19152



Sc++

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCtgca

AAAGTGGCACCGAGTCGGTGC






tctgactcctGAGGAGAAGTCTGCC








GTTACTGCCctgt








158
SpyCas9-
+
GCCCCACAGGGCAGTAACGGGTTTT
18976
GCATAAAAGTCAGGGCAGAGGTTTT
19153



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCtgca

AAAGTGGCACCGAGTCGGTGC






tctgactcctGAGGAGAAGTCTGCC








GTTACTGCCctgt








159
SpyCas9-

CAAACAGACACCATGGTGCAGTTTT
18977
TTAAACCTGTCTTGTAACCTGTTTT
19154



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCacgg

AAAGTGGCACCGAGTCGGTGC






cagacttctcCTCAGGAGTCAGATG








CACCATGGTgtct








160
BlatCaS9
+
cttgCCCCACAGGGCAGTAACGGGC
18978
tgggCATAAAAGTCAGGGCAGAGGC
19155





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTtgcat

GCATTTATCTCCGAGGTGCT






ctgactcctGAGGAGAAGTCTGCCG








TTACTGCCctgt








165
SauCas9KKH
+
TTGCCCCACAGGGCAGTAACGGTTT
18979
GGCTGGGCATAAAAGTCAGGGGTTT
19156





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAgca

TCTCGTCAACTTGTTGGCGAGA






tctgactcctGAGGAGAAGTCTGCC








GTTACTGCCCtgtg








166
SauriCas9-
+
TTGCCCCACAGGGCAGTAACGGTTT
18980
GCTGGGCATAAAAGTCAGGGCGTTT
19157



KKH

TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAgca

TCTCGTCAACTTGTTGGCGAGA






tctgactcctGAGGAGAAGTCTGCC








GTTACTGCCCtgtg








167
SpyCas9-
+
TGCCCCACAGGGCAGTAACGGTTTT
18981
CATAAAAGTCAGGGCAGAGCGTTTT
19158



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCgcat

AAAGTGGCACCGAGTCGGTGC






ctgactcctGAGGAGAAGTCTGCCG








TTACTGCCCtgtg








168
SpyCas9-

TCAAACAGACACCATGGTGCGTTTT
18982
TAAACCTGTCTTGTAACCTTGTTTT
19159



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCcggc

AAAGTGGCACCGAGTCGGTGC






agacttctcCTCAGGAGTCAGATGC








ACCATGGTGtctg








172
SauCas9KKH
+
CTTGCCCCACAGGGCAGTAACGTTT
18983
GGCTGGGCATAAAAGTCAGGGGTTT
19160





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAcat

TCTCGTCAACTTGTTGGCGAGA






ctgactcctGAGGAGAAGTCTGCCG








TTACTGCCCTgtgg








173
SpyCas9-
+
TTGCCCCACAGGGCAGTAACGTTTT
18984
GGGCATAAAAGTCAGGGCAGGTTTT
19161



NG

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCcatc

AAAGTGGCACCGAGTCGGTGC






tgactcctGAGGAGAAGTCTGCCGT








TACTGCCCTgtgg








177
SpyCas9-
+
TTGCCCCACAGGGCAGTAACGTTTT
18985
ATAAAAGTCAGGGCAGAGCCGTTTT
19162



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCcatc

AAAGTGGCACCGAGTCGGTGC






tgactcctGAGGAGAAGTCTGCCGT








TACTGCCCTgtgg








178
SpyCaS9-

CTCAAACAGACACCATGGTGGTTTT
18986
AAACCTGTCTTGTAACCTTGGTTTT
19163



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCggca

AAAGTGGCACCGAGTCGGTGC






gacttctcCTCAGGAGTCAGATGCA








CCATGGTGTctgt








186
ScaCas9-
+
CTTGCCCCACAGGGCAGTAAGTTTT
18987
AGTCAGGGCAGAGCCATCTAGTTTT
19164



Sc++

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCatct

AAAGTGGCACCGAGTCGGTGC






gactcctGAGGAGAAGTCTGCCGTT








ACTGCCCTGtggg








187
SpyCas9
+
CTTGCCCCACAGGGCAGTAAGTTTT
18988
AGGGCTGGGCATAAAAGTCAGTTTT
19165





AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCatct

AAAGTGGCACCGAGTCGGTGC






gactcctGAGGAGAAGTCTGCCGTT








ACTGCCCTGtggg








190
SpyCas9-
+
CTTGCCCCACAGGGCAGTAAGTTTT
18989
TAAAAGTCAGGGCAGAGCCAGTTTT
19166



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCatct

AAAGTGGCACCGAGTCGGTGC






gactcctGAGGAGAAGTCTGCCGTT








ACTGCCCTGtggg








191
SpyCas9-
+
CTTGCCCCACAGGGCAGTAAGTTTT
18990
GTCAGGGCAGAGCCATCTATGTTTT
19167



NG

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCatct

AAAGTGGCACCGAGTCGGTGC






gactcctGAGGAGAAGTCTGCCGTT








ACTGCCCTGtggg








194
SpyCaS9-

CCTCAAACAGACACCATGGTGTTTT
18991
AACCTGTCTTGTAACCTTGAGTTTT
19168



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCgcag

AAAGTGGCACCGAGTCGGTGC






acttctcCTCAGGAGTCAGATGCAC








CATGGTGTCtgtt








195
BlatCas9

caacCTCAAACAGACACCATGGTGC
18992
cttaAACCTGTCTTGTAACCTTGGC
19169





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTgcaga

GCATTTATCTCCGAGGTGCT






cttctcCTCAGGAGTCAGATGCACC








ATGGTGTCtgtt








196
BlatCa9

caacCTCAAACAGACACCATGGTGC
18993
cttaAACCTGTCTTGTAACCTTGGC
19170





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTgcaga

GCATTTATCTCCGAGGTGCT






cttctcCTCAGGAGTCAGATGCACC








ATGGTGTCtgtt








198
SauriCas9
+
ACCTTGCCCCACAGGGCAGTAGTTT
18994
CCAGGGCTGGGCATAAAAGTCGTTT
19171





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAtct

TCTCGTCAACTTGTTGGCGAGA






gactcctGAGGAGAAGTCTGCCGTT








ACTGCCCTGTgggg








199
SauriCas9-
+
ACCTTGCCCCACAGGGCAGTAGTTT
18995
GCTGGGCATAAAAGTCAGGGCGTTT
19172



KKH

TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAtct

TCTCGTCAACTTGTTGGCGAGA






gactcctGAGGAGAAGTCTGCCGTT








ACTGCCCTGTgggg








202
ScaCas9-
+
CCTTGCCCCACAGGGCAGTAGTTTT
18996
AGTCAGGGCAGAGCCATCTAGTTTT
19173



Sc++

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCtctg

AAAGTGGCACCGAGTCGGTGC






actcctGAGGAGAAGTCTGCCGTTA








CTGCCCTGTgggg








203
SpyCas9-
+
CCTTGCCCCACAGGGCAGTAGTTTT
18997
AAAAGTCAGGGCAGAGCCATGTTTT
19174



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCtctg

AAAGTGGCACCGAGTCGGTGC






actcctGAGGAGAAGTCTGCCGTTA








CTGCCCTGTgggg








204
SpyCas9-

ACCTCAAACAGACACCATGGGTTTT
18998
TTAAACCTGTCTTGTAACCTGTTTT
19175



NG

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCcaga

AAAGTGGCACCGAGTCGGTGC






cttctcCTCAGGAGTCAGATGCACC








ATGGTGTCTgttt








208
SpyCas9-

ACCTCAAACAGACACCATGGGTTTT
18999
ACCTGTCTTGTAACCTTGATGTTTT
19176



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCcaga

AAAGTGGCACCGAGTCGGTGC






cttctcCTCAGGAGTCAGATGCACC








ATGGTGTCTgttt








209
BlatCas9
+
tcacCTTGCCCCACAGGGCAGTAGC
19000
taaaAGTCAGGGCAGAGCCATCTGC
19177





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTtctga

GCATTTATCTCCGAGGTGCT






ctcctGAGGAGAAGTCTGCCGTTAC








TGCCCTGTgggg








210
BlatCas9
+
tcacCTTGCCCCACAGGGCAGTAGC
19001
taaaAGTCAGGGCAGAGCCATCTGC
19178





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTtctga

GCATTTATCTCCGAGGTGCT






ctcctGAGGAGAAGTCTGCCGTTAC








TGCCCTGTgggg








212
SauCas9KKH
+
CACCTTGCCCCACAGGGCAGTGTTT
19002
GGCTGGGCATAAAAGTCAGGGGTTT
19179





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGActg

TCTCGTCAACTTGTTGGCGAGA






actcctGAGGAGAAGTCTGCCGTTA








CTGCCCTGTGgggc








215
ScaCas9-

AACCTCAAACAGACACCATGGTTTT
19003
CTTGTAACCTTGATACCAACGTTTT
19180



Sc++

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCagac

AAAGTGGCACCGAGTCGGTGC






ttctcCTCAGGAGTCAGATGCACCA








TGGTGTCTGtttg








216
SpyCas9-

AACCTCAAACAGACACCATGGTTTT
19004
CCTGTCTTGTAACCTTGATAGTTTT
19181



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCagac

AAAGTGGCACCGAGTCGGTGC






ttctcCTCAGGAGTCAGATGCACCA








TGGTGTCTGtttg








217
SpyCas9-
+
ACCTTGCCCCACAGGGCAGTGTTTT
19005
AAAGTCAGGGCAGAGCCATCGTTTT
19182



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCctga

AAAGTGGCACCGAGTCGGTGC






ctcctGAGGAGAAGTCTGCCGTTAC








TGCCCTGTGgggc








221
SpyCas9-

CAACCTCAAACAGACACCATGTTTT
19006
TTGTAACCTTGATACCAACCGTTTT
19183



NG

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCgact

AAAGTGGCACCGAGTCGGTGC






tctcCTCAGGAGTCAGATGCACCAT








GGTGTCTGTttga








224
SpyCas9-
+
CACCTTGCCCCACAGGGCAGGTTTT
19007
AAGTCAGGGCAGAGCCATCTGTTTT
19184



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCtgac

AAAGTGGCACCGAGTCGGTGC






tcctGAGGAGAAGTCTGCCGTTACT








GCCCTGTGGggca








226
SpyCas9-

CAACCTCAAACAGACACCATGTTTT
19008
CTGTCTTGTAACCTTGATACGTTTT
19185



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCgact

AAAGTGGCACCGAGTCGGTGC






tctcCTCAGGAGTCAGATGCACCAT








GGTGTCTGTttga








227
BlatCas9

tagcAACCTCAAACAGACACCATGC
19009
aaccTGTCTTGTAACCTTGATACGC
19186





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTgactt

GCATTTATCTCCGAGGTGCT






ctcCTCAGGAGTCAGATGCACCATG








GTGTCTGTttga








232
ScaCas9-

GCAACCTCAAACAGACACCAGTTTT
19010
CTTGTAACCTTGATACCAACGTTTT
19187



Sc++

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCactt

AAAGTGGCACCGAGTCGGTGC






ctcCTCAGGAGTCAGATGCACCATG








GTGTCTGTTtgag








233
SpyCas9

GCAACCTCAAACAGACACCAGTTTT
19011
ACCTTGATACCAACCTGCCCGTTTT
19188





AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCactt

AAAGTGGCACCGAGTCGGTGC






ctcCTCAGGAGTCAGATGCACCATG








GTGTCTGTTtgag








236
SpyCas9-

GCAACCTCAAACAGACACCAGTTTT
19012
TGTCTTGTAACCTTGATACCGTTTT
19189



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCactt

AAAGTGGCACCGAGTCGGTGC






ctcCTCAGGAGTCAGATGCACCATG








GTGTCTGTTtgag








237
SpyCas9-

GCAACCTCAAACAGACACCAGTTTT
19013
TTGTAACCTTGATACCAACCGTTTT
19190



NG

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCactt

AAAGTGGCACCGAGTCGGTGC






ctcCTCAGGAGTCAGATGCACCATG








GTGTCTGTTtgag








240
SpyCas9-
+
TCACCTTGCCCCACAGGGCAGTTTT
19014
AGTCAGGGCAGAGCCATCTAGTTTT
19191



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCgact

AAAGTGGCACCGAGTCGGTGC






cctGAGGAGAAGTCTGCCGTTACTG








CCCTGTGGGgcaa








241
BlatCas9
+
cgttCACCTTGCCCCACAGGGCAGC
19015
taaaAGTCAGGGCAGAGCCATCTGC
19192





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTgactc

GCATTTATCTCCGAGGTGCT






ctGAGGAGAAGTCTGCCGTTACTGC








CCTGTGGGgcaa








243
SauCas9KKH
+
GTTCACCTTGCCCCACAGGGCGTTT
19016
GGCTGGGCATAAAAGTCAGGGGTTT
19193





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAact

TCTCGTCAACTTGTTGGCGAGA






cctGAGGAGAAGTCTGCCGTTACTG








CCCTGTGGGGcaag








244
SauriCas9

TAGCAACCTCAAACAGACACCGTTT
19017
TAACCTTGATACCAACCTGCCGTTT
19194





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGActt

TCTCGTCAACTTGTTGGCGAGA






ctcCTCAGGAGTCAGATGCACCATG








GTGTCTGTTTgagg








245
SauriCas9-

TAGCAACCTCAAACAGACACCGTTT
19018
GTAACCTTGATACCAACCTGCGTTT
19195



KKH

TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGActt

TCTCGTCAACTTGTTGGCGAGA






ctcCTCAGGAGTCAGATGCACCATG








GTGTCTGTTTgagg








248
ScaCas9-

AGCAACCTCAAACAGACACCGTTTT
19019
CTTGTAACCTTGATACCAACGTTTT
19196



Sc++

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCcttc

AAAGTGGCACCGAGTCGGTGC






tcCTCAGGAGTCAGATGCACCATGG








TGTCTGTTTgagg








249
SpyCas9-

AGCAACCTCAAACAGACACCGTTTT
19020
GTCTTGTAACCTTGATACCAGTTTT
19197



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCcttc

AAAGTGGCACCGAGTCGGTGC






tcCTCAGGAGTCAGATGCACCATGG








TGTCTGTTTgagg








250
SpyCas9-
+
TTCACCTTGCCCCACAGGGCGTTTT
19021
GTCAGGGCAGAGCCATCTATGTTTT
19198



NG

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCactc

AAAGTGGCACCGAGTCGGTGC






ctGAGGAGAAGTCTGCCGTTACTGC








CCTGTGGGGcaag








254
SpyCas9-
+
TTCACCTTGCCCCACAGGGCGTTTT
19022
GTCAGGGCAGAGCCATCTATGTTTT
19199



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCactc

AAAGTGGCACCGAGTCGGTGC






ctGAGGAGAAGTCTGCCGTTACTGC








CCTGTGGGGcaag








258
SauCas9KKH

CTAGCAACCTCAAACAGACACGTTT
19023
TGTAACCTTGATACCAACCTGGTTT
19200





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAttc

TCTCGTCAACTTGTTGGCGAGA






tcCTCAGGAGTCAGATGCACCATGG








TGTCTGTTTGaggt








259
SauCas9KKH

CTAGCAACCTCAAACAGACACGTTT
19024
TGTAACCTTGATACCAACCTGGTTT
19201





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAttc

TCTCGTCAACTTGTTGGCGAGA






tcCTCAGGAGTCAGATGCACCATGG








TGTCTGTTTGaggt








262
ScaCas9-
+
GTTCACCTTGCCCCACAGGGGTTTT
19025
AGTCAGGGCAGAGCCATCTAGTTTT
19202



Sc++

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCctcc

AAAGTGGCACCGAGTCGGTGC






tGAGGAGAAGTCTGCCGTTACTGCC








CTGTGGGGCaagg








263
SpyCas9-
+
GTTCACCTTGCCCCACAGGGGTTTT
19026
TCAGGGCAGAGCCATCTATTGTTTT
19203



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCctec

AAAGTGGCACCGAGTCGGTGC






tGAGGAGAAGTCTGCCGTTACTGCC








CTGTGGGGCaagg








264
SpyCas9-

TAGCAACCTCAAACAGACACGTTTT
19027
TCTTGTAACCTTGATACCAAGTTTT
19204



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCttct

AAAGTGGCACCGAGTCGGTGC






cCTCAGGAGTCAGATGCACCATGGT








GTCTGTTTGaggt








267
SauriCas9-
+
ACGTTCACCTTGCCCCACAGGGTTT
19028
GCTGGGCATAAAAGTCAGGGCGTTT
19205



KKH

TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAtec

TCTCGTCAACTTGTTGGCGAGA






tGAGGAGAAGTCTGCCGTTACTGCC








CTGTGGGGCAaggt








268
SpyCas9-
+
CGTTCACCTTGCCCCACAGGGTTTT
19029
CAGGGCAGAGCCATCTATTGGTTTT
19206



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCtcct

AAAGTGGCACCGAGTCGGTGC






GAGGAGAAGTCTGCCGTTACTGCCC








TGTGGGGCAaggt








269
SpyCas9-

CTAGCAACCTCAAACAGACAGTTTT
19030
CTTGTAACCTTGATACCAACGTTTT
19207



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCtctc

AAAGTGGCACCGAGTCGGTGC






CTCAGGAGTCAGATGCACCATGGTG








TCTGTTTGAggtt








270
SauCas9KKH
+
CACGTTCACCTTGCCCCACAGGTTT
19031
CATCTATTGCTTACATTTGCTGTTT
19208





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAcct

TCTCGTCAACTTGTTGGCGAGA






GAGGAGAAGTCTGCCGTTACTGCCC








TGTGGGGCAAggtg








271
SauCas9KKH
+
CACGTTCACCTTGCCCCACAGGTTT
19032
CATCTATTGCTTACATTTGCTGTTT
19209





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAcct

TCTCGTCAACTTGTTGGCGAGA






GAGGAGAAGTCTGCCGTTACTGCCC








TGTGGGGCAAggtg








274
SpyCas9-
+
ACGTTCACCTTGCCCCACAGGTTTT
19033
GTCAGGGCAGAGCCATCTATGTTTT
19210



NG

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCcctG

AAAGTGGCACCGAGTCGGTGC






AGGAGAAGTCTGCCGTTACTGCCCT








GTGGGGCAAggtg








278
SpyCas9-
+
ACGTTCACCTTGCCCCACAGGTTTT
19034
AGGGCAGAGCCATCTATTGCGTTTT
19211



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCcctG

AAAGTGGCACCGAGTCGGTGC






AGGAGAAGTCTGCCGTTACTGCCCT








GTGGGGCAAggtg








279
SpyCas9-

ACTAGCAACCTCAAACAGACGTTTT
19035
TTGTAACCTTGATACCAACCGTTTT
19212



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCctcC

AAAGTGGCACCGAGTCGGTGC






TCAGGAGTCAGATGCACCATGGTGT








CTGTTTGAGgttg








283
ScaCas9-
+
CACGTTCACCTTGCCCCACAGTTTT
19036
GAGCCATCTATTGCTTACATGTTTT
19213



Sc++

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCctGA

AAAGTGGCACCGAGTCGGTGC






GGAGAAGTCTGCCGTTACTGCCCTG








TGGGGCAAGgtga








284
SpyCaS9
+
CACGTTCACCTTGCCCCACAGTTTT
19037
AGGGCTGGGCATAAAAGTCAGTTTT
19214





AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCctGA

AAAGTGGCACCGAGTCGGTGC






GGAGAAGTCTGCCGTTACTGCCCTG








TGGGGCAAGgtga








287
SpyCas9-
+
CACGTTCACCTTGCCCCACAGTTTT
19038
GGGCAGAGCCATCTATTGCTGTTTT
19215



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCctGA

AAAGTGGCACCGAGTCGGTGC






GGAGAAGTCTGCCGTTACTGCCCTG








TGGGGCAAGgtga








288
SpyCas9-
+
CACGTTCACCTTGCCCCACAGTTTT
19039
GTCAGGGCAGAGCCATCTATGTTTT
19216



NG

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCctGA

AAAGTGGCACCGAGTCGGTGC






GGAGAAGTCTGCCGTTACTGCCCTG








TGGGGCAAGgtga








291
SpyCas9-

CACTAGCAACCTCAAACAGAGTTTT
19040
TGTAACCTTGATACCAACCTGTTTT
19217



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCtcCT

AAAGTGGCACCGAGTCGGTGC






CAGGAGTCAGATGCACCATGGTGTC








TGTTTGAGGttgc








294
SauriCas9
+
TCCACGTTCACCTTGCCCCACGTTT
19041
CCAGGGCTGGGCATAAAAGTCGTTT
19218





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAtGA

TCTCGTCAACTTGTTGGCGAGA






GGAGAAGTCTGCCGTTACTGCCCTG








TGGGGCAAGGtgaa








295
SauriCas9-
+
TCCACGTTCACCTTGCCCCACGTTT
19042
GCTGGGCATAAAAGTCAGGGCGTTT
19219



KKH

TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAtGA

TCTCGTCAACTTGTTGGCGAGA






GGAGAAGTCTGCCGTTACTGCCCTG








TGGGGCAAGGtgaa








298
ScaCas9-
+
CCACGTTCACCTTGCCCCACGTTTT
19043
GAGCCATCTATTGCTTACATGTTTT
19220



Sc++

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCtGAG

AAAGTGGCACCGAGTCGGTGC






GAGAAGTCTGCCGTTACTGCCCTGT








GGGGCAAGGtgaa








299
SpyCas9
+
CCACGTTCACCTTGCCCCACGTTTT
19044
AGGGCTGGGCATAAAAGTCAGTTTT
19221





AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCtGAG

AAAGTGGCACCGAGTCGGTGC






GAGAAGTCTGCCGTTACTGCCCTGT








GGGGCAAGGtgaa








302
SpyCas9-
+
CCACGTTCACCTTGCCCCACGTTTT
19045
GGCAGAGCCATCTATTGCTTGTTTT
19222



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCtGAG

AAAGTGGCACCGAGTCGGTGC






GAGAAGTCTGCCGTTACTGCCCTGT








GGGGCAAGGtgaa








303
SpyCas9-
+
CCACGTTCACCTTGCCCCACGTTTT
19046
GTCAGGGCAGAGCCATCTATGTTTT
19223



NG

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCtGAG

AAAGTGGCACCGAGTCGGTGC






GAGAAGTCTGCCGTTACTGCCCTGT








GGGGCAAGGtgaa








306
SpyCas9-

TCACTAGCAACCTCAAACAGGTTTT
19047
GTAACCTTGATACCAACCTGGTTTT
19224



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCcCTC

AAAGTGGCACCGAGTCGGTGC






AGGAGTCAGATGCACCATGGTGTCT








GTTTGAGGTtgct








307
BlatCas9
+
catcCACGTTCACCTTGCCCCACGC
19048
agtcAGGGCAGAGCCATCTATTGGC
19225





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTtGAGG

GCATTTATCTCCGAGGTGCT






AGAAGTCTGCCGTTACTGCCCTGTG








GGGCAAGGtgaa








308
BlatCas9

tgttCACTAGCAACCTCAAACAGGC
19049
gtctTGTAACCTTGATACCAACCGC
19226





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTcCTCA

GCATTTATCTCCGAGGTGCT






GGAGTCAGATGCACCATGGTGTCTG








TTTGAGGTtgct








309
BlatCas9
+
catcCACGTTCACCTTGCCCCACGC
19050
agtcAGGGCAGAGCCATCTATTGGC
19227





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTtGAGG

GCATTTATCTCCGAGGTGCT






AGAAGTCTGCCGTTACTGCCCTGTG








GGGCAAGGtgaa








310
BlatCas9

tgttCACTAGCAACCTCAAACAGGC
19051
gtctTGTAACCTTGATACCAACCGC
19228





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTcCTCA

GCATTTATCTCCGAGGTGCT






GGAGTCAGATGCACCATGGTGTCTG








TTTGAGGTtgct








312
Nme2Cas9

tgTGTTCACTAGCAACCTCAAACAG
19052
tgTAACCTTGATACCAACCTGCCCG
19229





TTGTAGCTCCCTTTCTCATTTCGGA

TTGTAGCTCCCTTTCTCATTTCGGA






AACGAAATGAGAACCGTTGCTACAA

AACGAAATGAGAACCGTTGCTACAA






TAAGGCCGTCTGAAAAGATGTGCCG

TAAGGCCGTCTGAAAAGATGTGCCG






CAACGCTCTGCCCCTTAAAGCTTCT

CAACGCTCTGCCCCTTAAAGCTTCT






GCTTTAAGGGGCATCGTTTACTCAG

GCTTTAAGGGGCATCGTTTA






GAGTCAGATGCACCATGGTGTCTGT








TTGAGGTTgcta








313
SauCaS9
+
tcATCCACGTTCACCTTGCCCCAGT
19053
agGGCTGGGCATAAAAGTCAGGGGT
19230





TTTAGTACTCTGGAAACAGAATCTA

TTTAGTACTCTGGAAACAGAATCTA






CTAAAACAAGGCAAAATGCCGTGTT

CTAAAACAAGGCAAAATGCCGTGTT






TATCTCGTCAACTTGTTGGCGAGAG

TATCTCGTCAACTTGTTGGCGAGA






AGGAGAAGTCTGCCGTTACTGCCCT








GTGGGGCAAGGTgaac








314
SauCas9KKH
+
ATCCACGTTCACCTTGCCCCAGTTT
19054
CATCTATTGCTTACATTTGCTGTTT
19231





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAGAG

TCTCGTCAACTTGTTGGCGAGA






GAGAAGTCTGCCGTTACTGCCCTGT








GGGGCAAGGTgaac








315
SauriCas9
+
ATCCACGTTCACCTTGCCCCAGTTT
19055
CCAGGGCTGGGCATAAAAGTCGTTT
19232





TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAGAG

TCTCGTCAACTTGTTGGCGAGA






GAGAAGTCTGCCGTTACTGCCCTGT








GGGGCAAGGTgaac








316
SauriCas9-
+
ATCCACGTTCACCTTGCCCCAGTTT
19056
GCTGGGCATAAAAGTCAGGGCGTTT
19233



KKH

TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAGAG

TCTCGTCAACTTGTTGGCGAGA






GAGAAGTCTGCCGTTACTGCCCTGT








GGGGCAAGGTgaac








319
ScaCas9-
+
TCCACGTTCACCTTGCCCCAGTTTT
19057
GAGCCATCTATTGCTTACATGTTTT
19234



Sc++

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCGAGG

AAAGTGGCACCGAGTCGGTGC






AGAAGTCTGCCGTTACTGCCCTGTG








GGGCAAGGTgaac








320
SpyCas9-
+
TCCACGTTCACCTTGCCCCAGTTTT
19058
GCAGAGCCATCTATTGCTTAGTTTT
19235



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCGAGG

AAAGTGGCACCGAGTCGGTGC






AGAAGTCTGCCGTTACTGCCCTGTG








GGGCAAGGTgaac








321
SpyCas9-

TTCACTAGCAACCTCAAACAGTTTT
19059
TAACCTTGATACCAACCTGCGTTTT
19236



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCCTCA

AAAGTGGCACCGAGTCGGTGC






GGAGTCAGATGCACCATGGTGTCTG








TTTGAGGTTgcta








322
BlatCas9

gtgtTCACTAGCAACCTCAAACAGC
19060
gtaaCCTTGATACCAACCTGCCCGC
19237





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTCTCAG

GCATTTATCTCCGAGGTGCT






GAGTCAGATGCACCATGGTGTCTGT








TTGAGGTTgcta








323
BlatCas9

gtgtTCACTAGCAACCTCAAACAGC
19061
gtaaCCTTGATACCAACCTGCCCGC
19238





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTCTCAG

GCATTTATCTCCGAGGTGCT






GAGTCAGATGCACCATGGTGTCTGT








TTGAGGTTgcta








327
SauCas9
+
CATCCACGTTCACCTTGCCCCGTTT
19062
CATCTATTGCTTACATTTGCTGTTT
19239



KKH

TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAAGG

TCTCGTCAACTTGTTGGCGAGA






AGAAGTCTGCCGTTACTGCCCTGTG








GGGCAAGGTGaacg








328
SauriCas9-
+
CATCCACGTTCACCTTGCCCCGTTT
19063
GCTGGGCATAAAAGTCAGGGCGTTT
19240



KKH

TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAAGG

TCTCGTCAACTTGTTGGCGAGA






AGAAGTCTGCCGTTACTGCCCTGTG








GGGCAAGGTGaacg








329
SpyCas9-

GTTCACTAGCAACCTCAAACGTTTT
19064
ACCTTGATACCAACCTGCCCGTTTT
19241



NG

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCTCAG

AAAGTGGCACCGAGTCGGTGC






GAGTCAGATGCACCATGGTGTCTGT








TTGAGGTTGctag








333
SpyCas9-

GTTCACTAGCAACCTCAAACGTTTT
19065
AACCTTGATACCAACCTGCCGTTTT
19242



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCTCAG

AAAGTGGCACCGAGTCGGTGC






GAGTCAGATGCACCATGGTGTCTGT








TTGAGGTTGctag








334
SpyCas9-
+
ATCCACGTTCACCTTGCCCCGTTTT
19066
CAGAGCCATCTATTGCTTACGTTTT
19243



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCAGGA

AAAGTGGCACCGAGTCGGTGC






GAAGTCTGCCGTTACTGCCCTGTGG








GGCAAGGTGaacg








340
SauCas9
+
TCATCCACGTTCACCTTGCCCGTTT
19067
CATCTATTGCTTACATTTGCTGTTT
19244



KKH

TAGTACTCTGGAAACAGAATCTACT

TAGTACTCTGGAAACAGAATCTACT






AAAACAAGGCAAAATGCCGTGTTTA

AAAACAAGGCAAAATGCCGTGTTTA






TCTCGTCAACTTGTTGGCGAGAGGA

TCTCGTCAACTTGTTGGCGAGA






GAAGTCTGCCGTTACTGCCCTGTGG








GGCAAGGTGAacgt








343
ScaCas9-

TGTTCACTAGCAACCTCAAAGTTTT
19068
ACCTTGATACCAACCTGCCCGTTTT
19245



Sc++

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCCAGG

AAAGTGGCACCGAGTCGGTGC






AGTCAGATGCACCATGGTGTCTGTT








TGAGGTTGCtagt








344
SpyCas9-

TGTTCACTAGCAACCTCAAAGTTTT
19069
ACCTTGATACCAACCTGCCCGTTTT
19246



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCCAGG

AAAGTGGCACCGAGTCGGTGC






AGTCAGATGCACCATGGTGTCTGTT








TGAGGTTGCtagt








345
SpyCas9-
+
CATCCACGTTCACCTTGCCCGTTTT
19070
AGAGCCATCTATTGCTTACAGTTTT
19247



SpRY

AGAGCTAGAAATAGCAAGTTAAAAT

AGAGCTAGAAATAGCAAGTTAAAAT






AAGGCTAGTCCGTTATCAACTTGAA

AAGGCTAGTCCGTTATCAACTTGAA






AAAGTGGCACCGAGTCGGTGCGGAG

AAAGTGGCACCGAGTCGGTGC






AAGTCTGCCGTTACTGCCCTGTGGG








GCAAGGTGAacgt








346
BlatCas9

ctgtGTTCACTAGCAACCTCAAAGC
19071
gtaaCCTTGATACCAACCTGCCCGC
19248





TATAGTTCCTTACTGAAAGGTAAGT

TATAGTTCCTTACTGAAAGGTAAGT






TGCTATAGTAAGGGCAACAGACCCG

TGCTATAGTAAGGGCAACAGACCCG






AGGCGTTGGGGATCGCCTAGCCCGT

AGGCGTTGGGGATCGCCTAGCCCGT






GTTTACGGGCTCTCCCCATATTCAA

GTTTACGGGCTCTCCCCATATTCAA






AATAATGACAGACGAGCACCTTGGA

AATAATGACAGACGAGCACCTTGGA






GCATTTATCTCCGAGGTGCTCAGGA

GCATTTATCTCCGAGGTGCT






GTCAGATGCACCATGGTGTCTGTTT








GAGGTTGCtagt









Capital letters indicate “core nucleotides” while lower case letters indicate “flanking nucleotides.” Herein, when an RNA sequence (e.g., a template RNA sequence) is said to comprise a particular sequence (e.g., a sequence of Table 4 or a portion thereof) that comprises thymine (T), it is of course understood that the RNA sequence may (and frequently does) comprise uracil (U) in place of T. For instance, the RNA sequence may comprise U at every position shown as T in the sequence in Table 4. More specifically, the present disclosure provides an RNA sequence according to every template sequence shown in Table 4, wherein the RNA sequence has a U in place of each T in the sequence of Table 4.


In some embodiments, the systems and methods provided herein may comprise a template sequence listed in any of Tables 5A-5D. Tables 5A-5D provide exemplary template RNA sequences (column 2) designed to be paired with a gene modifying polypeptide to correct a mutation in the HBB gene. The templates in Tables 5A-5D are meant to exemplify the total sequence of: (1) gRNA spacer (e.g., for targeting for first strand nick), (2) gRNA scaffold, (3) RT (heterologous object sequence) sequence, and (4) PBS sequence (e.g., for initiating TPRT at first strand nick).









TABLE 5A







Exemplary template RNA sequences


Table 5A provides design of exemplary DNA components of gene modifying systems for


correcting the pathogenic E6V mutation in HBB to the wild-type form. This table


details the sequence of a complete template RNA for use in exemplary gene modifying


systems comprising a gene  modifying polypeptide. Templates in this table employ the


HBB5 spacer (CATGGTGCACCTGACTCCTG SEQ ID NO: 19249) and a gRNA scaffold sequence of


GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGC


(SEQ ID NO: 20923). For exemplification, the lengths of the RT (heterologous object)


sequences and PBS sequences were varied at the 3′ end. The length of these respective


sequences is reflected in columns 3 and 4, respectively. The longest form of the RT


sequence is AGTAACGGCAGACTTCTCTTCAG (SEQ ID NO: 20954). The longest form of the PBS


is GAGTCAGGTGCACCATG (SEQ ID NO: 19431).














SEQ





Sequence

ID
RT
PBS
Total


Name
Full DNA sequence
NO
length
length
length















HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20958
23
17
136


RT23_PBS17
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGTAACGG







CAGACTTCTCTTCAGGAGTCAGGTGCACCATG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20959
23
16
135


RT23_PBS16
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGTAACGG







CAGACTTCTCTTCAGGAGTCAGGTGCACCAT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20960
23
15
134


RT23_PBS15
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGTAACGG







CAGACTTCTCTTCAGGAGTCAGGTGCACCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20961
23
14
133


RT23_PBS14
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGTAACGG







CAGACTTCTCTTCAGGAGTCAGGTGCACC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20962
23
13
132


RT23_PBS13
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGTAACGG







CAGACTTCTCTTCAGGAGTCAGGTGCAC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20963
23
12
131


RT23_PBS12
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGTAACGG







CAGACTTCTCTTCAGGAGTCAGGTGCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20964
23
11
130


RT23_PBS11
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGTAACGG







CAGACTTCTCTTCAGGAGTCAGGTGC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20965
23
10
129


RT23_PBS10
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGTAACGG







CAGACTTCTCTTCAGGAGTCAGGTG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20966
23
9
128


RT23_PBS9
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGTAACGG







CAGACTTCTCTTCAGGAGTCAGGT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20967
23
8
127


RT23_PBS8
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGTAACGG







CAGACTTCTCTTCAGGAGTCAGG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20968
22
17
135


RT22_PBS17
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTAACGGC







AGACTTCTCTTCAGGAGTCAGGTGCACCATG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20969
22
16
134


RT22_PBS16
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTAACGGC







AGACTTCTCTTCAGGAGTCAGGTGCACCAT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20970
22
15
133


RT22_PBS15
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTAACGGC







AGACTTCTCTTCAGGAGTCAGGTGCACCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20971
22
14
132


RT22_PBS14
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTAACGGC







AGACTTCTCTTCAGGAGTCAGGTGCACC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20972
22
13
131


RT22_PBS13
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTAACGGC







AGACTTCTCTTCAGGAGTCAGGTGCAC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20973
22
12
130


RT22_PBS12
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTAACGGC







AGACTTCTCTTCAGGAGTCAGGTGCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20974
22
11
129


RT22_PBS11
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTAACGGC







AGACTTCTCTTCAGGAGTCAGGTGC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20975
22
10
128


RT22_PBS10
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTAACGGC







AGACTTCTCTTCAGGAGTCAGGTG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20976
22
9
127


RT22_PBS9
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTAACGGC







AGACTTCTCTTCAGGAGTCAGGT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20977
22
8
126


RT22_PBS8
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTAACGGC







AGACTTCTCTTCAGGAGTCAGG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20978
21
17
134


RT21_PBS17
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTAACGGCA







GACTTCTCTTCAGGAGTCAGGTGCACCATG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20979
21
16
133


RT21_PBS16
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTAACGGCA







GACTTCTCTTCAGGAGTCAGGTGCACCAT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20980
21
15
132


RT21_PBS15
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTAACGGCA







GACTTCTCTTCAGGAGTCAGGTGCACCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20981
21
14
131


RT21_PBS14
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTAACGGCA







GACTTCTCTTCAGGAGTCAGGTGCACC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20982
21
13
130


RT21_PBS13
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTAACGGCA







GACTTCTCTTCAGGAGTCAGGTGCAC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20983
21
12
129


RT21_PBS12
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTAACGGCA







GACTTCTCTTCAGGAGTCAGGTGCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20984
21
11
128


RT21_PBS11
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTAACGGCA







GACTTCTCTTCAGGAGTCAGGTGC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20985
21
10
127


RT21_PBS10
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTAACGGCA







GACTTCTCTTCAGGAGTCAGGTG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20986
21
9
126


RT21_PBS9
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTAACGGCA







GACTTCTCTTCAGGAGTCAGGT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20987
21
8
125


RT21_PBS8
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTAACGGCA







GACTTCTCTTCAGGAGTCAGG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20988
20
17
133


RT20_PBS17
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAACGGCAG







ACTTCTCTTCAGGAGTCAGGTGCACCATG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20989
20
16
132


RT20_PBS16
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAACGGCAG







ACTTCTCTTCAGGAGTCAGGTGCACCAT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20990
20
15
131


RT20_PBS15
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAACGGCAG







ACTTCTCTTCAGGAGTCAGGTGCACCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20991
20
14
130


RT20_PBS14
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAACGGCAG







ACTTCTCTTCAGGAGTCAGGTGCACC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20992
20
13
129


RT20_PBS13
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAACGGCAG







ACTTCTCTTCAGGAGTCAGGTGCAC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20993
20
12
128


RT20_PBS12
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAACGGCAG







ACTTCTCTTCAGGAGTCAGGTGCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20994
20
11
127


RT20_PBS11
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAACGGCAG







ACTTCTCTTCAGGAGTCAGGTGC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20995
20
10
126


RT20_PBS10
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAACGGCAG







ACTTCTCTTCAGGAGTCAGGTG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20996
20
9
125


RT20_PBS9
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAACGGCAG







ACTTCTCTTCAGGAGTCAGGT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20997
20
8
124


RT20_PBS8
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAACGGCAG







ACTTCTCTTCAGGAGTCAGG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20998
19
17
132


RT19_PBS17
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACGGCAGA







CTTCTCTTCAGGAGTCAGGTGCACCATG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
20999
19
16
131


RT19_PBS16
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACGGCAGA







CTTCTCTTCAGGAGTCAGGTGCACCAT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21000
19
15
130


RT19_PBS15
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACGGCAGA







CTTCTCTTCAGGAGTCAGGTGCACCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21001
19
14
129


RT19_PBS14
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACGGCAGA







CTTCTCTTCAGGAGTCAGGTGCACC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21002
19
13
128


RT19_PBS13
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACGGCAGA







CTTCTCTTCAGGAGTCAGGTGCAC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21003
19
12
127


RT19_PBS12
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACGGCAGA







CTTCTCTTCAGGAGTCAGGTGCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21004
19
11
126


RT19_PBS11
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACGGCAGA







CTTCTCTTCAGGAGTCAGGTGC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21005
19
10
125


RT19_PBS10
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACGGCAGA







CTTCTCTTCAGGAGTCAGGTG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21006
19
9
124


RT19_PBS9
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACGGCAGA







CTTCTCTTCAGGAGTCAGGT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21007
19
8
123


RT19_PBS8
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACGGCAGA







CTTCTCTTCAGGAGTCAGG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21008
18
17
131


RT18_PBS17
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCGGCAGAC







TTCTCTTCAGGAGTCAGGTGCACCATG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21009
18
16
130


RT18_PBS16
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCGGCAGAC







TTCTCTTCAGGAGTCAGGTGCACCAT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21010
18
15
129


RT18_PBS15
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCGGCAGAC







TTCTCTTCAGGAGTCAGGTGCACCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21011
18
14
128


RT18_PBS14
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCGGCAGAC







TTCTCTTCAGGAGTCAGGTGCACC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21012
18
13
127


RT18_PBS13
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCGGCAGAC







TTCTCTTCAGGAGTCAGGTGCAC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21013
18
12
126


RT18_PBS12
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCGGCAGAC







TTCTCTTCAGGAGTCAGGTGCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21014
18
11
125


RT18_PBS11
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCGGCAGAC







TTCTCTTCAGGAGTCAGGTGC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21015
18
10
124


RT18_PBS10
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCGGCAGAC







TTCTCTTCAGGAGTCAGGTG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21016
18
9
123


RT18_PBS9
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCGGCAGAC







TTCTCTTCAGGAGTCAGGT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21017
18
8
122


RT18_PBS8
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCGGCAGAC







TTCTCTTCAGGAGTCAGG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21018
17
17
130


RT17_PBS17
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGCAGACT







TCTCTTCAGGAGTCAGGTGCACCATG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21019
17
16
129


RT17_PBS16
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGCAGACT







TCTCTTCAGGAGTCAGGTGCACCAT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21020
17
15
128


RT17_PBS15
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGCAGACT







TCTCTTCAGGAGTCAGGTGCACCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21021
17
14
127


RT17_PBS14
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGCAGACT







TCTCTTCAGGAGTCAGGTGCACC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21022
17
13
126


RT17_PBS13
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGCAGACT







TCTCTTCAGGAGTCAGGTGCAC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21023
17
12
125


RT17_PBS12
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGCAGACT







TCTCTTCAGGAGTCAGGTGCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21024
17
11
124


RT17_PBS11
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGCAGACT







TCTCTTCAGGAGTCAGGTGC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21025
17
10
123


RT17_PBS10
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGCAGACT







TCTCTTCAGGAGTCAGGTG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21026
17
9
122


RT17_PBS9
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGCAGACT







TCTCTTCAGGAGTCAGGT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21027
17
8
121


RT17_PBS8
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGCAGACT







TCTCTTCAGGAGTCAGG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21028
16
17
129


RT16_PBS17
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCAGACTT







CTCTTCAGGAGTCAGGTGCACCATG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21029
16
16
128


RT16_PBS16
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCAGACTT







CTCTTCAGGAGTCAGGTGCACCAT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21030
16
15
127


RT16_PBS15
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCAGACTT







CTCTTCAGGAGTCAGGTGCACCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21031
16
14
126


RT16_PBS14
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCAGACTT







CTCTTCAGGAGTCAGGTGCACC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21032
16
13
125


RT16_PBS13
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCAGACTT







CTCTTCAGGAGTCAGGTGCAC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21033
16
12
124


RT16_PBS12
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCAGACTT







CTCTTCAGGAGTCAGGTGCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21034
16
11
123


RT16_PBS11
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCAGACTT







CTCTTCAGGAGTCAGGTGC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21035
16
10
122


RT16_PBS10
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCAGACTT







CTCTTCAGGAGTCAGGTG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21036
16
9
121


RT16_PBS9
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCAGACTT







CTCTTCAGGAGTCAGGT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21037
16
8
120


RT16_PBS8
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCAGACTT







CTCTTCAGGAGTCAGG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21038
15
17
128


RT15_PBS17
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCAGACTTC







TCTTCAGGAGTCAGGTGCACCATG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21039
15
16
127


RT15_PBS16
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCAGACTTC







TCTTCAGGAGTCAGGTGCACCAT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21040
15
15
126


RT15_PBS15
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCAGACTTC







TCTTCAGGAGTCAGGTGCACCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21041
15
14
125


RT15_PBS14
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCAGACTTC







TCTTCAGGAGTCAGGTGCACC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21042
15
13
124


RT15_PBS13
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCAGACTTC







TCTTCAGGAGTCAGGTGCAC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21043
15
12
123


RT15_PBS12
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCAGACTTC







TCTTCAGGAGTCAGGTGCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21044
15
11
122


RT15_PBS11
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCAGACTTC







TCTTCAGGAGTCAGGTGC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21045
15
10
121


RT15_PBS10
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCAGACTTC







TCTTCAGGAGTCAGGTG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21046
15
9
120


RT15_PBS9
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCAGACTTC







TCTTCAGGAGTCAGGT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21047
15
8
119


RT15_PBS8
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCAGACTTC







TCTTCAGGAGTCAGG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21048
14
17
127


RT14_PBS17
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGACTTCT







CTTCAGGAGTCAGGTGCACCATG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21049
14
16
126


RT14_PBS16
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGACTTCT







CTTCAGGAGTCAGGTGCACCAT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21050
14
15
125


RT14_PBS15
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGACTTCT







CTTCAGGAGTCAGGTGCACCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21051
14
14
124


RT14_PBS14
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGACTTCT







CTTCAGGAGTCAGGTGCACC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21052
14
13
123


RT14_PBS13
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGACTTCT







CTTCAGGAGTCAGGTGCAC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21053
14
12
122


RT14_PBS12
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGACTTCT







CTTCAGGAGTCAGGTGCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21054
14
11
121


RT14_PBS11
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGACTTCT







CTTCAGGAGTCAGGTGC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21055
14
10
120


RT14_PBS10
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGACTTCT







CTTCAGGAGTCAGGTG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21056
14
9
119


RT14_PBS9
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGACTTCT







CTTCAGGAGTCAGGT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21057
14
8
118


RT14_PBS8
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGACTTCT







CTTCAGGAGTCAGG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21058
13
17
126


RT13_PBS17
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTTCTC







TTCAGGAGTCAGGTGCACCATG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21059
13
16
125


RT13_PBS16
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTTCTC







TTCAGGAGTCAGGTGCACCAT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21060
13
15
124


RT13_PBS15
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTTCTC







TTCAGGAGTCAGGTGCACCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21061
13
14
123


RT13_PBS14
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTTCTC







TTCAGGAGTCAGGTGCACC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21062
13
13
122


RT13_PBS13
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTTCTC







TTCAGGAGTCAGGTGCAC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21063
13
12
121


RT13_PBS12
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTTCTC







TTCAGGAGTCAGGTGCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21064
13
11
120


RT13_PBS11
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTTCTC







TTCAGGAGTCAGGTGC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21065
13
10
119


RT13_PBS10
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTTCTC







TTCAGGAGTCAGGTG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21066
13
9
118


RT13_PBS9
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTTCTC







TTCAGGAGTCAGGT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21067
13
8
117


RT13_PBS8
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTTCTC







TTCAGGAGTCAGG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21068
12
17
125


RT12_PBS17
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTTCTCT







TCAGGAGTCAGGTGCACCATG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21069
12
16
124


RT12_PBS16
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTTCTCT







TCAGGAGTCAGGTGCACCAT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21070
12
15
123


RT12_PBS15
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTTCTCT







TCAGGAGTCAGGTGCACCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21071
12
14
122


RT12_PBS14
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTTCTCT







TCAGGAGTCAGGTGCACC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21072
12
13
121


RT12_PBS13
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTTCTCT







TCAGGAGTCAGGTGCAC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21073
12
12
120


RT12_PBS12
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTTCTCT







TCAGGAGTCAGGTGCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21074
12
11
119


RT12_PBS11
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTTCTCT







TCAGGAGTCAGGTGC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21075
12
10
118


RT12_PBS10
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTTCTCT







TCAGGAGTCAGGTG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21076
12
9
117


RT12_PBS9
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTTCTCT







TCAGGAGTCAGGT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21077
12
8
116


RT12_PBS8
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTTCTCT







TCAGGAGTCAGG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21078
11
17
124


RT11_PBS17
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTTCTCTTC







AGGAGTCAGGTGCACCATG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21079
11
16
123


RT11_PBS16
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTTCTCTTC







AGGAGTCAGGTGCACCAT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21080
11
15
122


RT11_PBS15
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTTCTCTTC







AGGAGTCAGGTGCACCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21081
11
14
121


RT11_PBS14
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTTCTCTTC







AGGAGTCAGGTGCACC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21082
11
13
120


RT11_PBS13
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTTCTCTTC







AGGAGTCAGGTGCAC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21083
11
12
119


RT11_PBS12
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTTCTCTTC







AGGAGTCAGGTGCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21084
11
11
118


RT11_PBS11
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTTCTCTTC







AGGAGTCAGGTGC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21085
11
10
117


RT11_PBS10
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTTCTCTTC







AGGAGTCAGGTG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21086
11
9
116


RT11_PBS9
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTTCTCTTC







AGGAGTCAGGT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21087
11
8
115


RT11_PBS8
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTTCTCTTC







AGGAGTCAGG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21088
10
17
123


RT10_PBS17
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTCTCTTC







AGGAGTCAGGTGCACCATG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21089
10
16
122


RT10_PBS16
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTCTCTTC







AGGAGTCAGGTGCACCAT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21090
10
15
121


RT10_PBS15
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTCTCTTC







AGGAGTCAGGTGCACCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21091
10
14
120


RT10_PBS14
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTCTCTTC







AGGAGTCAGGTGCACC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21092
10
13
119


RT10_PBS13
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTCTCTTC







AGGAGTCAGGTGCAC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21093
10
12
118


RT10_PBS12
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTCTCTTC







AGGAGTCAGGTGCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21094
10
11
117


RT10_PBS11
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTCTCTTC







AGGAGTCAGGTGC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21095
10
10
116


RT10_PBS10
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTCTCTTC







AGGAGTCAGGTG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21096
10
9
115


RT10_PBS9
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTCTCTTC







AGGAGTCAGGT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21097
10
8
114


RT10_PBS8
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTCTCTTC







AGGAGTCAGG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21098
9
17
122


RT9_PBS17
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTCTCTTCA







GGAGTCAGGTGCACCATG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21099
9
16
121


RT9_PBS16
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTCTCTTCA







GGAGTCAGGTGCACCAT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21100
9
15
120


RT9_PBS15
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTCTCTTCA







GGAGTCAGGTGCACCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21101
9
14
119


RT9_PBS14
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTCTCTTCA







GGAGTCAGGTGCACC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21102
9
13
118


RT9_PBS13
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTCTCTTCA







GGAGTCAGGTGCAC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21103
9
12
117


RT9_PBS12
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTCTCTTCA







GGAGTCAGGTGCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21104
9
11
116


RT9_PBS11
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTCTCTTCA







GGAGTCAGGTGC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21105
9
10
115


RT9_PBS10
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTCTCTTCA







GGAGTCAGGTG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21106
9
9
114


RT9_PBS9
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTCTCTTCA







GGAGTCAGGT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21107
9
8
113


RT9_PBS8
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTCTCTTCA







GGAGTCAGG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21108
8
17
121


RT8_PBS17
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCTTCAG







GAGTCAGGTGCACCATG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21109
8
16
120


RT8_PBS16
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCTTCAG







GAGTCAGGTGCACCAT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21110
8
15
119


RT8_PBS15
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCTTCAG







GAGTCAGGTGCACCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21111
8
14
118


RT8_PBS14
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCTTCAG







GAGTCAGGTGCACC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21112
8
13
117


RT8_PBS13
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCTTCAG







GAGTCAGGTGCAC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21113
8
12
116


RT8_PBS12
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCTTCAG







GAGTCAGGTGCA









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21114
8
11
115


RT8_PBS11
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCTTCAG







GAGTCAGGTGC









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21115
8
10
114


RT8_PBS10
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCTTCAG







GAGTCAGGTG









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21116
8
9
113


RT8_PBS9
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCTTCAG







GAGTCAGGT









HBB5_corr_WT
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA
21117
8
8
112


RT8_PBS8
GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCTTCAG







GAGTCAGG
















TABLE 5B







Exemplary template RNA sequences


Table 5B provides design of exemplary DNA components of gene modifying systems for


correcting the pathogenic E6V mutation in HBB to the Makassar form. This table


details the sequence of a complete template RNA for use in an exemplary gene


modifying system comprising a gene modifying polypeptide. Templates in this


table employ the HBB5 spacer (CATGGTGCACCTGACTCCTG SEQ ID NO: 19249) and a


gRNA scaffold sequence of GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCA


ACTTGAAAAAGTGGCACCGAGTCGGTGC (SEQ ID NO: 20923). For exemplification, the


lengths of the RT (heterologous object) sequences and PBS sequences were


varied at the 3′ end. The length of these respective sequences is reflected


in columns 3 and 4, respectively. The longest form of the RT sequence is


AGTAACGGCAGACTTCTCTGCAG (SEQ ID NO: 20955). The longest form of the PBS


is GAGTCAGGTGCACCATG (SEQ ID NO: 19431).














SEQ





Sequence

ID
RT
PBS
Total


Name
Full DNA sequence
NO
length
length
length















HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21118
23
17
136


Mak_RT23
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGTAACGGCAG






PBS17
ACTTCTCTGCAGGAGTCAGGTGCACCATG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21119
23
16
135


Mak_RT23
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGTAACGGCAG






PBS16
ACTTCTCTGCAGGAGTCAGGTGCACCAT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21120
23
15
134


Mak_RT23
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGTAACGGCAG






PBS15
ACTTCTCTGCAGGAGTCAGGTGCACCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21121
23
14
133


Mak_RT23
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGTAACGGCAG






PBS14
ACTTCTCTGCAGGAGTCAGGTGCACC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21122
23
13
132


Mak_RT23
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGTAACGGCAG






PBS13
ACTTCTCTGCAGGAGTCAGGTGCAC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21123
23
12
131


Mak_RT23
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGTAACGGCAG






PBS12
ACTTCTCTGCAGGAGTCAGGTGCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21124
23
11
130


Mak_RT23
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGTAACGGCAG






PBS11
ACTTCTCTGCAGGAGTCAGGTGC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21125
23
10
129


Mak_RT23
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGTAACGGCAG






PBS10
ACTTCTCTGCAGGAGTCAGGTG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21126
23
9
128


Mak_RT23
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGTAACGGCAG






PBS9
ACTTCTCTGCAGGAGTCAGGT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21127
23
8
127


Mak_RT23
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGTAACGGCAG






PBS8
ACTTCTCTGCAGGAGTCAGG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21128
22
17
135


Mak_RT22
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTAACGGCAGAC






PBS17
TTCTCTGCAGGAGTCAGGTGCACCATG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21129
22
16
134


Mak_RT22
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTAACGGCAGAC






PBS16
TTCTCTGCAGGAGTCAGGTGCACCAT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21130
22
15
133


Mak_RT22
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTAACGGCAGAC






PBS15
TTCTCTGCAGGAGTCAGGTGCACCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21131
22
14
132


Mak_RT22
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTAACGGCAGAC






PBS14
TTCTCTGCAGGAGTCAGGTGCACC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21132
22
13
131


Mak_RT22
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTAACGGCAGAC






PBS13
TTCTCTGCAGGAGTCAGGTGCAC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21133
22
12
130


Mak_RT22
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTAACGGCAGAC






PBS12
TTCTCTGCAGGAGTCAGGTGCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21134
22
11
129


Mak_RT22
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTAACGGCAGAC






PBS11
TTCTCTGCAGGAGTCAGGTGC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21135
22
10
128


Mak_RT22
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTAACGGCAGAC






PBS10
TTCTCTGCAGGAGTCAGGTG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21136
22
9
127


Mak_RT22
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTAACGGCAGAC






PBS9
TTCTCTGCAGGAGTCAGGT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21137
22
8
126


Mak_RT22
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTAACGGCAGAC






PBS8
TTCTCTGCAGGAGTCAGG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21138
21
17
134


Mak_RT21
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTAACGGCAGACT






PBS17
TCTCTGCAGGAGTCAGGTGCACCATG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21139
21
16
133


Mak_RT21
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTAACGGCAGACT






PBS16
TCTCTGCAGGAGTCAGGTGCACCAT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21140
21
15
132


Mak_RT21
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTAACGGCAGACT






PBS15
TCTCTGCAGGAGTCAGGTGCACCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21141
21
14
131


Mak_RT21
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTAACGGCAGACT






PBS14
TCTCTGCAGGAGTCAGGTGCACC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21142
21
13
130


Mak_RT21
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTAACGGCAGACT






PBS13
TCTCTGCAGGAGTCAGGTGCAC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21143
21
12
129


Mak_RT21
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTAACGGCAGACT






PBS12
TCTCTGCAGGAGTCAGGTGCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21144
21
11
128


Mak_RT21
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTAACGGCAGACT






PBS11
TCTCTGCAGGAGTCAGGTGC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21145
21
10
127


Mak_RT21
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTAACGGCAGACT






PBS10
TCTCTGCAGGAGTCAGGTG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21146
21
9
126


Mak_RT21
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTAACGGCAGACT






PBS9
TCTCTGCAGGAGTCAGGT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21147
21
8
125


Mak_RT21
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTAACGGCAGACT






PBS8
TCTCTGCAGGAGTCAGG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21148
20
17
133


Mak_RT20
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAACGGCAGACTT






PBS17
CTCTGCAGGAGTCAGGTGCACCATG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21149
20
16
132


Mak_RT20
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAACGGCAGACTT






PBS16
CTCTGCAGGAGTCAGGTGCACCAT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21150
20
15
131


Mak_RT20
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAACGGCAGACTT






PBS15
CTCTGCAGGAGTCAGGTGCACCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21151
20
14
130


Mak_RT20
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAACGGCAGACTT






PBS14
CTCTGCAGGAGTCAGGTGCACC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21152
20
13
129


Mak_RT20
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAACGGCAGACTT






PBS13
CTCTGCAGGAGTCAGGTGCAC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21153
20
12
128


Mak_RT20
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAACGGCAGACTT






PBS12
CTCTGCAGGAGTCAGGTGCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21154
20
11
127


Mak_RT20
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAACGGCAGACTT






PBS11
CTCTGCAGGAGTCAGGTGC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21155
20
10
126


Mak_RT20
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAACGGCAGACTT






PBS10
CTCTGCAGGAGTCAGGTG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21156
20
9
125


Mak_RT20
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAACGGCAGACTT






PBS9
CTCTGCAGGAGTCAGGT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21157
20
8
124


Mak_RT20
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAACGGCAGACTT






PBS8
CTCTGCAGGAGTCAGG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21158
19
17
132


Mak_RT19
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACGGCAGACTTC






PBS17
TCTGCAGGAGTCAGGTGCACCATG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21159
19
16
131


Mak_RT19
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACGGCAGACTTC






PBS16
TCTGCAGGAGTCAGGTGCACCAT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21160
19
15
130


Mak_RT19
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACGGCAGACTTC






PBS15
TCTGCAGGAGTCAGGTGCACCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21161
19
14
129


Mak_RT19
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACGGCAGACTTC






PBS14
TCTGCAGGAGTCAGGTGCACC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21162
19
13
128


Mak_RT19
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACGGCAGACTTC






PBS13
TCTGCAGGAGTCAGGTGCAC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21163
19
12
127


Mak_RT19
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACGGCAGACTTC






PBS12
TCTGCAGGAGTCAGGTGCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21164
19
11
126


Mak_RT19
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACGGCAGACTTC






PBS11
TCTGCAGGAGTCAGGTGC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21165
19
10
125


Mak_RT19
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACGGCAGACTTC






PBS10
TCTGCAGGAGTCAGGTG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21166
19
9
124


Mak_RT19
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACGGCAGACTTC






PBS9
TCTGCAGGAGTCAGGT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21167
19
8
123


Mak_RT19
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACGGCAGACTTC






PBS8
TCTGCAGGAGTCAGG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21168
18
17
131


Mak_RT18
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCGGCAGACTTCT






PBS17
CTGCAGGAGTCAGGTGCACCATG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21169
18
16
130


Mak_RT18
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCGGCAGACTTCT






PBS16
CTGCAGGAGTCAGGTGCACCAT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21170
18
15
129


Mak_RT18
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCGGCAGACTTCT






PBS15
CTGCAGGAGTCAGGTGCACCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21171
18
14
128


Mak_RT18
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCGGCAGACTTCT






PBS14
CTGCAGGAGTCAGGTGCACC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21172
18
13
127


Mak_RT18
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCGGCAGACTTCT






PBS13
CTGCAGGAGTCAGGTGCAC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21173
18
12
126


Mak_RT18
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCGGCAGACTTCT






PBS12
CTGCAGGAGTCAGGTGCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21174
18
11
125


Mak_RT18
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCGGCAGACTTCT






PBS11
CTGCAGGAGTCAGGTGC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21175
18
10
124


Mak_RT18
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCGGCAGACTTCT






PBS10
CTGCAGGAGTCAGGTG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21176
18
9
123


Mak_RT18
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCGGCAGACTTCT






PBS9
CTGCAGGAGTCAGGT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21177
18
8
122


Mak_RT18
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCGGCAGACTTCT






PBS8
CTGCAGGAGTCAGG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21178
17
17
130


Mak_RT17
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGCAGACTTCTC






PBS17
TGCAGGAGTCAGGTGCACCATG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21179
17
16
129


Mak_RT17
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGCAGACTTCTC






PBS16
TGCAGGAGTCAGGTGCACCAT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21180
17
15
128


Mak_RT17
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGCAGACTTCTC






PBS15
TGCAGGAGTCAGGTGCACCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21181
17
14
127


Mak_RT17
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGCAGACTTCTC






PBS14
TGCAGGAGTCAGGTGCACC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21182
17
13
126


Mak_RT17
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGCAGACTTCTC






PBS13
TGCAGGAGTCAGGTGCAC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21183
17
12
125


Mak_RT17
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGCAGACTTCTC






PBS12
TGCAGGAGTCAGGTGCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21184
17
11
124


Mak_RT17
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGCAGACTTCTC






PBS11
TGCAGGAGTCAGGTGC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21185
17
10
123


Mak_RT17
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGCAGACTTCTC






PBS10
TGCAGGAGTCAGGTG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21186
17
9
122


Mak_RT17
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGCAGACTTCTC






PBS9
TGCAGGAGTCAGGT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21187
17
8
121


Mak_RT17
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGCAGACTTCTC






PBS8
TGCAGGAGTCAGG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21188
16
17
129


Mak_RT16
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCAGACTTCTCT






PBS17
GCAGGAGTCAGGTGCACCATG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21189
16
16
128


Mak_RT16
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCAGACTTCTCT






PBS16
GCAGGAGTCAGGTGCACCAT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21190
16
15
127


Mak_RT16
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCAGACTTCTCT






PBS15
GCAGGAGTCAGGTGCACCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21191
16
14
126


Mak_RT16
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCAGACTTCTCT






PBS14
GCAGGAGTCAGGTGCACC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21192
16
13
125


Mak_RT16
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCAGACTTCTCT






PBS13
GCAGGAGTCAGGTGCAC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21193
16
12
124


Mak_RT16
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCAGACTTCTCT






PBS12
GCAGGAGTCAGGTGCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21194
16
11
123


Mak_RT16
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCAGACTTCTCT






PBS11
GCAGGAGTCAGGTGC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21195
16
10
122


Mak_RT16
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCAGACTTCTCT






PBS10
GCAGGAGTCAGGTG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21196
16
9
121


Mak_RT16
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCAGACTTCTCT






PBS9
GCAGGAGTCAGGT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21197
16
8
120


Mak_RT16
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCAGACTTCTCT






PBS8
GCAGGAGTCAGG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21198
15
17
128


Mak_RT15
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCAGACTTCTCTG






PBS17
CAGGAGTCAGGTGCACCATG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21199
15
16
127


Mak_RT15
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCAGACTTCTCTG






PBS16
CAGGAGTCAGGTGCACCAT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21200
15
15
126


Mak_RT15
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCAGACTTCTCTG






PBS15
CAGGAGTCAGGTGCACCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21201
15
14
125


Mak_RT15
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCAGACTTCTCTG






PBS14
CAGGAGTCAGGTGCACC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21202
15
13
124


Mak_RT15
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCAGACTTCTCTG






PBS13
CAGGAGTCAGGTGCAC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21203
15
12
123


Mak_RT15
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCAGACTTCTCTG






PBS12
CAGGAGTCAGGTGCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21204
15
11
122


Mak_RT15
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCAGACTTCTCTG






PBS11
CAGGAGTCAGGTGC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21205
15
10
121


Mak_RT15
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCAGACTTCTCTG






PBS10
CAGGAGTCAGGTG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21206
15
9
120


Mak_RT15
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCAGACTTCTCTG






PBS9
CAGGAGTCAGGT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21207
15
8
119


Mak_RT15
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCAGACTTCTCTG






PBS8
CAGGAGTCAGG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21208
14
17
127


Mak_RT14
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGACTTCTCTGC






PBS17
AGGAGTCAGGTGCACCATG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21209
14
16
126


Mak_RT14
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGACTTCTCTGC






PBS16
AGGAGTCAGGTGCACCAT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21210
14
15
125


Mak_RT14
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGACTTCTCTGC






PBS15
AGGAGTCAGGTGCACCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21211
14
14
124


Mak_RT14
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGACTTCTCTGC






PBS14
AGGAGTCAGGTGCACC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21212
14
13
123


Mak_RT14
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGACTTCTCTGC






PBS13
AGGAGTCAGGTGCAC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21213
14
12
122


Mak_RT14
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGACTTCTCTGC






PBS12
AGGAGTCAGGTGCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21214
14
11
121


Mak_RT14
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGACTTCTCTGC






PBS11
AGGAGTCAGGTGC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21215
14
10
120


Mak_RT14
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGACTTCTCTGC






PBS10
AGGAGTCAGGTG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21216
14
9
119


Mak_RT14
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGACTTCTCTGC






PBS9
AGGAGTCAGGT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21217
14
8
118


Mak_RT14
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAGACTTCTCTGC






PBS8
AGGAGTCAGG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21218
13
17
126


Mak_RT13
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTTCTCTGCA






PBS17
GGAGTCAGGTGCACCATG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21219
13
16
125


Mak_RT13
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTTCTCTGCA






PBS16
GGAGTCAGGTGCACCAT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21220
13
15
124


Mak_RT13
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTTCTCTGCA






PBS15
GGAGTCAGGTGCACCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21221
13
14
123


Mak_RT13
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTTCTCTGCA






PBS14
GGAGTCAGGTGCACC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21222
13
13
122


Mak_RT13
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTTCTCTGCA






PBS13
GGAGTCAGGTGCAC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21223
13
12
121


Mak_RT13
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTTCTCTGCA






PBS12
GGAGTCAGGTGCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21224
13
11
120


Mak_RT13
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTTCTCTGCA






PBS11
GGAGTCAGGTGC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21225
13
10
119


Mak_RT13
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTTCTCTGCA






PBS10
GGAGTCAGGTG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21226
13
9
118


1
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTTCTCTGCA






Mak_RT13
GGAGTCAGGT






PBS9










HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21227
13
8
117


Mak_RT13
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTTCTCTGCA






PBS8
GGAGTCAGG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21228
12
17
125


Mak_RT12
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTTCTCTGCAG






PBS17
GAGTCAGGTGCACCATG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21229
12
16
124


Mak_RT12
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTTCTCTGCAG






PBS16
GAGTCAGGTGCACCAT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21230
12
15
123


Mak_RT12
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTTCTCTGCAG






PBS15
GAGTCAGGTGCACCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21231
12
14
122


Mak_RT12
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTTCTCTGCAG






PBS14
GAGTCAGGTGCACC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21232
12
13
121


Mak_RT12
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTTCTCTGCAG






PBS13
GAGTCAGGTGCAC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21233
12
12
120


Mak_RT12
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTTCTCTGCAG






PBS12
GAGTCAGGTGCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21234
12
11
119


Mak_RT12
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTTCTCTGCAG






PBS11
GAGTCAGGTGC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21235
12
10
118


Mak_RT12
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTTCTCTGCAG






PBS10
GAGTCAGGTG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21236
12
9
117


Mak_RT12
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTTCTCTGCAG






PBS9
GAGTCAGGT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21237
12
8
116


Mak_RT12
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTTCTCTGCAG






PBS8
GAGTCAGG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21238
11
17
124


Mak_RT11
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTTCTCTGCAGG






PBS17
AGTCAGGTGCACCATG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21239
11
16
123


Mak_RT11
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTTCTCTGCAGG






PBS16
AGTCAGGTGCACCAT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21240
11
15
122


Mak_RT11
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTTCTCTGCAGG






PBS15
AGTCAGGTGCACCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21241
11
14
121


Mak_RT11
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTTCTCTGCAGG






PBS14
AGTCAGGTGCACC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21242
11
13
120


Mak_RT11
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTTCTCTGCAGG






PBS13
AGTCAGGTGCAC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21243
11
12
119


Mak_RT11
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTTCTCTGCAGG






PBS12
AGTCAGGTGCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21244
11
11
118


Mak_RT11
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTTCTCTGCAGG






PBS11
AGTCAGGTGC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21245
11
10
117


Mak_RT11
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTTCTCTGCAGG






PBS10
AGTCAGGTG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21246
11
9
116


Mak_RT11
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTTCTCTGCAGG






PBS9
AGTCAGGT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21247
11
8
115


Mak_RT11
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTTCTCTGCAGG






PBS8
AGTCAGG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21248
10
17
123


Mak_RT10
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTCTCTGCAGGA






PBS17
GTCAGGTGCACCATG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21249
10
16
122


Mak_RT10
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTCTCTGCAGGA






PBS16
GTCAGGTGCACCAT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21250
10
15
121


Mak_RT10
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTCTCTGCAGGA






PBS15
GTCAGGTGCACCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21251
10
14
120


Mak_RT10
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTCTCTGCAGGA






PBS14
GTCAGGTGCACC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21252
10
13
119


Mak_RT10
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTCTCTGCAGGA






PBS13
GTCAGGTGCAC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21253
10
12
118


Mak_RT10
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTCTCTGCAGGA






PBS12
GTCAGGTGCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21254
10
11
117


Mak_RT10
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTCTCTGCAGGA






PBS11
GTCAGGTGC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21255
10
10
116


Mak_RT10
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTCTCTGCAGGA






PBS10
GTCAGGTG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21256
10
9
115


Mak_RT10
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTCTCTGCAGGA






PBS9
GTCAGGT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21257
10
8
114


Mak_RT10
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTCTCTGCAGGA






PBS8
GTCAGG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21258
9
17
122


Mak_RT9_P
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTCTCTGCAGGAG






BS17
TCAGGTGCACCATG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21259
9
16
121


Mak_RT9_P
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTCTCTGCAGGAG






BS16
TCAGGTGCACCAT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21260
9
15
120


Mak_RT9_P
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTCTCTGCAGGAG






BS15
TCAGGTGCACCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21261
9
14
119


Mak_RT9_P
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTCTCTGCAGGAG






BS14
TCAGGTGCACC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21262
9
13
118


Mak_RT9_P
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTCTCTGCAGGAG






BS13
TCAGGTGCAC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21263
9
12
117


Mak_RT9_P
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTCTCTGCAGGAG






BS12
TCAGGTGCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21264
9
11
116


Mak_RT9_P
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTCTCTGCAGGAG






BS11
TCAGGTGC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21265
9
10
115


Mak_RT9_P
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTCTCTGCAGGAG






BS10
TCAGGTG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21266
9
9
114


Mak_RT9_P
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTCTCTGCAGGAG






BS9
TCAGGT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21267
9
8
113


Mak_RT9_P
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTCTCTGCAGGAG






BS8
TCAGG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21268
8
17
121


Mak_RT8_P
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCTGCAGGAGT






BS17
CAGGTGCACCATG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21269
8
16
120


Mak_RT8_P
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCTGCAGGAGT






BS16
CAGGTGCACCAT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21270
8
15
119


Mak_RT8_P
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCTGCAGGAGT






BS15
CAGGTGCACCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21271
8
14
118


Mak_RT8_P
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCTGCAGGAGT






BS14
CAGGTGCACC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21272
8
13
117


Mak_RT8_P
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCTGCAGGAGT






BS13
CAGGTGCAC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21273
8
12
116


Mak_RT8_P
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCTGCAGGAGT






BS12
CAGGTGCA









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21274
8
11
115


Mak_RT8_P
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCTGCAGGAGT






BS11
CAGGTGC









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21275
8
10
114


Mak_RT8_P
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCTGCAGGAGT






BS10
CAGGTG









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21276
8
9
113


Mak_RT8_P
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCTGCAGGAGT






BS9
CAGGT









HBB5_corr
CATGGTGCACCTGACTCCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG
21277
8
8
112


Mak_RT8_P
CTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCTGCAGGAGT






BS8
CAGG
















TABLE 5C







Exemplary template RNA sequences


Table 5C provides design of exemplary DNA components of gene modifying systems for


correcting the pathogenic E6V mutation in HBB to the wild-type form. This table


details the sequence of a complete template RNA for use in exemplary gene modifying


systems comprising a gene modifying polypeptide. Templates in this table employ the


HBB8 spacer (GTAACGGCAGACTTCTCCAC SEQ ID NO: 19971) and a gRNA scaffold sequence of


GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGC


(SEQ ID NO: 20923). For exemplification, the lengths of the RT (heterologous object)


sequences and PBS sequences were varied at the 3′ end. The length of these respective


sequences is reflected in columns 3 and 4, respectively. The longest form of the RT


sequence is CCATGGTGCACCTGACTCCTGAG (SEQ ID NO: 20956). The longest form of the PBS


is GAGAAGTCTGCCGTTAC (SEQ ID NO: 20957).














SEQ





Sequence

ID
RT
PBS
Total


Name
Full DNA sequence
NO
length
length
length















HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21278
23
17
136


WT_RT23_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCCATGGTGCACCTG






S17
ACTCCTGAGGAGAAGTCTGCCGTTAC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21279
23
16
135


WT_RT23_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCCATGGTGCACCTG






S16
ACTCCTGAGGAGAAGTCTGCCGTTA









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21280
23
15
134


WT_RT23_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCCATGGTGCACCTG






S15
ACTCCTGAGGAGAAGTCTGCCGTT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21281
23
14
133


WT_RT23_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCCATGGTGCACCTG






S14
ACTCCTGAGGAGAAGTCTGCCGT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21282
23
13
132


WT_RT23_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCCATGGTGCACCTG






S13
ACTCCTGAGGAGAAGTCTGCCG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21283
23
12
131


WT_RT23_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCCATGGTGCACCTG






S12
ACTCCTGAGGAGAAGTCTGCC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21284
23
11
130


WT_RT23_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCCATGGTGCACCTG






S11
ACTCCTGAGGAGAAGTCTGC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21285
23
10
129


WT_RT23_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCCATGGTGCACCTG






S10
ACTCCTGAGGAGAAGTCTG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21286
23
9
128


WT_RT23_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCCATGGTGCACCTG






S9
ACTCCTGAGGAGAAGTCT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21287
23
8
127


WT_RT23_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCCATGGTGCACCTG






S8
ACTCCTGAGGAGAAGTC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21288
22
17
135


WT_RT22_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCATGGTGCACCTGA






S17
CTCCTGAGGAGAAGTCTGCCGTTAC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21289
22
16
134


WT_RT22_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCATGGTGCACCTGA






S16
CTCCTGAGGAGAAGTCTGCCGTTA









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21290
22
15
133


WT_RT22_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCATGGTGCACCTGA






S15
CTCCTGAGGAGAAGTCTGCCGTT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21291
22
14
132


WT_RT22_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCATGGTGCACCTGA






S14
CTCCTGAGGAGAAGTCTGCCGT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21292
22
13
131


WT_RT22_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCATGGTGCACCTGA






S13
CTCCTGAGGAGAAGTCTGCCG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21293
22
12
130


WT_RT22_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCATGGTGCACCTGA






S12
CTCCTGAGGAGAAGTCTGCC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21294
22
11
129


WT_RT22_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCATGGTGCACCTGA






S11
CTCCTGAGGAGAAGTCTGC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21295
22
10
128


WT_RT22_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCATGGTGCACCTGA






S10
CTCCTGAGGAGAAGTCTG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21296
22
9
127


WT_RT22_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCATGGTGCACCTGA






S9
CTCCTGAGGAGAAGTCT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21297
22
8
126


WT_RT22_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCATGGTGCACCTGA






S8
CTCCTGAGGAGAAGTC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21298
21
17
134


WT_RT21_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCATGGTGCACCTGACT






S17
CCTGAGGAGAAGTCTGCCGTTAC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21299
21
16
133


WT_RT21_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCATGGTGCACCTGACT






S16
CCTGAGGAGAAGTCTGCCGTTA









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21300
21
15
132


WT_RT21_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCATGGTGCACCTGACT






S15
CCTGAGGAGAAGTCTGCCGTT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21301
21
14
131


WT_RT21_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCATGGTGCACCTGACT






S14
CCTGAGGAGAAGTCTGCCGT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21302
21
13
130


WT_RT21_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCATGGTGCACCTGACT






S13
CCTGAGGAGAAGTCTGCCG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21303
21
12
129


WT_RT21_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCATGGTGCACCTGACT






S12
CCTGAGGAGAAGTCTGCC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21304
21
11
128


WT_RT21_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCATGGTGCACCTGACT






S11
CCTGAGGAGAAGTCTGC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21305
21
10
127


WT_RT21_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCATGGTGCACCTGACT






S10
CCTGAGGAGAAGTCTG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21306
21
9
126


WT_RT21_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCATGGTGCACCTGACT






S9
CCTGAGGAGAAGTCT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21307
21
8
125


WT_RT21_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCATGGTGCACCTGACT






S8
CCTGAGGAGAAGTC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21308
20
17
133


WT_RT20_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGGTGCACCTGACTC






S17
CTGAGGAGAAGTCTGCCGTTAC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21309
20
16
132


WT_RT20_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGGTGCACCTGACTC






S16
CTGAGGAGAAGTCTGCCGTTA









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21310
20
15
131


WT_RT20_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGGTGCACCTGACTC






S15
CTGAGGAGAAGTCTGCCGTT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21311
20
14
130


WT_RT20_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGGTGCACCTGACTC






S14
CTGAGGAGAAGTCTGCCGT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21312
20
13
129


WT_RT20_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGGTGCACCTGACTC






S13
CTGAGGAGAAGTCTGCCG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21313
20
12
128


WT_RT20_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGGTGCACCTGACTC






S12
CTGAGGAGAAGTCTGCC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21314
20
11
127


WT_RT20_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGGTGCACCTGACTC






S11
CTGAGGAGAAGTCTGC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21315
20
10
126


WT_RT20_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGGTGCACCTGACTC






S10
CTGAGGAGAAGTCTG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21316
20
9
125


WT_RT20_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGGTGCACCTGACTC






S9
CTGAGGAGAAGTCT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21317
20
8
124


WT_RT20_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGGTGCACCTGACTC






S8
CTGAGGAGAAGTC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21318
19
17
132


WT_RT19_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGTGCACCTGACTCC






S17
TGAGGAGAAGTCTGCCGTTAC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21319
19
16
131


WT_RT19_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGTGCACCTGACTCC






S16
TGAGGAGAAGTCTGCCGTTA









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21320
19
15
130


WT_RT19_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGTGCACCTGACTCC






S15
TGAGGAGAAGTCTGCCGTT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21321
19
14
129


WT_RT19_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGTGCACCTGACTCC






S14
TGAGGAGAAGTCTGCCGT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21322
19
13
128


WT_RT19_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGTGCACCTGACTCC






S13
TGAGGAGAAGTCTGCCG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21323
19
12
127


WT_RT19_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGTGCACCTGACTCC






S12
TGAGGAGAAGTCTGCC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21324
19
11
126


WT_RT19_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGTGCACCTGACTCC






S11
TGAGGAGAAGTCTGC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21325
19
10
125


WT_RT19_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGTGCACCTGACTCC






S10
TGAGGAGAAGTCTG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21326
19
9
124


WT_RT19_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGTGCACCTGACTCC






S9
TGAGGAGAAGTCT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21327
19
8
123


WT_RT19_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGGTGCACCTGACTCC






S8
TGAGGAGAAGTC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21328
18
17
131


WT_RT18_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTGCACCTGACTCCT






S17
GAGGAGAAGTCTGCCGTTAC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21329
18
16
130


WT_RT18_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTGCACCTGACTCCT






S16
GAGGAGAAGTCTGCCGTTA









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21330
18
15
129


WT_RT18_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTGCACCTGACTCCT






S15
GAGGAGAAGTCTGCCGTT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21331
18
14
128


WT_RT18_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTGCACCTGACTCCT






S14
GAGGAGAAGTCTGCCGT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21332
18
13
127


WT_RT18_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTGCACCTGACTCCT






S13
GAGGAGAAGTCTGCCG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21333
18
12
126


WT_RT18_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTGCACCTGACTCCT






S12
GAGGAGAAGTCTGCC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21334
18
11
125


WT_RT18_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTGCACCTGACTCCT






S11
GAGGAGAAGTCTGC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21335
18
10
124


WT_RT18_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTGCACCTGACTCCT






S10
GAGGAGAAGTCTG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21336
18
9
123


WT_RT18_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTGCACCTGACTCCT






S9
GAGGAGAAGTCT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21337
18
8
122


WT_RT18_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTGCACCTGACTCCT






S8
GAGGAGAAGTC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21338
17
17
130


WT_RT17_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGCACCTGACTCCTG






S17
AGGAGAAGTCTGCCGTTAC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21339
17
16
129


WT_RT17_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGCACCTGACTCCTG






S16
AGGAGAAGTCTGCCGTTA









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21340
17
15
128


WT_RT17_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGCACCTGACTCCTG






S15
AGGAGAAGTCTGCCGTT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21341
17
14
127


WT_RT17_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGCACCTGACTCCTG






S14
AGGAGAAGTCTGCCGT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21342
17
13
126


WT_RT17_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGCACCTGACTCCTG






S13
AGGAGAAGTCTGCCG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21343
17
12
125


WT_RT17_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGCACCTGACTCCTG






S12
AGGAGAAGTCTGCC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21344
17
11
124


WT_RT17_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGCACCTGACTCCTG






S11
AGGAGAAGTCTGC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21345
17
10
123


WT_RT17_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGCACCTGACTCCTG






S10
AGGAGAAGTCTG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21346
17
9
122


WT_RT17_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGCACCTGACTCCTG






S9
AGGAGAAGTCT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21347
17
8
121


WT_RT17_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGCACCTGACTCCTG






S8
AGGAGAAGTC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21348
16
17
129


WT_RT16_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCACCTGACTCCTGA






S17
GGAGAAGTCTGCCGTTAC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21349
16
16
128


WT_RT16_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCACCTGACTCCTGA






S16
GGAGAAGTCTGCCGTTA









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21350
16
15
127


WT_RT16_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCACCTGACTCCTGA






S15
GGAGAAGTCTGCCGTT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21351
16
14
126


WT_RT16_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCACCTGACTCCTGA






S14
GGAGAAGTCTGCCGT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21352
16
13
125


WT_RT16_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCACCTGACTCCTGA






S13
GGAGAAGTCTGCCG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21353
16
12
124


WT_RT16_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCACCTGACTCCTGA






S12
GGAGAAGTCTGCC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21354
16
11
123


WT_RT16_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCACCTGACTCCTGA






S11
GGAGAAGTCTGC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21355
16
10
122


WT_RT16_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCACCTGACTCCTGA






S10
GGAGAAGTCTG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21356
16
9
121


WT_RT16_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCACCTGACTCCTGA






S9
GGAGAAGTCT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21357
16
8
120


WT_RT16_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGCACCTGACTCCTGA






S8
GGAGAAGTC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21358
15
17
128


WT_RT15_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCACCTGACTCCTGAG






S17
GAGAAGTCTGCCGTTAC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21359
15
16
127


WT_RT15_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCACCTGACTCCTGAG






S16
GAGAAGTCTGCCGTTA









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21360
15
15
126


WT_RT15_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCACCTGACTCCTGAG






S15
GAGAAGTCTGCCGTT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21361
15
14
125


WT_RT15_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCACCTGACTCCTGAG






S14
GAGAAGTCTGCCGT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21362
15
13
124


WT_RT15_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCACCTGACTCCTGAG






S13
GAGAAGTCTGCCG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21363
15
12
123


WT_RT15_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCACCTGACTCCTGAG






S12
GAGAAGTCTGCC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21364
15
11
122


WT_RT15_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCACCTGACTCCTGAG






S11
GAGAAGTCTGC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21365
15
10
121


WT_RT15_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCACCTGACTCCTGAG






S10
GAGAAGTCTG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21366
15
9
120


WT_RT15_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCACCTGACTCCTGAG






S9
GAGAAGTCT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21367
15
8
119


WT_RT15_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCACCTGACTCCTGAG






S8
GAGAAGTC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21368
14
17
127


WT_RT14_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACCTGACTCCTGAG






S17
GAGAAGTCTGCCGTTAC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21369
14
16
126


WT_RT14_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACCTGACTCCTGAG






S16
GAGAAGTCTGCCGTTA









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21370
14
15
125


WT_RT14_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACCTGACTCCTGAG






S15
GAGAAGTCTGCCGTT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21371
14
14
124


WT_RT14_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACCTGACTCCTGAG






S14
GAGAAGTCTGCCGT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21372
14
13
123


WT_RT14_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACCTGACTCCTGAG






S13
GAGAAGTCTGCCG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21373
14
12
122


WT_RT14_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACCTGACTCCTGAG






S12
GAGAAGTCTGCC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21374
14
11
121


WT_RT14_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACCTGACTCCTGAG






S11
GAGAAGTCTGC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21375
14
10
120


WT_RT14_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACCTGACTCCTGAG






S10
GAGAAGTCTG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21376
14
9
119


WT_RT14_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACCTGACTCCTGAG






S9
GAGAAGTCT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21377
14
8
118


WT_RT14_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACCTGACTCCTGAG






S8
GAGAAGTC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21378
13
17
126


WT_RT13_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCCTGACTCCTGAGG






S17
AGAAGTCTGCCGTTAC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21379
13
16
125


WT_RT13_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCCTGACTCCTGAGG






S16
AGAAGTCTGCCGTTA









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21380
13
15
124


WT_RT13_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCCTGACTCCTGAGG






S15
AGAAGTCTGCCGTT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21381
13
14
123


WT_RT13_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCCTGACTCCTGAGG






S14
AGAAGTCTGCCGT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21382
13
13
122


WT_RT13_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCCTGACTCCTGAGG






S13
AGAAGTCTGCCG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21383
13
12
121


WT_RT13_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCCTGACTCCTGAGG






S12
AGAAGTCTGCC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21384
13
11
120


WT_RT13_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCCTGACTCCTGAGG






S11
AGAAGTCTGC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21385
13
10
119


WT_RT13_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCCTGACTCCTGAGG






S10
AGAAGTCTG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21386
13
9
118


WT_RT13_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCCTGACTCCTGAGG






S9
AGAAGTCT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21387
13
8
117


WT_RT13_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCCTGACTCCTGAGG






S8
AGAAGTC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21388
12
17
125


WT_RT12_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTGACTCCTGAGGA






S17
GAAGTCTGCCGTTAC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21389
12
16
124


WT_RT12_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTGACTCCTGAGGA






S16
GAAGTCTGCCGTTA









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21390
12
15
123


WT_RT12_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTGACTCCTGAGGA






S15
GAAGTCTGCCGTT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21391
12
14
122


WT_RT12_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTGACTCCTGAGGA






S14
GAAGTCTGCCGT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21392
12
13
121


WT_RT12_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTGACTCCTGAGGA






S13
GAAGTCTGCCG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21393
12
12
120


WT_RT12_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTGACTCCTGAGGA






S12
GAAGTCTGCC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21394
12
11
119


WT_RT12_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTGACTCCTGAGGA






S11
GAAGTCTGC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21395
12
10
118


WT_RT12_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTGACTCCTGAGGA






S10
GAAGTCTG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21396
12
9
117


WT_RT12_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTGACTCCTGAGGA






S9
GAAGTCT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21397
12
8
116


WT_RT12_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTGACTCCTGAGGA






S8
GAAGTC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21398
11
17
124


WT_RT11_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGACTCCTGAGGAG






S17
AAGTCTGCCGTTAC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21399
11
16
123


WT_RT11_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGACTCCTGAGGAG






S16
AAGTCTGCCGTTA









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21400
11
15
122


WT_RT11_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGACTCCTGAGGAG






S15
AAGTCTGCCGTT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21401
11
14
121


WT_RT11_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGACTCCTGAGGAG






S14
AAGTCTGCCGT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21402
11
13
120


WT_RT11_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGACTCCTGAGGAG






S13
AAGTCTGCCG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21403
11
12
119


WT_RT11_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGACTCCTGAGGAG






S12
AAGTCTGCC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21404
11
11
118


WT_RT11_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGACTCCTGAGGAG






S11
AAGTCTGC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21405
11
10
117


WT_RT11_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGACTCCTGAGGAG






S10
AAGTCTG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21406
11
9
116


WT_RT11_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGACTCCTGAGGAG






S9
AAGTCT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21407
11
8
115


WT_RT11_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTGACTCCTGAGGAG






S8
AAGTC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21408
10
17
123


WT_RT10_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTCCTGAGGAGA






S17
AGTCTGCCGTTAC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21409
10
16
122


WT_RT10_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTCCTGAGGAGA






S16
AGTCTGCCGTTA









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21410
10
15
121


WT_RT10_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTCCTGAGGAGA






S15
AGTCTGCCGTT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21411
10
14
120


WT_RT10_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTCCTGAGGAGA






S14
AGTCTGCCGT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21412
10
13
119


WT_RT10_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTCCTGAGGAGA






S13
AGTCTGCCG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21413
10
12
118


WT_RT10_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTCCTGAGGAGA






S12
AGTCTGCC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21414
10
11
117


WT_RT10_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTCCTGAGGAGA






S11
AGTCTGC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21415
10
10
116


WT_RT10_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTCCTGAGGAGA






S10
AGTCTG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21416
10
9
115


WT_RT10_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTCCTGAGGAGA






S9
AGTCT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21417
10
8
114


WT_RT10_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGACTCCTGAGGAGA






S8
AGTC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21418
9
17
122


WT_RT9_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTCCTGAGGAGAA






S17
GTCTGCCGTTAC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21419
9
16
121


WT_RT9_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTCCTGAGGAGAA






S16
GTCTGCCGTTA









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21420
9
15
120


WT_RT9_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTCCTGAGGAGAA






S15
GTCTGCCGTT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21421
9
14
119


WT_RT9_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTCCTGAGGAGAA






S14
GTCTGCCGT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21422
9
13
118


WT_RT9_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTCCTGAGGAGAA






S13
GTCTGCCG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21423
9
12
117


WT_RT9_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTCCTGAGGAGAA






S12
GTCTGCC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21424
9
11
116


WT_RT9_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTCCTGAGGAGAA






S11
GTCTGC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21425
9
10
115


WT_RT9_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTCCTGAGGAGAA






S10
GTCTG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21426
9
9
114


WT_RT9_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTCCTGAGGAGAA






S9
GTCT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21427
9
8
113


WT_RT9_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCACTCCTGAGGAGAA






S8
GTC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21428
8
17
121


WT_RT8_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCCTGAGGAGAAG






S17
TCTGCCGTTAC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21429
8
16
120


WT_RT8_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCCTGAGGAGAAG






S16
TCTGCCGTTA









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21430
8
15
119


WT_RT8_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCCTGAGGAGAAG






S15
TCTGCCGTT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21431
8
14
118


WT_RT8_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCCTGAGGAGAAG






S14
TCTGCCGT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21432
8
13
117


WT_RT8_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCCTGAGGAGAAG






S13
TCTGCCG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21433
8
12
116


WT_RT8_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCCTGAGGAGAAG






S12
TCTGCC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21434
8
11
115


WT_RT8_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCCTGAGGAGAAG






S11
TCTGC









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21435
8
10
114


WT_RT8_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCCTGAGGAGAAG






S10
TCTG









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21436
8
9
113


WT_RT8_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCCTGAGGAGAAG






S9
TCT









HBB8_corr
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
21437
8
8
112


WT_RT8_PB
TAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCCTCCTGAGGAGAAG






S8
TC
















TABLE 5D







Exemplary template RNA sequences


Table 5D provides design of exemplary DNA components of gene modifying


systems for correcting the pathogenic E6V mutation in HBB to the


Makassar form. This table details the sequence of a complete template


RNA for use in exemplary gene modifying systems comprising a gene


modifying polypeptide. Templates in this table employ the HBB8 spacer


(GTAACGGCAGACTTCTCCAC SEQ ID NO: 19971) and a gRNA scaffold sequence


of GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCAC


CGAGTCGGTGC (SEQ ID NO: 20923). For exemplification, the lengths of


the RT (heterologous object) sequences and PBS sequences were varied


at the 3′ end. The length of these respective sequences is reflected


in columns 3 and 4, respectively. The longest form of the RT sequence


is CCATGGTGCACCTGACTCCTGCG (SEQ ID NO: 21906). The longest form of


the PBS is GAGAAGTCTGCCGTTAC (SEQ ID NO: 20957).














SEQ





Sequence

ID
RT
PBS
Total


Name
Full DNA sequence
NO
length
length
length















HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21438
23
17
136


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCCATGGTGC






RT23_
ACCTGACTCCTGCGGAGAAGTCTGCCGTTAC






PBS17










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21439
23
16
135


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCCATGGTGC






RT23_
ACCTGACTCCTGCGGAGAAGTCTGCCGTTA






PBS16










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21440
23
15
134


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCCATGGTGC






RT23_
ACCTGACTCCTGCGGAGAAGTCTGCCGTT






PBS15










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21441
23
14
133


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCCATGGTGC






RT23_
ACCTGACTCCTGCGGAGAAGTCTGCCGT






PBS14










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21442
23
13
132


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCCATGGTGC






RT23_
ACCTGACTCCTGCGGAGAAGTCTGCCG






PBS13










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21443
23
12
131


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCCATGGTGC






RT23_
ACCTGACTCCTGCGGAGAAGTCTGCC






PBS12










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21444
23
11
130


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCCATGGTGC






RT23_
ACCTGACTCCTGCGGAGAAGTCTGC






PBS11










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21445
23
10
129


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCCATGGTGC






RT23_
ACCTGACTCCTGCGGAGAAGTCTG






PBS10










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21446
23
9
128


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCCATGGTGC






RT23_
ACCTGACTCCTGCGGAGAAGTCT






PBS9










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21447
23
8
127


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCCATGGTGC






RT23_
ACCTGACTCCTGCGGAGAAGTC






PBS8










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21448
22
17
135


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCATGGTGCA






RT22_
CCTGACTCCTGCGGAGAAGTCTGCCGTTAC






PBS17










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21449
22
16
134


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCATGGTGCA






RT22_
CCTGACTCCTGCGGAGAAGTCTGCCGTTA






PBS16










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21450
22
15
133


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCATGGTGCA






RT22_
CCTGACTCCTGCGGAGAAGTCTGCCGTT






PBS15










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21451
22
14
132


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCATGGTGCA






RT22_
CCTGACTCCTGCGGAGAAGTCTGCCGT






PBS14










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21452
22
13
131


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCATGGTGCA






RT22_
CCTGACTCCTGCGGAGAAGTCTGCCG






PBS13










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21453
22
12
130


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCATGGTGCA






RT22_
CCTGACTCCTGCGGAGAAGTCTGCC






PBS12










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21454
22
11
129


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCATGGTGCA






RT22_
CCTGACTCCTGCGGAGAAGTCTGC






PBS11










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21455
22
10
128


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCATGGTGCA






RT22_
CCTGACTCCTGCGGAGAAGTCTG






PBS10










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21456
22
9
127


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCATGGTGCA






RT22_
CCTGACTCCTGCGGAGAAGTCT






PBS9










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21457
22
8
126


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCATGGTGCA






RT22_
CCTGACTCCTGCGGAGAAGTC






PBS8










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21458
21
17
134


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCATGGTGCAC






RT21_
CTGACTCCTGCGGAGAAGTCTGCCGTTAC






PBS17










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21459
21
16
133


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCATGGTGCAC






RT21_
CTGACTCCTGCGGAGAAGTCTGCCGTTA






PBS16










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21460
21
15
132


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCATGGTGCAC






RT21_
CTGACTCCTGCGGAGAAGTCTGCCGTT






PBS15










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21461
21
14
131


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCATGGTGCAC






RT21_
CTGACTCCTGCGGAGAAGTCTGCCGT






PBS14










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21462
21
13
130


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCATGGTGCAC






RT21_
CTGACTCCTGCGGAGAAGTCTGCCG






PBS13










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21463
21
12
129


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCATGGTGCAC






RT21_
CTGACTCCTGCGGAGAAGTCTGCC






PBS12










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21464
21
11
128


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCATGGTGCAC






RT21_
CTGACTCCTGCGGAGAAGTCTGC






PBS11










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21465
21
10
127


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCATGGTGCAC






RT21_
CTGACTCCTGCGGAGAAGTCTG






PBS10










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21466
21
9
126


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCATGGTGCAC






RT21_
CTGACTCCTGCGGAGAAGTCT






PBS9










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21467
21
8
125


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCATGGTGCAC






RT21_
CTGACTCCTGCGGAGAAGTC






PBS8










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21468
20
17
133


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGGTGCACC






RT20_
TGACTCCTGCGGAGAAGTCTGCCGTTAC






PBS17










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21469
20
16
132


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGGTGCACC






RT20_
TGACTCCTGCGGAGAAGTCTGCCGTTA






PBS16










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21470
20
15
131


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGGTGCACC






RT20_
TGACTCCTGCGGAGAAGTCTGCCGTT






PBS15










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21471
20
14
130


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGGTGCACC






RT20_
TGACTCCTGCGGAGAAGTCTGCCGT






PBS14










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21472
20
13
129


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGGTGCACC






RT20_
TGACTCCTGCGGAGAAGTCTGCCG






PBS13










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21473
20
12
128


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGGTGCACC






RT20_
TGACTCCTGCGGAGAAGTCTGCC






PBS12










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21474
20
11
127


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGGTGCACC






RT20_
TGACTCCTGCGGAGAAGTCTGC






PBS11










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21475
20
10
126


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGGTGCACC






RT20_
TGACTCCTGCGGAGAAGTCTG






PBS10










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21476
20
9
125


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGGTGCACC






RT20_
TGACTCCTGCGGAGAAGTCT






PBS9










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21477
20
8
124


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGGTGCACC






RT20_
TGACTCCTGCGGAGAAGTC






PBS8










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21478
19
17
132


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGGTGCACCT






RT19_
GACTCCTGCGGAGAAGTCTGCCGTTAC






PBS17










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21479
19
16
131


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGGTGCACCT






RT19_
GACTCCTGCGGAGAAGTCTGCCGTTA






PBS16










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21480
19
15
130


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGGTGCACCT






RT19_
GACTCCTGCGGAGAAGTCTGCCGTT






PBS15










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21481
19
14
129


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGGTGCACCT






RT19_
GACTCCTGCGGAGAAGTCTGCCGT






PBS14










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21482
19
13
128


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGGTGCACCT






RT19_
GACTCCTGCGGAGAAGTCTGCCG






PBS13










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21483
19
12
127


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGGTGCACCT






RT19_
GACTCCTGCGGAGAAGTCTGCC






PBS12










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21484
19
11
126


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGGTGCACCT






RT19_
GACTCCTGCGGAGAAGTCTGC






PBS11










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21485
19
10
125


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGGTGCACCT






RT19_
GACTCCTGCGGAGAAGTCTG






PBS10










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21486
19
9
124


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGGTGCACCT






RT19_
GACTCCTGCGGAGAAGTCT






PBS9










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21487
19
8
123


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGGTGCACCT






RT19_
GACTCCTGCGGAGAAGTC






PBS8










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21488
18
17
131


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGTGCACCTG






RT18_
ACTCCTGCGGAGAAGTCTGCCGTTAC






PBS17










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21489
18
16
130


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGTGCACCTG






RT18_
ACTCCTGCGGAGAAGTCTGCCGTTA






PBS16










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21490
18
15
129


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGTGCACCTG






RT18_
ACTCCTGCGGAGAAGTCTGCCGTT






PBS15










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21491
18
14
128


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGTGCACCTG






RT18_
ACTCCTGCGGAGAAGTCTGCCGT






PBS14










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21492
18
13
127


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGTGCACCTG






RT18_
ACTCCTGCGGAGAAGTCTGCCG






PBS13










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21493
18
12
126


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGTGCACCTG






RT18_
ACTCCTGCGGAGAAGTCTGCC






PBS12










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21494
18
11
125


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGTGCACCTG






RT18_
ACTCCTGCGGAGAAGTCTGC






PBS11










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21495
18
10
124


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGTGCACCTG






RT18_
ACTCCTGCGGAGAAGTCTG






PBS10










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21496
18
9
123


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGTGCACCTG






RT18_
ACTCCTGCGGAGAAGTCT






PBS9










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21497
18
8
122


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGTGCACCTG






RT18_
ACTCCTGCGGAGAAGTC






PBS8










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21498
17
17
130


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGCACCTGA






RT17_
CTCCTGCGGAGAAGTCTGCCGTTAC






PBS17










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21499
17
16
129


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGCACCTGA






RT17_
CTCCTGCGGAGAAGTCTGCCGTTA






PBS16










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21500
17
15
128


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGCACCTGA






RT17_
CTCCTGCGGAGAAGTCTGCCGTT






PBS15










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21501
17
14
127


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGCACCTGA






RT17_
CTCCTGCGGAGAAGTCTGCCGT






PBS14










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21502
17
13
126


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGCACCTGA






RT17_
CTCCTGCGGAGAAGTCTGCCG






PBS13










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21503
17
12
125


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGCACCTGA






RT17_
CTCCTGCGGAGAAGTCTGCC






PBS12










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21504
17
11
124


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGCACCTGA






RT17_
CTCCTGCGGAGAAGTCTGC






PBS11










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21505
17
10
123


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGCACCTGA






RT17_
CTCCTGCGGAGAAGTCTG






PBS10










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21506
17
9
122


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGCACCTGA






RT17_
CTCCTGCGGAGAAGTCT






PBS9










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21507
17
8
121


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGCACCTGA






RT17_
CTCCTGCGGAGAAGTC






PBS8










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21508
16
17
129


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGCACCTGAC






RT16_
TCCTGCGGAGAAGTCTGCCGTTAC






PBS17










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21509
16
16
128


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGCACCTGAC






RT16_
TCCTGCGGAGAAGTCTGCCGTTA






PBS16










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21510
16
15
127


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGCACCTGAC






RT16_
TCCTGCGGAGAAGTCTGCCGTT






PBS15










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21511
16
14
126


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGCACCTGAC






RT16_
TCCTGCGGAGAAGTCTGCCGT






PBS14










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21512
16
13
125


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGCACCTGAC






RT16_
TCCTGCGGAGAAGTCTGCCG






PBS13










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21513
16
12
124


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGCACCTGAC






RT16_
TCCTGCGGAGAAGTCTGCC






PBS12










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21514
16
11
123


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGCACCTGAC






RT16_
TCCTGCGGAGAAGTCTGC






PBS11










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21515
16
10
122


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGCACCTGAC






RT16_
TCCTGCGGAGAAGTCTG






PBS10










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21516
16
9
121


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGCACCTGAC






RT16_
TCCTGCGGAGAAGTCT






PBS9










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21517
16
8
120


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGCACCTGAC






RT16_
TCCTGCGGAGAAGTC






PBS8










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21518
15
17
128


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCACCTGACT






RT15_
CCTGCGGAGAAGTCTGCCGTTAC






PBS17










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21519
15
16
127


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCACCTGACT






RT15_
CCTGCGGAGAAGTCTGCCGTTA






PBS16










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21520
15
15
126


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCACCTGACT






RT15_
CCTGCGGAGAAGTCTGCCGTT






PBS15










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21521
15
14
125


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCACCTGACT






RT15_
CCTGCGGAGAAGTCTGCCGT






PBS14










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21522
15
13
124


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCACCTGACT






RT15_
CCTGCGGAGAAGTCTGCCG






PBS13










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21523
15
12
123


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCACCTGACT






RT15_
CCTGCGGAGAAGTCTGCC






PBS12










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21524
15
11
122


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCACCTGACT






RT15_
CCTGCGGAGAAGTCTGC






PBS11










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21525
15
10
121


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCACCTGACT






RT15_
CCTGCGGAGAAGTCTG






PBS10










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21526
15
9
120


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCACCTGACT






RT15_
CCTGCGGAGAAGTCT






PBS9










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21527
15
8
119


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCACCTGACT






RT15_
CCTGCGGAGAAGTC






PBS8










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21528
14
17
127


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCACCTGACTC






RT14_
CTGCGGAGAAGTCTGCCGTTAC






PBS17










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21529
14
16
126


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCACCTGACTC






RT14_
CTGCGGAGAAGTCTGCCGTTA






PBS16










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21530
14
15
125


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCACCTGACTC






RT14_
CTGCGGAGAAGTCTGCCGTT






PBS15










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21531
14
14
124


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCACCTGACTC






RT14_
CTGCGGAGAAGTCTGCCGT






PBS14










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21532
14
13
123


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCACCTGACTC






RT14_
CTGCGGAGAAGTCTGCCG






PBS13










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21533
14
12
122


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCACCTGACTC






RT14_
CTGCGGAGAAGTCTGCC






PBS12










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21534
14
11
121


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCACCTGACTC






RT14_
CTGCGGAGAAGTCTGC






PBS11










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21535
14
10
120


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCACCTGACTC






RT14_
CTGCGGAGAAGTCTG






PBS10










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21536
14
9
119


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCACCTGACTC






RT14_
CTGCGGAGAAGTCT






PBS9










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21537
14
8
118


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCACCTGACTC






RT14_
CTGCGGAGAAGTC






PBS8










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21538
13
17
126


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCCTGACTCC






RT13_
TGCGGAGAAGTCTGCCGTTAC






PBS17










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21539
13
16
125


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCCTGACTCC






RT13_
TGCGGAGAAGTCTGCCGTTA






PBS16










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21540
13
15
124


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCCTGACTCC






RT13_
TGCGGAGAAGTCTGCCGTT






PBS15










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21541
13
14
123


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCCTGACTCC






RT13_
TGCGGAGAAGTCTGCCGT






PBS14










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21542
13
13
122


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCCTGACTCC






RT13_
TGCGGAGAAGTCTGCCG






PBS13










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21543
13
12
121


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCCTGACTCC






RT13_
TGCGGAGAAGTCTGCC






PBS12










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21544
13
11
120


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCCTGACTCC






RT13_
TGCGGAGAAGTCTGC






PBS11










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21545
13
10
119


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCCTGACTCC






RT13_
TGCGGAGAAGTCTG






PBS10










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21546
13
9
118


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCCTGACTCC






RT13_
TGCGGAGAAGTCT






PBS9










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21547
13
8
117


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCCTGACTCC






RT13_
TGCGGAGAAGTC






PBS8










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21548
12
17
125


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCTGACTCCT






RT12_
GCGGAGAAGTCTGCCGTTAC






PBS17










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21549
12
16
124


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCTGACTCCT






RT12_
GCGGAGAAGTCTGCCGTTA






PBS16










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21550
12
15
123


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCTGACTCCT






RT12_
GCGGAGAAGTCTGCCGTT






PBS15










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21551
12
14
122


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCTGACTCCT






RT12_
GCGGAGAAGTCTGCCGT






PBS14










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21552
12
13
121


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCTGACTCCT






RT12_
GCGGAGAAGTCTGCCG






PBS13










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21553
12
12
120


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCTGACTCCT






RT12_
GCGGAGAAGTCTGCC






PBS12










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21554
12
11
119


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCTGACTCCT






RT12_
GCGGAGAAGTCTGC






PBS11










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21555
12
10
118


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCTGACTCCT






RT12_
GCGGAGAAGTCTG






PBS10










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21556
12
9
117


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCTGACTCCT






RT12_
GCGGAGAAGTCT






PBS9










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21557
12
8
116


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCTGACTCCT






RT12_
GCGGAGAAGTC






PBS8










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21558
11
17
124


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGACTCCTG






RT11_
CGGAGAAGTCTGCCGTTAC






PBS17










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21559
11
16
123


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGACTCCTG






RT11_
CGGAGAAGTCTGCCGTTA






PBS16










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21560
11
15
122


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGACTCCTG






RT11_
CGGAGAAGTCTGCCGTT






PBS15










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21561
11
14
121


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGACTCCTG






RT11_
CGGAGAAGTCTGCCGT






PBS14










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21562
11
13
120


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGACTCCTG






RT11_
CGGAGAAGTCTGCCG






PBS13










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21563
11
12
119


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGACTCCTG






RT11_
CGGAGAAGTCTGCC






PBS12










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21564
11
11
118


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGACTCCTG






RT11_
CGGAGAAGTCTGC






PBS11










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21565
11
10
117


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGACTCCTG






RT11_
CGGAGAAGTCTG






PBS10










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21566
11
9
116


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGACTCCTG






RT11_
CGGAGAAGTCT






PBS9










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21567
11
8
115


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCTGACTCCTG






RT11_
CGGAGAAGTC






PBS8










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21568
10
17
123


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGACTCCTGC






RT10_
GGAGAAGTCTGCCGTTAC






PBS17










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21569
10
16
122


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGACTCCTGC






RT10_
GGAGAAGTCTGCCGTTA






PBS16










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21570
10
15
121


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGACTCCTGC






RT10_
GGAGAAGTCTGCCGTT






PBS15










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21571
10
14
120


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGACTCCTGC






RT10_
GGAGAAGTCTGCCGT






PBS14










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21572
10
13
119


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGACTCCTGC






RT10_
GGAGAAGTCTGCCG






PBS13










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21573
10
12
118


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGACTCCTGC






RT10_
GGAGAAGTCTGCC






PBS12










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21574
10
11
117


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGACTCCTGC






RT10_
GGAGAAGTCTGC






PBS11










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21575
10
10
116


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGACTCCTGC






RT10_
GGAGAAGTCTG






PBS10










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21576
10
9
115


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGACTCCTGC






RT10_
GGAGAAGTCT






PBS9










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21577
10
8
114


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCGACTCCTGC






RT10_
GGAGAAGTC






PBS8










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21578
9
17
122


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCACTCCTGCG






RT9_
GAGAAGTCTGCCGTTAC






PBS17










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21579
9
16
121


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCACTCCTGCG






RT9_
GAGAAGTCTGCCGTTA






PBS16










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21580
9
15
120


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCACTCCTGCG






RT9_
GAGAAGTCTGCCGTT






PBS15










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21581
9
14
119


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCACTCCTGCG






RT9_
GAGAAGTCTGCCGT






PBS14










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21582
9
13
118


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCACTCCTGCG






RT9_
GAGAAGTCTGCCG






PBS13










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21583
9
12
117


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCACTCCTGCG






RT9_
GAGAAGTCTGCC






PBS12










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21584
9
11
116


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCACTCCTGCG






RT9_
GAGAAGTCTGC






PBS11










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21585
9
10
115


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCACTCCTGCG






RT9_
GAGAAGTCTG






PBS10










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21586
9
9
114


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCACTCCTGCG






RT9_
GAGAAGTCT






PBS9










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21587
9
8
113


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCACTCCTGCG






RT9_
GAGAAGTC






PBS8










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21588
8
17
121


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCTCCTGCGG






RT8_
AGAAGTCTGCCGTTAC






PBS17










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21589
8
16
120


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCTCCTGCGG






RT8_
AGAAGTCTGCCGTTA






PBS16










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21590
8
15
119


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCTCCTGCGG






RT8_
AGAAGTCTGCCGTT






PBS15










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21591
8
14
118


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCTCCTGCGG






RT8_
AGAAGTCTGCCGT






PBS14










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21592
8
13
117


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCTCCTGCGG






RT8_
AGAAGTCTGCCG






PBS13










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21593
8
12
116


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCTCCTGCGG






RT8_
AGAAGTCTGCC






PBS12










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21594
8
11
115


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCTCCTGCGG






RT8_
AGAAGTCTGC






PBS11










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21595
8
10
114


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCTCCTGCGG






RT8_
AGAAGTCTG






PBS10










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21596
8
9
113


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCTCCTGCGG






RT8_
AGAAGTCT






PBS9










HBB8_
GTAACGGCAGACTTCTCCACGTTTTAGAGCTAGAA
21597
8
8
112


corr_
ATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAAC






Mak_
TTGAAAAAGTGGCACCGAGTCGGTGCCTCCTGCGG






RT8_
AGAAGTC






PBS8









In some embodiments, the systems and methods provided herein may comprise second strand-targeting gRNAs comprising a spacer sequence listed in Table 6A. Table 6A provides exemplary second strand-targeting gRNA spacer sequences (Column 2) designed to be paired with a gene modifying polypeptide and a template RNA to correct a mutation in the HBB gene.


In some embodiments, the second strand-targeting gRNA targets a sequence overlapping the target mutation of the template RNA. In some embodiments, such an overlapping second strand-targeting gRNA comprises a sequence (e.g., spacer sequence) complementary to the sickle cell mutation. In some embodiments, such an overlapping second strand-targeting gRNA comprises a sequence (e.g., spacer sequence) complementary to the wild-type sequence at the sickle cell locus. In some embodiments, such an overlapping second strand-targeting gRNA comprises a sequence (e.g., spacer sequence) complementary to the Makassar sequence at the sickle cell locus. In some embodiments, such an overlapping second strand-targeting gRNA comprises a sequence (e.g., spacer sequence) complementary to a SNP proximal to the sickle cell locus, e.g., a SNP contained in the genomic DNA of a subject (e.g., a patient). In some embodiments, such an overlapping second strand-targeting gRNA comprises a sequence (e.g., spacer sequence) complementary to or comprising one or more silent substitutions proximal to the sickle cell locus. Examples of such second strand-targeting gRNAs can be found in Table 6A.









TABLE 6A







Exemplary second-strand targeting (second-nick) gRNA sequences


Table 6A provides spacer sequences for second strand-targeting


gRNAs and relevant characteristics. Second-nick gRNAs in this


table are designed to be used in combination with template RNAs


comprising either the HBB5 (SEQ ID NO: 19249) or HBB8


(SEQ ID NO: 19971) spacers, as noted in Column 5. PAM


orientation is included in Column 4. In some embodiments,


second-nick gRNA is selected with preference for a distance


of less than or equal to 100 nt from the first nick (i.e.,


the nick specified by the template RNA). In some embodiments,


a second-nick gRNA is selected with a preference for a PAM-in


orientation with the template RNA of the gene modifying system,


as described elsewhere in this application.












Second-strand-
SEQ ID
PAM



Name
targeting gRNA
NO
orientation
Spacer





HBB5_27_rev
GGGTGTGGCTCCACAGGGTG
21598
PAM out
HBB5





HBB5_32_rev
CCCTAGGGTGTGGCTCCACA
21599
PAM out
HBB5





HBB5_33_rev
ACCCTAGGGTGTGGCTCCAC
21600
PAM out
HBB5





HBB5_42_rev
GATTGGCCAACCCTAGGGTG
21601
PAM out
HBB5





HBB5_47_rev
GAGTAGATTGGCCAACCCTA
21602
PAM out
HBB5





HBB5_48_rev
GGAGTAGATTGGCCAACCCT
21603
PAM out
HBB5





HBB5_59_rev
CCCTGCTCCTGGGAGTAGAT
21604
PAM out
HBB5





HBB5_69_rev
CTCCTGCCCTCCCTGCTCCT
21605
PAM out
HBB5





HBB5_70_rev
GCTCCTGCCCTCCCTGCTCC
21606
PAM out
HBB5





HBB5_92_rev
CTGACTTTTATGCCCAGCCC
21607
PAM out
HBB5





HBB5_122_rev
AAGCAAATGTAAGCAATAGA
21608
PAM out
HBB5





HBB5_170_rev
TGCACCATGGTGTCTGTTTG
21609
PAM out
HBB5





HBB5_g24
CTCAGGAGTCAGATGCACCA
21610
PAM out
HBB5





HBB5_g34
CAGACTTCTCCTCAGGAGTC
21611
PAM out
HBB5





HBB5_g34_mut
CAGACTTCTCtgCAGGAGTC
21612
PAM out
HBB5





HBB5_g34_mut2
CAGACTTCTCtgccGGAGTC
21613
PAM out
HBB5





HBB5_g34_mut3
CAGACTTCTCttccGGAGTC
21614
PAM out
HBB5





HBB5_g34_mut4
CAGACTTCTCgtccGGAGTC
21615
PAM out
HBB5





HBB5_g34_mut5
CAGACTTCTCatccGGAGTC
21616
PAM out
HBB5





HBB5_g41
GTAACGGCAGACTTCTCCTC
21617
PAM in
HBB5





HBB5_g41_mut
GTAACGGCAGACTTCTCtgC
21618
PAM in
HBB5





HBB5_g41_mut2
GTAACGGCAGACTTCTCttc
21619
PAM in
HBB5





HBB5_g41_mut3
GTAACGGCAGACTTCTCgtc
21620
PAM in
HBB5





HBB5_g41_mut4
GTAACGGCAGACTTCTCatc
21621
PAM in
HBB5





HBB5_216_rev
CTTGCCCCACAGGGCAGTAA
21622
PAM in
HBB5





HBB5_g37
CACGTTCACCTTGCCCCACA
21623
PAM in
HBB5





HBB5_g38
CCACGTTCACCTTGCCCCAC
21624
PAM in
HBB5





HBB5_g27
CCTTGATACCAACCTGCCCA
21625
PAM in
HBB5





HBB5_g39
ACCTTGATACCAACCTGCCC
21626
PAM in
HBB5





HBB5_g40
TCCACATGCCCAGTTTCTAT
21627
PAM in
HBB5





HBB8_37_fw
ATCACTTAGACCTCACCCTG
21628
PAM in
HBB8





HBB8_51_fw
ACCCTGTGGAGCCACACCCT
21629
PAM in
HBB8





HBB8_52_fw
CCCTGTGGAGCCACACCCTA
21630
PAM in
HBB8





HBB8_56_fw
GTGGAGCCACACCCTAGGGT
21631
PAM in
HBB8





HBB8_72_fw
GGGTTGGCCAATCTACTCCC
21632
PAM in
HBB8





HBB8_78_fw
GCCAATCTACTCCCAGGAGC
21633
PAM in
HBB8





HBB8_79_fw
CCAATCTACTCCCAGGAGCA
21634
PAM in
HBB8





HBB8_82_fw
ATCTACTCCCAGGAGCAGGG
21635
PAM in
HBB8





HBB8_83_fw
TCTACTCCCAGGAGCAGGGA
21636
PAM in
HBB8





HBB8_87_fw
CTCCCAGGAGCAGGGAGGGC
21637
PAM in
HBB8





HBB8_94_fw
GAGCAGGGAGGGCAGGAGCC
21638
PAM in
HBB8





HBB8_95_fw
AGCAGGGAGGGCAGGAGCCA
21639
PAM in
HBB8





HBB8_99_fw
GGGAGGGCAGGAGCCAGGGC
21640
PAM in
HBB8





HBB8_g4
GGAGGGCAGGAGCCAGGGCT
21641
PAM in
HBB8





HBB8_g1
CAGGGCTGGGCATAAAAGTC
21642
PAM in
HBB8





HBB8_g2
AGGGCTGGGCATAAAAGTCA
21643
PAM in
HBB8





HBB8_g3
GCAACCTCAAACAGACACCA
21644
PAM in
HBB8





HBB8_204_fw
CATGGTGCATCTGACTCCTG
21645
PAM in
HBB8





HBB8_204_fw_mut
CATGGTGCACCTGACTCCTG
21646
PAM in
HBB8





HBB8_230_fw
AGTCTGCCGTTACTGCCCTG
21647
PAM out
HBB8





HBB8_231_fw
GTCTGCCGTTACTGCCCTGT
21648
PAM out
HBB8





HBB8_232_fw
TCTGCCGTTACTGCCCTGTG
21649
PAM out
HBB8





HBB8_237_fw
CGTTACTGCCCTGTGGGGCA
21650
PAM out
HBB8





HBB8_246_fw
CCTGTGGGGCAAGGTGAACG
21651
PAM out
HBB8





HBB8_256_fw
AAGGTGAACGTGGATGAAGT
21652
PAM out
HBB8





HBB8_259_fw
GTGAACGTGGATGAAGTTGG
21653
PAM out
HBB8





HBB8_264_fw
CGTGGATGAAGTTGGTGGTG
21654
PAM out
HBB8





HBB8_270_fw
TGAAGTTGGTGGTGAGGCCC
21655
PAM out
HBB8





HBB8_271_fw
GAAGTTGGTGGTGAGGCCCT
21656
PAM out
HBB8





HBB8_275_fw
TTGGTGGTGAGGCCCTGGGC
21657
PAM out
HBB8





HBB8_279_fw
TGGTGAGGCCCTGGGCAGGT
21658
PAM out
HBB8





HBB8_287_fw
CCCTGGGCAGGTTGGTATCA
21659
PAM out
HBB8





HBB8_299_fw
TGGTATCAAGGTTACAAGAC
21660
PAM out
HBB8





HBB8_306_fw
AAGGTTACAAGACAGGTTTA
21661
PAM out
HBB8





HBB8_323_fw
TTAAGGAGACCAATAGAAAC
21662
PAM out
HBB8





HBB8_324_fw
TAAGGAGACCAATAGAAACT
21663
PAM out
HBB8





HBB8_331_fw
ACCAATAGAAACTGGGCATG
21664
PAM out
HBB8





HBB8_350_fw
GTGGAGACAGAGAAGACTCT
21665
PAM out
HBB8





HBB8_351_fw
TGGAGACAGAGAAGACTCTT
21666
PAM out
HBB8





HBB8_362_fw
AAGACTCTTGGGTTTCTGAT
21667
PAM out
HBB8









The template RNA sequences shown in Tables 1-4, 5A-5D, and 6A may be customized depending on the cell being targeted. For example, in some embodiments it is desired to inactivate a PAM sequence upon editing (e.g., using a “PAM-kill” modification) to decrease the potential for further gene editing (e.g., by Cas retargeting) following the initial edit. Consequently, certain template RNAs described herein are designed to write a mutation (e.g., a substitution) into the PAM of the target site, such that upon editing, the PAM site will be mutated to a sequence no longer recognized by the gene modifying polypeptide. Thus, a mutation region within the heterologous object sequence of the template RNA may comprise a PAM-kill sequence. Without wishing to be bound by theory, in some embodiments, a PAM-kill sequence prevents re-engagement of the gene modifying polypeptide upon completion of a genetic modification, or decreases re-engagement relative to a template RNA lacking a PAM-kill sequence. In some embodiments, a PAM-kill sequence does not alter the amino acid sequence encoded by a gene, e.g., the PAM-kill sequence results in a silent mutation. In other embodiments, it is desired to leave the PAM sequence intact (no PAM-kill).


Similarly, in some embodiments, to decrease the potential for further gene editing (e.g., by Cas retargeting) following the initial edit, it may be desirable to alter the first three nucleotides of the RT template sequence via a “seed-kill” motif. Consequently, certain template RNAs described herein are designed to write a mutation (e.g., a substitution) into the portion of the target site corresponding to the first three nucleotides of the RT template sequence, such that upon editing, the target site will be mutated to a sequence with lower homology to the RT template sequence. Thus, a mutation region within the heterologous object sequence of the template RNA may comprise a seed-kill sequence. Without wishing to be bound by theory, in some embodiments, a seed-kill sequence prevents re-engagement of the gene modifying polypeptide upon completion of genetic modification, or decreases re-engagement relative to an otherwise similar template RNA lacking a seed-kill sequence. In some embodiments, a seed-kill sequence does not alter the amino acid sequence encoded by a gene, e.g., the seed-kill sequence results in a silent mutation. In other embodiments, it is desired to leave the seed region intact, and a seed-kill sequence is not used.


In further embodiments, to optimize or improve gene editing efficiency, it may be desirable to evade the target cell's mismatch repair or nucleotide repair pathways or to bias the target cell's repair pathways toward preservation of the edited strand. In some embodiments, multiple silent mutations (for example, silent substitutions) may be introduced within the RT template sequence to evade the target cell's mismatch repair or nucleotide repair pathways or to bias the target cell's repair pathways toward preservation of the edited strand.


Table 7A provides exemplary silent mutations for various positions within the HBB gene.









TABLE 7A







Exemplary Silent Mutation Codons for the HBB Gene










Amino





Acid





Position





(counting
WT




initial
Amino
WT



Met)
Acid
CODON
ALL CODONS


















2
V
GTG
GTT
GTC
GTA
GTG




3
H
CAT
CAT
CAC






4
L
CTG
TTA
TTG
CTT
CTC
CTA
CTG


5
T
ACT
ACT
ACC
ACA
ACG




6
P
CCT
CCT
CCC
CCA
CCG




8
E
GAG
GAA
GAG






9
K
AAG
AAA
AAG






10
S
TCT
TCT
TCC
TCA
TCG
AGT
AGC


11
A
GCC
GCT
GCC
GCA
GCG




12
V
GTT
GTT
GTC
GTA
GTG




13
T
ACT
ACT
ACC
ACA
ACG




14
A
GCC
GCT
GCC
GCA
GCG




15
L
CTG
TTA
TTG
CTT
CTC
CTA
CTG


16
W
TGG
TGG







17
G
GGC
GGT
GGC
GGA
GGG




18
K
AAG
AAA
AAG






19
V
GTG
GTT
GTC
GTA
GTG




20
N
AAC
AAT
AAC






21
V
GTG
GTT
GTC
GTA
GTG




22
D
GAT
GAT
GAC






23
E
GAA
GAA
GAG






24
V
GTT
GTT
GTC
GTA
GTG




25
G
GGT
GGT
GGC
GGA
GGG




26
G
GGT
GGT
GGC
GGA
GGG




27
E
GAG
GAA
GAG






28
A
GCC
GCT
GCC
GCA
GCG




29
L
CTG
TTA
TTG
CTT
CTC
CTA
CTG


30
G
GGC
GGT
GGC
GGA
GGG









In some embodiments, the template RNA comprises one or more silent mutations.


In some embodiments, the silent mutation comprises a mutation of the codon encoding the 6th amino acid, counting the initial methionine, of the HBB gene (proline), e.g., to CCC or CCG.


In some embodiments, the template RNA comprises one or more silent substitions as illustrated in Tables X1-X4 herein.


It should be understood that the silent mutations illustrated in Table 7A may be used individually or combined in any manner in a template RNA sequence described herein.


gRNAs with Inducible Activity


In some embodiments, a gRNA described herein (e.g., a gRNA that is part of a template RNA or a gRNA used for second strand nicking) has inducible activity. Inducible activity may be achieved by the template nucleic acid, e.g., template RNA, further comprising (in addition to the gRNA) a blocking domain, wherein the sequence of a portion of or all of the blocking domain is at least partially complementary to a portion or all of the gRNA. The blocking domain is thus capable of hybridizing or substantially hybridizing to a portion of or all of the gRNA. In some embodiments, the blocking domain and inducibly active gRNA are disposed on the template nucleic acid, e.g., template RNA, such that the gRNA can adopt a first conformation where the blocking domain is hybridized or substantially hybridized to the gRNA, and a second conformation where the blocking domain is not hybridized or not substantially hybridized to the gRNA. In some embodiments, in the first conformation the gRNA is unable to bind to the gene modifying polypeptide (e.g., the template nucleic acid binding domain, DNA binding domain, or endonuclease domain (e.g., a CRISPR/Cas protein)) or binds with substantially decreased affinity compared to an otherwise similar template RNA lacking the blocking domain. In some embodiments, in the second conformation the gRNA is able to bind to the gene modifying polypeptide (e.g., the template nucleic acid binding domain, DNA binding domain, or endonuclease domain (e.g., a CRISPR/Cas protein)). In some embodiments, whether the gRNA is in the first or second conformation can influence whether the DNA binding or endonuclease activities of the gene modifying polypeptide (e.g., of the CRISPR/Cas protein the gene modifying polypeptide comprises) are active.


In some embodiments, the gRNA that coordinates the second nick has inducible activity. In some embodiments, the gRNA that coordinates the second nick is induced after the template is reverse transcribed. In some embodiments, hybridization of the gRNA to the blocking domain can be disrupted using an opener molecule. In some embodiments, an opener molecule comprises an agent that binds to a portion or all of the gRNA or blocking domain and inhibits hybridization of the gRNA to the blocking domain. In some embodiments, the opener molecule comprises a nucleic acid, e.g., comprising a sequence that is partially or wholly complementary to the gRNA, blocking domain, or both. By choosing or designing an appropriate opener molecule, providing the opener molecule can promote a change in the conformation of the gRNA such that it can associate with a CRISPR/Cas protein and provide the associated functions of the CRISPR/Cas protein (e.g., DNA binding and/or endonuclease activity). Without wishing to be bound by theory, providing the opener molecule at a selected time and/or location may allow for spatial and temporal control of the activity of the gRNA, CRISPR/Cas protein, or gene modifying system comprising the same. In some embodiments, the opener molecule is exogenous to the cell comprising the gene modifying polypeptide and or template nucleic acid. In some embodiments, the opener molecule comprises an endogenous agent (e.g., endogenous to the cell comprising the gene modifying polypeptide and or template nucleic acid comprising the gRNA and blocking domain). For example, an inducible gRNA, blocking domain, and opener molecule may be chosen such that the opener molecule is an endogenous agent expressed in a target cell or tissue, e.g., thereby ensuring activity of a gene modifying system in the target cell or tissue. As a further example, an inducible gRNA, blocking domain, and opener molecule may be chosen such that the opener molecule is absent or not substantially expressed in one or more non-target cells or tissues, e.g., thereby ensuring that activity of a gene modifying system does not occur or substantially occur in the one or more non-target cells or tissues, or occurs at a reduced level compared to a target cell or tissue. Exemplary blocking domains, opener molecules, and uses thereof are described in PCT App. Publication WO2020044039A1, which is incorporated herein by reference in its entirety. In some embodiments, the template nucleic acid, e.g., template RNA, may comprise one or more sequences or structures for binding by one or more components of a gene modifying polypeptide, e.g., by a reverse transcriptase or RNA binding domain, and a gRNA. In some embodiments, the gRNA facilitates interaction with the template nucleic acid binding domain (e.g., RNA binding domain) of the gene modifying polypeptide. In some embodiments, the gRNA directs the gene modifying polypeptide to the matching target sequence, e.g., in a target cell genome.


Circular RNAs and Ribozymes in Gene Modifying Systems

It is contemplated that it may be useful to employ circular and/or linear RNA states during the formulation, delivery, or gene modifying reaction within the target cell. Thus, in some embodiments of any of the aspects described herein, a gene modifying system comprises one or more circular RNAs (circRNAs). In some embodiments of any of the aspects described herein, a gene modifying system comprises one or more linear RNAs. In some embodiments, a nucleic acid as described herein (e.g., a template nucleic acid, a nucleic acid molecule encoding a gene modifying polypeptide, or both) is a circRNA. In some embodiments, a circular RNA molecule encodes the gene modifying polypeptide. In some embodiments, the circRNA molecule encoding the gene modifying polypeptide is delivered to a host cell. In some embodiments, a circular RNA molecule encodes a recombinase, e.g., as described herein. In some embodiments, the circRNA molecule encoding the recombinase is delivered to a host cell. In some embodiments, the circRNA molecule encoding the gene modifying polypeptide is linearized (e.g., in the host cell, e.g., in the nucleus of the host cell) prior to translation.


Circular RNAs (circRNAs) have been found to occur naturally in cells and have been found to have diverse functions, including both non-coding and protein coding roles in human cells. It has been shown that a circRNA can be engineered by incorporating a self-splicing intron into an RNA molecule (or DNA encoding the RNA molecule) that results in circularization of the RNA, and that an engineered circRNA can have enhanced protein production and stability (Wesselhoeft et al. Nature Communications 2018). In some embodiments, the gene modifying polypeptide is encoded as circRNA. In certain embodiments, the template nucleic acid is a DNA, such as a dsDNA or ssDNA. In certain embodiments, the circDNA comprises a template RNA.


In some embodiments, the circRNA comprises one or more ribozyme sequences. In some embodiments, the ribozyme sequence is activated for autocleavage, e.g., in a host cell, e.g., thereby resulting in linearization of the circRNA. In some embodiments, the ribozyme is activated when the concentration of magnesium reaches a sufficient level for cleavage, e.g., in a host cell. In some embodiments the circRNA is maintained in a low magnesium environment prior to delivery to the host cell. In some embodiments, the ribozyme is a protein-responsive ribozyme. In some embodiments, the ribozyme is a nucleic acid-responsive ribozyme. In some embodiments, the circRNA comprises a cleavage site. In some embodiments, the circRNA comprises a second cleavage site.


In some embodiments, the circRNA is linearized in the nucleus of a target cell. In some embodiments, linearization of a circRNA in the nucleus of a cell involves components present in the nucleus of the cell, e.g., to activate a cleavage event. In some embodiments, a ribozyme, e.g., a ribozyme from a B2 or ALU element, that is responsive to a nuclear element, e.g., a nuclear protein, e.g., a genome-interacting protein, e.g., an epigenetic modifier, e.g., EZH2, is incorporated into a circRNA, e.g., of a gene modifying system. In some embodiments, nuclear localization of the circRNA results in an increase in autocatalytic activity of the ribozyme and linearization of the circRNA.


In some embodiments, the ribozyme is heterologous to one or more of the other components of the gene modifying system. In some embodiments, an inducible ribozyme (e.g., in a circRNA as described herein) is created synthetically, for example, by utilizing a protein ligand-responsive aptamer design. A system for utilizing the satellite RNA of tobacco ringspot virus hammerhead ribozyme with an MS2 coat protein aptamer has been described (Kennedy et al. Nucleic Acids Res 42(19): 12306-12321 (2014), incorporated herein by reference in its entirety) that results in activation of the ribozyme activity in the presence of the MS2 coat protein. In embodiments, such a system responds to protein ligand localized to the cytoplasm or the nucleus. In some embodiments the protein ligand is not MS2. Methods for generating RNA aptamers to target ligands have been described, for example, based on the systematic evolution of ligands by exponential enrichment (SELEX) (Tuerk and Gold, Science 249(4968):505-510 (1990); Ellington and Szostak, Nature 346(6287):818-822 (1990); the methods of each of which are incorporated herein by reference) and have, in some instances, been aided by in silico design (Bell et al. PNAS 117(15):8486-8493, the methods of which are incorporated herein by reference). Thus, in some embodiments, an aptamer for a target ligand is generated and incorporated into a synthetic ribozyme system, e.g., to trigger ribozyme-mediated cleavage and circRNA linearization, e.g., in the presence of the protein ligand. In some embodiments, circRNA linearization is triggered in the cytoplasm, e.g., using an aptamer that associates with a ligand in the cytoplasm. In some embodiments, circRNA linearization is triggered in the nucleus, e.g., using an aptamer that associates with a ligand in the nucleus. In embodiments, the ligand in the nucleus comprises an epigenetic modifier or a transcription factor. In some embodiments the ligand that triggers linearization is present at higher levels in on-target cells than off-target cells.


It is further contemplated that a nucleic acid-responsive ribozyme system can be employed for circRNA linearization. For example, biosensors that sense defined target nucleic acid molecules to trigger ribozyme activation are described, e.g., in Penchovsky (Biotechnology Advances 32(5): 1015-1027 (2014), incorporated herein by reference). By these methods, a ribozyme naturally folds into an inactive state and is only activated in the presence of a defined target nucleic acid molecule (e.g., an RNA molecule). In some embodiments, a circRNA of a gene modifying system comprises a nucleic acid-responsive ribozyme that is activated in the presence of a defined target nucleic acid, e.g., an RNA, e.g., an mRNA, miRNA, guide RNA, gRNA, sgRNA, ncRNA, lncRNA, tRNA, snRNA, or mtRNA. In some embodiments the nucleic acid that triggers linearization is present at higher levels in on-target cells than off-target cells.


In some embodiments of any of the aspects herein, a gene modifying system incorporates one or more ribozymes with inducible specificity to a target tissue or target cell of interest, e.g., a ribozyme that is activated by a ligand or nucleic acid present at higher levels in a target tissue or target cell of interest. In some embodiments, the gene modifying system incorporates a ribozyme with inducible specificity to a subcellular compartment, e.g., the nucleus, nucleolus, cytoplasm, or mitochondria. In some embodiments, the ribozyme that is activated by a ligand or nucleic acid present at higher levels in the target subcellular compartment. In some embodiments, an RNA component of a gene modifying system is provided as circRNA, e.g., that is activated by linearization. In some embodiments, linearization of a circRNA encoding a gene modifying polypeptide activates the molecule for translation. In some embodiments, a signal that activates a circRNA component of a gene modifying system is present at higher levels in on-target cells or tissues, e.g., such that the system is specifically activated in these cells.


In some embodiments, an RNA component of a gene modifying system is provided as a circRNA that is inactivated by linearization. In some embodiments, a circRNA encoding the gene modifying polypeptide is inactivated by cleavage and degradation. In some embodiments, a circRNA encoding the gene modifying polypeptide is inactivated by cleavage that separates a translation signal from the coding sequence of the polypeptide. In some embodiments, a signal that inactivates a circRNA component of a gene modifying system is present at higher levels in off-target cells or tissues, such that the system is specifically inactivated in these cells.


Target Nucleic Acid Site

In some embodiments, after gene modification, the target site surrounding the edited sequence contains a limited number of insertions or deletions, for example, in less than about 50% or 10% of editing events, e.g., as determined by long-read amplicon sequencing of the target site, e.g., as described in Karst et al. (2020) bioRxiv doi.org/10.1101/645903 (incorporated by reference herein in its entirety). In some embodiments, the target site does not show multiple consecutive editing events, e.g., head-to-tail or head-to-head duplications, e.g., as determined by long-read amplicon sequencing of the target site, e.g., as described in Karst et al. bioRxiv doi.org/10.1101/645903 (2020) (incorporated herein by reference in its entirety). In some embodiments, the target site contains an integrated sequence corresponding to the template RNA. In some embodiments, the target site does not contain insertions resulting from endogenous RNA in more than about 1% or 10% of events, e.g., as determined by long-read amplicon sequencing of the target site, e.g., as described in Karst et al. bioRxiv doi.org/10.1101/645903 (2020) (incorporated herein by reference in its entirety). In some embodiments, the target site contains the integrated sequence corresponding to the template RNA.


In certain aspects of the present invention, the host DNA-binding site integrated into by the gene modifying system can be in a gene, in an intron, in an exon, an ORF, outside of a coding region of any gene, in a regulatory region of a gene, or outside of a regulatory region of a gene. In other aspects, the polypeptide may bind to one or more than one host DNA sequence.


In some embodiments, a gene modifying system is used to edit a target locus in multiple alleles. In some embodiments, a gene modifying system is designed to edit a specific allele. For example, a gene modifying polypeptide may be directed to a specific sequence that is only present on one allele, e.g., comprises a template RNA with homology to a target allele, e.g., a gRNA or annealing domain, but not to a second cognate allele. In some embodiments, a gene modifying system can alter a haplotype-specific allele. In some embodiments, a gene modifying system that targets a specific allele preferentially targets that allele, e.g., has at least a 2, 4, 6, 8, or 10-fold preference for a target allele.


Second Strand Nicking

In some embodiments, a gene modifying system described herein comprises a nickase activity (e.g., in the gene modifying polypeptide) that nicks the first strand, and a nickase activity (e.g., in a polypeptide separate from the gene modifying polypeptide) that nicks the second strand of target DNA. As discussed herein, without wishing to be bound by theory, nicking of the first strand of the target site DNA is thought to provide a 3′ OH that can be used by an RT domain to reverse transcribe a sequence of a template RNA, e.g., a heterologous object sequence. Without wishing to be bound by theory, it is thought that introducing an additional nick to the second strand may bias the cellular DNA repair machinery to adopt the heterologous object sequence-based sequence more frequently than the original genomic sequence. In some embodiments, the additional nick to the second strand is made by the same endonuclease domain (e.g., nickase domain) as the nick to the first strand. In some embodiments, the same gene modifying polypeptide performs both the nick to the first strand and the nick to the second strand. In some embodiments, the gene modifying polypeptide comprises a CRISPR/Cas domain and the additional nick to the second strand is directed by an additional nucleic acid, e.g., comprising a second gRNA directing the CRISPR/Cas domain to nick the second strand. In other embodiments, the additional second strand nick is made by a different endonuclease domain (e.g., nickase domain) than the nick to the first strand. In some embodiments, that different endonuclease domain is situated in an additional polypeptide (e.g., a system of the invention further comprises the additional polypeptide), separate from the gene modifying polypeptide. In some embodiments, the additional polypeptide comprises an endonuclease domain (e.g., nickase domain) described herein. In some embodiments, the additional polypeptide comprises a DNA binding domain, e.g., described herein.


It is contemplated herein that the position at which the second strand nick occurs relative to the first strand nick may influence the extent to which one or more of: desired gene modifying DNA modifications are obtained, undesired double-strand breaks (DSBs) occur, undesired insertions occur, or undesired deletions occur. Without wishing to be bound by theory, second strand nicking may occur in two general orientations: inward nicks and outward nicks.


In some embodiments, in the inward nick orientation, the RT domain polymerizes (e.g., using the template RNA (e.g., the heterologous object sequence)) away from the second strand nick. In some embodiments, in the inward nick orientation, the location of the nick to the first strand and the location of the nick to the second strand are positioned between the first PAM site and second PAM site (e.g., in a scenario wherein both nicks are made by a polypeptide (e.g., a gene modifying polypeptide) comprising a CRISPR/Cas domain). When there are two PAMs on the outside and two nicks on the inside, this inward nick orientation can also be referred to as “PAM-out”. In some embodiments, in the inward nick orientation, the location of the nick to the first strand and the location of the nick to the second strand are between the sites where the polypeptide and the additional polypeptide bind to the target DNA. In some embodiments, in the inward nick orientation, the location of the nick to the second strand is positioned between the binding sites of the polypeptide and additional polypeptide, and the nick to the first strand is also located between the binding sites of the polypeptide and additional polypeptide. In some embodiments, in the inward nick orientation, the location of the nick to the first strand and the location of the nick to the second strand are positioned between the PAM site and the binding site of the second polypeptide which is at a distance from the target site.


An example of a gene modifying system that provides an inward nick orientation comprises a gene modifying polypeptide comprising a CRISPR/Cas domain, a template RNA comprising a gRNA that directs nicking of the target site DNA on the first strand, and an additional nucleic acid comprising an additional gRNA that directs nicking at a site a distance from the location of the first nick, wherein the location of the first nick and the location of the second nick are between the PAM sites of the sites to which the two gRNAs direct the gene modifying polypeptide. As a further example, another gene modifying system that provides an inward nick orientation comprises a gene modifying polypeptide comprising a zinc finger molecule and a first nickase domain wherein the zinc finger molecule binds to the target DNA in a manner that directs the first nickase domain to nick the first strand of the target site; an additional polypeptide comprising a CRISPR/Cas domain, and an additional nucleic acid comprising a gRNA that directs the additional polypeptide to nick a site a distance from the target site DNA on the second strand, wherein the location of the first nick and the location of the second nick are between the PAM site and the site to which the zinc finger molecule binds. As a further example, another gene modifying system that provides an inward nick orientation comprises a gene modifying polypeptide comprising a zinc finger molecule and a first nickase domain wherein the zinc finger molecule binds to the target DNA in a manner that directs the first nickase domain to nick the first strand of the target site; an additional polypeptide comprising a TAL effector molecule and a second nickase domain wherein the TAL effector molecule binds to a site a distance from the target site in a manner that directs the additional polypeptide to nick the second strand, wherein the location of the first nick and the location of the second nick are between the site to which the TAL effector molecule binds and the site to which the zinc finger molecule binds.


In some embodiments, in the outward nick orientation, the RT domain polymerizes (e.g., using the template RNA (e.g., the heterologous object sequence)) toward the second strand nick. In some embodiments, in the outward nick orientation when both the first and second nicks are made by a polypeptide comprising a CRISPR/Cas domain (e.g., a gene modifying polypeptide), the first PAM site and second PAM site are positioned between the location of the nick to the first strand and the location of the nick to the second strand. When there are two PAMs on the inside and two nicks on the outside, this outward nick orientation also can be referred to as “PAM-in”. In some embodiments, in the outward nick orientation, the polypeptide (e.g., the gene modifying polypeptide) and the additional polypeptide bind to sites on the target DNA between the location of the nick to the first strand and the location of the nick to the second. In some embodiments, in the outward nick orientation, the location of the nick to the second strand is positioned on the opposite side of the binding sites of the polypeptide and additional polypeptide relative to the location of the nick to the first strand. In some embodiments, in the outward orientation, the PAM site and the binding site of the second polypeptide which is at a distance from the target site are positioned between the location of the nick to the first strand and the location of the nick to the second strand.


An example of a gene modifying system that provides an outward nick orientation comprises a gene modifying polypeptide comprising a CRISPR/Cas domain, a template RNA comprising a gRNA that directs nicking of the target site DNA on the first strand, and an additional nucleic acid comprising an additional gRNA that directs nicking at a site a distance from the location of the first nick, wherein the location of the first nick and the location of the second nick are outside of the PAM sites of the sites to which the two gRNAs direct the gene modifying polypeptide (i.e., the PAM sites are between the location of the first nick and the location of the second nick). As a further example, another gene modifying system that provides an outward nick orientation comprises a gene modifying polypeptide comprising a zinc finger molecule and a first nickase domain wherein the zinc finger molecule binds to the target DNA in a manner that directs the first nickase domain to nick the first strand of the target site; an additional polypeptide comprising a CRISPR/Cas domain, and an additional nucleic acid comprising a gRNA that directs the additional polypeptide to nick a site a distance from the target site DNA on the second strand, wherein the location of the first nick and the location of the second nick are outside the PAM site and the site to which the zinc finger molecule binds (i.e., the PAM site and the site to which the zinc finger molecule binds are between the location of the first nick and the location of the second nick). As a further example, another gene modifying system that provides an outward nick orientation comprises a gene modifying polypeptide comprising a zinc finger molecule and a first nickase domain wherein the zinc finger molecule binds to the target DNA in a manner that directs the first nickase domain to nick the first strand of the target site; an additional polypeptide comprising a TAL effector molecule and a second nickase domain wherein the TAL effector molecule binds to a site a distance from the target site in a manner that directs the additional polypeptide to nick the second strand, wherein the location of the first nick and the location of the second nick are outside the site to which the TAL effector molecule binds and the site to which the zinc finger molecule binds (i.e., the site to which the TAL effector molecule binds and the site to which the zinc finger molecule binds are between the location of the first nick and the location of the second nick).


Without wishing to be bound by theory, it is thought that, for gene modifying systems where a second strand nick is provided, an outward nick orientation is preferred in some embodiments. As is described herein, an inward nick may produce a higher number of double-strand breaks (DSBs) than an outward nick orientation. DSBs may be recognized by the DSB repair pathways in the nucleus of a cell, which can result in undesired insertions and deletions. An outward nick orientation may provide a decreased risk of DSB formation, and a corresponding lower amount of undesired insertions and deletions. In some embodiments, undesired insertions and deletions are insertions and deletions not encoded by the heterologous object sequence, e.g., an insertion or deletion produced by the double-strand break repair pathway unrelated to the modification encoded by the heterologous object sequence. In some embodiments, a desired gene modification comprises a change to the target DNA (e.g., a substitution, insertion, or deletion) encoded by the heterologous object sequence (e.g., and achieved by the gene modifying writing the heterologous object sequence into the target site). In some embodiments, the first strand nick and the second strand nick are in an outward orientation.


In addition, the distance between the first strand nick and second strand nick may influence the extent to which one or more of: desired gene modifying system DNA modifications are obtained, undesired double-strand breaks (DSBs) occur, undesired insertions occur, or undesired deletions occur. Without wishing to be bound by theory, it is thought the second strand nick benefit, the biasing of DNA repair toward incorporation of the heterologous object sequence into the target DNA, increases as the distance between the first strand nick and second strand nick decreases. However, it is thought that the risk of DSB formation also increases as the distance between the first strand nick and second strand nick decreases. Correspondingly, it is thought that the number of undesired insertions and/or deletions may increase as the distance between the first strand nick and second strand nick decreases. In some embodiments, the distance between the first strand nick and second strand nick is chosen to balance the benefit of biasing DNA repair toward incorporation of the heterologous object sequence into the target DNA and the risk of DSB formation and of undesired deletions and/or insertions. In some embodiments, a system where the first strand nick and the second strand nick are at least a threshold distance apart has an increased level of desired gene modifying system modification outcomes, a decreased level of undesired deletions, and/or a decreased level of undesired insertions relative to an otherwise similar inward nick orientation system where the first nick and the second nick are less than the a threshold distance apart. In some embodiments the threshold distance(s) is given below.


In some embodiments, the first nick and the second nick are at least 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, or 200 nucleotides apart. In some embodiments, the first nick and the second nick are no more than 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, or 250 nucleotides apart. In some embodiments, the first nick and the second nick are 20-200, 30-200, 40-200, 50-200, 60-200, 70-200, 80-200, 90-200, 100-200, 110-200, 120-200, 130-200, 140-200, 150-200, 160-200, 170-200, 180-200, 190-200, 20-190, 30-190, 40-190, 50-190, 60-190, 70-190, 80-190, 90-190, 100-190, 110-190, 120-190, 130-190, 140-190, 150-190, 160-190, 170-190, 180-190, 20-180, 30-180, 40-180, 50-180, 60-180, 70-180, 80-180, 90-180, 100-180, 110-180, 120-180, 130-180, 140-180, 150-180, 160-180, 170-180, 20-170, 30-170, 40-170, 50-170, 60-170, 70-170, 80-170, 90-170, 100-170, 110-170, 120-170, 130-170, 140-170, 150-170, 160-170, 20-160, 30-160, 40-160, 50-160, 60-160, 70-160, 80-160, 90-160, 100-160, 110-160, 120-160, 130-160, 140-160, 150-160, 20-150, 30-150, 40-150, 50-150, 60-150, 70-150, 80-150, 90-150, 100-150, 110-150, 120-150, 130-150, 140-150, 20-140, 30-140, 40-140, 50-140, 60-140, 70-140, 80-140, 90-140, 100-140, 110-140, 120-140, 130-140, 20-130, 30-130, 40-130, 50-130, 60-130, 70-130, 80-130, 90-130, 100-130, 110-130, 120-130, 20-120, 30-120, 40-120, 50-120, 60-120, 70-120, 80-120, 90-120, 100-120, 110-120, 20-110, 30-110, 40-110, 50-110, 60-110, 70-110, 80-110, 90-110, 100-110, 20-100, 30-100, 40-100, 50-100, 60-100, 70-100, 80-100, 90-100, 20-90, 30-90, 40-90, 50-90, 60-90, 70-90, 80-90, 20-80, 30-80, 40-80, 50-80, 60-80, 70-80, 20-70, 30-70, 40-70, 50-70, 60-70, 20-60, 30-60, 40-60, 50-60, 20-50, 30-50, 40-50, 20-40, 30-40, or 20-30 nucleotides apart. In some embodiments, the first nick and the second nick are 40-100 nucleotides apart.


Without wishing to be bound by theory, it is thought that, for gene modifying systems where a second strand nick is provided and an inward nick orientation is selected, increasing the distance between the first strand nick and second strand nick may be preferred. As is described herein, an inward nick orientation may produce a higher number of DSBs than an outward nick orientation, and may result in a higher amount of undesired insertions and deletions than an outward nick orientation, but increasing the distance between the nicks may mitigate that increase in DSBs, undesired deletions, and/or undesired insertions. In some embodiments, an inward nick orientation wherein the first nick and the second nick are at least a threshold distance apart has an increased level of desired gene modifying system modification outcomes, a decreased level of undesired deletions, and/or a decreased level of undesired insertions relative to an otherwise similar inward nick orientation system where the first nick and the second nick are less than the a threshold distance apart. In some embodiments the threshold distance is given below.


In some embodiments, the first strand nick and the second strand nick are in an inward orientation. In some embodiments, the first strand nick and the second strand nick are in an inward orientation and the first strand nick and second strand nick are at least 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 220, 240, 260, 280, 300, 350, 400, 450, or 500 nucleotides apart, e.g., at least 100 nucleotides apart, (and optionally no more than 500, 400, 300, 200, 190, 180, 170, 160, 150, 140, 130, or 120 nucleotides apart). In some embodiments, the first strand nick and the second strand nick are in an inward orientation and the first strand nick and second strand nick are 100-200, 110-200, 120-200, 130-200, 140-200, 150-200, 160-200, 170-200, 180-200, 190-200, 100-190, 110-190, 120-190, 130-190, 140-190, 150-190, 160-190, 170-190, 180-190, 100-180, 110-180, 120-180, 130-180, 140-180, 150-180, 160-180, 170-180, 100-170, 110-170, 120-170, 130-170, 140-170, 150-170, 160-170, 100-160, 110-160, 120-160, 130-160, 140-160, 150-160, 100-150, 110-150, 120-150, 130-150, 140-150, 100-140, 110-140, 120-140, 130-140, 100-130, 110-130, 120-130, 100-120, 110-120, or 100-110 nucleotides apart.


Chemically Modified Nucleic Acids and Nucleic Acid End Features

A nucleic acid described herein (e.g., a template nucleic acid, e.g., a template RNA; or a nucleic acid (e.g., mRNA) encoding a gene modifying polypeptide; or a gRNA) can comprise unmodified or modified nucleobases. Naturally occurring RNAs are synthesized from four basic ribonucleotides: ATP, CTP, UTP and GTP, but may contain post-transcriptionally modified nucleotides. Further, approximately one hundred different nucleoside modifications have been identified in RNA (Rozenski, J, Crain, P, and McCloskey, J. (1999). The RNA Modification Database: 1999 update. Nucl Acids Res 27: 196-197). An RNA can also comprise wholly synthetic nucleotides that do not occur in nature.


In some embodiments, the chemical modification is one provided in WO/2016/183482, US Pat. Pub. No. 20090286852, of International Application No. WO/2012/019168, WO/2012/045075, WO/2012/135805, WO/2012/158736, WO/2013/039857, WO/2013/039861, WO/2013/052523, WO/2013/090648, WO/2013/096709, WO/2013/101690, WO/2013/106496, WO/2013/130161, WO/2013/151669, WO/2013/151736, WO/2013/151672, WO/2013/151664, WO/2013/151665, WO/2013/151668, WO/2013/151671, WO/2013/151667, WO/2013/151670, WO/2013/151666, WO/2013/151663, WO/2014/028429, WO/2014/081507, WO/2014/093924, WO/2014/093574, WO/2014/113089, WO/2014/144711, WO/2014/144767, WO/2014/144039, WO/2014/152540, WO/2014/152030, WO/2014/152031, WO/2014/152027, WO/2014/152211, WO/2014/158795, WO/2014/159813, WO/2014/164253, WO/2015/006747, WO/2015/034928, WO/2015/034925, WO/2015/038892, WO/2015/048744, WO/2015/051214, WO/2015/051173, WO/2015/051169, WO/2015/058069, WO/2015/085318, WO/2015/089511, WO/2015/105926, WO/2015/164674, WO/2015/196130, WO/2015/196128, WO/2015/196118, WO/2016/011226, WO/2016/011222, WO/2016/011306, WO/2016/014846, WO/2016/022914, WO/2016/036902, WO/2016/077125, or WO/2016/077123, each of which is herein incorporated by reference in its entirety. It is understood that incorporation of a chemically modified nucleotide into a polynucleotide can result in the modification being incorporated into a nucleobase, the backbone, or both, depending on the location of the modification in the nucleotide. In some embodiments, the backbone modification is one provided in EP 2813570, which is herein incorporated by reference in its entirety. In some embodiments, the modified cap is one provided in US Pat. Pub. No. 20050287539, which is herein incorporated by reference in its entirety.


In some embodiments, the chemically modified nucleic acid (e.g., RNA, e.g., mRNA) comprises one or more of ARCA: anti-reverse cap analog (m27.3′-OGP3G), GP3G (Unmethylated Cap Analog), m7GP3G (Monomethylated Cap Analog), m32.2.7GP3G (Trimethylated Cap Analog), m5CTP (5′-methyl-cytidine triphosphate), m6ATP (N6-methyl-adenosine-5′-triphosphate), s2UTP (2-thio-uridine triphosphate), and Y (pseudouridine triphosphate).


In some embodiments, the chemically modified nucleic acid comprises a 5′ cap, e.g.: a 7-methylguanosine cap (e.g., a O-Me-m7G cap); a hypermethylated cap analog; an NAD+-derived cap analog (e.g., as described in Kiledjian, Trends in Cell Biology 28, 454-464 (2018)); or a modified, e.g., biotinylated, cap analog (e.g., as described in Bednarek et al., Phil Trans R Soc B 373, 20180167 (2018)).


In some embodiments, the chemically modified nucleic acid comprises a 3′ feature selected from one or more of: a polyA tail; a 16-nucleotide long stem-loop structure flanked by unpaired 5 nucleotides (e.g., as described by Mannironi et al., Nucleic Acid Research 17, 9113-9126 (1989)); a triple-helical structure (e.g., as described by Brown et al., PNAS 109, 19202-19207 (2012)); a tRNA, Y RNA, or vault RNA structure (e.g., as described by Labno et al., Biochemica et Biophysica Acta 1863, 3125-3147 (2016)); incorporation of one or more deoxyribonucleotide triphosphates (dNTPs), 2′O-Methylated NTPs, or phosphorothioate-NTPs; a single nucleotide chemical modification (e.g., oxidation of the 3′ terminal ribose to a reactive aldehyde followed by conjugation of the aldehyde-reactive modified nucleotide); or chemical ligation to another nucleic acid molecule.


In some embodiments, the nucleic acid (e.g., template nucleic acid) comprises one or more modified nucleotides, e.g., selected from dihydrouridine, inosine, 7-methylguanosine, 5-methylcytidine (5mC), 5′ Phosphate ribothymidine, 2′-O-methyl ribothymidine, 2′-O-ethyl ribothymidine, 2′-fluoro ribothymidine, C-5 propynyl-deoxycytidine (pdC), C-5 propynyl-deoxyuridine (pdU), C-5 propynyl-cytidine (pC), C-5 propynyl-uridine (pU), 5-methyl cytidine, 5-methyl uridine, 5-methyl deoxycytidine, 5-methyl deoxyuridine methoxy, 2,6-diaminopurine, 5′-Dimethoxytrityl-N4-ethyl-2′-deoxycytidine, C-5 propynyl-f-cytidine (pfC), C-5 propynyl-f-uridine (pfU), 5-methyl f-cytidine, 5-methyl f-uridine, C-5 propynyl-m-cytidine (pmC), C-5 propynyl-f-uridine (pmU), 5-methyl m-cytidine, 5-methyl m-uridine, LNA (locked nucleic acid), MGB (minor groove binder) pseudouridine (Y), 1-N-methylpseudouridine (1-Me-Y′), or 5-methoxyuridine (5-MO-U).


In some embodiments, the nucleic acid comprises a backbone modification, e.g., a modification to a sugar or phosphate group in the backbone. In some embodiments, the nucleic acid comprises a nucleobase modification.


In some embodiments, the nucleic acid comprises one or more chemically modified nucleotides of Table 13, one or more chemical backbone modifications of Table 14, one or more chemically modified caps of Table 15. For instance, in some embodiments, the nucleic acid comprises two or more (e.g., 3, 4, 5, 6, 7, 8, 9, or 10 or more) different types of chemical modifications. As an example, the nucleic acid may comprise two or more (e.g., 3, 4, 5, 6, 7, 8, 9, or 10 or more) different types of modified nucleobases, e.g., as described herein, e.g., in Table 13. Alternatively or in combination, the nucleic acid may comprise two or more (e.g., 3, 4, 5, 6, 7, 8, 9, or 10 or more) different types of backbone modifications, e.g., as described herein, e.g., in Table 14. Alternatively or in combination, the nucleic acid may comprise one or more modified cap, e.g., as described herein, e.g., in Table 15. For instance, in some embodiments, the nucleic acid comprises one or more type of modified nucleobase and one or more type of backbone modification; one or more type of modified nucleobase and one or more modified cap; one or more type of modified cap and one or more type of backbone modification; or one or more type of modified nucleobase, one or more type of backbone modification, and one or more type of modified cap.


In some embodiments, the nucleic acid comprises one or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000, or more) modified nucleobases. In some embodiments, all nucleobases of the nucleic acid are modified. In some embodiments, the nucleic acid is modified at one or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000, or more) positions in the backbone. In some embodiments, all backbone positions of the nucleic acid are modified.









TABLE 13





Modified nucleotides
















5-aza-uridine
N2-methyl-6-thio-guanosine


2-thio-5-aza-midine
N2,N2-dimethyl-6-thio-guanosine


2-thiouridine
pyridin-4-one ribonucleoside


4-thio-pseudouridine
2-thio-5-aza-uridine


2-thio-pseudouridine
2-thiomidine


5-hydroxyuridine
4-thio-pseudomidine


3-methyluridine
2-thio-pseudowidine


5-carboxymethyl-uridine
3-methylmidine


1-carboxymethyl-pseudouridine
1-propynyl-pseudomidine


5-propynyl-uridine
1-methyl-1-deaza-pseudomidine


1-propynyl-pseudouridine
2-thio-1-methyl-1-deaza-pseudouridine


5-taurinomethyluridine
4-methoxy-pseudomidine


1-taurinomethyl-pseudouridine
5′-O-(1-Thiophosphate)-Adenosine


5-taurinomethyl-2-thio-uridine
5′-O-(1-Thiophosphate)-Cytidine


1-taurinomethyl-4-thio-uridine
5′-O-(1-thiophosphate)-Guanosine


5-methyl-uridine
5′-O-(1-Thiophophate)-Uridine


1-methyl-pseudouridine
5′-O-(1-Thiophosphate)-Pseudouridine


4-thio-1-methyl-pseudouridine
2′-O-methyl-Adenosine


2-thio-1-methyl-pseudouridine
2′-O-methyl-Cytidine


1-methyl-1-deaza-pseudouridine
2′-O-methyl-Guanosine


2-thio-1-methyl-1-deaza-pseudomidine
2′-O-methyl-Uridine


dihydrouridine
2′-O-methyl-Pseudouridine


dihydropseudouridine
2′-O-methyl-Inosine


2-thio-dihydromidine
2-methyladenosine


2-thio-dihydropseudouridine
2-methylthio-N6-methyladenosine


2-methoxyuridine
2-methylthio-N6 isopentenyladenosine


2-methoxy-4-thio-uridine
2-methylthio-N6-(cis-


4-methoxy-pseudouridine
hydroxyisopentenyl)adenosine


4-methoxy-2-thio-pseudouridine
N6-methyl-N6-threonylcarbamoyladenosine


5-aza-cytidine
N6-hydroxynorvalylcarbamoyladenosine


pseudoisocytidine
2-methylthio-N6-hydroxynorvalyl


3-methyl-cytidine
carbamoyladenosine


N4-acetylcytidine
2′-O-ribosyladenosine (phosphate)


5-formylcytidine
1,2′-O-dimethylinosine


N4-methylcytidine
5,2′-O-dimethylcytidine


5-hydroxymethylcytidine
N4-acetyl-2′-O-methylcytidine


1-methyl-pseudoisocytidine
Lysidine


pyrrolo-cytidine
7-methylguanosine


pyrrolo-pseudoisocytidine
N2,2′-O-dimethylguanosine


2-thio-cytidine
N2,N2,2′-O-trimethylguanosine


2-thio-5-methyl-cytidine
2′-O-ribosylguanosine (phosphate)


4-thio-pseudoisocytidine
Wybutosine


4-thio-1-methyl-pseudoisocytidine
Peroxywybutosine


4-thio-1-methyl-1-deaza-pseudoisocytidine
Hydroxywybutosine


1-methyl-1-deaza-pseudoisocytidine
undermodified hydroxywybutosine


zebularine
methylwyosine


5-aza-zebularine
queuosine


5-methyl-zebularine
epoxyqueuosine


5-aza-2-thio-zebularine
galactosyl-queuosine


2-thio-zebularine
mannosyl-queuosine


2-methoxy-cytidine
7-cyano-7-deazaguanosine


2-methoxy-5-methyl-cytidine
7-aminomethyl-7-deazaguanosine


4-methoxy-pseudoisocytidine
archaeosine


4-methoxy-1-methyl-pseudoisocytidine
5,2′-O-dimethyluridine


2-aminopurine
4-thiouridine


2,6-diaminopurine
5-methyl-2-thiouridine


7-deaza-adenine
2-thio-2′-O-methyluridine


7-deaza-8-aza-adenine
3-(3-amino-3-carboxypropyl)uridine


7-deaza-2-aminopurine
5-methoxyuridine


7-deaza-8-aza-2-aminopurine
uridine 5-oxyacetic acid


7-deaza-2,6- diaminopurine
uridine 5-oxyacetic acid methyl ester


7-deaza-8-aza-2,6-diarninopurine
5-(carboxyhydroxymethyl)uridine)


1-methyladenosine
5-(carboxyhydroxymethyl)uridine methyl ester


N6-isopentenyladenosine
5-methoxycarbonylmethyluridine


N6-(cis-hydroxyisopentenyl)adenosine
5-methoxycarbonylmethyl-2′-O-methyluridine


2-methylthio-N6-(cis-hydroxyisopentenyl)
5-methoxycarbonylmethyl-2-thiouridine


adenosine
5-aminomethyl-2-thiouridine


N6-glycinylcarbamoyladenosine
5-methylaminomethyluridine


N6-threonylcarbamoyladenosine
5-methylaminomethyl-2-thiouridine


2-methylthio-N6-threonyl
5-methylaminomethyl-2-selenouridine


carbamoyladenosine
5-carbamoylmethyluridine


N6,N6-dimethyladenosine
5-carbamoylmethyl-2′-O-methyluridine


7-methyladenine
5-carboxymethylaminomethyluridine


2-methylthio-adenine
5-carboxymethylaminomethyl-2′-O-


2-methoxy-adenine
methyluridine


inosine
5-carboxymethylaminomethyl-2-thiouridine


1-methyl-inosine
N4,2′-O-dimethylcytidine


wyosine
5-carboxymethyluridine


wybutosine
N6,2′-O-dimethyladenosine


7-deaza-guanosine
N,N6,O-2′-trimethyladenosine


7-deaza-8-aza-guanosine
N2,7-dimethylguanosine


6-thio-guanosine
N2,N2,7-trimethylguanosine


6-thio-7-deaza-guanosine
3,2′-O-dimethyluridine


6-thio-7-deaza-8-aza-guanosine
5-methyldihydrouridine


7-methyl-guanosine
5-formyl-2′-O-methylcytidine


6-thio-7-methyl-guanosine
1,2′-O-dimethylguanosine


7-methylinosine
4-demethylwyosine


6-methoxy-guanosine
Isowyosine


1-methylguanosine
N6-acetyladenosine


N2-methylguanosine



N2,N2-dimethylguanosine



8-oxo-guanosine



7-methyl-8-oxo-guanosine



1-methyl-6-thio-guanosine
















TABLE 14





Backbone modifications

















2′-O-Methyl backbone



Peptide Nucleic Acid (PNA) backbone



phosphorothioate backbone



morpholino backbone



carbamate backbone



siloxane backbone



sulfide backbone



sulfoxide backbone



sulfone backbone



formacetyl backbone



thioformacetyl backbone



methyleneformacetyl backbone



riboacetyl backbone



alkene containing backbone



sulfamate backbone



sulfonate backbone



sulfonamide backbone



methyleneimino backbone



methylenehydrazino backbone



amide backbone
















TABLE 15





Modified caps

















m7GpppA



m7GpppC



m2,7GpppG



m2,2,7GpppG



m7Gpppm7G



m7,2′OmeGpppG



m72′dGpppG



m7,3′OmeGpppG



m7,3′dGpppG



GppppG



m7GppppG



m7GppppA



m7GppppC



m2,7GppppG



m2,2,7GppppG



m7Gppppm7G



m7,2′OmeGppppG



m72′dGppppG



m7,3′OmeGppppG



m7,3′dGppppG









The nucleotides comprising the template of the gene modifying system can be natural or modified bases, or a combination thereof. For example, the template may contain pseudouridine, dihydrouridine, inosine, 7-methylguanosine, or other modified bases. In some embodiments, the template may contain locked nucleic acid nucleotides. In some embodiments, the modified bases used in the template do not inhibit the reverse transcription of the template. In some embodiments, the modified bases used in the template may improve reverse transcription, e.g., specificity or fidelity.


In some embodiments, an RNA component of the system (e.g., a template RNA or a gRNA) comprises one or more nucleotide modifications. In some embodiments, the modification pattern of a gRNA can significantly affect in vivo activity compared to unmodified or end-modified guides (e.g., as shown in FIG. 1D from Finn et al. Cell Rep 22(9):2227-2235 (2018); incorporated herein by reference in its entirety). Without wishing to be bound by theory, this process may be due, at least in part, to a stabilization of the RNA conferred by the modifications. Non-limiting examples of such modifications may include 2′-O-methyl (2′-O-Me), 2′-O-(2-methoxyethyl) (2′-O-MOE), 2′-fluoro (2′-F), phosphorothioate (PS) bond between nucleotides, G-C substitutions, and inverted abasic linkages between nucleotides and equivalents thereof.


In some embodiments, the template RNA (e.g., at the portion thereof that binds a target site) or the guide RNA comprises a 5′ terminus region. In some embodiments, the template RNA or the guide RNA does not comprise a 5′ terminus region. In some embodiments, the 5′ terminus region comprises a gRNA spacer region, e.g., as described with respect to sgRNA in Briner AE et al, Molecular Cell 56: 333-339 (2014) (incorporated herein by reference in its entirety; applicable herein, e.g., to all guide RNAs). In some embodiments, the 5′ terminus region comprises a 5′ end modification. In some embodiments, a 5′ terminus region with or without a spacer region may be associated with a crRNA, trRNA, sgRNA and/or dgRNA. The gRNA spacer region can, in some instances, comprise a guide region, guide domain, or targeting domain.


In some embodiments, the template RNAs (e.g., at the portion thereof that binds a target site) or guide RNAs described herein comprises any of the sequences shown in Table 4 of WO2018107028A1, incorporated herein by reference in its entirety. In some embodiments, where a sequence shows a guide and/or spacer region, the composition may comprise this region or not. In some embodiments, a guide RNA comprises one or more of the modifications of any of the sequences shown in Table 4 of WO2018107028A1, e.g., as identified therein by a SEQ ID NO. In embodiments, the nucleotides may be the same or different, and/or the modification pattern shown may be the same or similar to a modification pattern of a guide sequence as shown in Table 4 of WO2018107028A1. In some embodiments, a modification pattern includes the relative position and identity of modifications of the gRNA or a region of the gRNA (e.g. 5′ terminus region, lower stem region, bulge region, upper stem region, nexus region, hairpin 1 region, hairpin 2 region, 3′ terminus region). In some embodiments, the modification pattern contains at least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% of the modifications of any one of the sequences shown in the sequence column of Table 4 of WO2018107028A1, and/or over one or more regions of the sequence. In some embodiments, the modification pattern is at least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the modification pattern of any one of the sequences shown in the sequence column of Table 4 of WO2018107028A1. In some embodiments, the modification pattern is at least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical over one or more regions of the sequence shown in Table 4 of WO2018107028A1, e.g., in a 5 ‘ terminus region, lower stem region, bulge region, upper stem region, nexus region, hairpin 1 region, hairpin 2 region, and/or 3’ terminus region. In some embodiments, the modification pattern is least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the modification pattern of a sequence over the 5′ terminus region. In some embodiments, the modification pattern is least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical over the lower stem. In some embodiments, the modification pattern is least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical over the bulge. In some embodiments, the modification pattern is least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical over the upper stem. In some embodiments, the modification pattern is least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical over the nexus. In some embodiments, the modification pattern is least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical over the hairpin 1. In some embodiments, the modification pattern is least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical over the hairpin 2. In some embodiments, the modification pattern is least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical over the 3′ terminus. In some embodiments, the modification pattern differs from the modification pattern of a sequence of Table 4 of WO2018107028A1, or a region (e.g. 5′ terminus, lower stem, bulge, upper stem, nexus, hairpin 1, hairpin 2, 3′ terminus) of such a sequence, e.g., at 0, 1, 2, 3, 4, 5, 6, or more nucleotides. In some embodiments, the gRNA comprises modifications that differ from the modifications of a sequence of Table 4 of WO2018107028A1, e.g., at 0, 1, 2, 3, 4, 5, 6, or more nucleotides. In some embodiments, the gRNA comprises modifications that differ from modifications of a region (e.g. 5 ‘ terminus, lower stem, bulge, upper stem, nexus, hairpin 1, hairpin 2, 3’ terminus) of a sequence of Table 4 of WO2018107028A1, e.g., at 0, 1, 2, 3, 4, 5, 6, or more nucleotides.


In some embodiments, the template RNAs (e.g., at the portion thereof that binds a target site) or the gRNA comprises a 2′-O-methyl (2′-O-Me) modified nucleotide. In some embodiments, the gRNA comprises a 2′-O-(2-methoxy ethyl) (2′-O-moe) modified nucleotide. In some embodiments, the gRNA comprises a 2′-fluoro (2′-F) modified nucleotide. In some embodiments, the gRNA comprises a phosphorothioate (PS) bond between nucleotides. In some embodiments, the gRNA comprises a 5′ end modification, a 3′ end modification, or 5′ and 3′ end modifications. In some embodiments, the 5′ end modification comprises a phosphorothioate (PS) bond between nucleotides. In some embodiments, the 5′ end modification comprises a 2′-O-methyl (2′-O-Me), 2′-O-(2-methoxy ethyl) (2′-O-MOE), and/or 2′-fluoro (2′-F) modified nucleotide. In some embodiments, the 5′ end modification comprises at least one phosphorothioate (PS) bond and one or more of a 2′-O-methyl (2′-O-Me), 2′-O-(2-methoxyethyl) (2′-O-MOE), and/or 2′-fluoro (2′-F) modified nucleotide. The end modification may comprise a phosphorothioate (PS), 2′-O-methyl (2′-O-Me), 2′-O-(2-methoxyethyl) (2′-O-MOE), and/or 2′-fluoro (2′-F) modification. Equivalent end modifications are also encompassed by embodiments described herein. In some embodiments, the template RNA or gRNA comprises an end modification in combination with a modification of one or more regions of the template RNA or gRNA. Additional exemplary modifications and methods for protecting RNA, e.g., gRNA, and formulae thereof, are described in WO2018126176A1, which is incorporated herein by reference in its entirety.


In some embodiments, a template RNA described herein comprises three phosphorothioate linkages at the 5′ end and three phosphorothioate linkages at the 3′ end. In some embodiments, a template RNA described herein comprises three 2′-O-methyl ribonucleotides at the 5′ end and three 2′-O-methyl ribonucleotides at the 3′ end. In some embodiments, the 5′ most three nucleotides of the template RNA are 2′-O-methyl ribonucleotides, the 5′ most three internucleotide linkages of the template RNA are phosphorothioate linkages, the 3′ most three nucleotides of the template RNA are 2′-O-methyl ribonucleotides, and the 3′ most three internucleotide linkages of the template RNA are phosphorothioate linkages. In some embodiments, the template RNA comprises alternating blocks of ribonucleotides and 2′-O-methyl ribonucleotides, for instance, blocks of between 12 and 28 nucleotides in length. In some embodiments, the central portion of the template RNA comprises the alternating blocks and the 5′ and 3′ ends each comprise three 2′-O-methyl ribonucleotides and three phosphorothioate linkages.


In some embodiments, structure-guided and systematic approaches are used to introduce modifications (e.g., 2′-OMe-RNA, 2′-F-RNA, and PS modifications) to a template RNA or guide RNA, for example, as described in Mir et al. Nat Commun 9:2641 (2018) (incorporated by reference herein in its entirety). In some embodiments, the incorporation of 2′-F-RNAs increases thermal and nuclease stability of RNA:RNA or RNA:DNA duplexes, e.g., while minimally interfering with C3′-endo sugar puckering. In some embodiments, 2′-F may be better tolerated than 2′-OMe at positions where the 2′-OH is important for RNA:DNA duplex stability. In some embodiments, a crRNA comprises one or more modifications that do not reduce Cas9 activity, e.g., C10, C20, or C21 (fully modified), e.g., as described in Supplementary Table 1 of Mir et al. Nat Commun 9:2641 (2018), incorporated herein by reference in its entirety. In some embodiments, a tracrRNA comprises one or more modifications that do not reduce Cas9 activity, e.g., T2, T6, T7, or T8 (fully modified) of Supplementary Table 1 of Mir et al. Nat Commun 9:2641 (2018). In some embodiments, a crRNA comprises one or more modifications (e.g., as described herein) may be paired with a tracrRNA comprising one or more modifications, e.g., C20 and T2. In some embodiments, a gRNA comprises a chimera, e.g., of a crRNA and a tracrRNA (e.g., Jinek et al. Science 337(6096):816-821 (2012)). In embodiments, modifications from the crRNA and tracrRNA are mapped onto the single-guide chimera, e.g., to produce a modified gRNA with enhanced stability.


In some embodiments, gRNA molecules may be modified by the addition or subtraction of the naturally occurring structural components, e.g., hairpins. In some embodiments, a gRNA may comprise a gRNA with one or more 3′ hairpin elements deleted, e.g., as described in WO2018106727, incorporated herein by reference in its entirety. In some embodiments, a gRNA may contain an added hairpin structure, e.g., an added hairpin structure in the spacer region, which was shown to increase specificity of a CRISPR-Cas system in the teachings of Kocak et al. Nat Biotechnol 37(6):657-666 (2019). Additional modifications, including examples of shortened gRNA and specific modifications improving in vivo activity, can be found in US20190316121, incorporated herein by reference in its entirety.


In some embodiments, structure-guided and systematic approaches (e.g., as described in Mir et al. Nat Commun 9:2641 (2018); incorporated herein by reference in its entirety) are employed to find modifications for the template RNA. In embodiments, the modifications are identified with the inclusion or exclusion of a guide region of the template RNA. In some embodiments, a structure of polypeptide bound to template RNA is used to determine non-protein-contacted nucleotides of the RNA that may then be selected for modifications, e.g., with lower risk of disrupting the association of the RNA with the polypeptide. Secondary structures in a template RNA can also be predicted in silico by software tools, e.g., the RNAstructure tool available at rna.urmc.rochester.edu/RNAstructureWeb (Bellaousov et al. Nucleic Acids Res 41:W471-W474 (2013); incorporated by reference herein in its entirety), e.g., to determine secondary structures for selecting modifications, e.g., hairpins, stems, and/or bulges.


Production of Compositions and Systems

As will be appreciated by one of skill, methods of designing and constructing nucleic acid constructs and proteins or polypeptides (such as the systems, constructs and polypeptides described herein) are routine in the art. Generally, recombinant methods may be used. See, in general, Smales & James (Eds.), Therapeutic Proteins: Methods and Protocols (Methods in Molecular Biology), Humana Press (2005); and Crommelin, Sindelar & Meibohm (Eds.), Pharmaceutical Biotechnology: Fundamentals and Applications, Springer (2013). Methods of designing, preparing, evaluating, purifying and manipulating nucleic acid compositions are described in Green and Sambrook (Eds.), Molecular Cloning: A Laboratory Manual (Fourth Edition), Cold Spring Harbor Laboratory Press (2012).


The disclosure provides, in part, a nucleic acid, e.g., vector, encoding a gene modifying polypeptide described herein, a template nucleic acid described herein, or both. In some embodiments, a vector comprises a selective marker, e.g., an antibiotic resistance marker. In some embodiments, the antibiotic resistance marker is a kanamycin resistance marker. In some embodiments, the antibiotic resistance marker does not confer resistance to beta-lactam antibiotics. In some embodiments, the vector does not comprise an ampicillin resistance marker. In some embodiments, the vector comprises a kanamycin resistance marker and does not comprise an ampicillin resistance marker. In some embodiments, a vector encoding a gene modifying polypeptide is integrated into a target cell genome (e.g., upon administration to a target cell, tissue, organ, or subject). In some embodiments, a vector encoding a gene modifying polypeptide is not integrated into a target cell genome (e.g., upon administration to a target cell, tissue, organ, or subject). In some embodiments, a vector encoding a template nucleic acid (e.g., template RNA) is not integrated into a target cell genome (e.g., upon administration to a target cell, tissue, organ, or subject). In some embodiments, if a vector is integrated into a target site in a target cell genome, the selective marker is not integrated into the genome. In some embodiments, if a vector is integrated into a target site in a target cell genome, genes or sequences involved in vector maintenance (e.g., plasmid maintenance genes) are not integrated into the genome. In some embodiments, if a vector is integrated into a target site in a target cell genome, transfer regulating sequences (e.g., inverted terminal repeats, e.g., from an AAV) are not integrated into the genome. In some embodiments, administration of a vector (e.g., encoding a gene modifying polypeptide described herein, a template nucleic acid described herein, or both) to a target cell, tissue, organ, or subject results in integration of a portion of the vector into one or more target sites in the genome(s) of said target cell, tissue, organ, or subject. In some embodiments, less than 99, 95, 90, 80, 70, 60, 50, 40, 30, 20, 10, 5, 4, 3, 2, or 1% of target sites (e.g., no target sites) comprising integrated material comprise a selective marker (e.g., an antibiotic resistance gene), a transfer regulating sequence (e.g., an inverted terminal repeat, e.g., from an AAV), or both from the vector.


Exemplary methods for producing a therapeutic pharmaceutical protein or polypeptide described herein involve expression in mammalian cells, although recombinant proteins can also be produced using insect cells, yeast, bacteria, or other cells under control of appropriate promoters. Mammalian expression vectors may comprise non-transcribed elements such as an origin of replication, a suitable promoter, and other 5′ or 3′ flanking non-transcribed sequences, and 5′ or 3′ non-translated sequences such as necessary ribosome binding sites, a polyadenylation site, splice donor and acceptor sites, and termination sequences. DNA sequences derived from the SV40 viral genome, for example, SV40 origin, early promoter, splice, and polyadenylation sites may be used to provide other genetic elements required for expression of a heterologous DNA sequence. Appropriate cloning and expression vectors for use with bacterial, fungal, yeast, and mammalian cellular hosts are described in Green & Sambrook, Molecular Cloning: A Laboratory Manual (Fourth Edition), Cold Spring Harbor Laboratory Press (2012).


Various mammalian cell culture systems can be employed to express and manufacture recombinant protein. Examples of mammalian expression systems include CHO, COS, HEK293, HeLA, and BHK cell lines. Processes of host cell culture for production of protein therapeutics are described in Zhou and Kantardjieff (Eds.), Mammalian Cell Cultures for Biologics Manufacturing (Advances in Biochemical Engineering Biotechnology), Springer (2014). Compositions described herein may include a vector, such as a viral vector, e.g., a lentiviral vector, encoding a recombinant protein. In some embodiments, a vector, e.g., a viral vector, may comprise a nucleic acid encoding a recombinant protein.


Purification of protein therapeutics is described in Franks, Protein Biotechnology: Isolation, Characterization, and Stabilization, Humana Press (2013); and in Cutler, Protein Purification Protocols (Methods in Molecular Biology), Humana Press (2010).


The disclosure also provides compositions and methods for the production of template nucleic acid molecules (e.g., template RNAs) with specificity for a gene modifying polypeptide and/or a genomic target site. In an aspect, the method comprises production of RNA segments including an upstream homology segment, a heterologous object sequence segment, a gene modifying polypeptide binding motif, and a gRNA segment.


Therapeutic Applications

In some embodiments, a gene modifying system as described herein can be used to modify a cell (e.g., an animal cell, plant cell, or fungal cell). In some embodiments, a gene modifying system as described herein can be used to modify a mammalian cell (e.g., a human cell). In some embodiments, a gene modifying system as described herein can be used to modify a cell from a livestock animal (e.g., a cow, horse, sheep, goat, pig, llama, alpaca, camel, yak, chicken, duck, goose, or ostrich). In some embodiments, a gene modifying system as described herein can be used as a laboratory tool or a research tool, or used in a laboratory method or research method, e.g., to modify an animal cell, e.g., a mammalian cell (e.g., a human cell), a plant cell, or a fungal cell.


By integrating coding genes into a RNA sequence template, the gene modifying system can address therapeutic needs, for example, by providing expression of a therapeutic transgene in individuals with loss-of-function mutations, by replacing gain-of-function mutations with normal transgenes, by providing regulatory sequences to eliminate gain-of-function mutation expression, and/or by controlling the expression of operably linked genes, transgenes and systems thereof. In certain embodiments, the RNA sequence template encodes a promotor region specific to the therapeutic needs of the host cell, for example a tissue specific promotor or enhancer. In still other embodiments, a promotor can be operably linked to a coding sequence.


Accordingly, provided herein are methods for treating sickle cell disease (SCD) (e.g., sickle cell anemia) in a subject in need thereof. In some embodiments, treatment results in amelioration of one or more symptoms associated with SCD.


In some embodiments, a system herein is used to treat a subject having a mutation in E6 (e.g., E6V).


In some embodiments, treatment with a system disclosed herein results in correction of the E6V mutation in between about 60-70% (e.g., about 60-65% or about 65-70%) of cells. In some embodiments, treatment with a system disclosed herein results in correction of the E6V mutation in between about 60-70% (e.g., about 60-65% or about 65-70%) of DNA isolated from the treated cells.


In some embodiments, treatment with a gene modifying system described herein results in one or more of:

    • (a) a reduction in the number of sickle-shaped cells;
    • (b) a reduction in production of an abnormal version of beta-globulin (e.g., hemoglobulin S);
    • (c) a reduction of pain and/or organ damage associated with sickle cell-related blood vessel blockage; and/or
    • (d) an increase in normal blood flow, as compared to a subject having SCD that has not been treated with a gene modifying system described herein.


Administration and Delivery

The compositions and systems described herein may be used in vitro or in vivo. In some embodiments the system or components of the system are delivered to cells (e.g., mammalian cells, e.g., human cells), e.g., in vitro or in vivo. In some embodiments, the cells are eukaryotic cells, e.g., cells of a multicellular organism, e.g., an animal, e.g., a mammal (e.g., human, swine, bovine), a bird (e.g., poultry, such as chicken, turkey, or duck), or a fish. In some embodiments, the cells are non-human animal cells (e.g., a laboratory animal, a livestock animal, or a companion animal). In some embodiments, the cell is a stem cell (e.g., a hematopoietic stem cell), a fibroblast, or a T cell. In some embodiments, the cell is an immune cell, e.g., a T cell (e.g., a Treg, CD4, CD8, γδ, or memory T cell), B cell (e.g., memory B cell or plasma cell), or NK cell. In some embodiments, the cell is a non-dividing cell, e.g., a non-dividing fibroblast or non-dividing T cell. In some embodiments, the cell is an HSC and p53 is not upregulated or is upregulated by less than 10%, 5%, 2%, or 1%, e.g., as determined according to the method described in Example 30 of PCT/US2019/048607. The skilled artisan will understand that the components of the gene modifying system may be delivered in the form of polypeptide, nucleic acid (e.g., DNA, RNA), and combinations thereof.


In one embodiment the system and/or components of the system are delivered as nucleic acid. For example, the gene modifying polypeptide may be delivered in the form of a DNA or RNA encoding the polypeptide, and the template RNA may be delivered in the form of RNA or its complementary DNA to be transcribed into RNA. In some embodiments the system or components of the system are delivered on 1, 2, 3, 4, or more distinct nucleic acid molecules. In some embodiments the system or components of the system are delivered as a combination of DNA and RNA. In some embodiments the system or components of the system are delivered as a combination of DNA and protein. In some embodiments the system or components of the system are delivered as a combination of RNA and protein. In some embodiments the gene modifying polypeptide is delivered as a protein.


In some embodiments the system or components of the system are delivered to cells, e.g. mammalian cells or human cells, using a vector. The vector may be, e.g., a plasmid or a virus. In some embodiments, delivery is in vivo, in vitro, ex vivo, or in situ. In some embodiments the virus is an adeno associated virus (AAV), a lentivirus, or an adenovirus. In some embodiments the system or components of the system are delivered to cells with a viral-like particle or a virosome. In some embodiments the delivery uses more than one virus, viral-like particle or virosome.


In one embodiment, the compositions and systems described herein can be formulated in liposomes or other similar vesicles. Liposomes are spherical vesicle structures composed of a uni- or multilamellar lipid bilayer surrounding internal aqueous compartments and a relatively impermeable outer lipophilic phospholipid bilayer. Liposomes may be anionic, neutral or cationic. Liposomes are biocompatible, nontoxic, can deliver both hydrophilic and lipophilic drug molecules, protect their cargo from degradation by plasma enzymes, and transport their load across biological membranes and the blood brain barrier (BBB) (see, e.g., Spuch and Navarro, Journal of Drug Delivery, vol. 2011, Article ID 469679, 12 pages, 2011. doi: 10.1155/2011/469679 for review).


Vesicles can be made from several different types of lipids; however, phospholipids are most commonly used to generate liposomes as drug carriers. Methods for preparation of multilamellar vesicle lipids are known in the art (see for example U.S. Pat. No. 6,693,086, the teachings of which relating to multilamellar vesicle lipid preparation are incorporated herein by reference). Although vesicle formation can be spontaneous when a lipid film is mixed with an aqueous solution, it can also be expedited by applying force in the form of shaking by using a homogenizer, sonicator, or an extrusion apparatus (see, e.g., Spuch and Navarro, Journal of Drug Delivery, vol. 2011, Article ID 469679, 12 pages, 2011. doi: 10.1155/2011/469679 for review). Extruded lipids can be prepared by extruding through filters of decreasing size, as described in Templeton et al., Nature Biotech, 15:647-652, 1997, the teachings of which relating to extruded lipid preparation are incorporated herein by reference.


A variety of nanoparticles can be used for delivery, such as a liposome, a lipid nanoparticle, a cationic lipid nanoparticle, an ionizable lipid nanoparticle, a polymeric nanoparticle, a gold nanoparticle, a dendrimer, a cyclodextrin nanoparticle, a micelle, or a combination of the foregoing.


Lipid nanoparticles are an example of a carrier that provides a biocompatible and biodegradable delivery system for the pharmaceutical compositions described herein. Nanostructured lipid carriers (NLCs) are modified solid lipid nanoparticles (SLNs) that retain the characteristics of the SLN, improve drug stability and loading capacity, and prevent drug leakage. Polymer nanoparticles (PNPs) are an important component of drug delivery. These nanoparticles can effectively direct drug delivery to specific targets and improve drug stability and controlled drug release. Lipid-polymer nanoparticles (PLNs), a type of carrier that combines liposomes and polymers, may also be employed. These nanoparticles possess the complementary advantages of PNPs and liposomes. A PLN is composed of a core-shell structure; the polymer core provides a stable structure, and the phospholipid shell offers good biocompatibility. As such, the two components increase the drug encapsulation efficiency rate, facilitate surface modification, and prevent leakage of water-soluble drugs. For a review, see, e.g., Li et al. 2017, Nanomaterials 7, 122; doi: 10.3390/nano7060122.


Exosomes can also be used as drug delivery vehicles for the compositions and systems described herein. For a review, see Ha et al. July 2016. Acta Pharmaceutica Sinica B. Volume 6, Issue 4, Pages 287-296; doi.org/10.1016/j.apsb.2016.02.001.


Fusosomes interact and fuse with target cells, and thus can be used as delivery vehicles for a variety of molecules. They generally consist of a bilayer of amphipathic lipids enclosing a lumen or cavity and a fusogen that interacts with the amphipathic lipid bilayer. The fusogen component has been shown to be engineerable in order to confer target cell specificity for the fusion and payload delivery, allowing the creation of delivery vehicles with programmable cell specificity (see for example Patent Application WO2020014209, the teachings of which relating to fusosome design, preparation, and usage are incorporated herein by reference).


In some embodiments, the protein component(s) of the gene modifying system may be pre-associated with the template nucleic acid (e.g., template RNA). For example, in some embodiments, the gene modifying polypeptide may be first combined with the template nucleic acid (e.g., template RNA) to form a ribonucleoprotein (RNP) complex. In some embodiments, the RNP may be delivered to cells via, e.g., transfection, nucleofection, virus, vesicle, LNP, exosome, fusosome.


A gene modifying system can be introduced into cells, tissues and multicellular organisms. In some embodiments the system or components of the system are delivered to the cells via mechanical means or physical means.


Formulation of protein therapeutics is described in Meyer (Ed.), Therapeutic Protein Drug Products: Practical Approaches to formulation in the Laboratory, Manufacturing, and the Clinic, Woodhead Publishing Series (2012).


Tissue Specific Activity/Administration

In some embodiments, a system described herein can make use of one or more feature (e.g., a promoter or microRNA binding site) to limit activity in off-target cells or tissues.


In some embodiments, a nucleic acid described herein (e.g., a template RNA or a DNA encoding a template RNA) comprises a promoter sequence, e.g., a tissue specific promoter sequence. In some embodiments, the tissue-specific promoter is used to increase the target-cell specificity of a gene modifying system. For instance, the promoter can be chosen on the basis that it is active in a target cell type but not active in (or active at a lower level in) a non-target cell type. Thus, even if the promoter integrated into the genome of a non-target cell, it would not drive expression (or only drive low level expression) of an integrated gene. A system having a tissue-specific promoter sequence in the template RNA may also be used in combination with a microRNA binding site, e.g., in the template RNA or a nucleic acid encoding a gene modifying protein, e.g., as described herein. A system having a tissue-specific promoter sequence in the template RNA may also be used in combination with a DNA encoding a gene modifying polypeptide, driven by a tissue-specific promoter, e.g., to achieve higher levels of gene modifying protein in target cells than in non-target cells. In some embodiments, e.g., for liver indications, a tissue-specific promoter is selected from Table 3 of WO2020014209, incorporated herein by reference.


In some embodiments, a nucleic acid described herein (e.g., a template RNA or a DNA encoding a template RNA) comprises a microRNA binding site. In some embodiments, the microRNA binding site is used to increase the target-cell specificity of a gene modifying system. For instance, the microRNA binding site can be chosen on the basis that is recognized by a miRNA that is present in a non-target cell type, but that is not present (or is present at a reduced level relative to the non-target cell) in a target cell type. Thus, when the template RNA is present in a non-target cell, it would be bound by the miRNA, and when the template RNA is present in a target cell, it would not be bound by the miRNA (or bound but at reduced levels relative to the non-target cell). While not wishing to be bound by theory, binding of the miRNA to the template RNA may interfere with its activity, e.g., may interfere with insertion of the heterologous object sequence into the genome. Accordingly, the system would edit the genome of target cells more efficiently than it edits the genome of non-target cells, e.g., the heterologous object sequence would be inserted into the genome of target cells more efficiently than into the genome of non-target cells, or an insertion or deletion is produced more efficiently in target cells than in non-target cells. A system having a microRNA binding site in the template RNA (or DNA encoding it) may also be used in combination with a nucleic acid encoding a gene modifying polypeptide, wherein expression of the gene modifying polypeptide is regulated by a second microRNA binding site, e.g., as described herein. In some embodiments, e.g., for liver indications, a miRNA is selected from Table 4 of WO2020014209, incorporated herein by reference.


In some embodiments, the template RNA comprises a microRNA sequence, an siRNA sequence, a guide RNA sequence, or a piwi RNA sequence.


Promoters

In some embodiments, one or more promoter or enhancer elements are operably linked to a nucleic acid encoding a gene modifying protein or a template nucleic acid, e.g., that controls expression of the heterologous object sequence. In certain embodiments, the one or more promoter or enhancer elements comprise cell-type or tissue specific elements. In some embodiments, the promoter or enhancer is the same or derived from the promoter or enhancer that naturally controls expression of the heterologous object sequence. For example, the ornithine transcarbomylase promoter and enhancer may be used to control expression of the ornithine transcarbomylase gene in a system or method provided by the invention for correcting ornithine transcarbomylase deficiencies. In some embodiments, the promoter is a promoter of Table 16 or 17 or a functional fragment or variant thereof.


Exemplary tissue specific promoters that are commercially available can be found, for example, at a uniform resource locator (e.g., invivogen.com/tissue-specific-promoters). In some embodiments, a promoter is a native promoter or a minimal promoter, e.g., which consists of a single fragment from the S′ region of a given gene. In some embodiments, a native promoter comprises a core promoter and its natural S′ UTR. In some embodiments, the 5′ UTR comprises an intron. In other embodiments, these include composite promoters, which combine promoter elements of different origins or were generated by assembling a distal enhancer with a minimal promoter of the same origin.


Exemplary cell or tissue specific promoters are provided in the tables, below, and exemplary nucleic acid sequences encoding them are known in the art and can be readily accessed using a variety of resources, such as the NCBI database, including RefSeq, as well as the Eukaryotic Promoter Database (//epd.epfl.ch//index.php)









TABLE 16







Exemplary cell or tissue-specific promoters










Promoter
Target cells







B29 Promoter
B cells



CD14 Promoter
Monocytic Cells



CD43 Promoter
Leukocytes and platelets



CD45 Promoter
Hematopoeitic cells



CD68 promoter
macrophages



Desmin promoter
muscle cells



Elastase-1
pancreatic acinar cells



promoter




Endoglin promoter
endothelial cells



fibronectin
differentiating cells, healing



promoter
tissue



Flt-1 promoter
endothelial cells



GFAP promoter
Astrocytes



GPIIB promoter
megakaryocytes



ICAM-2 Promoter
Endothelial cells



INF-Beta promoter
Hematopoeitic cells



Mb promoter
muscle cells



Nphs1 promoter
podocytes



OG-2 promoter
Osteoblasts, Odonblasts



SP-B promoter
Lung



Syn1 promoter
Neurons



WASP promoter
Hematopoeitic cells



SV40/bAlb
Liver



promoter




SV40/bAlb
Liver



promoter




SV40/Cd3
Leukocytes and platelets



promoter




SV40/CD45
hematopoeitic cells



promoter




NSE/RU5′
Mature Neurons



promoter



















TABLE 17





Promoter
Gene Description
Gene Specificity















Additional exemplary cell or tissue-specific promoters









APOA2
Apolipoprotein A-II
Hepatocytes (from hepatocyte




progenitors)


SERPINA
Serpin peptidase inhibitor, clade A
Hepatocytes


1 (hAAT)
(alpha-1
(from definitive endoderm



antiproteinase, antitrypsin), member 1
stage)



(also named alpha 1 anti-tryps in)



CYP3A
Cytochrome P450, family 3,
Mature Hepatocytes



subfamily A, polypeptide



MIR122
MicroRNA 122
Hepatocytes




(from early stage embryonic




liver cells)




and endoderm







Pancreatic specific promoters









INS
Insulin
Pancreatic beta cells




(from definitive endoderm stage)


IRS2
Insulin receptor substrate 2
Pancreatic beta cells


Pdx1
Pancreatic and duodenal
Pancreas



homeobox 1
(from definitive endoderm stage)


Alx3
Aristaless-like homeobox 3
Pancreatic beta cells




(from definitive endoderm stage)


Ppy
Pancreatic polypeptide
PP pancreatic cells




(gamma cells)







Cardiac specific promoters









Myh6
Myosin, heavy chain 6, cardiac
Late differentiation marker of cardiac


(aMHC)
muscle, alpha
muscle cells (atrial specificity)


MYL2
Myosin, light chain 2, regulatory,
Late differentiation marker of cardiac


(MLC-2v)
cardiac, slow
muscle cells (ventricular specificity)


ITNNl3
Troponin I type 3 (cardiac)
Cardiomyocytes


(cTnl)

(from immature state)


ITNNl3
Troponin I type 3 (cardiac)
Cardiomyocytes


(cTnl)

(from immature state)


NPPA
Natriuretic peptide precursor A (also
Atrial specificity in adult cells


(ANF)
named Atrial Natriuretic Factor)



Slc8a1
Solute carrier family 8
Cardiomyocytes from early


(Ncx1)
(sodium/calcium exchanger), member 1
developmental stages







CNS specific promoters









SYN1
Synapsin I
Neurons


(hSyn)




GFAP
Glial fibrillary acidic protein
Astrocytes


INA
Internexin neuronal intermediate
Neuroprogenitors



filament protein, alpha (a-internexin)



NES
Nestin
Neuroprogenitors and ectoderm


MOBP
Myelin-associated oligodendrocyte
Oligodendrocytes



basic protein



MBP
Myelin basic protein
Oligodendrocytes


TH
Tyrosine hydroxylase
Dopaminergic neurons


FOXA2
Forkhead box A2
Dopaminergic neurons (also used as a


(HNF3

marker of endoderm)


beta)









Skin specific promoters









FLG
Filaggrin
Keratinocytes from granular layer


K14
Keratin 14
Keratinocytes from granular




and basal layers


TGM3
Transglutaminase 3
Keratinocytes from granular layer







Immune cell specific promoters









ITGAM
Integrin, alpha M (complement
Monocytes, macrophages, granulocytes,


(CD11B)
component 3 receptor 3 subunit)
natural killer cells







Urogential cell specific promoters









Pbsn
Probasin
Prostatic epithelium


Upk2
Uroplakin 2
Bladder


Sbp
Spermine binding protein
Prostate


Fer1l4
Fer-1-like 4
Bladder







Endothelial cell specific promoters









ENG
Endoglin
Endothelial cells







Pluripotent and embryonic cell specific promoters









Oct4
POU class 5 homeobox 1
Pluripotent cells


(POU5F1)

(germ cells, ES cells, iPS cells)


NANOG
Nanog homeobox
Pluripotent cells




(ES cells, iPS cells)


Synthetic
Synthetic promoter based on a Oct-4
Pluripotent cells (ES cells, iPS cells)


Oct4
core enhancer element



T
Brachyury
Mesoderm


brachyury




NES
Nestin
Neuroprogenitors and Ectoderm


SOX17
SRY (sex determining region Y)-box
Endoderm



17



FOXA2
Forkhead box A2
Endoderm (also used as a marker of


(HNFJ

dopaminergic neurons)


beta)




MIR122
MicroRNA 122
Endoderm and hepatocytes




(from early stage embryonic liver cells~









Depending on the host/vector system utilized, any of a number of suitable transcription and translation control elements, including constitutive and inducible promoters, transcription enhancer elements, transcription terminators, etc. may be used in the expression vector (see e.g., Bitter et al. (1987) Methods in Enzymology, 153.516-544; incorporated herein by reference in its entirety).


In some embodiments, a nucleic acid encoding a gene modifying protein or template nucleic acid is operably linked to a control element, e.g., a transcriptional control element, such as a promoter The transcriptional control element may, in some embodiment, be functional in either a eukaryotic cell, e.g., a mammalian cell; or a prokaryotic cell (e.g., bacterial or archaeal cell). In some embodiments, a nucleotide sequence encoding a polypeptide is operably linked to multiple control elements, e.g, that allow expression of the nucleotide sequence encoding the polypeptide in both prokaryotic and eukaryotic cells.


For illustration purposes, examples of spatially restricted promoters include, but are not limited to, neuron-specific promoters, adipocyte-specific promoters, cardiomyocyte-specific promoters, smooth muscle-specific promoters, photoreceptor-specific promoters, etc. Neuron-specific spatially restricted promoters include, but are not limited to, a neuron-specific enolase (NSE) promoter (see, e.g., EMBL HSENO2, X51956), an aromatic amino acid decarboxylase (AADC) promoter, a neurofilament promoter (see, e.g., GenBank HUMNFL, L04147); a synapsin promoter (see, e.g., GenBank HUMSYNIB, M55301); a thy-1 promoter (see, e.g., Chen et al (1987) Cell 51:7-19; and Llewellyn, et al. (2010) Nat. Med. 16(10). 1161-1166), a serotonin receptor promoter (see, e.g., GenBank S62283); a tyrosine hydroxylase promoter (TH) (see, e.g., Oh et al. (2009) Gene Ther 16.437, Sasaoka et al. (1992) Mol. Brain Res. 16:274; Boundy et al (1998) J. Neurosci. 18:9989; and Kaneda et al. (1991) Neuron 6:583-594); a GnRH promoter (see, e.g., Radovick et al. (1991) Proc. Natl. Acad Sci USA 88.3402-3406); an L7 promoter (see, e.g., Oberdick et al. (1990) Science 248:223-226); a DNMT promoter (see, e.g., Bartge et al. (1988) Proc. Natl Acad. Sci. USA 85:3648-3652); an enkephalin promoter (sec, e.g., Comb et al. (1988) EMBO J. 17:3793-3805); a myelin basic protein (MBP) promoter; a Ca2+-calmodulin-dependent protein kinase II-alpha (CamKIIa) promoter (see, e.g., Mayford et al. (1996) Proc Natl Acad. Sci. USA 93-13250; and Casanova et al. (2001) Genesis 31.37), a CMV enhancer/platelet-derived growth factor-p promoter (see, e.g., Liu et al. (2004) Gene Therapy 11:52-60), and the like


Adipocyte-specific spatially restricted promoters include, but are not limited to, the al2 gene promoter/enhancer, e.g., a region from −5.4 kb to +21 bp of a human aP2 gene (see, e.g., Tozzo et al. (1997) Endocrinol. 138:1604; Ross et al. (1990) Proc. Natl. Acad. Sci. USA 87:9590, and Pavjani et al. (2005) Nat. Med. 11:797); a glucose transporter-4 (GLUT4) promoter (see, e.g., Knight et al. (2003) Proc. Natl. Acad. Sci. USA 100:14725); a fatty acid translocase (FAT/CD36) promoter (see, e.g., Kuriki et al (2002) Biol Pharm. Bull. 25-1476; and Sato et al. (2002) J. Biol. Chem. 277:15703); a stearoyl-CoA desaturase-1 (SCD1) promoter (Tabor et al. (1999) J. Biol. Chem. 274:20603), a leptin promoter (see, e.g, Mason et al (1998) Endocrinol 139:1013; and Chen et al. (1999) Biochem. Biophys. Res. Comm. 262:187); an adiponectin promoter (see, e.g., Kita et al. (2005) Biochem Biophys Res. Comm. 331:484; and Chakrabarti (2010) Endocrinol. 151-2408); an adipsin promoter (see, e.g., Platt et al (1989) Proc. Natl. Acad Sci. USA 86:7490); a resistin promoter (see, e.g., Seo et al. (2003) Molec. Endocrinol. 17:1522); and the like


Cardiomyocyte-specific spatially restricted promoters include, but are not limited to, control sequences derived from the following genes myosin light chain-2, o-myosin heavy chain, AE3, cardiac troponin C, cardiac actin, and the like. Franz et al. (1997) Cardiovasc. Res. 35:560-566; Robbins et al (1995) Ann N.Y. Acad. Sci. 752:492-505, Linn et al. (1995) Circ. Res. 76:584-591; Parmacek et al. (1994) Mol. Cell. Biol. 14:1870-1885; Hunter et al. (1993) Hypertension 22:608-617; and Sartorelli et al. (1992) Proc. Natl. Acad. Sci. USA 89:4047-4051.


Smooth muscle-specific spatially restricted promoters include, but are not limited to, an SM220 promoter (see, e.g., Akyürek et al. (2000) Mol. Med. 6:983; and U.S. Pat. No. 7,169,874); a smoothelin promoter (see, e.g., WO 2001/018048); an a-smooth muscle actin promoter; and the like. For example, a 0.4 kb region of the SM220 promoter, within which lie two CArG elements, has been shown to mediate vascular smooth muscle cell-specific expression (see, e.g., Kim, et al. (1997) Mol. Cell. Biol. 17, 2266-2278; Li, et al., (1996) J. Cell Biol. 132, 849-859; and Moessler, et al. (1996) Development 122, 2415-2425).


Photoreceptor-specific spatially restricted promoters include, but are not limited to, a rhodopsin promoter; a rhodopsin kinase promoter (Young et al. (2003) Ophthalmol. Vis. Sci. 44:4076); a beta phosphodiesterase gene promoter (Nicoud et al. (2007) J. Gene Med. 9-1015); a retinitis pigmentosa gene promoter (Nicoud et al. (2007) supra); an interphotoreceptor retinoid-binding protein (IRBP) gene enhancer (Nicoud et al. (2007) supra), an IRBP gene promoter (Yokoyama et al. (1992) Exp Eye Res. 55:225); and the like.


In some embodiments, a gene modifying system, e.g., DNA encoding a gene modifying polypeptide, DNA encoding a template RNA, or DNA or RNA encoding a heterologous object sequence, is designed such that one or more elements is operably linked to a tissue-specific promoter, e.g., a promoter that is active in T-cells. In further embodiments, the T-cell active promoter is inactive in other cell types, e.g., B-cells, NK cells. In some embodiments, the T-cell active promoter is derived from a promoter for a gene encoding a component of the T-cell receptor, e.g., TRAC, TRBC, TRGC, TRDC. In some embodiments, the T-cell active promoter is derived from a promoter for a gene encoding a component of a T-cell-specific cluster of differentiation protein, e.g., CD3, e.g., CD3D, CD3E, CD3G, CD3Z. In some embodiments, T-cell-specific promoters in gene modifying systems are discovered by comparing publicly available gene expression data across cell types and selecting promoters from the genes with enhanced expression in T-cells. In some embodiments, promoters may be selecting depending on the desired expression breadth, e.g., promoters that are active in T-cells only, promoters that are active in NK cells only, promoters that are active in both T-cells and NK cells.


Cell-specific promoters known in the art may be used to direct expression of a gene modifying protein, e.g., as described herein. Nonlimiting exemplary mammalian cell-specific promoters have been characterized and used in mice expressing Cre recombinase in a cell-specific manner. Certain nonlimiting exemplary mammalian cell-specific promoters are listed in Table 1 of U.S. Pat. No. 9,845,481, incorporated herein by reference


In some embodiments, a vector as described herein comprises an expression cassette. Typically, an expression cassette comprises the nucleic acid molecule of the instant invention operatively linked to a promoter sequence. For example, a promoter is operatively linked with a coding sequence when it is capable of affecting the expression of that coding sequence (e.g., the coding sequence is under the transcriptional control of the promoter). Encoding sequences can be operatively linked to regulatory sequences in sense or antisense orientation. In certain embodiments, the promoter is a heterologous promoter. In certain embodiments, an expression cassette may comprise additional elements, for example, an intron, an enhancer, a polyadenylation site, a woodchuck response element (WRE), and/or other elements known to affect expression levels of the encoding sequence. A promoter typically controls the expression of a coding sequence or functional RNA In certain embodiments, a promoter sequence comprises proximal and more distal upstream elements and can further comprise an enhancer element. An enhancer can typically stimulate promoter activity and may be an innate element of the promoter or a heterologous element inserted to enhance the level or tissue-specificity of a promoter. In certain embodiments, the promoter is derived in its entirety from a native gene. In certain embodiments, the promoter is composed of different elements derived from different naturally occurring promoters. In certain embodiments, the promoter comprises a synthetic nucleotide sequence. It will be understood by those skilled in the art that different promoters will direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental conditions or to the presence or the absence of a drug or transcriptional co-factor Ubiquitous, cell-type-specific, tissue-specific, developmental stage-specific, and conditional promoters, for example, drug-responsive promoters (e.g., tetracycline-responsive promoters) are well known to those of skill in the art. Exemplary promoters include, but are not limited to, the phosphoglycerate kinase (PKG) promoter, CAG (composite of the CMV enhancer the chicken beta actin promoter (CBA) and the rabbit beta globin intron), NSE (neuronal specific enolase), synapsin or NeuN promoters, the SV40 early promoter, mouse mammary tumor virus LTR promoter; adenovirus major late promoter (Ad MLP), a herpes simplex virus (HSV) promoter, a cytomegalovirus (CMV) promoter such as the CMV immediate early promoter region (CMVIE), SFFV promoter, rous sarcoma virus (RSV) promoter, synthetic promoters, hybrid promoters, and the like. Other promoters can be of human origin or from other species, including from mice Common promoters include, e.g., the human cytomegalovirus (CMV) immediate early gene promoter, the SV40 early promoter, the Rous sarcoma virus long terminal repeat, [beta]-actin, rat insulin promoter, the phosphoglycerate kinase promoter, the human alpha-1 antitrypsin (hAAT) promoter, the transthyretin promoter, the TBG promoter and other liver-specific promoters, the desmin promoter and similar muscle-specific promoters, the EF1-alpha promoter, hybrid promoters with multi-tissue specificity, promoters specific for neurons like synapsin and glyceraldehyde-3-phosphate dehydrogenase promoter, all of which are promoters well known and readily available to those of skill in the art, can be used to obtain high-level expression of the coding sequence of interest. In addition, sequences derived from non-viral genes, such as the murine metallothionein gene, will also find use herein. Such promoter sequences are commercially available from, e.g., Stratagene (San Diego, CA). Additional exemplary promoter sequences are described, for example, in WO2018213786A1 (incorporated by reference herein in its entirety)


In some embodiments, the apolipoprotein E enhancer (ApoE) or a functional fragment thereof is used, e.g., to drive expression in the liver. In some embodiments, two copies of the ApoE enhancer or a functional fragment thereof are used. In some embodiments, the ApoE enhancer or functional fragment thereof is used in combination with a promoter, e.g., the human alpha-1 antitrypsin (hAAT) promoter.


In some embodiments, the regulatory sequences impart tissue-specific gene expression capabilities. In some cases, the tissue-specific regulatory sequences bind tissue-specific transcription factors that induce transcription in a tissue specific manner. Various tissue-specific regulatory sequences (e.g., promoters, enhancers, etc.) are known in the art. Exemplary tissue-specific regulatory sequences include, but are not limited to, the following tissue-specific promoters: a liver-specific thyroxin binding globulin (TBG) promoter, a insulin promoter, a glucagon promoter, a somatostatin promoter, a pancreatic polypeptide (PPY) promoter, a synapsin-1 (Syn) promoter, a creatine kinase (MCK) promoter, a mammalian desmin (DES) promoter, a a-myosin heavy chain (a-MHC) promoter, or a cardiac Troponin T (cInT) promoter Other exemplary promoters include Beta-actin promoter, hepatitis B virus core promoter, Sandig et al., Gene Ther., 3:1002-9 (1996); alpha-fetoprotein (AFP) promoter, Arbuthnot et al., Hum. Gene Ther, 7.1503-14 (1996)), bone osteocalcin promoter (Stein et al., Mol. Biol. Rep, 24-185-96 (1997)); bone sialoprotein promoter (Chen et al., J. Bone Miner. Res., 11:654-64 (1996)), CD2 promoter (Hansal et al., J. Immunol., 161:1063-8 (1998); immunoglobulin heavy chain promoter; T cell receptor α-chain promoter, neuronal such as neuron-specific enolase (NSE) promoter (Andersen et al, Cell. Mol. Neurobiol., 13:503-15 (1993)), neurofilament light-chain gene promoter (Piccioli et al., Proc. Natl. Acad. Sci. USA, 88:5611-5 (1991)), and the neuron-specific vgf gene promoter (Piccioli et al, Neuron, 15:373-84 (1995)), and others. Additional exemplary promoter sequences are described, for example, in U.S. patent Ser. No. 10/300,146 (incorporated herein by reference in its entirety). In some embodiments, a tissue-specific regulatory element, e.g, a tissue-specific promoter, is selected from one known to be operably linked to a gene that is highly expressed in a given tissue, e.g., as measured by RNA-seq or protein expression data, or a combination thereof. Methods for analyzing tissue specificity by expression are taught in Fagerberg et al. Mol Cell Proteomics 13(2): 397-406 (2014), which is incorporated herein by reference in its entirety.


In some embodiments, a vector described herein is a multicistronic expression construct. Multicistronie expression constructs include, for example, constructs harboring a first expression cassette, e.g. comprising a first promoter and a first encoding nucleic acid sequence, and a second expression cassette, e.g. comprising a second promoter and a second encoding nucleic acid sequence. Such multicistronic expression constructs may, in some instances, be particularly useful in the delivery of non-translated gene products, such as hairpin RNAs, together with a polypeptide, for example, a gene modifying polypeptide and gene modifying template. In some embodiments, multicistronic expression constructs may exhibit reduced expression levels of one or more of the included transgenes, for example, because of promoter interference or the presence of incompatible nucleic acid elements in close proximity. If a multicistronic expression construct is part of a viral vector, the presence of a self-complementary nucleic acid sequence may, in some instances, interfere with the formation of structures necessary for viral reproduction or packaging.


In some embodiments, the sequence encodes an RNA with a hairpin. In some embodiments, the hairpin RNA is a guide RNA, a template RNA, a shRNA, or a microRNA. In some embodiments, the first promoter is an RNA polymerase I promoter. In some embodiments, the first promoter is an RNA polymerase Il promoter. In some embodiments, the second promoter is an RNA polymerase III promoter. In some embodiments, the second promoter is a U6 or Hl promoter


Without wishing to be bound by theory, multicistronic expression constructs may not achieve optimal expression levels as compared to expression systems containing only one cistron. One of the suggested causes of lower expression levels achieved with multicistronic expression constructs comprising two or more promoter elements is the phenomenon of promoter interference (see, e.g., Curtin J A, Dane A P, Swanson A, Alexander I E, Ginn S L. Bidirectional promoter interference between two widely used internal heterologous promoters in a late-generation lentiviral construct. Gene Ther. 2008 March; 15(5) 384-90; and Martin-Duque P. Jezzard S, Kaftansis L, Vassaux G. Direct comparison of the insulating propernes of two genene elements in an adenoviral vector containing two different expression cassettes. Hum Gene Ther. 2004 October: 15(10):995-1002; both references incorporated herein by reference for disclosure of promoter interference phenomenon). In some embodiments, the problem of promoter interference may be overcome, e.g., by producing multicistronic expression constructs comprising only one promoter driving transcription of multiple encoding nucleic acid sequences separated by internal ribosomal entry sites, or by separating cistrons comprising their own promoter with transcriptional insulator elements. In some embodiments, single-promoter driven expression of multiple cistrons may result in uneven expression levels of the cistrons. In some embodiments, a promoter cannot efficiently be isolated and isolation elements may not be compatible with some gene transfer vectors, for example, some retroviral vectors.


MicroRNAs

MicroRNAs (miRNAs) and other small interfering nucleic acids generally regulate gene expression via target RNA transcript cleavage/degradation or translational repression of the target messenger RNA (mRNA). miRNAs may, in some instances, be natively expressed, typically as final 19-25 non-translated RNA products. miRNAs generally exhibit their activity through sequence-specific interactions with the 3′ untranslated regions (UTR) of target mRNAs. These endogenously expressed miRNAs may form hairpin precursors that are subsequently processed into an miRNA duplex, and further into a mature single stranded miRNA molecule. This mature miRNA generally guides a multiprotein complex, miRISC, which identifies target 3′ UTR regions of target mRNAs based upon their complementarity to the mature miRNA. Useful transgene products may include, for example, miRNAs or miRNA binding sites that regulate the expression of a linked polypeptide. A non-limiting list of miRNA genes; the products of these genes and their homologues are useful as transgenes or as targets for small interfering nucleic acids (e.g., miRNA sponges, antisense oligonucleotides), e.g., in methods such as those listed in U.S. Ser. No. 10/300,146, 22:25-25:48, are herein incorporated by reference. In some embodiments, one or more binding sites for one or more of the foregoing miRNAs are incorporated in a transgene. e.g., a transgene delivered by a rAAV vector, e.g., to inhibit the expression of the transgene in one or more tissues of an animal harboring the transgene. In some embodiments, a binding site may be selected to control the expression of a transgene in a tissue specific manner. For example, binding sites for the liver-specific miR-122 may be incorporated into a transgene to inhibit expression of that transgene in the liver. Additional exemplary miRNA sequences are described, for example, in U.S. Pat. No. 10,300,146 (incorporated berein by reference in its entirety).


An miR inhibitor or miRNA inhibitor is generally an agent that blocks miRNA expression and/or processing. Examples of such agents include, but are not limited to, microRNA antagonists, microRNA specific antisense, microRNA sponges, and microRNA oligonucleotides (double-stranded, hairpin, short oligonucleotides) that inhibit miRNA interaction with a Drosha complex. MicroRNA inhibitors, e.g., miRNA sponges, can be expressed in cells from transgenes (e.g., as described in Ebert, M. S. Nature Methods, Epub Aug. 12, 2007; incorporated by reference herein in its entirety). In some embodiments, microRNA sponges, or other miR inhibitors, are used with the AAVs. microRNA sponges generally specifically inhibit miRNAs through a complementary heptameric seed sequence. In some embodiments, an entire family of miRNAs can be silenced using a single sponge sequence. Other methods for silencing miRNA function (derepression of miRNA targets) in cells will be apparent to one of ordinary skill in the art.


In some embodiments, a gene modifying system, template RNA, or polypeptide described herein is administered to or is active in (e.g., is more active in) a target tissue, e.g., a first tissue. In some embodiments, the gene modifying system, template RNA, or polypeptide is not administered to or is less active in (e.g., not active in) a non-target tissue. In some embodiments, a gene modifying system, template RNA, or polypeptide described herein is useful for modifying DNA in a target tissue, e.g., a first tissue, (e.g., and not modifying DNA in a non-target tissue).


In some embodiments, a gene modifying system comprises (a) a polypeptide described herein or a nucleic acid encoding the same, (b) a template nucleic acid (e.g., template RNA) described herein, and (c) one or more first tissue-specific expression-control sequences specific to the target tissue, wherein the one or more first tissue-specific expression-control sequences specific to the target tissue are in operative association with (a), (b), or (a) and (b), wherein, when associated with (a), (a) comprises a nucleic acid encoding the polypeptide.


In some embodiments, the nucleic acid in (b) comprises RNA.


In some embodiments, the nucleic acid in (b) comprises DNA.


In some embodiments, the nucleic acid in (b): (i) is single-stranded or comprises a single-stranded segment, e.g., is single-stranded DNA or comprises a single-stranded segment and one or more double stranded segments; (ii) has inverted terminal repeats; or (iii) both (i) and (ii).


In some embodiments, the nucleic acid in (b) is double-stranded or comprises a double-stranded segment.


In some embodiments, (a) comprises a nucleic acid encoding the polypeptide.


In some embodiments, the nucleic acid in (a) comprises RNA.


In some embodiments, the nucleic acid in (a) comprises DNA.


In some embodiments, the nucleic acid in (a): (i) is single-stranded or comprises a single-stranded segment, e.g., is single-stranded DNA or comprises a single-stranded segment and one or more double stranded segments; (ii) has inverted terminal repeats; or (iii) both (i) and (ii).


In some embodiments, the nucleic acid in (a) is double-stranded or comprises a double-stranded segment.


In some embodiments, the nucleic acid in (a), (b), or (a) and (b) is linear.


In some embodiments, the nucleic acid in (a), (b), or (a) and (b) is circular, e.g., a plasmid or minicircle.


In some embodiments, the heterologous object sequence is in operative association with a first promoter.


In some embodiments, the one or more first tissue-specific expression-control sequences comprises a tissue specific promoter.


In some embodiments, the tissue-specific promoter comprises a first promoter in operative association with: (i) the heterologous object sequence, (ii) a nucleic acid encoding the retroviral RT, or (iii) (i) and (ii).


In some embodiments, the one or more first tissue-specific expression-control sequences comprises a tissue-specific microRNA recognition sequence in operative association with: (i) the heterologous object sequence, (ii) a nucleic acid encoding the retroviral RT domain, or (iii) (i) and (ii).


In some embodiments, a system comprises a tissue-specific promoter, and the system further comprises one or more tissue-specific microRNA recognition sequences, wherein: (i) the tissue specific promoter is in operative association with: (I) the heterologous object sequence, (II) a nucleic acid encoding the retroviral RT domain, or (III) (I) and (II); and/or (ii) the one or more tissue-specific microRNA recognition sequences are in operative association with: (I) the heterologous object sequence, (II) a nucleic acid encoding the retroviral RT, or (III) (I) and (II).


In some embodiments, wherein (a) comprises a nucleic acid encoding the polypeptide, the nucleic acid comprises a promoter in operative association with the nucleic acid encoding the polypeptide.


In some embodiments, the nucleic acid encoding the polypeptide comprises one or more second tissue-specific expression-control sequences specific to the target tissue in operative association with the polypeptide coding sequence.


In some embodiments, the one or more second tissue-specific expression-control sequences comprises a tissue specific promoter.


In some embodiments, the tissue-specific promoter is the promoter in operative association with the nucleic acid encoding the polypeptide.


In some embodiments, the one or more second tissue-specific expression-control sequences comprises a tissue-specific microRNA recognition sequence.


In some embodiments, the promoter in operative association with the nucleic acid encoding the polypeptide is a tissue-specific promoter, the system further comprising one or more tissue-specific microRNA recognition sequences.


In some embodiments, a nucleic acid component of a system provided by the invention is a sequence (e.g., encoding the polypeptide or comprising a heterologous object sequence) flanked by untranslated regions (UTRs) that modify protein expression levels. Various 5′ and 3′ UTRs can affect protein expression. For example, in some embodiments, the coding sequence may be preceded by a 5′ UTR that modifies RNA stability or protein translation. In some embodiments, the sequence may be followed by a 3′ UTR that modifies RNA stability or translation. In some embodiments, the sequence may be preceded by a 5′ UTR and followed by a 3′ UTR that modify RNA stability or translation. In some embodiments, the 5′ and/or 3′ UTR may be selected from the 5′ and 3′ UTRs of complement factor 3 (C3) (CACTCCTCCCCATCCTCTCCCTCTGTCCCTCTGTCCCTCTGACCCTGCACTGTCCCAG CACC; SEQ ID NO: 11,004) or orosomucoid 1 (ORM1) (CAGGACACAGCCTTGGATCAGGACAGAGACTTGGGGGCCATCCTGCCCCTCCAACC CGACATGTGTACCTCAGCTTTTTCCCTCACTTGCATCAATAAAGCTTCTGTGTTTGGA ACAGCTAA; SEQ ID NO: 11,005) (Asrani et al. RNA Biology 2018). In certain embodiments, the 5′ UTR is the 5′ UTR from C3 and the 3′ UTR is the 3′ UTR from ORM1. In certain embodiments, a 5′ UTR and 3′ UTR for protein expression, e.g., mRNA (or DNA encoding the RNA) for a gene modifying polypeptide or heterologous object sequence, comprise optimized expression sequences. In some embodiments, the 5′ UTR comprises GGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACC (SEQ ID NO: 11,006) and/or the 3′ UTR comprising UGAUAAUAGGCUGGAGCCUCGGUGGCCAUGCUUCUUGCCCCUUGGGCCUCCCCCC AGCCCCUCCUCCCCUUCCUGCACCCGUACCCCCGUGGUCUUUGAAUAAAGUCUGA (SEQ ID NO: 11,007), e.g., as described in Richner et al. (el/168(6): P1114-1125 (2017), the sequences of which are incorporated herein by reference. In some embodiments, a 5′ and/or 3″ UTR may be selected to enhance protein expression. In some embodiments, a 5′ and/or 3′ UTR may be selected to modify protein expression such that overproduction inhibition is minimized. In some embodiments, UTRs are around a coding sequence, e.g., outside the coding sequence and in other embodiments proximal to the coding sequence. In some embodiments, additional regulatory elements (e.g., miRNA binding sites, cis-regulatory sites) are included in the UTRs.


In some embodiments, an open reading frame of a gene modifying system, e.g., an ORF of an mRNA (or DNA encoding an mRNA) encoding a gene modifying polypeptide or one or more ORFs of an mRNA (or DNA encoding an mRNA) of a heterologous object sequence, is flanked by a 5′ and/or 3′ untranslated region (UTR) that enhances the expression thereof. In some embodiments, the 5′ UTR of an mRNA component (or transcript produced from a DNA component) of the system comprises the sequence 5′-GGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACC-3′; SEQ ID NO: 11,008). In some embodiments, the 3′ UTR of an mRNA component (or transcript produced from a DNA component) of the system comprises the sequence 5′-UGAUAAUAGGCUGGAGCCUCGGUGGCCAUGCUUCUUGCCCCUUGGGCCUCCCCCC AGCCCCUCCUCCCCUUCCUGCACCCGUACCCCCGUGGUCUUUGAAUAAAGUCUGA-3′ (SEQ ID NO: 11,009). This combination of 5′ UTR and 3′ UTR has been shown to result in desirable expression of an operably linked ORF by Richner et al. (el/168(6): P1114-1125 (2017), the teachings and sequences of which are incorporated herein by reference. In some embodiments, a system described herein comprises a DNA encoding a transcript, wherein the DNA comprises the corresponding 5′ UTR and 3′ UTR sequences, with T substituting for U in the above-listed sequence). In some embodiments, a DNA vector used to produce an RNA component of the system further comprises a promoter upstream of the 5′ UTR for initiating in vitro transcription, e.g, a T7, T3, or SP6 promoter. The 5′ UTR above begins with GGG, which is a suitable start for optimizing transcription using T7 RNA polymerase. For tuning transcription levels and altering the transcription start site nucleotides to fit alternative 5′ UTRs, the teachings of Davidson et al. Pac Symp Biocomput 433-443 (2010) describe T7 promoter variants, and the methods of discovery thereof, that fulfill both of these traits.


Viral Vectors and Components Thereof

Viruses are a useful source of delivery vehicles for the systems described herein, in addition to a source of relevant enzymes or domains as described herein, e.g., as sources of polymerases and polymerase functions used herein, e.g., DNA-dependent DNA polymerase, RNA-dependent RNA polymerase, RNA-dependent DNA polymerase, DNA-dependent RNA polymerase, reverse transcriptase. Some enzymes, e.g., reverse transcriptases, may have multiple activities, e.g., be capable of both RNA-dependent DNA polymerization and DNA-dependent DNA polymerization, e.g., first and second strand synthesis. In some embodiments, the virus used as a gene modifying delivery system or a source of components thereof may be selected from a group as described by Baltimore Bacteriol Rev 35(3):235-241 (1971).


In some embodiments, the virus is selected from a Group I virus, e.g., is a DNA virus and packages dsDNA into virions. In some embodiments, the Group I virus is selected from, e.g., Adenoviruses, Herpesviruses, Poxviruses.


In some embodiments, the virus is selected from a Group II virus, e.g., is a DNA virus and packages ssDNA into virions. In some embodiments, the Group II virus is selected from, e.g., Parvoviruses. In some embodiments, the parvovirus is a dependoparvovirus, e.g., an adeno-associated virus (AAV).


In some embodiments, the virus is selected from a Group III virus, e.g., is an RNA virus and packages dsRNA into virions. In some embodiments, the Group III virus is selected from, e.g., Reoviruses. In some embodiments, one or both strands of the dsRNA contained in such virions is a coding molecule able to serve directly as mRNA upon transduction into a host cell, e.g., can be directly translated into protein upon transduction into a host cell without requiring any intervening nucleic acid replication or polymerization steps.


In some embodiments, the virus is selected from a Group IV virus, e.g., is an RNA virus and packages ssRNA(+) into virions. In some embodiments, the Group IV virus is selected from, e.g., Coronaviruses, Picornaviruses, Togaviruses. In some embodiments, the ssRNA(+) contained in such virions is a coding molecule able to serve directly as mRNA upon transduction into a host cell, e.g., can be directly translated into protein upon transduction into a host cell without requiring any intervening nucleic acid replication or polymerization steps.


In some embodiments, the virus is selected from a Group V virus, e.g., is an RNA virus and packages ssRNA(−) into virions. In some embodiments, the Group V virus is selected from, e.g., Orthomyxoviruses, Rhabdoviruses. In some embodiments, an RNA virus with an ssRNA(−) genome also carries an enzyme inside the virion that is transduced to host cells with the viral genome, e.g., an RNA-dependent RNA polymerase, capable of copying the ssRNA(−) into ssRNA(+) that can be translated directly by the host.


In some embodiments, the virus is selected from a Group VI virus, e.g., is a retrovirus and packages ssRNA(+) into virions. In some embodiments, the Group VI virus is selected from, e.g., retroviruses. In some embodiments, the retrovirus is a lentivirus, e.g., HIV-1, HIV-2, SIV, BIV. In some embodiments, the retrovirus is a spumavirus, e.g., a foamy virus, e.g., HFV, SFV, BFV. In some embodiments, the ssRNA(+) contained in such virions is a coding molecule able to serve directly as mRNA upon transduction into a host cell, e.g., can be directly translated into protein upon transduction into a host cell without requiring any intervening nucleic acid replication or polymerization steps. In some embodiments, the ssRNA(+) is first reverse transcribed and copied to generate a dsDNA genome intermediate from which mRNA can be transcribed in the host cell. In some embodiments, an RNA virus with an ssRNA(+) genome also carries an enzyme inside the virion that is transduced to host cells with the viral genome, e.g., an RNA-dependent DNA polymerase, capable of copying the ssRNA(+) into dsDNA that can be transcribed into mRNA and translated by the host. In some embodiments, the reverse transcriptase from a Group VI retrovirus is incorporated as the reverse transcriptase domain of a gene modifying polypeptide.


In some embodiments, the virus is selected from a Group VII virus, e.g., is a retrovirus and packages dsRNA into virions. In some embodiments, the Group VII virus is selected from, e.g., Hepadnaviruses. In some embodiments, one or both strands of the dsRNA contained in such virions is a coding molecule able to serve directly as mRNA upon transduction into a host cell, e.g., can be directly translated into protein upon transduction into a host cell without requiring any intervening nucleic acid replication or polymerization steps. In some embodiments, one or both strands of the dsRNA contained in such virions is first reverse transcribed and copied to generate a dsDNA genome intermediate from which mRNA can be transcribed in the host cell. In some embodiments, an RNA virus with a dsRNA genome also carries an enzyme inside the virion that is transduced to host cells with the viral genome, e.g., an RNA-dependent DNA polymerase, capable of copying the dsRNA into dsDNA that can be transcribed into mRNA and translated by the host. In some embodiments, the reverse transcriptase from a Group VII retrovirus is incorporated as the reverse transcriptase domain of a gene modifying polypeptide.


In some embodiments, virions used to deliver nucleic acid in this invention may also carry enzymes involved in the process of gene modification. For example, a retroviral virion may contain a reverse transcriptase domain that is delivered into a host cell along with the nucleic acid. In some embodiments, an RNA template may be associated with a gene modifying polypeptide within a virion, such that both are co-delivered to a target cell upon transduction of the nucleic acid from the viral particle. In some embodiments, the nucleic acid in a virion may comprise DNA, e.g., linear ssDNA, linear dsDNA, circular ssDNA, circular dsDNA, minicircle DNA, dbDNA, ceDNA. In some embodiments, the nucleic acid in a virion may comprise RNA, e.g., linear ssRNA, linear dsRNA, circular ssRNA, circular dsRNA. In some embodiments, a viral genome may circularize upon transduction into a host cell, e.g., a linear ssRNA molecule may undergo a covalent linkage to form a circular ssRNA, a linear dsRNA molecule may undergo a covalent linkage to form a circular dsRNA or one or more circular ssRNA. In some embodiments, a viral genome may replicate by rolling circle replication in a host cell. In some embodiments, a viral genome may comprise a single nucleic acid molecule, e.g., comprise a non-segmented genome. In some embodiments, a viral genome may comprise two or more nucleic acid molecules, e.g., comprise a segmented genome. In some embodiments, a nucleic acid in a virion may be associated with one or proteins. In some embodiments, one or more proteins in a virion may be delivered to a host cell upon transduction. In some embodiments, a natural virus may be adapted for nucleic acid delivery by the addition of virion packaging signals to the target nucleic acid, wherein a host cell is used to package the target nucleic acid containing the packaging signals.


In some embodiments, a virion used as a delivery vehicle may comprise a commensal human virus. In some embodiments, a virion used as a delivery vehicle may comprise an anellovirus, the use of which is described in WO2018232017A1, which is incorporated herein by reference in its entirety.


AAV Administration

In some embodiments, an adeno-associated virus (AAV) is used in conjunction with the system, template nucleic acid, and/or polypeptide described herein. In some embodiments, an AAV is used to deliver, administer, or package the system, template nucleic acid, and/or polypeptide described herein. In some embodiments, the AAV is a recombinant AAV (rAAV).


In some embodiments, a system comprises (a) a polypeptide described herein or a nucleic acid encoding the same, (b) a template nucleic acid (e.g., template RNA) described herein, and (c) one or more first tissue-specific expression-control sequences specific to the target tissue, wherein the one or more first tissue-specific expression-control sequences specific to the target tissue are in operative association with (a), (b), or (a) and (b), wherein, when associated with (a), (a) comprises a nucleic acid encoding the polypeptide.


In some embodiments, a system described herein further comprises a first recombinant adeno-associated virus (rAAV) capsid protein; wherein the at least one of (a) or (b) is associated with the first rAAV capsid protein, wherein at least one of (a) or (b) is flanked by AAV inverted terminal repeats (ITRs).


In some embodiments, (a) and (b) are associated with the first rAAV capsid protein.


In some embodiments, (a) and (b) are on a single nucleic acid.


In some embodiments, the system further comprises a second rAAV capsid protein, wherein at least one of (a) or (b) is associated with the second rAAV capsid protein, and wherein the at least one of (a) or (b) associated with the second rAAV capsid protein is different from the at least one of (a) or (b) is associated with the first rAAV capsid protein.


In some embodiments, the at least one of (a) or (b) is associated with the first or second rAAV capsid protein is dispersed in the interior of the first or second rAAV capsid protein, which first or second rAAV capsid protein is in the form of an AAV capsid particle.


In some embodiments, the system further comprises a nanoparticle, wherein the nanoparticle is associated with at least one of (a) or (b).


In some embodiments, (a) and (b), respectively are associated with: a) a first rAAV capsid protein and a second rAAV capsid protein; b) a nanoparticle and a first rAAV capsid protein; c) a first rAAV capsid protein; d) a first adenovirus capsid protein; e) a first nanoparticle and a second nanoparticle; or f) a first nanoparticle.


Viral vectors are useful for delivering all or part of a system provided by the invention, e.g., for use in methods provided by the invention. Systems derived from different viruses have been employed for the delivery of polypeptides or nucleic acids; for example: integrase-deficient lentivirus, adenovirus, adeno-associated virus (AAV), herpes simplex virus, and baculovirus (reviewed in Hodge et al. Hum Gene Ther 2017; Narayanavari et al. Crit Rev Biochem Mol Biol 2017; Boehme et al. Curr Gene Ther 2015).


Adenoviruses are common viruses that have been used as gene delivery vehicles given well-defined biology, genetic stability, high transduction efficiency, and ease of large-scale production (see, for example, review by Lee et al. Genes & Diseases 2017). They possess linear dsDNA genomes and come in a variety of serotypes that differ in tissue and cell tropisms. In order to prevent replication of infectious virus in recipient cells, adenovirus genomes used for packaging are deleted of some or all endogenous viral proteins, which are provided in trans in viral production cells. This renders the genomes helper-dependent, meaning they can only be replicated and packaged into viral particles in the presence of the missing components provided by so-called helper functions. A helper-dependent adenovirus system with all viral ORFs removed may be compatible with packaging foreign DNA of up to ˜37 kb (Parks et al. J Virol 1997). In some embodiments, an adenoviral vector is used to deliver DNA corresponding to the polypeptide or template component of the gene modifying system, or both are contained on separate or the same adenoviral vector. In some embodiments, the adenovirus is a helper-dependent adenovirus (HD-AdV) that is incapable of self-packaging. In some embodiments, the adenovirus is a high-capacity adenovirus (HC-AdV) that has had all or a substantial portion of endogenous viral ORFs deleted, while retaining the necessary sequence components for packaging into adenoviral particles. For this type of vector, the only adenoviral sequences required for genome packaging are noncoding sequences: the inverted terminal repeats (ITRs) at both ends and the packaging signal at the 5′-end (Jager et al. Nat Protoc 2009). In some embodiments, the adenoviral genome also comprises stuffer DNA to meet a minimal genome size for optimal production and stability (see, for example, Hausl et al. Mol Ther 2010). In some embodiments, an adenovirus is used to deliver a gene modifying system to the liver.


In some embodiments, an adenovirus is used to deliver a gene modifying system to HSCs, e.g., HDAd5/35++. HDAd5/35++ is an adenovirus with modified serotype 35 fibers that de-target the vector from the liver (Wang et al. Blood Adv 2019). In some embodiments, the adenovirus that delivers a gene modifying system to HSCs utilizes a receptor that is expressed specifically on primitive HSCs, e.g., CD46.


Adeno-associated viruses (AAV) belong to the parvoviridae family and more specifically constitute the dependoparvovirus genus. The AAV genome is composed of a linear single-stranded DNA molecule which contains approximately 4.7 kilobases (kb) and consists of two major open reading frames (ORFs) encoding the non-structural Rep (replication) and structural Cap (capsid) proteins. A second ORF within the cap gene was identified that encodes the assembly-activating protein (AAP). The DNAs flanking the AAV coding regions are two cis-acting inverted terminal repeat (ITR) sequences, approximately 145 nucleotides in length, with interrupted palindromic sequences that can be folded into energetically stable hairpin structures that function as primers of DNA replication. In addition to their role in DNA replication, the ITR sequences have been shown to be involved in viral DNA integration into the cellular genome, rescue from the host genome or plasmid, and encapsidation of viral nucleic acid into mature virions (Muzyczka, (1992) Curr. Top. Micro. Immunol. 158:97-129). In some embodiments, one or more gene modifying nucleic acid components is flanked by ITRs derived from AAV for viral packaging. See, e.g., WO2019113310.


In some embodiments, one or more components of the gene modifying system are carried via at least one AAV vector. In some embodiments, the at least one AAV vector is selected for tropism to a particular cell, tissue, organism. In some embodiments, the AAV vector is pseudotyped, e.g., AAV2/8, wherein AAV2 describes the design of the construct but the capsid protein is replaced by that from AAV8. It is understood that any of the described vectors could be pseudotype derivatives, wherein the capsid protein used to package the AAV genome is derived from that of a different AAV serotype. Without wishing to be limited in vector choice, a list of exemplary AAV serotypes can be found in Table 18. In some embodiments, an AAV to be employed for gene modifying may be evolved for novel cell or tissue tropism as has been demonstrated in the literature (e.g., Davidsson et al. Proc Natl Acad Sci USA 2019).


In some embodiments, the AAV delivery vector is a vector which has two AAV inverted terminal repeats (ITRs) and a nucleotide sequence of interest (for example, a sequence coding for a gene modifying polypeptideor a DNA template, or both), each of said ITRs having an interrupted (or noncontiguous) palindromic sequence, i.e., a sequence composed of three segments: a first segment and a last segment that are identical when read 5′->3′ but hybridize when placed against each other, and a segment that is different that separates the identical segments. See, for example, WO2012123430.


Conventionally, AAV virions with capsids are produced by introducing a plasmid or plasmids encoding the rAAV or scAAV genome, Rep proteins, and Cap proteins (Grimm et al, 1998). Upon introduction of these helper plasmids in trans, the AAV genome is “rescued” (i.e., released and subsequently recovered) from the host genome, and is further encapsidated to produce infectious AAV. In some embodiments, one or more gene modifying nucleic acids are packaged into AAV particles by introducing the ITR-flanked nucleic acids into a packaging cell in conjunction with the helper functions.


In some embodiments, the AAV genome is a so called self-complementary genome (referred to as scAAV), such that the sequence located between the ITRs contains both the desired nucleic acid sequence (e.g., DNA encoding the gene modifying polypeptide or template, or both) in addition to the reverse complement of the desired nucleic acid sequence, such that these two components can fold over and self-hybridize. In some embodiments, the self-complementary modules are separated by an intervening sequence that permits the DNA to fold back on itself, e.g., forms a stem-loop. An scAAV has the advantage of being poised for transcription upon entering the nucleus, rather than being first dependent on ITR priming and second-strand synthesis to form dsDNA. In some embodiments, one or more gene modifying components is designed as an scAAV, wherein the sequence between the AAV ITRs contains two reverse complementing modules that can self-hybridize to create dsDNA.


In some embodiments, nucleic acid (e.g., encoding a polypeptide, or a template, or both) delivered to cells is closed-ended, linear duplex DNA (CELID DNA or ceDNA). In some embodiments, ceDNA is derived from the replicative form of the AAV genome (Li et al. PLOS One 2013). In some embodiments, the nucleic acid (e.g., encoding a polypeptide, or a template DNA, or both) is flanked by ITRs, e.g., AAV ITRs, wherein at least one of the ITRs comprises a terminal resolution site and a replication protein binding site (sometimes referred to as a replicative protein binding site). In some embodiments, the ITRs are derived from an adeno-associated virus, e.g., AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12, or a combination thereof. In some embodiments, the ITRs are symmetric. In some embodiments, the ITRs are asymmetric. In some embodiments, at least one Rep protein is provided to enable replication of the construct. In some embodiments, the at least one Rep protein is derived from an adeno-associated virus, e.g., AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12, or a combination thereof. In some embodiments, ceDNA is generated by providing a production cell with (i) DNA flanked by ITRs, e.g., AAV ITRs, and (ii) components required for ITR-dependent replication, e.g., AAV proteins Rep78 and Rep52 (or nucleic acid encoding the proteins). In some embodiments, ceDNA is free of any capsid protein, e.g., is not packaged into an infectious AAV particle. In some embodiments, ceDNA is formulated into LNPs (see, for example, WO2019051289A1).


In some embodiments, the ceDNA vector consists of two self-complementary sequences, e.g., asymmetrical or symmetrical or substantially symmetrical ITRs as defined herein, flanking said expression cassette, wherein the ceDNA vector is not associated with a capsid protein. In some embodiments, the ceDNA vector comprises two self-complementary sequences found in an AAV genome, where at least one ITR comprises an operative Rep-binding element (RBE) (also sometimes referred to herein as “RBS”) and a terminal resolution site (trs) of AAV or a functional variant of the RBE. See, for example, WO2019113310.


In some embodiments, the AAV genome comprises two genes that encode four replication proteins and three capsid proteins, respectively. In some embodiments, the genes are flanked on either side by 145-bp inverted terminal repeats (ITRs). In some embodiments, the virion comprises up to three capsid proteins (Vp1, Vp2, and/or Vp3), e.g., produced in a 1:1:10 ratio. In some embodiments, the capsid proteins are produced from the same open reading frame and/or from differential splicing (Vp1) and alternative translational start sites (Vp2 and Vp3, respectively). Generally, Vp3 is the most abundant subunit in the virion and participates in receptor recognition at the cell surface defining the tropism of the virus. In some embodiments, Vp1 comprises a phospholipase domain, e.g., which functions in viral infectivity, in the N-terminus of Vp1.


In some embodiments, packaging capacity of the viral vectors limits the size of the gene modifying system that can be packaged into the vector. For example, the packaging capacity of the AAVs can be about 4.5 kb (e.g., about 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, or 6.0 kb), e.g., including one or two inverted terminal repeats (ITRs), e.g., 145 base ITRs.


In some embodiments, recombinant AAV (rAAV) comprises cis-acting 145-bp ITRs flanking vector transgene cassettes, e.g., providing up to 4.5 kb for packaging of foreign DNA. Subsequent to infection, rAAV can, in some instances, express a fusion protein of the invention and persist without integration into the host genome by existing episomally in circular head-to-tail concatemers. rAAV can be used, for example, in vitro and in vivo. In some embodiments, AAV-mediated gene delivery requires that the length of the coding sequence of the gene is equal or greater in size than the wild-type AAV genome.


AAV delivery of genes that exceed this size and/or the use of large physiological regulatory elements can be accomplished, for example, by dividing the protein(s) to be delivered into two or more fragments. In some embodiments, the N-terminal fragment is fused to an intein-N sequence. In some embodiments, the C-terminal fragment is fused to an intein-C sequence. In embodiments, the fragments are packaged into two or more AAV vectors.


In some embodiments, dual AAV vectors are generated by splitting a large transgene expression cassette in two separate halves (5′ and 3′ ends, or head and tail), e.g., wherein each half of the cassette is packaged in a single AAV vector (of <5 kb). The re-assembly of the full-length transgene expression cassette can, in some embodiments, then be achieved upon co-infection of the same cell by both dual AAV vectors. In some embodiments, co-infection is followed by one or more of: (1) homologous recombination (HR) between 5′ and 3′ genomes (dual AAV overlapping vectors); (2) ITR-mediated tail-to-head concatemerization of 5′ and 3′ genomes (dual AAV trans-splicing vectors); and/or (3) a combination of these two mechanisms (dual AAV hybrid vectors). In some embodiments, the use of dual AAV vectors in vivo results in the expression of full-length proteins. In some embodiments, the use of the dual AAV vector platform represents an efficient and viable gene transfer strategy for transgenes of greater than about 4.0, 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8, 4.9, or 5.0 kb in size. In some embodiments, AAV vectors can also be used to transduce cells with target nucleic acids, e.g., in the in vitro production of nucleic acids and peptides. In some embodiments, AAV vectors can be used for in vivo and ex vivo gene therapy procedures (see, e.g., West et al., Virology 160:38-47 (1987); U.S. Pat. No. 4,797,368; WO 93/24641; Kotin, Human Gene Therapy 5:793-801 (1994); Muzyczka, J. Clin. Invest.94:1351 (1994); each of which is incorporated herein by reference in their entirety). The construction of recombinant AAV vectors is described in a number of publications, including U.S. Pat. No. 5,173,414; Tratschin et al., Mol. Cell. Biol.5:3251-3260 (1985); Tratschin, et al., Mol. Cell. Biol.4:2072-2081 (1984); Hermonat & Muzyczka, PNAS 81:6466-6470 (1984); and Samulski et al., J. Virol.63:03822-3828 (1989) (incorporated by reference herein in their entirety).


In some embodiments, a gene modifying polypeptide described herein (e.g., with or without one or more guide nucleic acids) can be delivered using AAV, lentivirus, adenovirus or other plasmid or viral vector types, in particular, using formulations and doses from, for example, U.S. Pat. No. 8,454,972 (formulations, doses for adenovirus), U.S. Pat. No. 8,404,658 (formulations, doses for AAV) and U.S. Pat. No. 5,846,946 (formulations, doses for DNA plasmids) and from clinical trials and publications regarding the clinical trials involving lentivirus, AAV and adenovirus. For example, for AAV, the route of administration, formulation and dose can be as described in U.S. Pat. No. 8,454,972 and as in clinical trials involving AAV. For adenovirus, the route of administration, formulation and dose can be as described in U.S. Pat. No. 8,404,658 and as in clinical trials involving adenovirus. For plasmid delivery, the route of administration, formulation and dose can be as described in U.S. Pat. No. 5,846,946 and as in clinical studies involving plasmids. Doses can be based on or extrapolated to an average 70 kg individual (e.g. a male adult human), and can be adjusted for patients, subjects, mammals of different weight and species. Frequency of administration is within the ambit of the medical or veterinary practitioner (e.g., physician, veterinarian), depending on usual factors including the age, sex, general health, other conditions of the patient or subject and the particular condition or symptoms being addressed. In some embodiments, the viral vectors can be injected into the tissue of interest. For cell-type specific gene modifying, the expression of the gene modifying polypeptide and optional guide nucleic acid can, in some embodiments, be driven by a cell-type specific promoter.


In some embodiments, AAV allows for low toxicity, for example, due to the purification method not requiring ultracentrifugation of cell particles that can activate the immune response. In some embodiments, AAV allows low probability of causing insertional mutagenesis, for example, because it does not substantially integrate into the host genome.


In some embodiments, AAV has a packaging limit of about 4.4, 4.5, 4.6, 4.7, or 4.75 kb. In some embodiments, a gene modifying polypeptide-encoding sequence, promoter, and transcription terminator can fit into a single viral vector. SpCas9 (4.1 kb) may, in some instances, be difficult to package into AAV. Therefore, in some embodiments, a gene modifying polypeptide coding sequence is used that is shorter in length than other gene modifying polypeptide coding sequences or base editors. In some embodiments, the gene modifying polypeptide encoding sequences are less than about 4.5 kb, 4.4 kb, 4.3 kb, 4.2 kb, 4.1 kb, 4 kb, 3.9 kb, 3.8 kb, 3.7 kb, 3.6 kb, 3.5 kb, 3.4 kb, 3.3 kb, 3.2 kb, 3.1 kb, 3 kb, 2.9 kb, 2.8 kb, 2.7 kb, 2.6 kb, 2.5 kb, 2 kb, or 1.5 kb.


An AAV can be AAV1, AAV2, AAV5 or any combination thereof. In some embodiments, the type of AAV is selected with respect to the cells to be targeted; e.g., AAV serotypes 1, 2, 5 or a hybrid capsid AAV1, AAV2, AAV5 or any combination thereof can be selected for targeting brain or neuronal cells; or AAV4 can be selected for targeting cardiac tissue. In some embodiments, AAV8 is selected for delivery to the liver. Exemplary AAV serotypes as to these cells are described, for example, in Grimm, D. et al, J. Virol.82: 5887-5911 (2008) (incorporated herein by reference in its entirety). In some embodiments, AAV refers all serotypes, subtypes, and naturally-occurring AAV as well as recombinant AAV. AAV may be used to refer to the virus itself or a derivative thereof. In some embodiments, AAV includes AAV1, AAV2, AAV3, AAV3B, AAV4, AAV5, AAV6, AAV6.2, AAV7, AAVrh.64RI, AAVhu.37, AAVrh.8, AAVrh.32.33, AAV8, AAV9, AAV-DJ, AAV2/8, AAVrhIO, AAVLK03, AV10, AAV11, AAV 12, rhIO, and hybrids thereof, avian AAV, bovine AAV, canine AAV, equine AAV, primate AAV, non-primate AAV, and ovine AAV. The genomic sequences of various serotypes of AAV, as well as the sequences of the native terminal repeats (TRs), Rep proteins, and capsid subunits are known in the art. Such sequences may be found in the literature or in public databases such as GenBank. Additional exemplary AAV serotypes are listed in Table 18.









TABLE 18







Exemplary AAV serotypes.









Target




Tissue
Vehicle
Reference





Liver
AAV (AAV81, AAVrh.81,
1. Wang et al., Mol. Ther. 18,



AAVhu.371, AAV2/8,
118-25 (2010)



AAV2/rh102, AAV9, AAV2,




NP403, NP592,3, AAV3B5,
2. Ginn et al., JHEP Reports,



AAV-DJ4, AAV-LK014,
100065 (2019)



AAV-LK024, AAV-LK034,
3. Paulk et al., Mol. Ther. 26,



AAV-LK194, AAV57
289-303 (2018).



Adenovirus
4. L. Lisowski et al., Nature.



(Ad5, HC-AdV6)
506, 382-6 (2014).




5. L. Wang et al., Mol. Ther.




23, 1877-87 (2015).




6. Hausl Mol Ther (2010)




7. Davidoff et al., Mol. Ther.




11, 875-88 (2005)


Lung
AAV (AAV4, AAV5,
1. Duncan et al., Mol Ther



AAV61, AAV9, H222)

Methods
Clin Dev (2018)




Adenovirus (Ad5, Ad3,
2. Cooney et al., Am J Respir



Ad21, Ad14)3

Cell Mol Biol (2019)





3. Li et al., Mol Ther Methods





Clin Dev (2019)



Skin
AAV (AAV61, AAV-LK192)
1. Petek et al., Mol. Ther.




(2010)




2. L. Lisowski et al., Nature.




506, 382-6 (2014).


HSCs
Adenovirus (HDAd5/35++)
Wang et al. Blood Adv (2019)









In some embodiments, a pharmaceutical composition (e.g., comprising an AAV as described herein) has less than 10% empty capsids, less than 8% empty capsids, less than 7% empty capsids, less than 5% empty capsids, less than 3% empty capsids, or less than 1% empty capsids. In some embodiments, the pharmaceutical composition has less than about 5% empty capsids. In some embodiments, the number of empty capsids is below the limit of detection. In some embodiments, it is advantageous for the pharmaceutical composition to have low amounts of empty capsids, e.g., because empty capsids may generate an adverse response (e.g., immune response, inflammatory response, liver response, and/or cardiac response), e.g., with little or no substantial therapeutic benefit.


In some embodiments, the residual host cell protein (rHCP) in the pharmaceutical composition is less than or equal to 100 ng/ml rHCP per 1×1013 vg/ml, e.g., less than or equal to 40 ng/ml rHCP per 1×1013 vg/ml or 1-50 ng/ml rHCP per 1×1013 vg/ml. In some embodiments, the pharmaceutical composition comprises less than 10 ng rHCP per 1.0×1013 vg, or less than 5 ng rHCP per 1.0×1013 vg, less than 4 ng rHCP per 1.0×1013 vg, or less than 3 ng rHCP per 1.0×1013 vg, or any concentration in between. In some embodiments, the residual host cell DNA (hcDNA) in the pharmaceutical composition is less than or equal to 5×106 pg/ml hcDNA per 1×1013 vg/ml, less than or equal to 1.2×106 pg/ml hcDNA per 1×1013 vg/ml, or 1×105 pg/ml hcDNA per 1×1013 vg/ml. In some embodiments, the residual host cell DNA in said pharmaceutical composition is less than 5.0×105 pg per 1×1013 vg, less than 2.0×105 pg per 1.0×1013 vg, less than 1.1×105 pg per 1.0×1013 vg, less than 1.0×105 pg hcDNA per 1.0×1013 vg, less than 0.9×105 pg hcDNA per 1.0×1013 vg, less than 0.8×105 pg hcDNA per 1.0×1013 vg, or any concentration in between.


In some embodiments, the residual plasmid DNA in the pharmaceutical composition is less than or equal to 1.7×105 pg/ml per 1.0×1013 vg/ml, or 1×105 pg/ml per 1×1.0×1013 vg/ml, or 1.7×106 pg/ml per 1.0×1013 vg/ml. In some embodiments, the residual DNA plasmid in the pharmaceutical composition is less than 10.0×105 pg by 1.0×1013 vg, less than 8.0×105 pg by 1.0×1013 vg or less than 6.8×105 pg by 1.0×1013 vg. In embodiments, the pharmaceutical composition comprises less than 0.5 ng per 1.0×1013 vg, less than 0.3 ng per 1.0×1013 vg, less than 0.22 ng per 1.0×1013 vg or less than 0.2 ng per 1.0×1013 vg or any intermediate concentration of bovine serum albumin (BSA). In embodiments, the benzonase in the pharmaceutical composition is less than 0.2 ng by 1.0×1013 vg, less than 0.1 ng by 1.0×1013 vg, less than 0.09 ng by 1.0×1013 vg, less than 0.08 ng by 1.0×1013 vg or any intermediate concentration. In embodiments, Poloxamer 188 in the pharmaceutical composition is about 10 to 150 ppm, about 15 to 100 ppm or about 20 to 80 ppm. In embodiments, the cesium in the pharmaceutical composition is less than 50 pg/g (ppm), less than 30 pg/g (ppm) or less than 20 pg/g (ppm) or any intermediate concentration.


In embodiments, the pharmaceutical composition comprises total impurities, e.g., as determined by SDS-PAGE, of less than 10%, less than 8%, less than 7%, less than 6%, less than 5%, less than 4%, less than 3%, less than 2%, or any percentage in between. In embodiments, the total purity, e.g., as determined by SDS-PAGE, is greater than 90%, greater than 92%, greater than 93%, greater than 94%, greater than 95%, greater than 96%, greater than 97%, greater than 98%, or any percentage in between. In embodiments, no single unnamed related impurity, e.g., as measured by SDS-PAGE, is greater than 5%, greater than 4%, greater than 3% or greater than 2%, or any percentage in between. In embodiments, the pharmaceutical composition comprises a percentage of filled capsids relative to total capsids (e.g., peak 1+peak 2 as measured by analytical ultracentrifugation) of greater than 85%, greater than 86%, greater than 87%, greater than 88%, greater than 89%, greater than 90%, greater than 91%, greater than 91.9%, greater than 92%, greater than 93%, or any percentage in between. In embodiments of the pharmaceutical composition, the percentage of filled capsids measured in peak 1 by analytical ultracentrifugation is 20-80%, 25-75%, 30-75%, 35-75%, or 37.4-70.3%. In embodiments of the pharmaceutical composition, the percentage of filled capsids measured in peak 2 by analytical ultracentrifugation is 20-80%, 20-70%, 22-65%, 24-62%, or 24.9-60.1%.


In one embodiment, the pharmaceutical composition comprises a genomic titer of 1.0 to 5.0×1013 vg/mL, 1.2 to 3.0×1013 vg/mL or 1.7 to 2.3×1013 vg/ml. In one embodiment, the pharmaceutical composition exhibits a biological load of less than 5 CFU/mL, less than 4 CFU/mL, less than 3 CFU/mL, less than 2 CFU/mL or less than 1 CFU/mL or any intermediate contraction. In embodiments, the amount of endotoxin according to USP, for example, USP <85>(incorporated by reference in its entirety) is less than 1.0 EU/mL, less than 0.8 EU/mL or less than 0.75 EU/mL. In embodiments, the osmolarity of a pharmaceutical composition according to USP, for example, USP <785>(incorporated by reference in its entirety) is 350 to 450 mOsm/kg, 370 to 440 mOsm/kg or 390 to 430 mOsm/kg. In embodiments, the pharmaceutical composition contains less than 1200 particles that are greater than 25 μm per container, less than 1000 particles that are greater than 25 μm per container, less than 500 particles that are greater than 25 μm per container or any intermediate value. In embodiments, the pharmaceutical composition contains less than 10,000 particles that are greater than 10 μm per container, less than 8000 particles that are greater than 10 μm per container or less than 600 particles that are greater than 10 μm per container.


In one embodiment, the pharmaceutical composition has a genomic titer of 0.5 to 5.0×1013 vg/mL, 1.0 to 4.0×1013 vg/mL, 1.5 to 3.0×1013 vg/ml or 1.7 to 2.3×1013 vg/ml. In one embodiment, the pharmaceutical composition described herein comprises one or more of the following: less than about 0.09 ng benzonase per 1.0×1013 vg, less than about 30 pg/g (ppm) of cesium, about 20 to 80 ppm Poloxamer 188, less than about 0.22 ng BSA per 1.0×1013 vg, less than about 6.8×105 pg of residual DNA plasmid per 1.0×1013 vg, less than about 1.1×105 pg of residual hcDNA per 1.0×1013 vg, less than about 4 ng of rHCP per 1.0×1013 vg, pH 7.7 to 8.3, about 390 to 430 mOsm/kg, less than about 600 particles that are >25 μm in size per container, less than about 6000 particles that are >10 μm in size per container, about 1.7×1013-2.3×1013 vg/mL genomic titer, infectious titer of about 3.9×108 to 8.4×1010 IU per 1.0×1013 vg, total protein of about 100-300 μg per 1.0×1013 vg, mean survival of >24 days in A7SMA mice with about 7.5×1013 vg/kg dose of viral vector, about 70 to 130% relative potency based on an in vitro cell based assay and/or less than about 5% empty capsid. In various embodiments, the pharmaceutical compositions described herein comprise any of the viral particles discussed here, retain a potency of between +20%, between #15%, between +10% or within +5% of a reference standard. In some embodiments, potency is measured using a suitable in vitro cell assay or in vivo animal model.


Additional methods of preparation, characterization, and dosing AAV particles are taught in WO2019094253, which is incorporated herein by reference in its entirety.


Additional rAAV constructs that can be employed consonant with the invention include those described in Wang et al 2019, available at://doi.org/10.1038/s41573-019-0012-9, including Table 1 thereof, which is incorporated by reference in its entirety.


Lipid Nanoparticles

The methods and systems provided herein may employ any suitable carrier or delivery modality, including, in certain embodiments, lipid nanoparticles (LNPs). Lipid nanoparticles, in some embodiments, comprise one or more ionic lipids, such as non-cationic lipids (e.g., neutral or anionic, or zwitterionic lipids); one or more conjugated lipids (such as PEG-conjugated lipids or lipids conjugated to polymers described in Table 5 of WO2019217941; incorporated herein by reference in its entirety); one or more sterols (e.g., cholesterol); and, optionally, one or more targeting molecules (e.g., conjugated receptors, receptor ligands, antibodies); or combinations of the foregoing.


Lipids that can be used in nanoparticle formations (e.g., lipid nanoparticles) include, for example those described in Table 4 of WO2019217941, which is incorporated by reference—e.g., a lipid-containing nanoparticle can comprise one or more of the lipids in Table 4 of WO2019217941. Lipid nanoparticles can include additional elements, such as polymers, such as the polymers described in Table 5 of WO2019217941, incorporated by reference.


In some embodiments, conjugated lipids, when present, can include one or more of PEG-diacylglycerol (DAG) (such as 1-(monomethoxy-polyethyleneglycol)-2,3-dimyristoylglycerol (PEG-DMG)), PEG-dialkyloxypropyl (DAA), PEG-phospholipid, PEG-ceramide (Cer), a pegylated phosphatidylethanoloamine (PEG-PE), PEG succinate diacylglycerol (PEGS-DAG) (such as 4-O-(2′,3′-di(tetradecanoyloxy)propyl-1-O-(w-methoxy(polyethoxy)ethyl) butanedioate (PEG-S-DMG)), PEG dialkoxypropylcarbam, N-(carbonyl-methoxypoly ethylene glycol 2000)-1,2-distearoyl-sn-glycero-3-phosphoethanolamine sodium salt, and those described in Table 2 of WO2019051289 (incorporated by reference), and combinations of the foregoing.


In some embodiments, sterols that can be incorporated into lipid nanoparticles include one or more of cholesterol or cholesterol derivatives, such as those in WO2009/127060 or US2010/0130588, which are incorporated by reference. Additional exemplary sterols include phytosterols, including those described in Eygeris et al (2020), dx.doi.org/10.1021/acs.nanolett.0c01386, incorporated herein by reference.


In some embodiments, the lipid particle comprises an ionizable lipid, a non-cationic lipid, a conjugated lipid that inhibits aggregation of particles, and a sterol. The amounts of these components can be varied independently and to achieve desired properties. For example, in some embodiments, the lipid nanoparticle comprises an ionizable lipid is in an amount from about 20 mol % to about 90 mol % of the total lipids (in other embodiments it may be 20-70% (mol), 30-60% (mol) or 40-50% (mol); about 50 mol % to about 90 mol % of the total lipid present in the lipid nanoparticle), a non-cationic lipid in an amount from about 5 mol % to about 30 mol % of the total lipids, a conjugated lipid in an amount from about 0.5 mol % to about 20 mol % of the total lipids, and a sterol in an amount from about 20 mol % to about 50 mol % of the total lipids. The ratio of total lipid to nucleic acid (e.g., encoding the gene modifying polypeptide or template nucleic acid) can be varied as desired. For example, the total lipid to nucleic acid (mass or weight) ratio can be from about 10:1 to about 30:1.


In some embodiments, an ionizable lipid may be a cationic lipid, an ionizable cationic lipid, e.g., a cationic lipid that can exist in a positively charged or neutral form depending on pH, or an amine-containing lipid that can be readily protonated. In some embodiments, the cationic lipid is a lipid capable of being positively charged, e.g., under physiological conditions. Exemplary cationic lipids include one or more amine group(s) which bear the positive charge. In some embodiments, the lipid particle comprises a cationic lipid in formulation with one or more of neutral lipids, ionizable amine-containing lipids, biodegradable alkyn lipids, steroids, phospholipids including polyunsaturated lipids, structural lipids (e.g., sterols), PEG, cholesterol and polymer conjugated lipids. In some embodiments, the cationic lipid may be an ionizable cationic lipid. An exemplary cationic lipid as disclosed herein may have an effective pKa over 6.0. In embodiments, a lipid nanoparticle may comprise a second cationic lipid having a different effective pKa (e.g., greater than the first effective pKa), than the first cationic lipid. A lipid nanoparticle may comprise between 40 and 60 mol percent of a cationic lipid, a neutral lipid, a steroid, a polymer conjugated lipid, and a therapeutic agent, e.g., a nucleic acid (e.g., RNA) described herein (e.g., a template nucleic acid or a nucleic acid encoding a gene modifying polypeptide), encapsulated within or associated with the lipid nanoparticle. In some embodiments, the nucleic acid is co-formulated with the cationic lipid. The nucleic acid may be adsorbed to the surface of an LNP, e.g., an LNP comprising a cationic lipid. In some embodiments, the nucleic acid may be encapsulated in an LNP, e.g., an LNP comprising a cationic lipid. In some embodiments, the lipid nanoparticle may comprise a targeting moiety, e.g., coated with a targeting agent. In embodiments, the LNP formulation is biodegradable. In some embodiments, a lipid nanoparticle comprising one or more lipid described herein, e.g., Formula (i), (ii), (ii), (vii) and/or (ix) encapsulates at least 1%, at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 92%, at least 95%, at least 97%, at least 98% or 100% of an RNA molecule, e.g., template RNA and/or a mRNA encoding the gene modifying polypeptide.


In some embodiments, the lipid to nucleic acid ratio (mass/mass ratio; w/w ratio) can be in the range of from about 1:1 to about 25:1, from about 10:1 to about 14:1, from about 3:1 to about 15:1, from about 4:1 to about 10:1, from about 5:1 to about 9:1, or about 6:1 to about 9:1. The amounts of lipids and nucleic acid can be adjusted to provide a desired N/P ratio, for example, N/P ratio of 3, 4, 5, 6, 7, 8, 9, 10 or higher. Generally, the lipid nanoparticle formulation's overall lipid content can range from about 5 mg/ml to about 30 mg/mL.


Exemplary ionizable lipids that can be used in lipid nanoparticle formulations include, without limitation, those listed in Table 1 of WO2019051289, incorporated herein by reference. Additional exemplary lipids include, without limitation, one or more of the following formulae: X of US2016/0311759; I of US20150376115 or in US2016/0376224; I, II or III of US20160151284; I, IA, II, or IIA of US20170210967; I-c of US20150140070; A of US2013/0178541; I of US2013/0303587 or US2013/0123338; I of US2015/0141678; II, III, IV, or V of US2015/0239926; I of US2017/0119904; I or II of WO2017/117528; A of US2012/0149894; A of US2015/0057373; A of WO2013/116126; A of US2013/0090372; A of US2013/0274523; A of US2013/0274504; A of US2013/0053572; A of WO2013/016058; A of WO2012/162210; I of US2008/042973; I, II, III, or IV of US2012/01287670; I or II of US2014/0200257; I, II, or III of US2015/0203446; I or III of US2015/0005363; I, IA, IB, IC, ID, II, IIA, IIB, IIC, IID, or III-XXIV of US2014/0308304; of US2013/0338210; I, II, III, or IV of WO2009/132131; A of US2012/01011478; I or XXXV of US2012/0027796; XIV or XVII of US2012/0058144; of US2013/0323269; I of US2011/0117125; I, II, or III of US2011/0256175; I, II, III, IV, V, VI, VII, VIII, IX, X, XI, XII of US2012/0202871; I, II, III, IV, V, VI, VII, VIII, X, XII, XIII, XIV, XV, or XVI of US2011/0076335; I or II of US2006/008378; I of US2013/0123338; I or X-A-Y-Z of US2015/0064242; XVI, XVII, or XVIII of US2013/0022649; I, II, or III of US2013/0116307; I, II, or III of US2013/0116307; I or II of US2010/0062967; I-X of US2013/0189351; I of US2014/0039032; V of US2018/0028664; I of US2016/0317458; I of US2013/0195920; 5, 6, or 10 of U.S. Pat. No. 10,221,127; III-3 of WO2018/081480; I-5 or I-8 of WO2020/081938; 18 or 25 of U.S. Pat. No. 9,867,888; A of US2019/0136231; II of WO2020/219876; 1 of US2012/0027803; OF-02 of US2019/0240349; 23 of U.S. Pat. No. 10,086,013; cKK-E12/A6 of Miao et al (2020); C12-200 of WO2010/053572; 7C1 of Dahlman et al (2017); 304-013 or 503-013 of Whitehead et al; TS-P4C2 of U.S. Pat. No. 9,708,628; I of WO2020/106946; I of WO2020/106946.


In some embodiments, the ionizable lipid is MC3 (6Z,9Z,28Z,3 IZ)-heptatriaconta-6,9,28,3 1-tetraen-19-yl-4-(dimethylamino) butanoate (DLin-MC3-DMA or MC3), e.g., as described in Example 9 of WO2019051289A9 (incorporated by reference herein in its entirety). In some embodiments, the ionizable lipid is the lipid ATX-002, e.g., as described in Example 10 of WO2019051289A9 (incorporated by reference herein in its entirety). In some embodiments, the ionizable lipid is (13Z,16Z)-A,A-dimethyl-3-nonyldocosa-13, 16-dien-1-amine (Compound 32), e.g., as described in Example 11 of WO2019051289A9 (incorporated by reference herein in its entirety). In some embodiments, the ionizable lipid is Compound 6 or Compound 22, e.g., as described in Example 12 of WO2019051289A9 (incorporated by reference herein in its entirety). In some embodiments, the ionizable lipid is heptadecan-9-yl 8-((2-hydroxyethyl)(6-oxo-6-(undecyloxy)hexyl)amino)octanoate (SM-102); e.g., as described in Example 1 of U.S. Pat. No. 9,867,888(incorporated by reference herein in its entirety). In some embodiments, the ionizable lipid is 9Z, 12Z)-3-((4,4-bis(octyloxy)butanoyl)oxy)-2-((((3-(diethylamino)propoxy)carbonyl)oxy)methyl)propyl octadeca-9,12-dienoate (LP01) e.g., as synthesized in Example 13 of WO2015/095340(incorporated by reference herein in its entirety). In some embodiments, the ionizable lipid is Di((Z)-non-2-en-1-yl) 9-((4-dimethylamino)butanoyl)oxy)heptadecanedioate (L319), e.g. as synthesized in Example 7, 8, or 9 of US2012/0027803(incorporated by reference herein in its entirety). In some embodiments, the ionizable lipid is 1,1′-((2-(4-(2-((2-(Bis(2-hydroxydodecyl)amino)ethyl)(2-hydroxydodecyl) amino)ethyl)piperazin-1-yl)ethyl)azanediyl)bis(dodecan-2-ol) (C12-200), e.g., as synthesized in Examples 14 and 16 of WO2010/053572(incorporated by reference herein in its entirety). In some embodiments, the ionizable lipid is; Imidazole cholesterol ester (ICE) lipid (3S, 10R, 13R, 17R)-10, 13-dimethyl-17-((R)-6-methylheptan-2-yl)-2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17-tetradecahydro-1H-cyclopenta[a]phenanthren-3-yl 3-(1H-imidazol-4-yl)propanoate, e.g., Structure (I) from WO2020/106946 (incorporated by reference herein in its entirety).


Some non-limiting examples of lipid compounds that may be used (e.g., in combination with other lipid components) to form lipid nanoparticles for the delivery of compositions described herein, e.g., nucleic acid (e.g., RNA) described herein (e.g., a template nucleic acid or a nucleic acid encoding a gene modifying polypeptide) includes,




embedded image


In some embodiments an LNP comprising Formula (i) is used to deliver a gene modifying composition described herein to the liver and/or hepatocyte cells.




embedded image


In some embodiments an LNP comprising Formula (ii) is used to deliver a gene modifying composition described herein to the liver and/or hepatocyte cells.




embedded image


In some embodiments an LNP comprising Formula (iii) is used to deliver a gene modifying composition described herein to the liver and/or hepatocyte cells.




embedded image


In some embodiments an LNP comprising Formula (v) is used to deliver a gene modifying composition described herein to the liver and/or hepatocyte cells.




embedded image


In some embodiments an LNP comprising Formula (vi) is used to deliver a gene modifying composition described herein to the liver and/or hepatocyte cells.




embedded image


In some embodiments an LNP comprising Formula (viii) is used to deliver a gene modifying composition described herein to the liver and/or hepatocyte cells.




embedded image


In some embodiments an LNP comprising Formula (ix) is used to deliver a gene modifying composition described herein to the liver and/or hepatocyte cells.




embedded image


wherein

    • X1 is O, NR1, or a direct bond, X2 is C2-5 alkylene, X3 is C(═O) or a direct bond, R1 is H or Me, R3 is Ci-3 alkyl, R2 is Ci-3 alkyl, or R2 taken together with the nitrogen atom to which it is attached and 1-3 carbon atoms of X2 form a 4-, 5-, or 6-membered ring, or X′ is NR1, R1 and R2 taken together with the nitrogen atoms to which they are attached form a 5- or 6-membered ring, or R2 taken together with R3 and the nitrogen atom to which they are attached form a 5-, 6-, or 7-membered ring, Y′ is C2-12 alkylene, Y2 is selected from




embedded image




    • n is 0 to 3, R4 is Ci-15 alkyl, Z1 is Ci-6 alkylene or a direct bond,

    • Z2 is







embedded image


(in either orientation) or absent, provided that if Z1 is a direct bond, Z2 is absent;

    • R3 is C5-9 alkyl or C6-10 alkoxy, R6 is C5-9 alkyl or C6-10 alkoxy, W is methylene or a direct bond, and R7 is H or Me, or a salt thereof, provided that if R3 and R2 are C2 alkyls, X1 is O, X2 is linear C3 alkylene, X3 is C(=0), Y′ is linear Ce alkylene, (Y2)n-R4 is




embedded image


R4 is linear C5 alkyl, Z1 is C2 alkylene, Z2 is absent, W is methylene, and R7 is H, then R5 and


In some embodiments an LNP comprising Formula (xii) is used to deliver a gene modifying composition described herein to the liver and/or hepatocyte cells.




embedded image


In some embodiments an LNP comprising Formula (xi) is used to deliver a gene modifying composition described herein to the liver and/or hepatocyte cells.




embedded image


In some embodiments an LNP comprises a compound of Formula (xiii) and a compound of Formula (xiv).




embedded image


In some embodiments an LNP comprising Formula (xv) is used to deliver a gene modifying composition described herein to the liver and/or hepatocyte cells.




embedded image


In some embodiments an LNP comprising a formulation of Formula (xvi) is used to deliver a gene modifying composition described herein to the lung endothelial cells.




text missing or illegible when filed


In some embodiments, a lipid compound used to form lipid nanoparticles for the delivery of compositions described herein, e.g., nucleic acid (e.g., RNA) described herein (e.g., a template nucleic acid or a nucleic acid encoding a gene modifying polypeptide) is made by one of the following reactions:




embedded image


Exemplary non-cationic lipids include, but are not limited to, distearoyl-sn-glycero-phosphoethanolamine, distearoylphosphatidylcholine (DSPC), dioleoylphosphatidylcholine (DOPC), dipalmitoylphosphatidylcholine (DPPC), dioleoylphosphatidylglycerol (DOPG), dipalmitoylphosphatidylglycerol (DPPG), dioleoyl-phosphatidylethanolamine (DOPE), 1,2-dioleoyl-sn-glycero-3-phosphoethanolamine (DOPE), palmitoyloleoylphosphatidylcholine (POPC), palmitoyloleoylphosphatidylethanolamine (POPE), dioleoyl-phosphatidylethanolamine 4-(N-maleimidomethyl)-cyclohexane-1-carboxylate (DOPE-mal), dipalmitoyl phosphatidyl ethanolamine (DPPE), dimyristoylphosphoethanolamine (DMPE), distearoyl-phosphatidyl-ethanolamine (DSPE), monomethyl-phosphatidylethanolamine (such as 16-O-monomethyl PE), dimethyl-phosphatidylethanolamine (such as 16-O-dimethyl PE), 18-1-trans PE, 1-stearoyl-2-oleoyl-phosphatidyethanolamine (SOPE), hydrogenated soy phosphatidylcholine (HSPC), egg phosphatidylcholine (EPC), dioleoylphosphatidylserine (DOPS), sphingomyelin (SM), dimyristoyl phosphatidylcholine (DMPC), dimyristoyl phosphatidylglycerol (DMPG), distearoylphosphatidylglycerol (DSPG), dierucoylphosphatidylcholine (DEPC), palmitoyloleyolphosphatidylglycerol (POPG), dielaidoyl-phosphatidylethanolamine (DEPE), lecithin, phosphatidylethanolamine, lysolecithin, lysophosphatidylethanolamine, phosphatidylserine, phosphatidylinositol, sphingomyelin, egg sphingomyelin (ESM), cephalin, cardiolipin, phosphatidicacid, cerebrosides, dicetylphosphate, lysophosphatidylcholine, dilinoleoylphosphatidylcholine, or mixtures thereof. It is understood that other diacylphosphatidylcholine and diacylphosphatidylethanolamine phospholipids can also be used. The acyl groups in these lipids are preferably acyl groups derived from fatty acids having C10-C24 carbon chains, e.g., lauroyl, myristoyl, paimitoyl, stearoyl, or oleoyl. Additional exemplary lipids, in certain embodiments, include, without limitation, those described in Kim et al. (2020) dx.doi.org/10.1021/acs.nanolett.0c01386, incorporated herein by reference. Such lipids include, in some embodiments, plant lipids found to improve liver transfection with mRNA (e.g., DGTS). In some embodiments, the non-cationic lipid may have the following structure,




embedded image


Other examples of non-cationic lipids suitable for use in the lipid nanopartieles include, without limitation, nonphosphorous lipids such as, e.g., stearylamine, dodeeylamine, hexadecylamine, acetyl palmitate, glycerol ricinoleate, hexadecyl stereate, isopropyl myristate, amphoteric acrylic polymers, triethanolamine-lauryl sulfate, alkyl-aryl sulfate polyethyloxylated fatty acid amides, dioctadecyl dimethyl ammonium bromide, ceramide, sphingomyelin, and the like. Other non-cationic lipids are described in WO2017/099823 or US patent publication US2018/0028664, the contents of which is incorporated herein by reference in their entirety.


In some embodiments, the non-cationic lipid is oleic acid or a compound of Formula I, II, or IV of US2018/0028664, incorporated herein by reference in its entirety. The non-cationic lipid can comprise, for example, 0-30% (mol) of the total lipid present in the lipid nanoparticle. In some embodiments, the non-cationic lipid content is 5-20% (mol) or 10-15% (mol) of the total lipid present in the lipid nanoparticle. In embodiments, the molar ratio of ionizable lipid to the neutral lipid ranges from about 2:1 to about 8:1 (e.g., about 2:1, 3:1, 4:1, 5:1, 6:1, 7:1, or 8:1).


In some embodiments, the lipid nanoparticles do not comprise any phospholipids.


In some aspects, the lipid nanoparticle can further comprise a component, such as a sterol, to provide membrane integrity. One exemplary sterol that can be used in the lipid nanoparticle is cholesterol and derivatives thereof. Non-limiting examples of cholesterol derivatives include polar analogues such as 5a-choiestanol, 53-coprostanol, choiesteryl-(2;-hydroxy)-ethyl ether, choiesteryl-(4′-hydroxy)-butyl ether, and 6-ketocholestanol; non-polar analogues such as 5a-cholestane, cholestenone, 5a-cholestanone, 5p-cholestanone, and cholesteryl decanoate; and mixtures thereof. In some embodiments, the cholesterol derivative is a polar analogue, e.g., choiesteryl-(4′-hydroxy)-butyl ether. Exemplary cholesterol derivatives are described in PCT publication WO2009/127060 and US patent publication US2010/0130588, each of which is incorporated herein by reference in its entirety.


In some embodiments, the component providing membrane integrity, such as a sterol, can comprise 0-50% (mol) (e.g., 0-10%, 10-20%, 20-30%, 30-40%, or 40-50%) of the total lipid present in the lipid nanoparticle. In some embodiments, such a component is 20-50% (mol) 30-40% (mol) of the total lipid content of the lipid nanoparticle.


In some embodiments, the lipid nanoparticle can comprise a polyethylene glycol (PEG) or a conjugated lipid molecule. Generally, these are used to inhibit aggregation of lipid nanoparticles and/or provide steric stabilization. Exemplary conjugated lipids include, but are not limited to, PEG-lipid conjugates, polyoxazoline (POZ)-lipid conjugates, polyamide-lipid conjugates (such as ATTA-lipid conjugates), cationic-polymer lipid (CPL) conjugates, and mixtures thereof. In some embodiments, the conjugated lipid molecule is a PEG-lipid conjugate, for example, a (methoxy polyethylene glycol)-conjugated lipid.


Exemplary PEG-lipid conjugates include, but are not limited to, PEG-diacylglycerol (DAG) (such as 1-(monomethoxy-polyethyleneglycol)-2,3-dimyristoylglycerol (PEG-DMG)), PEG-dialkyloxypropyl (DAA), PEG-phospholipid, PEG-ceramide (Cer), a pegylated phosphatidylethanoloamine (PEG-PE), 1,2-dimyristoyl-sn-glycerol, methoxypoly ethylene glycol (DMG-PEG-2K), PEG succinate diacylglycerol (PEGS-DAG) (such as 4-O-(2′,3′-di(tetradecanoyloxy)propyl-1-O-(w-methoxy(polyethoxy)ethyl) butanedioate (PEG-S-DMG)), PEG dialkoxypropylcarbam, N-(carbonyl-methoxypolyethylene glycol 2000)-1,2-distearoyl-sn-glycero-3-phosphoethanolamine sodium salt, or a mixture thereof. Additional exemplary PEG-lipid conjugates are described, for example, in U.S. Pat. Nos. 5,885,613, 6,287,591, US2003/0077829, US2003/0077829, US2005/0175682, US2008/0020058, US2011/0117125, US2010/0130588, US2016/0376224, US2017/0119904, and US/099823, the contents of all of which are incorporated herein by reference in their entirety. In some embodiments, a PEG-lipid is a compound of Formula III, III-a-I, III-a-2, III-b-1, III-b-2, or V of US2018/0028664, the content of which is incorporated herein by reference in its entirety. In some embodiments, a PEG-lipid is of Formula II of US20150376115 or US2016/0376224, the content of both of which is incorporated herein by reference in its entirety. In some embodiments, the PEG-DAA conjugate can be, for example, PEG-dilauryloxypropyl, PEG-dimyristyloxypropyl, PEG-dipalmityloxypropyl, or PEG-distearyloxypropyl. The PEG-lipid can be one or more of PEG-DMG, PEG-dilaurylglycerol, PEG-dipalmitoylglycerol, PEG-disterylglycerol, PEG-dilaurylglycamide, PEG-dimyristylglycamide, PEG-dipalmitoylglycamide, PEG-disterylglycamide, PEG-cholesterol (1-[8′-(Cholest-5-en-3[beta]-oxy)carboxamido-3′,6′-dioxaoctanyl] carbamoyl-[omega]-methyl-poly(ethylene glycol), PEG-DMB (3,4-Ditetradecoxylbenzyl-[omega]-methyl-poly(ethylene glycol) ether), and 1,2-dimyristoyl-sn-glycero-3-phosphoethanolamine-N-[methoxy(polyethylene glycol)-2000]. In some embodiments, the PEG-lipid comprises PEG-DMG, 1,2-dimyristoyl-sn-glycero-3-phosphoethanolamine-N-[methoxy(polyethylene glycol)-2000]. In some embodiments, the PEG-lipid comprises a structure selected from:




embedded image


In some embodiments, lipids conjugated with a molecule other than a PEG can also be used in place of PEG-lipid. For example, polyoxazoline (POZ)-lipid conjugates, polyamide-lipid conjugates (such as ATTA-lipid conjugates), and cationic-polymer lipid (GPL) conjugates can be used in place of or in addition to the PEG-lipid.


Exemplary conjugated lipids, i.e., PEG-lipids, (POZ)-lipid conjugates, ATTA-lipid conjugates and cationic polymer-lipids are described in the PCT and LIS patent applications listed in Table 2 of WO2019051289A9 and in WO2020106946A1, the contents of all of which are incorporated herein by reference in their entirety.


In some embodiments an LNP comprises a compound of Formula (xix), a compound of Formula (xxi) and a compound of Formula (xxv). In some embodiments an LNP comprising a formulation of Formula (xix), Formula (xxi) and Formula (xxv)is used to deliver a gene modifying composition described herein to the lung or pulmonary cells.


In some embodiments, a lipid nanoparticle may comprise one or more cationic lipids selected from Formula (i), Formula (ii), Formula (iii), Formula (vii), and Formula (ix). In some embodiments, the LNP may further comprise one or more neutral lipid, e.g., DSPC, DPPC, DMPC, DOPC, POPC, DOPE, SM, a steroid, e.g., cholesterol, and/or one or more polymer conjugated lipid, e.g., a pegylated lipid, e.g., PEG-DAG, PEG-PE, PEG-S-DAG, PEG-cer or a PEG dialkyoxypropylcarbamate.


In some embodiments, the PEG or the conjugated lipid can comprise 0-20% (mol) of the total lipid present in the lipid nanoparticle. In some embodiments, PEG or the conjugated lipid content is 0.5-10% or 2-5% (mol) of the total lipid present in the lipid nanoparticle. Molar ratios of the ionizable lipid, non-cationic-lipid, sterol, and PEG/conjugated lipid can be varied as needed. For example, the lipid particle can comprise 30-70% ionizable lipid by mole or by total weight of the composition, 0-60% cholesterol by mole or by total weight of the composition, 0-30% non-cationic-lipid by mole or by total weight of the composition and 1-10% conjugated lipid by mole or by total weight of the composition. Preferably, the composition comprises 30-40% ionizable lipid by mole or by total weight of the composition, 40-50% cholesterol by mole or by total weight of the composition, and 10-20% non-cationic-lipid by mole or by total weight of the composition. In some other embodiments, the composition is 50-75% ionizable lipid by mole or by total weight of the composition, 20-40% cholesterol by mole or by total weight of the composition, and 5 to 10% non-cationic-lipid, by mole or by total weight of the composition and 1-10% conjugated lipid by mole or by total weight of the composition. The composition may contain 60-70% ionizable lipid by mole or by total weight of the composition, 25-35% cholesterol by mole or by total weight of the composition, and 5-10% non-cationic-lipid by mole or by total weight of the composition. The composition may also contain up to 90% ionizable lipid by mole or by total weight of the composition and 2 to 15% non-cationic lipid by mole or by total weight of the composition. The formulation may also be a lipid nanoparticle formulation, for example comprising 8-30% ionizable lipid by mole or by total weight of the composition, 5-30% non-cationic lipid by mole or by total weight of the composition, and 0-20% cholesterol by mole or by total weight of the composition; 4-25% ionizable lipid by mole or by total weight of the composition, 4-25% non-cationic lipid by mole or by total weight of the composition, 2 to 25% cholesterol by mole or by total weight of the composition, 10 to 35% conjugate lipid by mole or by total weight of the composition, and 5% cholesterol by mole or by total weight of the composition; or 2-30% ionizable lipid by mole or by total weight of the composition, 2-30% non-cationic lipid by mole or by total weight of the composition, 1 to 15% cholesterol by mole or by total weight of the composition, 2 to 35% conjugate lipid by mole or by total weight of the composition, and 1-20% cholesterol by mole or by total weight of the composition; or even up to 90% ionizable lipid by mole or by total weight of the composition and 2-10% non-cationic lipids by mole or by total weight of the composition, or even 100% cationic lipid by mole or by total weight of the composition. In some embodiments, the lipid particle formulation comprises ionizable lipid, phospholipid, cholesterol and a PEG-ylated lipid in a molar ratio of 50:10:38.5: 1.5. In some other embodiments, the lipid particle formulation comprises ionizable lipid, cholesterol and a PEG-ylated lipid in a molar ratio of 60:38.5:1.5.


In some embodiments, the lipid particle comprises ionizable lipid, non-cationic lipid (e.g. phospholipid), a sterol (e.g., cholesterol) and a PEG-ylated lipid, where the molar ratio of lipids ranges from 20 to 70 mole percent for the ionizable lipid, with a target of 40-60, the mole percent of non-cationic lipid ranges from 0 to 30, with a target of 0 to 15, the mole percent of sterol ranges from 20 to 70, with a target of 30 to 50, and the mole percent of PEG-ylated lipid ranges from 1 to 6, with a target of 2 to 5.


In some embodiments, the lipid particle comprises ionizable lipid/non-cationic-lipid/sterol/conjugated lipid at a molar ratio of 50:10:38.5:1.5.


In an aspect, the disclosure provides a lipid nanoparticle formulation comprising phospholipids, lecithin, phosphatidylcholine and phosphatidylethanolamine.


In some embodiments, one or more additional compounds can also be included. Those compounds can be administered separately or the additional compounds can be included in the lipid nanoparticles of the invention. In other words, the lipid nanoparticles can contain other compounds in addition to the nucleic acid or at least a second nucleic acid, different than the first. Without limitations, other additional compounds can be selected from the group consisting of small or large organic or inorganic molecules, monosaccharides, disaccharides, trisaccharides, oligosaccharides, polysaccharides, peptides, proteins, peptide analogs and derivatives thereof, peptidomimetics, nucleic acids, nucleic acid analogs and derivatives, an extract made from biological materials, or any combinations thereof.


In some embodiments, a lipid nanoparticle (or a formulation comprising lipid nanoparticles) lacks reactive impurities (e.g., aldehydes or ketones), or comprises less than a preselected level of reactive impurities (e.g., aldehydes or ketones). While not wishing to be bound by theory, in some embodiments, a lipid reagent is used to make a lipid nanoparticle formulation, and the lipid reagent may comprise a contaminating reactive impurity (e.g., an aldehyde or ketone). A lipid regent may be selected for manufacturing based on having less than a preselected level of reactive impurities (e.g., aldehydes or ketones). Without wishing to be bound by theory, in some embodiments, aldehydes can cause modification and damage of RNA, e.g., cross-linking between bases and/or covalently conjugating lipid to RNA (e.g., forming lipid-RNA adducts). This may, in some instances, lead to failure of a reverse transcriptase reaction and/or incorporation of inappropriate bases, e.g., at the site(s) of lesion(s), e.g., a mutation in a newly synthesized target DNA.


In some embodiments, a lipid nanoparticle formulation is produced using a lipid reagent comprising less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% total reactive impurity (e.g., aldehyde) content. In some embodiments, a lipid nanoparticle formulation is produced using a lipid reagent comprising less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% of any single reactive impurity (e.g., aldehyde) species. In some embodiments, a lipid nanoparticle formulation is produced using a lipid reagent comprising: (i) less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% total reactive impurity (e.g., aldehyde) content; and (ii) less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% of any single reactive impurity (e.g., aldehyde) species. In some embodiments, the lipid nanoparticle formulation is produced using a plurality of lipid reagents, and each lipid reagent of the plurality independently meets one or more criterion described in this paragraph. In some embodiments, each lipid reagent of the plurality meets the same criterion, e.g., a criterion of this paragraph.


In some embodiments, the lipid nanoparticle formulation comprises less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% total reactive impurity (e.g., aldehyde) content. In some embodiments, the lipid nanoparticle formulation comprises less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% of any single reactive impurity (e.g., aldehyde) species. In some embodiments, the lipid nanoparticle formulation comprises: (i) less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% total reactive impurity (e.g., aldehyde) content; and (ii) less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% of any single reactive impurity (e.g., aldehyde) species.


In some embodiments, one or more, or optionally all, of the lipid reagents used for a lipid nanoparticle as described herein or a formulation thereof comprise less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% total reactive impurity (e.g., aldehyde) content. In some embodiments, one or more, or optionally all, of the lipid reagents used for a lipid nanoparticle as described herein or a formulation thereof comprise less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% of any single reactive impurity (e.g., aldehyde) species. In some embodiments, one or more, or optionally all, of the lipid reagents used for a lipid nanoparticle as described herein or a formulation thereof comprise: (i) less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% total reactive impurity (e.g., aldehyde) content; and (ii) less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% of any single reactive impurity (e.g., aldehyde) species.


In some embodiments, total aldehyde content and/or quantity of any single reactive impurity (e.g., aldehyde) species is determined by liquid chromatography (LC), e.g., coupled with tandem mass spectrometry (MS/MS), e.g., according to the method described in Example 40 of PCT/US21/20948. In some embodiments, reactive impurity (e.g., aldehyde) content and/or quantity of reactive impurity (e.g., aldehyde) species is determined by detecting one or more chemical modifications of a nucleic acid molecule (e.g., an RNA molecule, e.g., as described herein) associated with the presence of reactive impurities (e.g., aldehydes), e.g., in the lipid reagents. In some embodiments, reactive impurity (e.g., aldehyde) content and/or quantity of reactive impurity (e.g., aldehyde) species is determined by detecting one or more chemical modifications of a nucleotide or nucleoside (e.g., a ribonucleotide or ribonucleoside, e.g., comprised in or isolated from a template nucleic acid, e.g., as described herein) associated with the presence of reactive impurities (e.g., aldehydes), e.g., in the lipid reagents, e.g., according to the method described in Example 41 of PCT/US21/20948. In embodiments, chemical modifications of a nucleic acid molecule, nucleotide, or nucleoside are detected by determining the presence of one or more modified nucleotides or nucleosides, e.g., using LC-MS/MS analysis, e.g., according to the method described in Example 41 of PCT/US21/20948.


In some embodiments, a nucleic acid (e.g., RNA) described herein (e.g., a template nucleic acid or a nucleic acid encoding a gene modifying polypeptide) does not comprise an aldehyde modification, or comprises less than a preselected amount of aldehyde modifications. In some embodiments, on average, a nucleic acid has less than 50, 20, 10, 5, 2, or 1 aldehyde modifications per 1000 nucleotides, e.g., wherein a single cross-linking of two nucleotides is a single aldehyde modification. In some embodiments, the aldehyde modification is an RNA adduct (e.g., a lipid-RNA adduct). In some embodiments, the aldehyde-modified nucleotide is cross-linking between bases. In some embodiments, a nucleic acid (e.g., RNA) described herein comprises less than 50, 20, 10, 5, 2, or 1 cross-links between nucleotide.


In some embodiments, LNPs are directed to specific tissues by the addition of targeting domains. For example, biological ligands may be displayed on the surface of LNPs to enhance interaction with cells displaying cognate receptors, thus driving association with and cargo delivery to tissues wherein cells express the receptor. In some embodiments, the biological ligand may be a ligand that drives delivery to the liver, e.g., LNPs that display GalNAc result in delivery of nucleic acid cargo to hepatocytes that display asialoglycoprotein receptor (ASGPR). The work of Akinc et al. Mol Ther 18(7): 1357-1364 (2010) teaches the conjugation of a trivalent GalNAc ligand to a PEG-lipid (GalNAc-PEG-DSG) to yield LNPs dependent on ASGPR for observable LNP cargo effect (see, e.g., FIG. 6 therein). Other ligand-displaying LNP formulations, e.g., incorporating folate, transferrin, or antibodies, are discussed in WO2017223135, which is incorporated herein by reference in its entirety, in addition to the references used therein, namely Kolhatkar et al., Curr Drug Discov Technol. 2011 8:197-206; Musacchio and Torchilin, Front Biosci. 2011 16:1388-1412; Yu et al., Mol Membr Biol. 2010 27:286-298; Patil et al., Crit Rev Ther Drug Carrier Syst. 2008 25:1-61; Benoit et al., Biomacromolecules. 2011 12:2708-2714; Zhao et al., Expert Opin Drug Deliv. 2008 5:309-319; Akinc et al., Mol Ther. 2010 18:1357-1364; Srinivasan et al., Methods Mol Biol. 2012 820:105-116; Ben-Arie et al., Methods Mol Biol. 2012 757:497-507; Peer 2010 J Control Release. 20:63-68; Peer et al., Proc Natl Acad Sci USA. 2007 104:4095-4100; Kim et al., Methods Mol Biol. 2011 721:339-353; Subramanya et al., Mol Ther. 2010 18:2028-2037; Song et al., Nat Biotechnol. 2005 23:709-717; Peer et al., Science. 2008 319:627-630; and Peer and Lieberman, Gene Ther. 2011 18:1127-1133.


In some embodiments, LNPs are selected for tissue-specific activity by the addition of a Selective ORgan Targeting (SORT) molecule to a formulation comprising traditional components, such as ionizable cationic lipids, amphipathic phospholipids, cholesterol and poly(ethylene glycol) (PEG) lipids. The teachings of Cheng et al. Nat Nanotechnol 15(4):313-320 (2020) demonstrate that the addition of a supplemental “SORT” component precisely alters the in vivo RNA delivery profile and mediates tissue-specific (e.g., lungs, liver, spleen) gene delivery and editing as a function of the percentage and biophysical property of the SORT molecule.


In some embodiments, the LNPs comprise biodegradable, ionizable lipids. In some embodiments, the LNPs comprise (9Z,12Z)-3-((4,4-bis(octyloxy)butanoyl)oxy)-2-((((3-(diethylamino)propoxy)carbonyl)oxy)methyl)propyl octadeca-9,12-dienoate, also called 3-((4,4-bis(octyloxy)butanoyl)oxy)-2-((((3-(diethylamino)propoxy)carbonyl)oxy)methyl)propyl (9Z,12Z)-octadeca-9,12-dienoate) or another ionizable lipid. See, e.g, lipids of WO2019/067992, WO/2017/173054, WO2015/095340, and WO2014/136086, as well as references provided therein. In some embodiments, the term cationic and ionizable in the context of LNP lipids is interchangeable, e.g., wherein ionizable lipids are cationic depending on the pH.


In some embodiments, an LNP described herein comprises a lipid described in Table 19.









TABLE 19







Exemplary Lipids












Molecular



LIPID ID
Chemical Name
Weight
Structure





LIPIDV003
(9Z,12Z)- 3-((4,4- bis(octyloxy) butanoyl)oxy)-2- ((((3- (diethylamino) propoxy)carbonyl) oxy)methyl) propyl octadeca- 9, 12-dienoate
852.29


embedded image







LIPIDV004
Heptadecan-9- yl 8-((2- hydroxyethyl) (8-(nonyloxy)-8- oxooctyl) amino)octanoate
710.18


embedded image







LIPIDV005

919.56


embedded image











In some embodiments, multiple components of a gene modifying system may be prepared as a single LNP formulation, e.g., an LNP formulation comprises mRNA encoding for the gene modifying polypeptide and an RNA template. Ratios of nucleic acid components may be varied in order to maximize the properties of a therapeutic. In some embodiments, the ratio of RNA template to mRNA encoding a gene modifying polypeptide is about 1:1 to 100:1, e.g., about 1:1 to 20:1, about 20:1 to 40:1, about 40:1 to 60:1, about 60:1 to 80:1, or about 80:1 to 100:1, by molar ratio. In other embodiments, a system of multiple nucleic acids may be prepared by separate formulations, e.g., one LNP formulation comprising a template RNA and a second LNP formulation comprising an mRNA encoding a gene modifying polypeptide. In some embodiments, the system may comprise more than two nucleic acid components formulated into LNPs. In some embodiments, the system may comprise a protein, e.g., a gene modifying polypeptide, and a template RNA formulated into at least one LNP formulation.


In some embodiments, the average LNP diameter of the LNP formulation may be between 10s of nm and 100s of nm, e.g., measured by dynamic light scattering (DLS). In some embodiments, the average LNP diameter of the LNP formulation may be from about 40 nm to about 150 nm, such as about 40 nm, 45 nm, 50 nm, 55 nm, 60 nm, 65 nm, 70 nm, 75 nm, 80 nm, 85 nm, 90 nm, 95 nm, 100 nm, 105 nm, 110 nm, 115 nm, 120 nm, 125 nm, 130 nm, 135 nm, 140 nm, 145 nm, or 150 nm. In some embodiments, the average LNP diameter of the LNP formulation may be from about 50 nm to about 100 nm, from about 50 nm to about 90 nm, from about 50 nm to about 80 nm, from about 50 nm to about 70 nm, from about 50 nm to about 60 nm, from about 60 nm to about 100 nm, from about 60 nm to about 90 nm, from about 60 nm to about 80 nm, from about 60 nm to about 70 nm, from about 70 nm to about 100 nm, from about 70 nm to about 90 nm, from about 70 nm to about 80 nm, from about 80 nm to about 100 nm, from about 80 nm to about 90 nm, or from about 90 nm to about 100 nm. In some embodiments, the average LNP diameter of the LNP formulation may be from about 70 nm to about 100 nm. In a particular embodiment, the average LNP diameter of the LNP formulation may be about 80 nm. In some embodiments, the average LNP diameter of the LNP formulation may be about 100 nm. In some embodiments, the average LNP diameter of the LNP formulation ranges from about 1 mm to about 500 mm, from about 5 mm to about 200 mm, from about 10 mm to about 100 mm, from about 20 mm to about 80 mm, from about 25 mm to about 60 mm, from about 30 mm to about 55 mm, from about 35 mm to about 50 mm, or from about 38 mm to about 42 mm.


An LNP may, in some instances, be relatively homogenous. A polydispersity index may be used to indicate the homogeneity of an LNP, e.g., the particle size distribution of the lipid nanoparticles. A small (e.g., less than 0.3) polydispersity index generally indicates a narrow particle size distribution. An LNP may have a polydispersity index from about 0 to about 0.25, such as 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.10, 0.11, 0.12, 0.13, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19, 0.20, 0.21, 0.22, 0.23, 0.24, or 0.25. In some embodiments, the polydispersity index of an LNP may be from about 0.10 to about 0.20.


The zeta potential of an LNP may be used to indicate the electrokinetic potential of the composition. In some embodiments, the zeta potential may describe the surface charge of an LNP. Lipid nanoparticles with relatively low charges, positive or negative, are generally desirable, as more highly charged species may interact undesirably with cells, tissues, and other elements in the body. In some embodiments, the zeta potential of an LNP may be from about −10 mV to about +20 mV, from about −10 mV to about +15 mV, from about −10 mV to about +10 mV, from about −10 mV to about +5 mV, from about −10 mV to about 0 mV, from about −10 mV to about −5 mV, from about −5 mV to about +20 mV, from about −5 mV to about +15 mV, from about −5 mV to about +10 mV, from about −5 mV to about +5 mV, from about −5 mV to about 0 mV, from about 0 mV to about +20 mV, from about 0 mV to about +15 mV, from about 0 mV to about +10 mV, from about 0 mV to about +5 mV, from about +5 mV to about +20 mV, from about +5 mV to about +15 mV, or from about +5 mV to about +10 mV.


The efficiency of encapsulation of a protein and/or nucleic acid, e.g., gene modifying polypeptide or mRNA encoding the polypeptide, describes the amount of protein and/or nucleic acid that is encapsulated or otherwise associated with an LNP after preparation, relative to the initial amount provided. The encapsulation efficiency is desirably high (e.g., close to 100%). The encapsulation efficiency may be measured, for example, by comparing the amount of protein or nucleic acid in a solution containing the lipid nanoparticle before and after breaking up the lipid nanoparticle with one or more organic solvents or detergents. An anion exchange resin may be used to measure the amount of free protein or nucleic acid (e.g., RNA) in a solution. Fluorescence may be used to measure the amount of free protein and/or nucleic acid (e.g., RNA) in a solution. For the lipid nanoparticles described herein, the encapsulation efficiency of a protein and/or nucleic acid may be at least 50%, for example 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100%. In some embodiments, the encapsulation efficiency may be at least 80%. In some embodiments, the encapsulation efficiency may be at least 90%. In some embodiments, the encapsulation efficiency may be at least 95%.


An LNP may optionally comprise one or more coatings. In some embodiments, an LNP may be formulated in a capsule, film, or table having a coating. A capsule, film, or tablet including a composition described herein may have any useful size, tensile strength, hardness or density.


Additional exemplary lipids, formulations, methods, and characterization of LNPs are taught by WO2020061457, which is incorporated herein by reference in its entirety.


In some embodiments, in vitro or ex vivo cell lipofections are performed using Lipofectamine MessengerMax (Thermo Fisher) or TransIT-mRNA Transfection Reagent (Mirus Bio). In certain embodiments, LNPs are formulated using the Gen Voy_ILM ionizable lipid mix (Precision NanoSystems). In certain embodiments, LNPs are formulated using 2,2-dilinoleyl-4-dimethylaminoethyl-[1,3]-dioxolane (DLin-KC2-DMA) or dilinoleylmethyl-4-dimethylaminobutyrate (DLin-MC3-DMA or MC3), the formulation and in vivo use of which are taught in Jayaraman et al. Angew Chem Int Ed Engl 51(34):8529-8533 (2012), incorporated herein by reference in its entirety.


LNP formulations optimized for the delivery of CRISPR-Cas systems, e.g., Cas9-gRNA RNP, gRNA, Cas9 mRNA, are described in WO2019067992 and WO2019067910, both incorporated by reference.


Additional specific LNP formulations useful for delivery of nucleic acids are described in U.S. Pat. Nos. 8,158,601 and 8,168,775, both incorporated by reference, which include formulations used in patisiran, sold under the name ONPATTRO.


Exemplary dosing of gene modifying LNP may include about 0.1, 0.25, 0.3, 0.5, 1, 2, 3, 4, 5, 6, 8, 10, or 100 mg/kg (RNA). Exemplary dosing of AAV comprising a nucleic acid encoding one or more components of the system may include an MOI of about 1011, 1012, 1013 and 1014 vg/kg.


Kits, Articles of Manufacture, and Pharmaceutical Compositions

In an aspect the disclosure provides a kit comprising a gene modifying polypeptide or a gene modifying system, e.g., as described herein. In some embodiments, the kit comprises a gene modifying polypeptide (or a nucleic acid encoding the polypeptide) and a template RNA (or DNA encoding the template RNA). In some embodiments, the kit further comprises a reagent for introducing the system into a cell, e.g., transfection reagent, LNP, and the like. In some embodiments, the kit is suitable for any of the methods described herein. In some embodiments, the kit comprises one or more elements, compositions (e.g., pharmaceutical compositions), gene modifying polypeptides, and/or gene modifying systems, or a functional fragment or component thereof, e.g., disposed in an article of manufacture. In some embodiments, the kit comprises instructions for use thereof.


In an aspect, the disclosure provides an article of manufacture, e.g., in which a kit as described herein, or a component thereof, is disposed.


In an aspect, the disclosure provides a pharmaceutical composition comprising a gene modifying polypeptide or a gene modifying system, e.g., as described herein. In some embodiments, the pharmaceutical composition further comprises a pharmaceutically acceptable carrier or excipient. In some embodiments, the pharmaceutical composition comprises a template RNA and/or an RNA encoding the polypeptide. In embodiments, the pharmaceutical composition has one or more (e.g., 1, 2, 3, or 4) of the following characteristics:

    • (a) less than 1% (e.g., less than 0.5%, 0.4%, 0.3%, 0.2%, or 0.1%) DNA template relative to the template RNA and/or the RNA encoding the polypeptide, e.g., on a molar basis;
    • (b) less than 1% (e.g., less than 0.5%, 0.4%, 0.3%, 0.2%, or 0.1%) uncapped RNA relative to the template RNA and/or the RNA encoding the polypeptide, e.g., on a molar basis;
    • (c) less than 1% (e.g., less than 0.5%, 0.4%, 0.3%, 0.2%, or 0.1%) partial length RNAs relative to the template RNA and/or the RNA encoding the polypeptide, e.g., on a molar basis;
    • (d) substantially lacks unreacted cap dinucleotides.


Chemistry, Manufacturing, and Controls (CMC)

Purification of protein therapeutics is described, for example, in Franks, Protein Biotechnology: Isolation, Characterization, and Stabilization, Humana Press (2013); and in Cutler, Protein Purification Protocols (Methods in Molecular Biology), Humana Press (2010).


In some embodiments, a gene modifying system, polypeptide, and/or template nucleic acid (e.g., template RNA) conforms to certain quality standards. In some embodiments, a gene modifying system, polypeptide, and/or template nucleic acid (e.g., template RNA) produced by a method described herein conforms to certain quality standards. Accordingly, the disclosure is directed, in some aspects, to methods of manufacturing a gene modifying system, polypeptide, and/or template nucleic acid (e.g., template RNA) that conforms to certain quality standards, e.g., in which said quality standards are assayed. The disclosure is also directed, in some aspects, to methods of assaying said quality standards in a gene modifying system, polypeptide, and/or template nucleic acid (e.g., template RNA). In some embodiments, quality standards include, but are not limited to, one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12) of the following:

    • (i) the length of the template RNA, e.g., whether the template RNA has a length that is above a reference length or within a reference length range, e.g., whether at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% of the template RNA present is greater than 100, 125, 150, 175, or 200 nucleotides long;
    • (ii) the presence, absence, and/or length of a poly A tail on the template RNA, e.g., whether at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% of the template RNA present contains a poly A tail (e.g., a polyA tail that is at least 5, 10, 20, 30, 50, 70, 100 nucleotides in length (SEQ ID NO: 22004));
    • (iii) the presence, absence, and/or type of a 5′ cap on the template RNA, e.g., whether at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% of the template RNA present contains a 5′ cap, e.g., whether that cap is a 7-methylguanosine cap, e.g., a O-Me-m7G cap;
    • (iv) the presence, absence, and/or type of one or more modified nucleotides (e.g., selected from pseudouridine, dihydrouridine, inosine, 7-methylguanosine, 1-N-methylpseudouridine (1-Me-Y′), 5-methoxyuridine (5-MO-U), 5-methylcytidine (5mC), or a locked nucleotide) in the template RNA, e.g., whether at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% of the template RNA present contains one or more modified nucleotides;
    • (v) the stability of the template RNA (e.g., over time and/or under a pre-selected condition), e.g., whether at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% of the template RNA remains intact (e.g., greater than 100, 125, 150, 175, or 200 nucleotides long) after a stability test;
    • (vi) the potency of the template RNA in a system for modifying DNA, e.g., whether at least 1% of target sites are modified after a system comprising the template RNA is assayed for potency;
    • (vii) the length of the polypeptide, first polypeptide, or second polypeptide, e.g., whether the polypeptide, first polypeptide, or second polypeptide has a length that is above a reference length or within a reference length range, e.g., whether at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% of the polypeptide, first polypeptide, or second polypeptide present is greater than 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1400, 1450, 1500, 1600, 1700, 1800, 1900, or 2000 amino acids long (and optionally, no larger than 2500, 2000, 1500, 1400, 1300, 1200, 1100, 1000, 900, 800, 700, or 600 amino acids long);
    • (viii) the presence, absence, and/or type of post-translational modification on the polypeptide, first polypeptide, or second polypeptide, e.g., whether at least 80, 85, 90, 95, 96, 97, 98, or 99% of the polypeptide, first polypeptide, or second polypeptide contains phosphorylation, methylation, acetylation, myristoylation, palmitoylation, isoprenylation, glipyatyon, or lipoylation, or any combination thereof;
    • (ix) the presence, absence, and/or type of one or more artificial, synthetic, or non-canonical amino acids (e.g., selected from ornithine, B-alanine, GABA, 8-Aminolevulinic acid, PABA, a D-amino acid (e.g., D-alanine or D-glutamate), aminoisobutyric acid, dehydroalanine, cystathionine, lanthionine, Djenkolic acid, Diaminopimelic acid, Homoalanine, Norvaline, Norleucine, Homonorleucine, homoserine, O-methyl-homoserine and O-ethyl-homoserine, ethionine, selenocysteine, selenohomocysteine, selenomethionine, selenoethionine, tellurocysteine, or telluromethionine) in the polypeptide, first polypeptide, or second polypeptide, e.g., whether at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% of the polypeptide, first polypeptide, or second polypeptide present contains one or more artificial, synthetic, or non-canonical amino acids;
    • (x) the stability of the polypeptide, first polypeptide, or second polypeptide (e.g., over time and/or under a pre-selected condition), e.g., whether at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% of the polypeptide, first polypeptide, or second polypeptide remains intact (e.g., greater than 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1400, 1450, 1500, 1600, 1700, 1800, 1900, or 2000 amino acids long (and optionally, no larger than 2500, 2000, 1500, 1400, 1300, 1200, 1100, 1000, 900, 800, 700, or 600 amino acids long)) after a stability test;
    • (xi) the potency of the polypeptide, first polypeptide, or second polypeptide in a system for modifying DNA, e.g., whether at least 1% of target sites are modified after a system comprising the polypeptide, first polypeptide, or second polypeptide is assayed for potency; or (xii) the presence, absence, and/or level of one or more of a pyrogen, virus, fungus, bacterial pathogen, or host cell protein, e.g., whether the system is free or substantially free of pyrogen, virus, fungus, bacterial pathogen, or host cell protein contamination.


In some embodiments, a system or pharmaceutical composition described herein is endotoxin free.


In some embodiments, the presence, absence, and/or level of one or more of a pyrogen, virus, fungus, bacterial pathogen, and/or host cell protein is determined. In embodiments, whether the system is free or substantially free of pyrogen, virus, fungus, bacterial pathogen, and/or host cell protein contamination is determined.


In some embodiments, a pharmaceutical composition or system as described herein has one or more (e.g., 1, 2, 3, or 4) of the following characteristics:

    • (a) less than 1% (e.g., less than 0.5%, 0.4%, 0.3%, 0.2%, or 0.1%) DNA template relative to the template RNA and/or the RNA encoding the polypeptide, e.g., on a molar basis;
    • (b) less than 1% (e.g., less than 0.5%, 0.4%, 0.3%, 0.2%, or 0.1%) uncapped RNA relative to the template RNA and/or the RNA encoding the polypeptide, e.g., on a molar basis;
    • (c) less than 1% (e.g., less than 0.5%, 0.4%, 0.3%, 0.2%, or 0.1%) partial length RNAs relative to the template RNA and/or the RNA encoding the polypeptide, e.g., on a molar basis;
    • (d) substantially lacks unreacted cap dinucleotides.


EXAMPLES
Example 1: Screening Configurations of Template RNAs that Correct a Sickle Cell Disease Associated Mutation in a Genomic Landing Pad in Human Cells

This example describes the use of gene modifying system containing a gene modifying polypeptide and template RNAs comprising varied lengths of heterologous object sequences and PBS sequences to quantify the activity of template RNAs for correction of the HBB:E6V mutation (also referred to as E7V or the HbS variant; NC_000011.10: g.5227002T>A). In this example, a template RNA contains:

    • (1) a gRNA spacer;
    • (2) a gRNA scaffold;
    • (3) a heterologous object sequence; and
    • (4) a primer binding site (PBS) sequence.


One or more template RNAs described in Tables 1˜4 can be tested as described in this example. The heterologous object sequences and PBS sequences were designed to correct the SCD mutation in a landing pad by replacing an “A” nucleotide with a “T” nucleotide at the mutation site via gene editing, to reverse an E6V mutation in the corresponding protein.


A cell line is created to have a “landing pad” or a stable integration that mimics a region of the HBB gene that contains the E6V mutation site and flanking sequences. In some embodiments, a cell line used for screening may contain one or more additional SNPs in the HBB locus relative to a patient or reference sequence, e.g., the hg38 human genome reference sequence, and a landing pad containing the target mutation is optionally designed to carry the one or more non-pathogenic SNPs to match the endogenous cell line HBB locus, e.g., designed to carry a mutation that recapitulates a SNP present in the endogenous HBB locus in HEK293T cells. Without wishing to be limited by example, it is understood that template RNA sequences found to successfully edit a target mutation at a site containing an additional SNP relative to a reference sequence would differ from a therapeutic template RNA in any region overlapping the additional SNP. For example, a successful template RNA in a HEK293T-based screening assay where a genomic landing pad contains the target mutation (corresponding to the endogenous E6V mutation caused by DNA substitution NC_000011.10: g.5227002T>A) and an additional substitution relative to hg38 (corresponding to the NC_000011.10: g.5227013T>C mutation at the endogenous HBB locus in HEK293T cells) in the protospacer may provide a candidate composition where the corresponding therapeutic template RNA would thus have a substitution (C>T) in the spacer region relative to the corresponding spacer region of the screening template RNA, in order to enable therapeutic correction of the E6V mutation at a target site lacking the additional substitution, e.g., at a target site comprising the pathogenic E6V mutation but otherwise matching the hg38 reference sequence. In this example, a screening cell line containing a target site landing pad comprising the pathogenic mutation with an additional T>C substitution in the protospacer region might be corrected using a screening template RNA comprising the spacer sequence 5′-CATGGTGCACCTGACTCCTG-3′ (SEQ ID NO: 19249), whereas the corresponding therapeutic template RNA might comprise the spacer sequence 5′-CATGGTGCATCTGACTCCTG-3′(SEQ ID NO: 19250), where the underlined nucleotides indicate the position that is altered to match either the screening cell target sequence or the hg38 target sequence. In some embodiments, the spacer, PBS, and/or RT template regions may need to be adjusted in this manner to account for any discrepancies between screening and reference target sequences. It is further contemplated that a given patient or patient population may possess one or more SNPs relative to hg38 at the target locus in addition to the pathogenic E6V mutation and thus a similar adaptation of candidate template RNA molecules could be used to generate template RNA sequences specific for the patient or patient population.


The DNA for the landing pad is chemically synthesized and cloned into the pLenti-N-tGFP vector. The cloned landing pad sequence in the lentiviral expression vector is confirmed and the sequence is verified by Sanger sequencing of the landing pad. The sequence verified plasmids (9 ug) along with the lentiviral packaging mix (9 ug, Biosettia) are transfected using Lipofectamine2000TM according to the manufacturer instructions into a packaging cell line, LentiX-293T (Takara Bio). The transfected cells are incubated at 37° ° C., 5% CO2 for 48 hours (including one medium change at 24 hrs) and the viral particle containing medium is collected from the cell culture dish. The collected medium is filtered through a 0.2 μm filter to remove cell debris and is prepared for transduction of HEK293T cells. The virus-containing medium is diluted in DMEM and mixed with polybrene to prepare a dilution series for transduction of HEK293T cells where the final concentration of polybrene is 8 ug/ml. The HEK293T cells are grown in virus containing medium for 48 hours and then split with fresh medium. The split cells are grown to confluence and transduction efficiency of the different dilutions of virus is measured by GFP expression via flow cytometry and ddPCR detection of the genomic integrated lentivirus that contained GFP and the HBB:E6V landing pads.


A gene modifying system comprising (i) a compatible gene modifying polypeptide described herein, e.g., having: an NLS of Table 11, a compatible Cas9 domain having a sequence of Table 8, a linker of Table 10, an RT sequence of Table 6 (e.g., MLVMS_P03355_PLV919), and a second NLS of Table 11 and (ii) a template RNA of any of Tables 1˜4 is transfected into the HEK293T landing pad cell line. The gene modifying polypeptide and the template RNAs are delivered by nucleofection in RNA format. Specifically, 1 μg of gene modifying polypeptide mRNA is combined with 10 μM template RNAs. The mRNA and template RNAs are added to 25 μL SF buffer containing 250,000 HEK293T landing pad cells and cells are nucleofected using program DS-150. After nucleofection, are were grown at 37° C., 5% CO2 for 3 days prior to cell lysis and genomic DNA extraction. To analyze gene editing activity, primers flanking the HBB:E6V site are used to amplify across the locus. Amplicons are analyzed via short read sequencing using an Illumina MiSeq. In some embodiments, the assay will indicate that at least 10%, 20%, 30%, 40%, 50%, 60%, or 70% of copies of the HBB gene in the sample are converted to the desired wild-type sequence.


Example 2: Gene Modifying Polypeptide Selection by Pooled Screening in HEK293T & U2OS Cells

This example describes the use of an RNA gene modifying system for the targeted editing of a coding sequence in the human genome. More specifically, this example describes the infection of HEK293T and U2OS cells with a library of gene modifying candidates, followed by transfection of a template guide RNA (tgRNA) for in vitro gene modifying in the cells, e.g., as a means of evaluating a new gene modifying polypeptide for editing activity in human cells by a pooled screening approach.


The gene modifying polypeptide library candidates assayed herein each comprise: 1) a S. pyogenes (Spy) Cas9 nickase containing an N863A mutation that inactivates one endonuclease active site; 2) one of the 122 peptide linkers depicted at Table 10; and 3) a reverse transcriptase (RT) domain from Table 6 of retroviral origin. The particular retroviral RT domains utilized were selected if they were expected to function as a monomer. For each selected RT domain, the wild-type sequences were tested, as well as versions with point mutations installed in the primary wild-type sequence. In particular, 143 RT domains were tested, either wild type or containing various mutations. In total, 17,446 Cas-linker-RT gene modifying polypeptides were tested.


The system described here is a two-component system comprising: 1) an expression plasmid encoding a human codon-optimized gene modifying polypeptide library candidate within a lentiviral cassette, and 2) a tgRNA expression plasmid expressing a non-coding tgRNA sequence that is recognized by Cas and localizes it to the genomic locus of interest, and that also templates reverse transcription of the desired edit into the genome by the RT domain, driven by a U6 promoter. The lentiviral cassette comprises: (i) a CMV promoter for expression in mammalian cells; (ii) a gene modifying polypeptide library candidate as shown; (iii) a self-cleaving T2A polypeptide; (iv) a puromycin resistance gene enabling selection in mammalian cells; and (v) a polyA tail termination signal.


To prepare a pool of cells expressing gene modifying polypeptide library candidates, HEK293T or U2OS cells were transduced with pooled lentiviral preparations of the gene modifying candidate plasmid library. HEK293 Lenti-X cells were seeded in 15 cm plates (12×106 cells) prior to lentiviral plasmid transfection. Lentiviral plasmid transfection using the Lentiviral Packaging Mix (Biosettia, 27 ug) and the plasmid DNA for the gene modifying candidate library (27 ug) was performed the following day using Lipofectamine 2000 and Opti-MEM media according to the manufacturer's protocol. Extracellular DNA was removed by a full media change the next day and virus-containing media was harvested 48 hours after. Lentiviral media was concentrated using Lenti-X Concentrator (TaKaRa Biosciences) and 5 mL lentiviral aliquots were made and stored at −80° C. Lentiviral titering was performed by enumerating colony forming units post Puromycin selection. HEK293T or U2OS cells carrying a BFP-expressing genomic landing pad were seeded at 6×107 cells in culture plates and transduced at a 0.3 multiplicity of infection (MOI) to minimize multiple infections per cell. Puromycin (2.5 ug/mL) was added 48 hours post infection to allow for selection of infected cells. Cells were kept under puromycin selection for at least 7 days and then scaled up for tgRNA electroporation.


To determine the genome-editing capacity of the gene modifying library candidates in the assay, infected BFP-expressing HEK293T or U2OS cells were then transfected by electroporation of 250,000 cells/well with 200 ng of a tgRNA (either g4 or g10) plasmid, designed to convert BFP to GFP, at sufficient cell count for >1000x coverage per library candidate.


The g4 tgRNA (5′ to 3′) is as follows: 20 nucleotide spacer region (GCCGAAGCACTGCACGCCGT; SEQ ID NO: 11,011), a scaffold region (GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAA AAGTGGCACCGAGTCGGTGC; SEQ ID NO: 11,012), the template region encoding the single base pair substitution to change BFP to GFP (bold) and a PAM inactivation that introduces a synonymous point mutation in the SpyCas9 PAM (NGG to NCG) that prevents re-engagement of the gene modifying polypeptide upon completion of a functional gene modifying reaction (underline) (ACCCTGACGTACG; SEQ ID NO: 11,013), and the 13 nucleotide PBS (GCGTGCAGTGCTT; SEQ ID NO: 11,014).


Similarly, the g10 tgRNA (5′ to 3′) is as follows: 20 nucleotide spacer region (AGAAGTCGTGCTGCTTCATG; SEQ ID NO: 11,015), a scaffold region (GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAA AAGTGGCACCGAGTCGGTGC; SEQ ID NO: 11,016), the template region encoding the single base pair substitution to change BFP to GFP (bold) and a PAM inactivation that introduces a synonymous point mutation in the SpyCas9 PAM (NGG to NGA) that prevents re-engagement of the gene modifying polypeptide upon completion of a functional gene modifying reaction (underline) (ACCCTGACCTACGGCGTGCAGTGCTTCGGCCGCTACCCCGATCACAT; SEQ ID NO: 11,017), and 13 nucleotide PBS (GAAGCAGCACGAC; SEQ ID NO: 11,018).


To assess the genome-editing capacity of the various constructs in the assay, cells were sorted by Fluorescence-Activated Cell Sorting (FACS) for GFP expression 6-7 days post-electroporation. Cells were sorted and harvested as distinct populations of unedited (BFP+) cells, edited (GFP+) cells and imperfect edit (BFP-, GFP-) cells. A sample of unsorted cells was also harvested as the input population to determine enrichment during analysis.


To determine which gene modifying library candidates have genome-editing capacity in this assay, genomic DNA (gDNA) was harvested from sorted and unsorted cell populations, and analyzed by sequencing the gene modifying library candidates in each population. Briefly, gene modifying sequences were amplified from the genome using primers specific to the lentiviral cassette, amplified in a second round of PCR to dilute genomic DNA, and then sequenced using Oxford Nanopore Sequencing Technology according to the manufacturer's protocol.


After quality control of sequencing reads, reads of at least 1500 and no more than 3200 nucleotides were mapped to the gene modifying polypeptide library sequences and those containing a minimum of an 80% match to a library sequence were considered to be successfully aligned to a given candidate. To identify gene modifying candidates capable of performing gene editing in the assay, the read count of each library candidate in the edited population was compared to its read count in the initial, unsorted population. For purposes of this pooled screen, gene modifying candidates with genome-editing capacity were selected as those candidates that were enriched in the converted (GFP+) population relative to unsorted (input) cells and wherein the enrichment was determined to be at or above the enrichment level of a reference (Element ID No: 17380).


A large number of gene modifying polypeptide candidates were determined to be enriched in the GFP+ cell populations. For example, of the 17,446 candidates tested, over 3,300 exhibited enrichment in GFP+sorted populations (relative to unsorted) that was at least equivalent to that of the reference under similar experimental conditions (HEK293T using g4 tgRNA; HEK293T cells using g10 tgRNA; or U2OS cells using g4 tgRNA), shown in Table D. Although the 17,446 candidates were also tested in U2OS cells using g10 tgRNA, the pooled screen did not yield candidates that were enriched in the converted (GFP+) population relative to unsorted (input) cells under that experimental condition; further investigation is required to explain these results.









TABLE D







Combinations of linker and RT sequences screened.


The amino acid sequence of each RT in this table is


provided in Table 6.










Linker



Linker amino
SEQ ID



acid sequence
NO:
RT domain name





EAAAKGSS
12,001
PERV_Q4VFZ2_3mutA_WS





EAAAKEAAAKEAAAKEAAAK
12,002
MLVMS_P03355_PLV919





PAPEAAAK
12,003
MLVFF_P26809_3mutA





EAAAKPAPGGG
12,004
MLVFF_P26809_3mutA





GSSGSSGSSGSSGSSGSS
12,005
PERV_Q4VFZ2_3mut





PAPGGGEAAAK
12,006
MLVAV_P03356_3mutA





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,007
MLVMS_P03355_PLV919





GSSEAAAK
12,008
MLVFF_P26809_3mutA





EAAAKPAPGGS
12,009
MLVFF_P26809_3mutA





GGSGGSGGSGGSGGSGGS
12,010
MLVFF_P26809_3mutA





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,011
XMRV6_A1Z651_3mutA





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,012
PERV_Q4VFZ2_3mutA_WS





EAAAKEAAAKEAAAK
12,013
MLVFF_P26809_3mutA





PAPEAAAKGSS
12,014
MLVFF_P26809_3mutA





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,015
PERV_Q4VFZ2_3mutA_WS





EAAAKEAAAKEAAAK
12,016
PERV_Q4VFZ2_3mutA_WS





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,017
AVIRE_P03360_3mutA





PAPAPAPAPAP
12,018
MLVCB_P08361_3mutA





PAPAPAPAPAP
12,019
MLVFF_P26809_3mutA





EAAAKGGSPAP
12,020
PERV_Q4VFZ2_3mutA_WS





PAP

MLVMS_P03355_PLV919





PAPGGGGSS
12,022
WMSV_P03359_3mutA





SGSETPGTSESATPES
12,023
MLVFF_P26809_3mutA





PAPEAAAKGSS
12,024
XMRV6_A1Z651_3mutA





EAAAKGGSGGG
12,025
MLVMS_P03355_PLV919





GGGGSGGGGS
12,026
MLVFF_P26809_3mutA





GGGPAPGSS
12,027
MLVAV_P03356_3mutA





GGSGGSGGSGGSGGSGGS
12,028
XMRV6_A1Z651_3mut





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
12,029
MLVCB_P08361_3mutA





GSSPAP
12,030
AVIRE_P03360_3mutA





EAAAKGSSPAP
12,031
MLVFF_P26809_3mutA





GSSGGGEAAAK
12,032
MLVFF_P26809_3mutA





GGSGGSGGSGGSGGSGGS
12,033
MLVMS_P03355_3mutA_WS





PAPAPAPAP
12,034
MLVFF_P26809_3mutA





EAAAKEAAAKEAAAKEAAAK
12,035
XMRV6_A1Z651_3mutA





EAAAKGGSPAP
12,036
MLVMS_P03355_3mutA_WS





PAPGGSEAAAK
12,037
AVIRE_P03360_3mutA





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
12,038
AVIRE_P03360_3mutA





EAAAKGGGGSEAAAK
12,039
MLVCB_P08361_3mutA





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,040
WMSV_P03359_3mutA





GSS

MLVMS_P03355_PLV919





GSSGSSGSSGSS
12,042
MLVMS_P03355_PLV919





GSSPAPEAAAK
12,043
XMRV6_A1Z651_3mutA





GGSPAPEAAAK
12,044
MLVFF_P26809_3mutA





GGGEAAAKGGS
12,045
MLVFF_P26809_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAK
12,046
PERV_Q4VFZ2_3mutA_WS





GGGGGGGG
12,047
PERV_Q4VFZ2_3mut





GGGPAP
12,048
MLVCB_P08361_3mutA





PAPAPAPAPAPAP
12,049
MLVCB_P08361_3mutA





GGSGGSGGSGGSGGSGGS
12,050
MLVCB_P08361_3mutA





PAP

MLVMS_P03355_3mutA_WS





GGSGGSGGSGGSGGSGGS
12,052
PERV_Q4VFZ2_3mutA_WS





PAPAPAPAPAPAP
12,053
MLVMS_P03355_PLV919





EAAAKPAPGSS
12,054
MLVMS_P03355_3mutA_WS





EAAAKEAAAKEAAAKEAAAK
12,055
MLVMS_P03355_3mutA_WS





EAAAKGGS
12,056
MLVMS_P03355_3mutA_WS





GGGGSEAAAKGGGGS
12,057
MLVFF_P26809_3mutA





EAAAKPAPGSS
12,058
MLVFF_P26809_3mutA





GGGGSGGGGSGGGGSGGGGS
12,059
MLVMS_P03355_PLV919





EAAAKGGGGGS
12,060
MLVMS_P03355_PLV919





GGSPAP
12,061
XMRV6_A1Z651_3mutA





EAAAKGGGPAP
12,062
MLVMS_P03355_PLV919





EAAAKEAAAKEAAAKEAAAKEAAAK
12,063
MLVFF_P26809_3mutA





PAP

MLVCB_P08361_3mutA





EAAAK
12,065
XMRV6_A1Z651_3mutA





GGSGSSPAP
12,066
PERV_Q4VFZ2_3mutA_WS





GSSGSSGSSGSSGSSGSS
12,067
MLVMS_P03355_PLV919





GSSEAAAKGGG
12,068
MLVAV_P03356_3mutA





GGGEAAAKGGS
12,069
XMRV6_A1Z651_3mutA





EAAAKGGGGSEAAAK
12,070
MLVAV_P03356_3mutA





GGGGSGGGGSGGGGS
12,071
MLVFF_P26809_3mutA





GGGGSGGGGSGGGGSGGGGS
12,072
AVIRE_P03360_3mutA





SGSETPGTSESATPES
12,073
AVIRE_P03360_3mutA





GGGEAAAKPAP
12,074
MLVFF_P26809_3mutA





EAAAKGSSGGG
12,075
MLVMS_P03355_3mutA_WS





EAAAKEAAAKEAAAKEAAAKEAAAK
12,076
WMSV_P03359_3mut





GGSGGSGGSGGS
12,077
XMRV6_A1Z651_3mutA





GGSEAAAKPAP
12,078
MLVFF_P26809_3mutA





EAAAKGSSGGG
12,079
XMRV6_A1Z651_3mutA





GGGGS
12,080
MLVFF_P26809_3mutA





GGGEAAAKGSS
12,081
MLVMS_P03355_PLV919





PAPAPAPAPAPAP
12,082
MLVAV_P03356_3mutA





GGGGSGGGGSGGGGSGGGGS
12,083
MLVCB_P08361_3mutA





GGGEAAAKGSS
12,084
MLVCB_P08361_3mutA





PAPGGSGSS
12,085
MLVFF_P26809_3mutA





GSAGSAAGSGEF
12,086
MLVCB_P08361_3mutA





PAPGGSEAAAK
12,087
MLVMS_P03355_3mutA_WS





GGSGSS
12,088
XMRV6_A1Z651_3mutA





PAPGGGGSS
12,089
MLVMS_P03355_PLV919





GSSGSSGSS
12,090
XMRV6_A1Z651_3mut





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,091
MLVMS_P03355_3mutA_WS





EAAAK
12,092
MLVMS_P03355_PLV919





GSSGSSGSSGSS
12,093
MLVFF_P26809_3mutA





PAPGGGGSS
12,094
MLVCB_P08361_3mutA





GGGEAAAKGGS
12,095
MLVCB_P08361_3mutA





PAPGGGEAAAK
12,096
MLVMS_P03355_PLV919





GGGGGSPAP
12,097
XMRV6_A1Z651_3mutA





EAAAKGGS
12,098
XMRV6_A1Z651_3mutA





EAAAKGSSPAP
12,099
XMRV6_A1Z651_3mut





PAPEAAAK
12,100
MLVAV_P03356_3mutA





GGSGGSGGSGGS
12,101
MLVMS_P03355_3mutA_WS





GGGPAPGGS
12,102
MLVMS_P03355_PLV919





GSSGSSGSSGSS
12,103
PERV_Q4VFZ2_3mutA_WS





EAAAKPAPGGS
12,104
MLVCB_P08361_3mutA





GSSGSS
12,105
MLVFF_P26809_3mutA





EAAAKEAAAKEAAAKEAAAK
12,106
MLVCB_P08361_3mutA





EAAAKEAAAKEAAAKEAAAK
12,107
FLV_P10273_3mutA





GSS

MLVFF_P26809_3mutA





EAAAKEAAAK
12,109
MLVMS_P03355_3mutA_WS





PAPEAAAKGGG
12,110
MLVAV_P03356_3mutA





GGSGSSEAAAK
12,111
MLVFF_P26809_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAK
12,112
PERV_Q4VFZ2





GSSEAAAKPAP
12,113
AVIRE_P03360_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAK
12,114
MLVCB_P08361_3mutA





EAAAKGGG
12,115
MLVFF_P26809_3mutA





GSSPAPGGG
12,116
MLVCB_P08361_3mutA





GGGPAPGSS
12,117
MLVMS_P03355_PLV919





GGGGGS
12,118
MLVMS_P03355_3mutA_WS





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
12,119
PERV_Q4VFZ2_3mut





GGGGSGGGGSGGGGSGGGGSGGGGS
12,120
WMSV_P03359_3mutA





EAAAKEAAAKEAAAK
12,121
PERV_Q4VFZ2_3mut





PAPAPAPAP
12,122
MLVCB_P08361_3mutA





GSSGSSGSSGSSGSS
12,123
PERV_Q4VFZ2_3mut





GGGGSSEAAAK
12,124
MLVMS_P03355_3mutA_WS





GGSGGSGGSGGS
12,125
MLVCB_P08361_3mutA





PAPEAAAKGGS
12,126
MLVCB_P08361_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
12,127
MLVCB_P08361_3mutA





EAAAKGGGGSEAAAK
12,128
MLVMS_P03355_PLV919





EAAAKGGGGSEAAAK
12,129
MLVMS_P03355_3mutA_WS





EAAAKGGGPAP
12,130
XMRV6_A1Z651_3mut





EAAAKEAAAKEAAAKEAAAKEAAAK
12,131
MLVMS_P03355_3mutA_WS





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,132
FLV_P10273_3mutA





GGSEAAAKGGG
12,133
MLVMS_P03355_3mutA_WS





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
12,134
KORV_Q9TTC1-Pro_3mutA





GGGPAPGGS
12,135
MLVCB_P08361_3mutA





PAPAPAPAPAPAP
12,136
XMRV6_A1Z651_3mutA





GGSGSSGGG
12,137
XMRV6_A1Z651_3mutA





GGSGSSGGG
12,138
MLVCB_P08361_3mutA





GGGEAAAKGGS
12,139
MLVMS_P03355_3mutA_WS





EAAAK
12,140
MLVCB_P08361_3mutA





GGSPAPGSS
12,141
MLVMS_P03355_3mutA_WS





GGGGSSEAAAK
12,142
PERV_Q4VFZ2_3mut





PAPAPAPAPAP
12,143
MLVBM_Q7SVK7_3mut





EAAAKEAAAKEAAAKEAAAK
12,144
MLVAV_P03356_3mutA





GGGGGSGSS
12,145
MLVCB_P08361_3mutA





EAAAKGSSPAP
12,146
MLVMS_P03355_3mutA_WS





PAPAPAPAPAPAP
12,147
MLVMS_P03355_3mutA_WS





GSSGGGGGS
12,148
MLVMS_P03355_3mutA_WS





PAPGSSGGG
12,149
MLVMS_P03355_PLV919





GGSGGGPAP
12,150
MLVCB_P08361_3mutA





GGGGGGG
12,151
MLVCB_P08361_3mutA





GSSGSSGSSGSSGSSGSS
12,152
MLVCB_P08361_3mutA





GGGPAPGGS
12,153
MLVFF_P26809_3mutA





EAAAKGGSGGG
12,154
PERV_Q4VFZ2_3mut





EAAAKGGGGSS
12,155
MLVMS_P03355_3mutA_WS





GSSGSSGSSGSSGSSGSS
12,156
MLVMS_P03355_3mut





GGGGSGGGGSGGGGSGGGGS
12,157
MLVBM_Q7SVK7_3mutA_WS





PAPAPAPAPAP
12,158
MLVMS_P03355_PLV919





GGGEAAAKGGS
12,159
MLVMS_P03355_PLV919





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,160
MLVMS_P03355_3mut





GSAGSAAGSGEF
12,161
MLVMS_P03355_3mutA_WS





GSSGSSGSSGSSGSS
12,162
MLVFF_P26809_3mutA





EAAAKGGSGSS
12,163
MLVFF_P26809_3mutA





PAPGGG
12,164
MLVFF_P26809_3mutA





GGGPAPGSS
12,165
XMRV6_A1Z651_3mutA





PAPEAAAKGGS
12,166
AVIRE_P03360_3mutA





PAPGGGEAAAK
12,167
MLVFF_P26809_3mut





GGGGSSEAAAK
12,168
MLVCB_P08361_3mutA





EAAAK
12,169
MLVMS_P03355_PLV919





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
12,170
BAEVM_P10272_3mutA





GGSGGGEAAAK
12,171
MLVMS_P03355_PLV919





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,172
MLVFF_P26809_3mutA





GSSPAPGGS
12,173
XMRV6_A1Z651_3mutA





GGSGGGPAP
12,174
MLVMS_P03355_PLV919





EAAAK
12,175
AVIRE_P03360_3mutA





GSS

XMRV6_A1Z651_3mutA





GGSGGSGGS
12,177
MLVFF_P26809_3mutA





EAAAKEAAAKEAAAKEAAAK
12,178
AVIRE_P03360_3mut





PAPEAAAKGGG
12,179
PERV_Q4VFZ2_3mutA_WS





GGGGGSEAAAK
12,180
BAEVM_P10272_3mutA





GGSGSSGGG
12,181
MLVMS_P03355_3mutA_WS





GGGGGGG
12,182
MLVMS_P03355_3mutA_WS





GSSEAAAKPAP
12,183
PERV_Q4VFZ2_3mut





GGGGGSEAAAK
12,184
WMSV_P03359_3mut





GGGGSGGGGSGGGGSGGGGSGGGGS
12,185
MLVFF_P26809_3mut





GGGEAAAKGGS
12,186
AVIRE_P03360_3mutA





GGSPAPGGG
12,187
AVIRE_P03360_3mutA





GSAGSAAGSGEF
12,188
MLVAV_P03356_3mutA





EAAAK
12,189
MLVAV_P03356_3mutA





EAAAKPAPGSS
12,190
WMSV_P03359_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
12,191
PERV_Q4VFZ2_3mutA_WS





GGSEAAAKPAP
12,192
MLVCB_P08361_3mutA





PAPAPAPAPAPAP
12,193
MLVBM_Q7SVK7_3mutA_WS





GGSPAPGGG
12,194
MLVMS_P03355_3mutA_WS





GGSEAAAKGGG
12,195
MLVMS_P03355_3mut





GGSGGSGGSGGS
12,196
MLVFF_P26809_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
12,197
MLVFF_P26809_3mutA





GGG

AVIRE_P03360_3mutA





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,199
PERV_Q4VFZ2_3mut





GGSGGSGGSGGS
12,200
MLVMS_P03355_3mutA_WS





GGGEAAAK
12,201
MLVCB_P08361_3mutA





GSSGSSGSSGSSGSSGSS
12,202
MLVMS_P03355_3mutA_WS





GSSGGGPAP
12,203
MLVMS_P03355_3mutA_WS





GSSEAAAKPAP
12,204
MLVFF_P26809_3mutA





EAAAKEAAAK
12,205
MLVMS_P03355_PLV919





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
12,206
MLVCB_P08361_3mut





GGGGGG
12,207
MLVMS_P03355_3mutA_WS





GGSGSSGGG
12,208
MLVFF_P26809_3mutA





GSSGGGEAAAK
12,209
PERV_Q4VFZ2_3mutA_WS





PAPAPAPAPAP
12,210
PERV_Q4VFZ2_3mut





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
12,211
SFV3L_P27401_2mut





EAAAKGGSGGG
12,212
BAEVM_P10272_3mutA





GGGGSSPAP
12,213
PERV_Q4VFZ2_3mutA_WS





GGGEAAAKPAP
12,214
MLVMS_P03355_PLV919





GGSGGGPAP
12,215
BAEVM_P10272_3mutA





PAPGSSGGS
12,216
MLVMS_P03355_PLV919





GGSGGGPAP
12,217
MLVMS_P03355_3mutA_WS





EAAAKGGSPAP
12,218
PERV_Q4VFZ2_3mutA_WS





EAAAKGGSGGG
12,219
MLVMS_P03355_3mutA_WS





PAPGSSGGG
12,220
MLVFF_P26809_3mutA





GSSEAAAKGGS
12,221
MLVFF_P26809_3mutA





PAPGSSEAAAK
12,222
MLVFF_P26809_3mutA





EAAAKGSSPAP
12,223
KORV_Q9TTC1-Pro_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAK
12,224
MLVBM_Q7SVK7_3mutA_WS





PAPGSSEAAAK
12,225
MLVMS_P03355_PLV919





EAAAKGSSGGG
12,226
MLVMS_P03355_3mutA_WS





EAAAKGGGGGS
12,227
AVIRE_P03360_3mutA





EAAAKEAAAKEAAAK
12,228
MLVMS_P03355_PLV919





PAPAPAPAPAPAP
12,229
MLVFF_P26809_3mutA





GGGGSGGGGSGGGGS
12,230
MLVCB_P08361_3mutA





PAPGGSEAAAK
12,231
MLVCB_P08361_3mutA





PAPGSSEAAAK
12,232
MLVBM_Q7SVK7_3mutA_WS





PAPEAAAKGSS
12,233
AVIRE_P03360_3mutA





GGSPAPGSS
12,234
WMSV_P03359_3mutA





PAPGGSGGG
12,235
MLVMS_P03355_PLV919





EAAAKGGSGSS
12,236
MLVMS_P03355_3mutA_WS





GGSGGG
12,237
MLVFF_P26809_3mutA





GGSEAAAKGSS
12,238
KORV_Q9TTC1_3mutA





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,239
MLVCB_P08361_3mutA





PAPAPAPAPAPAP
12,240
PERV_Q4VFZ2_3mutA_WS





PAPEAAAK
12,241
MLVMS_P03355_3mutA_WS





GGSEAAAKGGG
12,242
MLVMS_P03355_PLV919





GSSPAP
12,243
MLVMS_P03355_3mutA_WS





GGGGSS
12,244
MLVMS_P03355_PLV919





GGGEAAAKPAP
12,245
AVIRE_P03360_3mutA





EAAAKPAPGGS
12,246
MLVAV_P03356_3mutA





EAAAKGGGPAP
12,247
MLVAV_P03356_3mutA





PAPGGSEAAAK
12,248
BAEVM_P10272_3mutA





PAPGGSGSS
12,249
MLVMS_P03355_3mutA_WS





PAPGGSGSS
12,250
AVIRE_P03360_3mutA





GGSGGGPAP
12,251
MLVMS_P03355_3mutA_WS





EAAAKEAAAKEAAAKEAAAK
12,252
BAEVM_P10272_3mutA





GGGGSGGGGSGGGGSGGGGSGGGGS
12,253
MLVMS_P03355_PLV919





GGGGSSPAP
12,254
MLVCB_P08361_3mutA





GSSGGGPAP
12,255
MLVFF_P26809_3mutA





GGGGSSGGS
12,256
MLVMS_P03355_PLV919





GGSGGG
12,257
MLVCB_P08361_3mutA





GSSGGGGGS
12,258
MLVMS_P03355_PLV919





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
12,259
XMRV6_A1Z651_3mutA





GGGGGSGSS
12,260
KORV_Q9TTC1_3mut





GGGEAAAKGGS
12,261
BAEVM_P10272_3mutA





GGSGGG
12,262
BAEVM_P10272_3mutA





PAPAPAP
12,263
KORV_Q9TTC1-Pro_3mut





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,264
SFV3L_P27401_2mutA





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,265
MLVBM_Q7SVK7_3mutA_WS





GSSGSSGSSGSSGSS
12,266
MLVMS_P03355_3mutA_WS





GSSGGGEAAAK
12,267
MLVMS_P03355_3mutA_WS





GSSGGSEAAAK
12,268
MLVFF_P26809_3mutA





PAP

MLVMS_P03355_PLV919





EAAAKGGGGSEAAAK
12,270
MLVBM_Q7SVK7_3mutA_WS





PAPAP
12,271
AVIRE_P03360_3mutA





PAP

MLVFF_P26809_3mutA





GSSGGG
12,273
MLVMS_P03355_3mut





GSSPAPGGS
12,274
MLVFF_P26809_3mutA





PAPAPAPAP
12,275
XMRV6_A1Z651_3mutA





EAAAKGSSGGS
12,276
PERV_Q4VFZ2_3mut





PAPEAAAKGGG
12,277
KORV_Q9TTC1-Pro_3mutA





PAPGGS
12,278
MLVCB_P08361_3mutA





EAAAKGGG
12,279
MLVCB_P08361_3mutA





GSSEAAAKPAP
12,280
MLVMS_P03355_PLV919





PAPGGS
12,281
MLVFF_P26809_3mutA





EAAAKGGS
12,282
MLVCB_P08361_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
12,283
FLV_P10273_3mutA





PAPGGSEAAAK
12,284
MLVAV_P03356_3mutA





GSS

MLVCB_P08361_3mutA





GSSGSSGSSGSS
12,286
AVIRE_P03360_3mutA





GSSGSSGSS
12,287
MLVFF_P26809_3mutA





GSSGGG
12,288
MLVMS_P03355_PLV919





EAAAK
12,289
MLVFF_P26809_3mutA





GGSPAPEAAAK
12,290
MLVCB_P08361_3mutA





GGSGSS
12,291
MLVCB_P08361_3mutA





GSSPAPGGG
12,292
MLVMS_P03355_PLV919





EAAAKEAAAKEAAAKEAAAKEAAAK
12,293
MLVAV_P03356_3mutA





EAAAKGSSPAP
12,294
FLV_P10273_3mutA





GGGGSS
12,295
XMRV6_A1Z651_3mutA





GGSPAPGSS
12,296
MLVMS_P03355_PLV919





EAAAKEAAAKEAAAKEAAAKEAAAK
12,297
MLVMS_P03355_3mutA_WS





PAPEAAAKGGG
12,298
FLV_P10273_3mutA





EAAAKPAPGGS
12,299
XMRV6_A1Z651_3mut





PAPAP
12,300
BAEVM_P10272_3mutA





EAAAKEAAAKEAAAKEAAAK
12,301
MLVMS_P03355_PLV919





GSSPAPGGG
12,302
MLVMS_P03355_PLV919





EAAAKGGGPAP
12,303
KORV_Q9TTC1_3mutA





PAPEAAAK
12,304
MLVMS_P03355_PLV919





PAPGGGEAAAK
12,305
PERV_Q4VFZ2_3mutA_WS





EAAAKGSSGGS
12,306
MLVMS_P03355_3mutA_WS





EAAAKEAAAKEAAAK
12,307
MLVMS_P03355_PLV919





GSSEAAAK
12,308
MLVMS_P03355_3mutA_WS





GSSGSSGSSGSS
12,309
MLVMS_P03355_3mutA_WS





GGGGSGGGGSGGGGSGGGGS
12,310
MLVMS_P03355_3mutA_WS





EAAAKGGGGSEAAAK
12,311
MLVMS_P03355_3mut





GGS

MLVCB_P08361_3mutA





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
12,313
XMRV6_A1Z651_3mutA





GGSGSSPAP
12,314
MLVCB_P08361_3mutA





GGGGSGGGGSGGGGS
12,315
XMRV6_A1Z651_3mutA





PAPAPAPAPAP
12,316
BAEVM_P10272_3mutA





PAPAPAPAPAP
12,317
MLVMS_P03355_3mutA_WS





EAAAKEAAAKEAAAKEAAAK
12,318
MLVBM_Q7SVK7_3mut





GGGGSGGGGSGGGGSGGGGSGGGGS
12,319
BAEVM_P10272_3mutA





GGSGGSGGS
12,320
MLVMS_P03355_3mutA_WS





EAAAKPAPGSS
12,321
MLVMS_P03355_PLV919





GSS

MLVMS_P03355_3mutA_WS





PAPEAAAKGGS
12,323
MLVMS_P03355_3mutA_WS





GGGPAPGGS
12,324
MLVMS_P03355_3mutA_WS





EAAAKGGGGSS
12,325
MLVAV_P03356_3mutA





GSSGSSGSSGSSGSS
12,326
MLVFF_P26809_3mut





SGSETPGTSESATPES
12,327
PERV_Q4VFZ2_3mut





GGSEAAAKGGG
12,328
MLVMS_P03355_3mut





GSSGSSGSSGSSGSSGSS
12,329
AVIRE_P03360_3mutA





PAPAPAPAPAPAP
12,330
AVIRE_P03360_3mut





GGSGGS
12,331
XMRV6_A1Z651_3mutA





PAPGSSEAAAK
12,332
MLVCB_P08361_3mut





GGSPAPEAAAK
12,333
PERV_Q4VFZ2_3mut





EAAAKGGGGGS
12,334
MLVCB_P08361_3mutA





GGSGGSGGSGGS
12,335
MLVMS_P03355_PLV919





GGGGSSEAAAK
12,336
MLVMS_P03355_PLV919





GSSEAAAKGGG
12,337
MLVFF_P26809_3mutA





PAPGGS
12,338
MLVMS_P03355_3mutA_WS





EAAAKGGSGGG
12,339
MLVCB_P08361_3mutA





EAAAKGGG
12,340
PERV_Q4VFZ2_3mut





PAPGGS
12,341
XMRV6_A1Z651_3mutA





GSSPAPGGG
12,342
XMRV6_A1Z651_3mutA





PAPEAAAKGGG
12,343
MLVMS_P03355_3mutA_WS





GSSEAAAKGGG
12,344
PERV_Q4VFZ2_3mutA_WS





PAPGGSEAAAK
12,345
XMRV6_A1Z651_3mutA





GGGGGS
12,346
MLVMS_P03355_3mutA_WS





GGSPAPEAAAK
12,347
MLVMS_P03355_3mutA_WS





GGGPAP
12,348
MLVFF_P26809_3mutA





PAPGSSGGG
12,349
XMRV6_A1Z651_3mutA





PAPGSSGGG
12,350
MLVBM_Q7SVK7_3mutA_WS





GGGEAAAKGSS
12,351
MLVMS_P03355_3mutA_WS





GSSEAAAKGGS
12,352
MLVCB_P08361_3mutA





PAPGGSGSS
12,353
MLVCB_P08361_3mutA





EAAAKGGGGSEAAAK
12,354
BAEVM_P10272_3mutA





PAPAPAP
12,355
PERV_Q4VFZ2_3mutA_WS





GGGGGG
12,356
MLVAV_P03356_3mutA





GSSPAPEAAAK
12,357
MLVCB_P08361_3mutA





GGSGGSGGS
12,358
MLVMS_P03355_3mutA_WS





GSSGSSGSSGSSGSS
12,359
XMRV6_A1Z651_3mut





GGGPAPGGS
12,360
XMRV6_A1Z651_3mutA





GGGPAPEAAAK
12,361
BAEVM_P10272_3mutA





GGSGGG
12,362
AVIRE_P03360_3mutA





SGSETPGTSESATPES
12,363
PERV_Q4VFZ2_3mutA_WS





EAAAKGSSPAP
12,364
MLVMS_P03355_PLV919





GSSEAAAK
12,365
XMRV6_A1Z651_3mut





GSSGGSGGG
12,366
MLVFF_P26809_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAK
12,367
WMSV_P03359_3mutA





GGGGSEAAAKGGGGS
12,368
MLVMS_P03355_PLV919





PAPGGGGSS
12,369
MLVMS_P03355_3mutA_WS





SGSETPGTSESATPES
12,370
MLVMS_P03355_3mutA_WS





GGSPAPEAAAK
12,371
KORV_Q9TTC1-Pro_3mutA





GSSEAAAKGGG
12,372
MLVMS_P03355_3mutA_WS





GSSEAAAK
12,373
WMSV_P03359_3mutA





GGGGSEAAAKGGGGS
12,374
AVIRE_P03360_3mutA





GSS

WMSV_P03359_3mutA





PAPGGSEAAAK
12,376
MLVFF_P26809_3mutA





GGGGS
12,377
MLVMS_P03355_3mutA_WS





GGGPAP
12,378
MLVMS_P03355_3mutA_WS





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
12,379
MLVMS_P03355_3mutA_WS





EAAAKPAPGSS
12,380
PERV_Q4VFZ2_3mut





EAAAKPAPGSS
12,381
MLVCB_P08361_3mutA





GGGGGG
12,382
WMSV_P03359_3mutA





EAAAKPAPGGS
12,383
MLVMS_P03355_PLV919





PAPGGGEAAAK
12,384
PERV_Q4VFZ2_3mut





EAAAKEAAAKEAAAKEAAAKEAAAK
12,385
AVIRE_P03360_3mutA





GSSEAAAKPAP
12,386
XMRV6_A1Z651_3mutA





PAPGGSEAAAK
12,387
MLVBM_Q7SVK7_3mutA_WS





PAPGSS
12,388
MLVCB_P08361_3mutA





EAAAKGGG
12,389
MLVMS_P03355_3mutA_WS





EAAAKPAP
12,390
MLVCB_P08361_3mutA





PAPEAAAKGGS
12,391
MLVBM_Q7SVK7_3mutA_WS





GGSPAPGGG
12,392
MLVCB_P08361_3mutA





PAPGGSGSS
12,393
WMSV_P03359_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
12,394
MLVMS_P03355_PLV919





GGSGGGPAP
12,395
MLVMS_P03355_PLV919





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,396
MLVMS_P03355





PAPEAAAKGSS
12,397
MLVCB_P08361_3mutA





EAAAKGSS
12,398
MLVMS_P03355_3mutA_WS





GGSGGS
12,399
MLVMS_P03355_3mutA_WS





EAAAKEAAAKEAAAKEAAAKEAAAK
12,400
BAEVM_P10272_3mutA





GGGGSEAAAKGGGGS
12,401
FLV_P10273_3mutA





GGSEAAAKGGG
12,402
MLVCB_P08361_3mutA





GSSGSSGSSGSSGSS
12,403
BAEVM_P10272_3mutA





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
12,404
MLVFF_P26809_3mutA





EAAAKGGG
12,405
PERV_Q4VFZ2_3mut





GGGGGSEAAAK
12,406
MLVCB_P08361_3mutA





EAAAKPAPGGS
12,407
MLVMS_P03355_3mutA_WS





GGGGGSGSS
12,408
XMRV6_A1Z651_3mutA





PAPGSSEAAAK
12,409
MLVMS_P03355_3mutA_WS





GSSEAAAKPAP
12,410
MLVCB_P08361_3mutA





EAAAKGSSPAP
12,411
MLVAV_P03356_3mutA





GGGPAPGGS
12,412
WMSV_P03359_3mutA





GGSPAP
12,413
MLVMS_P03355_3mutA_WS





GGSEAAAKGGG
12,414
MLVMS_P03355_3mutA_WS





GGGGGGGG
12,415
MLVFF_P26809_3mutA





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
12,416
MLVMS_P03355_3mutA_WS





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
12,417
MLVBM_Q7SVK7_3mutA_WS





GSSPAPGGG
12,418
MLVAV_P03356_3mutA





GGGGGG
12,419
AVIRE_P03360_3mutA





GSSGGS
12,420
MLVMS_P03355_3mutA_WS





GGSPAPGSS
12,421
MLVFF_P26809_3mutA





PAPEAAAKGGG
12,422
PERV_Q4VFZ2_3mut





EAAAKGGGPAP
12,423
MLVFF_P26809_3mutA





GGGEAAAKGGS
12,424
MLVMS_P03355_PLV919





GGSGSSPAP
12,425
MLVFF_P26809_3mutA





SGSETPGTSESATPES
12,426
WMSV_P03359_3mutA





PAPGGSEAAAK
12,427
MLVBM_Q7SVK7_3mutA_WS





GGSGGG
12,428
MLVMS_P03355_PLV919





GGGGSSPAP
12,429
PERV_Q4VFZ2_3mut





GGGEAAAKGSS
12,430
MLVAV_P03356_3mutA





PAPAPAPAPAPAP
12,431
MLVMS_P03355_3mutA_WS





EAAAKGGGGSEAAAK
12,432
PERV_Q4VFZ2





EAAAKEAAAKEAAAKEAAAKEAAAK
12,433
MLVMS_P03355_PLV919





GGGGGSEAAAK
12,434
PERV_Q4VFZ2_3mut





PAPGSSEAAAK
12,435
MLVCB_P08361_3mutA





GSAGSAAGSGEF
12,436
PERV_Q4VFZ2_3mutA_WS





EAAAKGGGGSEAAAK
12,437
MLVFF_P26809_3mutA





GGSPAPGGG
12,438
PERV_Q4VFZ2_3mutA_WS





GSSEAAAKGGG
12,439
AVIRE_P03360_3mutA





GGGEAAAKPAP
12,440
MLVMS_P03355_3mutA_WS





GGGPAP
12,441
AVIRE_P03360_3mutA





GGSEAAAK
12,442
MLVCB_P08361_3mutA





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
12,443
PERV_Q4VFZ2_3mut





EAAAKPAPGGS
12,444
MLVBM_Q7SVK7_3mutA_WS





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,445
XMRV6_A1Z651_3mut





GGGGGGGG
12,446
MLVCB_P08361_3mutA





PAPGSS
12,447
PERV_Q4VFZ2_3mut





EAAAK
12,448
PERV_Q4VFZ2_3mut





GSAGSAAGSGEF
12,449
MLVMS_P03355_3mutA_WS





PAPGGGEAAAK
12,450
PERV_Q4VFZ2_3mut





EAAAKGSSGGS
12,451
MLVFF_P26809_3mut





GGGGSEAAAKGGGGS
12,452
BAEVM_P10272_3mutA





GGGGSGGGGSGGGGS
12,453
MLVMS_P03355_PLV919





EAAAKGGGGSEAAAK
12,454
BAEVM_P10272_3mut





PAPGGGEAAAK
12,455
MLVMS_P03355_3mutA_WS





GGSEAAAKPAP
12,456
MLVMS_P03355_3mutA_WS





PAPAP
12,457
MLVCB_P08361_3mutA





PAPAP
12,458
MLVFF_P26809_3mutA





GGSPAP
12,459
AVIRE_P03360_3mutA





EAAAKGSSGGS
12,460
MLVCB_P08361_3mutA





PAPGSSGGS
12,461
AVIRE_P03360_3mutA





EAAAKGGGGSEAAAK
12,462
XMRV6_A1Z651_3mutA





PAPAPAP
12,463
BAEVM_P10272_3mutA





GGSGGSGGSGGSGGSGGS
12,464
MLVMS_P03355_PLV919





GGGGGSGSS
12,465
MLVMS_P03355_PLV919





PAPGSSEAAAK
12,466
XMRV6_A1Z651_3mut





GGSEAAAKPAP
12,467
XMRV6_A1Z651_3mutA





EAAAKEAAAKEAAAKEAAAK
12,468
XMRV6_A1Z651_3mut





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,469
WMSV_P03359_3mut





GGSGGGEAAAK
12,470
XMRV6_A1Z651_3mutA





GGGEAAAK
12,471
XMRV6_A1Z651_3mutA





GGGGSGGGGSGGGGS
12,472
MLVMS_P03355_3mutA_WS





GGSGGSGGSGGSGGS
12,473
MLVFF_P26809_3mutA





GSSGGGGGS
12,474
MLVMS_P03355_3mut





PAPGGSEAAAK
12,475
MLVMS_P03355_3mutA_WS





GSSGGSPAP
12,476
MLVMS_P03355_3mutA_WS





SGSETPGTSESATPES
12,477
XMRV6_A1Z651_3mutA





GGGGSGGGGS
12,478
MLVMS_P03355_PLV919





PAPAPAPAPAP
12,479
MLVMS_P03355_3mut





GSSGSS
12,480
XMRV6_A1Z651_3mutA





GSSEAAAKPAP
12,481
PERV_Q4VFZ2_3mut





GGSGSSGGG
12,482
MLVMS_P03355_3mutA_WS





EAAAKEAAAK
12,483
MLVCB_P08361_3mutA





GSSGSSGSSGSS
12,484
MLVMS_P03355_3mutA_WS





GSSPAPGGG
12,485
PERV_Q4VFZ2_3mutA_WS





EAAAKEAAAKEAAAK
12,486
MLVMS_P03355_3mutA_WS





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,487
SFV1_P23074_2mutA





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
12,488
MLVMS_P03355_PLV919





GSAGSAAGSGEF
12,489
MLVMS_P03355_PLV919





PAPGSSEAAAK
12,490
MLVMS_P03355_3mutA_WS





GGSEAAAK
12,491
MLVMS_P03355_3mutA_WS





GSSGSSGSSGSSGSS
12,492
PERV_Q4VFZ2_3mutA_WS





GGSEAAAKPAP
12,493
PERV_Q4VFZ2_3mutA_WS





GGSGGSGGS
12,494
MLVCB_P08361_3mutA





EAAAKGGSGSS
12,495
MLVCB_P08361_3mutA





GGGGSGGGGSGGGGSGGGGSGGGGS
12,496
FLV_P10273_3mutA





EAAAKEAAAKEAAAKEAAAK
12,497
MLVBM_Q7SVK7_3mutA_WS





GGSGSSPAP
12,498
BAEVM_P10272_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAK
12,499
XMRV6_A1Z651_3mutA





GGGGSGGGGSGGGGSGGGGSGGGGS
12,500
MLVBM_Q7SVK7_3mutA_WS





GGSGSS
12,501
WMSV_P03359_3mutA





PAPEAAAK
12,502
MLVCB_P08361_3mutA





EAAAKPAP
12,503
BAEVM_P10272_3mutA





GSSPAP
12,504
PERV_Q4VFZ2_3mutA_WS





GGGPAP
12,505
PERV_Q4VFZ2_3mutA_WS





EAAAKGGSGSS
12,506
MLVMS_P03355_3mutA_WS





EAAAKGGGGSEAAAK
12,507
AVIRE_P03360_3mutA





GGSGGG
12,508
KORV_Q9TTC1-Pro_3mutA





GSSPAP
12,509
MLVFF_P26809_3mutA





GGSGSSEAAAK
12,510
BAEVM_P10272_3mutA





PAPGSSGGS
12,511
BAEVM_P10272_3mutA





GGGGGG
12,512
MLVFF_P26809_3mutA





PAPGGSEAAAK
12,513
MLVMS_P03355_PLV919





PAPGGS
12,514
MLVMS_P03355_PLV919





GGSGGSGGSGGS
12,515
BAEVM_P10272_3mutA





GSSPAP
12,516
MLVCB_P08361_3mutA





PAPAPAPAP
12,517
MLVMS_P03355_3mutA_WS





GGGGGG
12,518
MLVCB_P08361_3mutA





GSSGSSGSSGSSGSSGSS
12,519
KORV_Q9TTC1-Pro_3mutA





GSSEAAAKGGS
12,520
BAEVM_P10272_3mutA





GGSEAAAK
12,521
FLV_P10273_3mutA





GGSGGSGGSGGSGGS
12,522
KORV_Q9TTC1-Pro_3mutA





GSSPAPEAAAK
12,523
PERV_Q4VFZ2_3mut





GSSGSSGSSGSSGSS
12,524
XMRV6_A1Z651_3mutA





EAAAKPAPGGS
12,525
MLVMS_P03355_3mut





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
12,526
FLV_P10273_3mut





GGSPAPEAAAK
12,527
XMRV6_A1Z651_3mut





EAAAKGGSGGG
12,528
MLVFF_P26809_3mutA





EAAAKEAAAKEAAAKEAAAK
12,529
MLVFF_P26809_3mutA





GSSPAP
12,530
WMSV_P03359_3mutA





PAPAPAPAP
12,531
MLVAV_P03356_3mutA





PAPGGSEAAAK
12,532
KORV_Q9TTC1_3mut





GGSGSSEAAAK
12,533
MLVBM_Q7SVK7_3mutA_WS





GSSGGG
12,534
MLVCB_P08361_3mutA





GGGEAAAKGSS
12,535
PERV_Q4VFZ2_3mut





PAPGGSGGG
12,536
MLVFF_P26809_3mutA





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,537
FFV_093209





PAPGGGGSS
12,538
MLVMS_P03355_3mutA_WS





EAAAKGGS
12,539
MLVAV_P03356_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
12,540
MLVBM_Q7SVK7_3mutA_WS





GGSGGSGGS
12,541
WMSV_P03359_3mutA





PAPAP
12,542
MLVMS_P03355_3mutA_WS





GSSGGGEAAAK
12,543
MLVAV_P03356_3mutA





GGGGSSEAAAK
12,544
MLVFF_P26809_3mutA





EAAAKGSSGGS
12,545
MLVMS_P03355_PLV919





EAAAKGGGGSEAAAK
12,546
MLVMS_P03355_3mutA_WS





GGGGGGGG
12,547
MLVMS_P03355_PLV919





GSSGSSGSS
12,548
MLVMS_P03355_PLV919





GGGEAAAKPAP
12,549
PERV_Q4VFZ2_3mutA_WS





GGGGGSGSS
12,550
MLVMS_P03355_3mutA_WS





GGGGGGG
12,551
MLVMS_P03355_PLV919





GGS

MLVMS_P03355_PLV919





GSSGGG
12,553
MLVMS_P03355_3mutA_WS





EAAAKGGSGSS
12,554
PERV_Q4VFZ2_3mutA_WS





PAPGSSEAAAK
12,555
MLVMS_P03355_PLV919





GSSEAAAKPAP
12,556
MLVMS_P03355_PLV919





GGSPAPGSS
12,557
BAEVM_P10272_3mutA





GSAGSAAGSGEF
12,558
MLVCB_P08361_3mut





GGSPAPGGG
12,559
PERV_Q4VFZ2_3mut





GGGGSGGGGSGGGGSGGGGS
12,560
MLVMS_P03355_3mut





GSSGSSGSS
12,561
PERV_Q4VFZ2_3mutA_WS





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
12,562
PERV_Q4VFZ2_3mut





GGGGSEAAAKGGGGS
12,563
MLVCB_P08361_3mutA





GGSEAAAKGSS
12,564
MLVAV_P03356_3mutA





EAAAKGGGGSEAAAK
12,565
MLVCB_P08361_3mut





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
12,566
XMRV6_A1Z651_3mutA





PAPGGGEAAAK
12,567
MLVMS_P03355_3mutA_WS





GSSGGGEAAAK
12,568
PERV_Q4VFZ2_3mutA_WS





GSSGSS
12,569
MLVCB_P08361_3mut





PAPAPAPAPAPAP
12,570
PERV_Q4VFZ2_3mut





GGSPAPGGG
12,571
MLVFF_P26809_3mutA





GGSGGSGGSGGSGGS
12,572
MLVCB_P08361_3mutA





EAAAKEAAAK
12,573
MLVFF_P26809_3mutA





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,574
GALV_P21414_3mut





PAPAPAPAPAPAP
12,575
WMSV_P03359_3mutA





GGGEAAAKGGS
12,576
KORV_Q9TTC1_3mutA





EAAAKGGGPAP
12,577
KORV_Q9TTC1_3mut





PAPEAAAKGSS
12,578
MLVBM_Q7SVK7_3mutA_WS





PAPEAAAKGSS
12,579
FLV_P10273_3mutA





PAPGGSEAAAK
12,580
MLVMS_P03355_3mut





GSSPAPGGG
12,581
BAEVM_P10272_3mutA





GGGEAAAKPAP
12,582
KORV_Q9TTC1-Pro_3mutA





GGGGSGGGGS
12,583
MLVMS_P03355_PLV919





GGGEAAAKGSS
12,584
MLVFF_P26809_3mutA





PAPGGGGSS
12,585
MLVBM_Q7SVK7_3mutA_WS





GSSEAAAK
12,586
BAEVM_P10272_3mutA





GGGGGGGG
12,587
MLVMS_P03355_PLV919





PAPGSSGGS
12,588
MLVAV_P03356_3mutA





GGGGSGGGGSGGGGSGGGGS
12,589
BAEVM_P10272_3mutA





PAP

MLVMS_P03355_3mut





EAAAKGSSPAP
12,591
XMRV6_A1Z651_3mutA





PAPEAAAKGGS
12,592
MLVFF_P26809_3mutA





GSSGGGEAAAK
12,593
BAEVM_P10272_3mutA





PAPAPAP
12,594
MLVMS_P03355_3mutA_WS





GGSEAAAKGGG
12,595
MLVMS_P03355_PLV919





GSSEAAAK
12,596
PERV_Q4VFZ2_3mut





GGGG
12,597
MLVMS_P03355_3mutA_WS





GGGGGS
12,598
MLVMS_P03355_3mut





GGGGSSEAAAK
12,599
PERV_Q4VFZ2_3mut





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
12,600
SFV3L_P27401-Pro_2mutA





GGSEAAAKGSS
12,601
MLVMS_P03355_3mutA_WS





PAPGSSGGS
12,602
XMRV6_A1Z651_3mutA





GGSPAP
12,603
MLVMS_P03355_3mutA_WS





GGGGSSEAAAK
12,604
BAEVM_P10272_3mut





GGSGGSGGSGGS
12,605
AVIRE_P03360_3mutA





PAPGSSGGS
12,606
MLVFF_P26809_3mutA





GSSPAPGGG
12,607
MLVMS_P03355_3mutA_WS





GGGGGGG
12,608
MLVMS_P03355_3mutA_WS





EAAAKGGGGGS
12,609
MLVMS_P03355_3mutA_WS





EAAAKGGSGGG
12,610
MLVMS_P03355_PLV919





GGGGSSEAAAK
12,611
XMRV6_A1Z651_3mutA





GGGGSEAAAKGGGGS
12,612
MLVBM_Q7SVK7_3mutA_WS





GSSGSS
12,613
MLVMS_P03355_PLV919





GGSGGG
12,614
MLVMS_P03355_PLV919





PAPEAAAKGGG
12,615
AVIRE_P03360_3mutA





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,616
FOAMV_P14350-Pro_2mutA





GGGGGSGSS
12,617
PERV_Q4VFZ2_3mut





GSSGSSGSSGSSGSS
12,618
KORV_Q9TTC1-Pro_3mut





GGGGSEAAAKGGGGS
12,619
MLVMS_P03355_3mutA_WS





GGGGGSPAP
12,620
FLV_P10273_3mut





GGGEAAAK
12,621
MLVMS_P03355_3mutA_WS





GGSGGSGGSGGS
12,622
FLV_P10273_3mutA





GGG

MLVMS_P03355_PLV919





GGSPAPEAAAK
12,624
BAEVM_P10272_3mutA





EAAAKEAAAK
12,625
FLV_P10273_3mutA





GGGEAAAKPAP
12,626
BAEVM_P10272_3mutA





GGGEAAAKGGS
12,627
PERV_Q4VFZ2_3mut





GGSGGSGGS
12,628
PERV_Q4VFZ2_3mut





EAAAKGGGPAP
12,629
XMRV6_A1Z651_3mutA





EAAAK
12,630
MLVBM_Q7SVK7_3mutA_WS





PAPEAAAKGGG
12,631
PERV_Q4VFZ2_3mut





EAAAKGSS
12,632
MLVCB_P08361_3mutA





GGSEAAAKGGG
12,633
MLVBM_Q7SVK7_3mutA_WS





GGGGSGGGGSGGGGSGGGGS
12,634
XMRV6_A1Z651_3mutA





GGGGSGGGGSGGGGSGGGGSGGGGS
12,635
BAEVM_P10272_3mut





GGGGSSPAP
12,636
PERV_Q4VFZ2_3mutA_WS





GGSGGSGGSGGSGGSGGS
12,637
PERV_Q4VFZ2_3mut





GGGEAAAKPAP
12,638
PERV_Q4VFZ2_3mut





EAAAKEAAAK
12,639
BAEVM_P10272_3mutA





GGSGSSEAAAK
12,640
XMRV6_A1Z651_3mutA





PAPEAAAKGSS
12,641
WMSV_P03359_3mutA





PAPAPAPAPAP
12,642
XMRV6_A1Z651_3mutA





GSSGGGEAAAK
12,643
MLVMS_P03355_PLV919





GSSPAPGGG
12,644
MLVFF_P26809_3mutA





GGSPAPEAAAK
12,645
MLVFF_P26809_3mut





PAPGGSEAAAK
12,646
PERV_Q4VFZ2_3mut





GGGGSS
12,647
MLVFF_P26809_3mutA





GGSGSSGGG
12,648
BAEVM_P10272_3mutA





GSSGGGEAAAK
12,649
MLVMS_P03355_3mutA_WS





EAAAKGGS
12,650
MLVBM_Q7SVK7_3mutA_WS





GGGPAPGGS
12,651
MLVMS_P03355_PLV919





EAAAKEAAAK
12,652
MLVMS_P03355_PLV919





GSSGSSGSS
12,653
MLVMS_P03355_PLV919





GGGEAAAKPAP
12,654
MLVAV_P03356_3mutA





SGSETPGTSESATPES
12,655
FLV_P10273_3mutA





PAPAPAPAPAP
12,656
KORV_Q9TTC1-Pro_3mut





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,657
BAEVM_P10272_3mutA





PAPGSSGGG
12,658
MLVMS_P03355_3mutA_WS





GSSGGGEAAAK
12,659
XMRV6_A1Z651_3mutA





GGGGSGGGGSGGGGSGGGGSGGGGS
12,660
XMRV6_A1Z651_3mutA





GGGGSSPAP
12,661
MLVFF_P26809_3mutA





GGSGGGPAP
12,662
PERV_Q4VFZ2_3mutA_WS





GSS

PERV_Q4VFZ2_3mut





EAAAKGSSPAP
12,664
MLVMS_P03355_3mut





EAAAKGGG
12,665
XMRV6_A1Z651_3mutA





GSSGSSGSSGSS
12,666
WMSV_P03359_3mutA





PAPEAAAKGSS
12,667
MLVMS_P03355_PLV919





GSSEAAAK
12,668
AVIRE_P03360_3mutA





EAAAKGGSGSS
12,669
AVIRE_P03360_3mutA





GSSEAAAK
12,670
MLVMS_P03355_3mut





GGSGSSEAAAK
12,671
MLVMS_P03355_PLV919





GGSEAAAKGGG
12,672
MLVFF_P26809_3mutA





GGGGSGGGGSGGGGSGGGGS
12,673
MLVAV_P03356_3mutA





PAPAPAPAPAPAP
12,674
MLVFF_P26809_3mut





EAAAKPAPGSS
12,675
KORV_Q9TTC1-Pro_3mut





PAPGSSEAAAK
12,676
MLVAV_P03356_3mutA





GGGGSSPAP
12,677
WMSV_P03359_3mutA





EAAAKGGGGGS
12,678
MLVMS_P03355_3mutA_WS





GGGEAAAKGGS
12,679
MLVMS_P03355_3mut





GGSGSSGGG
12,680
MLVMS_P03355_3mut





GGGPAPGGS
12,681
MLVAV_P03356_3mutA





PAPGGGGGS
12,682
MLVMS_P03355_PLV919





GGGPAPGSS
12,683
PERV_Q4VFZ2_3mut





GGGGGGG
12,684
MLVFF_P26809_3mutA





GGSGGGGSS
12,685
MLVCB_P08361_3mutA





GGGGGG
12,686
FLV_P10273_3mutA





GGSEAAAKGSS
12,687
PERV_Q4VFZ2_3mut





GGSPAPGGG
12,688
BAEVM_P10272_3mutA





GGSPAPGSS
12,689
AVIRE_P03360_3mutA





GGSGGSGGSGGS
12,690
KORV_Q9TTC1_3mut





EAAAKEAAAKEAAAKEAAAKEAAAK
12,691
MLVBM_Q7SVK7_3mut





PAPGSSGGS
12,692
XMRV6_A1Z651_3mut





EAAAKGGGGSS
12,693
PERV_Q4VFZ2_3mutA_WS





GGSGGSGGSGGSGGS
12,694
PERV_Q4VFZ2_3mutA_WS





PAPGGSGGG
12,695
MLVMS_P03355_PLV919





PAPGSSGGG
12,696
PERV_Q4VFZ2_3mutA_WS





GSSGSS
12,697
BAEVM_P10272_3mutA





EAAAKGSS
12,698
MLVFF_P26809_3mutA





GGGPAP
12,699
MLVMS_P03355_PLV919





EAAAKGGGGGS
12,700
MLVFF_P26809_3mutA





EAAAKGGSPAP
12,701
MLVBM_Q7SVK7_3mutA_WS





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
12,702
WMSV_P03359_3mutA





GSSPAPGGG
12,703
MLVBM_Q7SVK7_3mutA_WS





GGGEAAAKGSS
12,704
AVIRE_P03360_3mutA





GGGGSSEAAAK
12,705
AVIRE_P03360_3mutA





GGGGGGGG
12,706
PERV_Q4VFZ2_3mutA_WS





PAPGSSEAAAK
12,707
BAEVM_P10272_3mutA





EAAAKGSS
12,708
MLVFF_P26809_3mut





GSSEAAAKGGG
12,709
MLVCB_P08361_3mutA





GGSEAAAK
12,710
MLVBM_Q7SVK7_3mutA_WS





GSSEAAAKGGG
12,711
PERV_Q4VFZ2_3mutA_WS





PAPGGSGGG
12,712
WMSV_P03359_3mutA





GSSGGSGGG
12,713
MLVCB_P08361_3mutA





EAAAKGSSGGG
12,714
FLV_P10273_3mutA





GSSEAAAK
12,715
MLVCB_P08361_3mutA





GSSGGGEAAAK
12,716
MLVMS_P03355_3mut





GGGGSGGGGS
12,717
MLVCB_P08361_3mutA





EAAAKGGGGSEAAAK
12,718
MLVBM_Q7SVK7_3mutA_WS





EAAAKGGG
12,719
PERV_Q4VFZ2_3mutA_WS





EAAAKGGSPAP
12,720
MLVMS_P03355_PLV919





GGGPAPGGS
12,721
AVIRE_P03360_3mutA





GSSEAAAK
12,722
MLVBM_Q7SVK7_3mutA_WS





GSSGGGEAAAK
12,723
PERV_Q4VFZ2_3mut





SGSETPGTSESATPES
12,724
MLVMS_P03355_PLV919





GGSGSSPAP
12,725
MLVMS_P03355_3mut





GGGGGG
12,726
MLVBM_Q7SVK7_3mutA_WS





GGSPAPGGG
12,727
XMRV6_A1Z651_3mutA





GGSGSS
12,728
PERV_Q4VFZ2_3mutA_WS





PAP

MLVBM_Q7SVK7_3mutA_WS





EAAAKPAPGSS
12,730
MLVMS_P03355_PLV919





EAAAKGGG
12,731
MLVMS_P03355_3mut





GSSEAAAKPAP
12,732
PERV_Q4VFZ2_3mutA_WS





GGGGSS
12,733
MLVMS_P03355_3mutA_WS





GGSGSSEAAAK
12,734
PERV_Q4VFZ2_3mut





GGGGSS
12,735
BAEVM_P10272_3mutA





PAPAP
12,736
MLVFF_P26809_3mut





PAPEAAAKGGG
12,737
BAEVM_P10272_3mutA





EAAAKGGS
12,738
MLVMS_P03355_PLV919





PAPAPAPAPAPAP
12,739
PERV_Q4VFZ2_3mutA_WS





GGGGGSEAAAK
12,740
MLVMS_P03355_3mut





PAPGGS
12,741
PERV_Q4VFZ2_3mut





GGGGSS
12,742
MLVCB_P08361_3mutA





GGGGS
12,743
MLVAV_P03356_3mutA





GSSPAPEAAAK
12,744
MLVMS_P03355_PLV919





GGGGSSGGS
12,745
MLVFF_P26809_3mutA





PAPEAAAKGSS
12,746
MLVMS_P03355_PLV919





GGSGSSEAAAK
12,747
MLVMS_P03355_3mutA_WS





EAAAKGGG
12,748
MLVAV_P03356_3mutA





PAPGSSEAAAK
12,749
FLV_P10273_3mutA





EAAAKGSSGGG
12,750
MLVCB_P08361_3mutA





PAPEAAAK
12,751
KORV_Q9TTC1-Pro_3mutA





GGSPAPEAAAK
12,752
KORV_Q9TTC1-Pro_3mut





GGSGGSGGSGGSGGSGGS
12,753
MLVAV_P03356_3mutA





GSSEAAAKPAP
12,754
MLVBM_Q7SVK7_3mutA_WS





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,755
KORV_Q9TTC1-Pro_3mutA





GSSGGGEAAAK
12,756
XMRV6_A1Z651_3mut





PAPGGSGGG
12,757
AVIRE_P03360_3mutA





PAPGGSEAAAK
12,758
PERV_Q4VFZ2_3mutA_WS





GGGGS
12,759
MLVMS_P03355_3mutA_WS





GGGGSGGGGSGGGGS
12,760
MLVBM_Q7SVK7_3mutA_WS





PAPAPAPAPAP
12,761
PERV_Q4VFZ2_3mutA_WS





EAAAKEAAAKEAAAKEAAAKEAAAK
12,762
MLVMS_P03355_3mut





GSSGGSEAAAK
12,763
MLVMS_P03355_3mutA_WS





GGSGGSGGSGGS
12,764
WMSV_P03359_3mutA





EAAAKGSSGGG
12,765
WMSV_P03359_3mutA





EAAAKGGG
12,766
PERV_Q4VFZ2_3mutA_WS





SGSETPGTSESATPES
12,767
PERV_Q4VFZ2_3mut





PAPGSSGGS
12,768
MLVMS_P03355_3mutA_WS





PAPEAAAKGSS
12,769
PERV_Q4VFZ2_3mut





PAPEAAAK
12,770
AVIRE_P03360_3mutA





GSSEAAAKGGG
12,771
BAEVM_P10272_3mutA





GSSPAP
12,772
MLVAV_P03356_3mutA





EAAAKEAAAKEAAAKEAAAK
12,773
MLVFF_P26809_3mut





PAPGGSGSS
12,774
MLVAV_P03356_3mutA





GGGGSGGGGSGGGGS
12,775
PERV_Q4VFZ2_3mutA_WS





GSSGGSEAAAK
12,776
MLVCB_P08361_3mutA





EAAAKGGS
12,777
KORV_Q9TTC1-Pro_3mutA





EAAAKGGS
12,778
MLVFF_P26809_3mutA





GGSPAP
12,779
MLVMS_P03355_PLV919





GGSGSS
12,780
MLVMS_P03355_PLV919





SGSETPGTSESATPES
12,781
WMSV_P03359_3mut





GGGGGGG
12,782
WMSV_P03359_3mut





GGSPAPGSS
12,783
MLVCB_P08361_3mutA





GGGGSSGGS
12,784
WMSV_P03359_3mut





PAPGGS
12,785
MLVMS_P03355_PLV919





PAPGSSGGS
12,786
MLVCB_P08361_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAK
12,787
MLVFF_P26809_3mut





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
12,788
PERV_Q4VFZ2_3mut





GGSGGSGGSGGSGGS
12,789
BAEVM_P10272_3mutA





GSSEAAAK
12,790
PERV_Q4VFZ2_3mut





EAAAKEAAAKEAAAKEAAAK
12,791
KORV_Q9TTC1-Pro_3mutA





GGSGGSGGSGGSGGS
12,792
MLVMS_P03355_3mut





PAPAPAPAPAPAP
12,793
MLVMS_P03355_3mut





GGSPAPEAAAK
12,794
MLVMS_P03355_PLV919





EAAAK
12,795
WMSV_P03359_3mutA





EAAAKGSSGGS
12,796
MLVBM_Q7SVK7_3mutA_WS





GGSGGGGSS
12,797
MLVMS_P03355_3mutA_WS





GGGEAAAKPAP
12,798
MLVMS_P03355_3mut





EAAAKGGSGGG
12,799
XMRV6_A1Z651_3mutA





GGGGGSEAAAK
12,800
KORV_Q9TTC1-Pro_3mutA





GGGGGG
12,801
BAEVM_P10272_3mutA





GGGGGG
12,802
MLVMS_P03355_3mut





GGGGGGG
12,803
MLVBM_Q7SVK7_3mutA_WS





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,804
AVIRE_P03360





PAPGSSGGS
12,805
PERV_Q4VFZ2_3mut





GGGGGS
12,806
XMRV6_A1Z651_3mut





EAAAKPAP
12,807
XMRV6_A1Z651_3mutA





GGG

MLVMS_P03355_3mutA_WS





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,809
FLV_P10273_3mut





EAAAKGSSPAP
12,810
MLVMS_P03355_3mut





SGSETPGTSESATPES
12,811
BAEVM_P10272_3mutA





GGSPAPEAAAK
12,812
MLVMS_P03355_3mut





GSSGSSGSSGSS
12,813
MLVAV_P03356_3mutA





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,814
MLVMS_P03355_3mut





GGSPAP
12,815
MLVCB_P08361_3mutA





GGGGGSEAAAK
12,816
MLVMS_P03355_3mutA_WS





GGGGG
12,817
MLVFF_P26809_3mutA





GSSEAAAK
12,818
MLVAV_P03356_3mutA





GGS

BAEVM_P10272_3mut





EAAAKGGSPAP
12,820
MLVCB_P08361_3mutA





PAPAPAPAP
12,821
FLV_P10273_3mutA





PAPGGGEAAAK
12,822
MLVCB_P08361_3mutA





GGGGSSEAAAK
12,823
MLVMS_P03355_3mutA_WS





GGGGG
12,824
PERV_Q4VFZ2_3mutA_WS





GGSGGSGGSGGSGGSGGS
12,825
PERV_Q4VFZ2_3mut





GGGGG
12,826
MLVMS_P03355_3mut





PAPEAAAKGGG
12,827
MLVBM_Q7SVK7_3mutA_WS





GSSGGGPAP
12,828
XMRV6_A1Z651_3mutA





GSSGSSGSSGSSGSSGSS
12,829
PERV_Q4VFZ2_3mutA_WS





EAAAKGGSPAP
12,830
PERV_Q4VFZ2_3mut





GSSGGSEAAAK
12,831
MLVMS_P03355_PLV919





GSS

PERV_Q4VFZ2_3mut





EAAAKGGS
12,833
WMSV_P03359_3mutA





GGGGGSPAP
12,834
PERV_Q4VFZ2_3mutA_WS





EAAAKGSS
12,835
MLVMS_P03355_PLV919





EAAAKGGGGSS
12,836
KORV_Q9TTC1-Pro_3mutA





PAPGSSGGG
12,837
PERV_Q4VFZ2_3mut





GGGGSSEAAAK
12,838
MLVFF_P26809_3mut





PAPAPAP
12,839
MLVMS_P03355_3mut





GSSGGSEAAAK
12,840
XMRV6_A1Z651_3mut





PAPEAAAKGSS
12,841
MLVMS_P03355_3mutA_WS





GGSGGSGGSGGSGGS
12,842
MLVMS_P03355_3mutA_WS





GGSGSSPAP
12,843
XMRV6_A1Z651_3mutA





GGGGSSPAP
12,844
MLVMS_P03355_PLV919





GGGGS
12,845
MLVCB_P08361_3mutA





EAAAKEAAAKEAAAKEAAAK
12,846
PERV_Q4VFZ2_3mutA_WS





EAAAKEAAAK
12,847
KORV_Q9TTC1_3mutA





PAPGGGEAAAK
12,848
BAEVM_P10272_3mutA





GSSGGSEAAAK
12,849
XMRV6_A1Z651_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
12,850
FLV_P10273_3mut





GSSEAAAKPAP
12,851
MLVMS_P03355_3mutA_WS





EAAAKPAPGSS
12,852
PERV_Q4VFZ2_3mutA_WS





GSSGGSPAP
12,853
XMRV6_A1Z651_3mutA





GSSEAAAKGGG
12,854
PERV_Q4VFZ2_3mut





GGGEAAAKGGS
12,855
WMSV_P03359_3mutA





GSSEAAAKGGG
12,856
MLVFF_P26809_3mut





PAPAPAP
12,857
KORV_Q9TTC1-Pro_3mutA





EAAAKGGSPAP
12,858
MLVMS_P03355_3mutA_WS





PAPGGSEAAAK
12,859
PERV_Q4VFZ2_3mut





GGGGS
12,860
MLVBM_Q7SVK7_3mutA_WS





EAAAKGSSGGG
12,861
KORV_Q9TTC1_3mut





EAAAKGGGPAP
12,862
MLVCB_P08361_3mutA





EAAAKGSS
12,863
BAEVM_P10272_3mutA





GGSPAPGGG
12,864
MLVBM_Q7SVK7_3mutA_WS





GGGGSEAAAKGGGGS
12,865
MLVMS_P03355_3mutA_WS





GGGEAAAKGGS
12,866
PERV_Q4VFZ2_3mutA_WS





EAAAKGGGGSS
12,867
MLVMS_P03355_3mutA_WS





EAAAKGGGPAP
12,868
MLVFF_P26809_3mut





GSSPAP
12,869
PERV_Q4VFZ2_3mutA_WS





EAAAKGGS
12,870
MLVMS_P03355_3mut





GGGGSS
12,871
KORV_Q9TTC1-Pro_3mutA





EAAAKGSSPAP
12,872
MLVMS_P03355_3mutA_WS





GGGPAP
12,873
PERV_Q4VFZ2_3mut





EAAAKGSSGGS
12,874
XMRV6_A1Z651_3mutA





PAPGGG
12,875
MLVAV_P03356_3mutA





GSSPAPEAAAK
12,876
BAEVM_P10272_3mutA





GGGPAP
12,877
MLVBM_Q7SVK7_3mutA_WS





GSSGGGGGS
12,878
AVIRE_P03360_3mutA





SGSETPGTSESATPES
12,879
MLVMS_P03355_PLV919





GGGPAP
12,880
MLVFF_P26809_3mut





EAAAKGGGGSS
12,881
XMRV6_A1Z651_3mutA





GGGGSSPAP
12,882
XMRV6_A1Z651_3mut





GGGGSEAAAKGGGGS
12,883
MLVMS_P03355_3mut





GSSPAP
12,884
MLVBM_Q7SVK7_3mutA_WS





GGSGSSEAAAK
12,885
FLV_P10273_3mutA





SGSETPGTSESATPES
12,886
MLVBM_Q7SVK7_3mutA_WS





PAPGGG
12,887
AVIRE_P03360_3mutA





GGGEAAAKPAP
12,888
MLVMS_P03355_3mutA_WS





EAAAKGGSGSS
12,889
PERV_Q4VFZ2_3mut





GGSPAPGGG
12,890
MLVAV_P03356_3mutA





PAPGGSGSS
12,891
BAEVM_P10272_3mutA





GSSGGSPAP
12,892
MLVFF_P26809_3mutA





EAAAKGSSGGG
12,893
PERV_Q4VFZ2_3mut





GGGGSGGGGS
12,894
PERV_Q4VFZ2_3mutA_WS





GSSGGGGGS
12,895
BAEVM_P10272_3mutA





GGGGSSGGS
12,896
MLVBM_Q7SVK7_3mutA_WS





EAAAKGGS
12,897
PERV_Q4VFZ2_3mutA_WS





GSSGSSGSSGSS
12,898
MLVMS_P03355_3mut





GGS

MLVMS_P03355_3mutA_WS





GSSGGSEAAAK
12,900
MLVBM_Q7SVK7_3mutA_WS





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
12,901
XMRV6_A1Z651





GGGGG
12,902
FLV_P10273_3mutA





PAPEAAAKGSS
12,903
PERV_Q4VFZ2_3mut





GGGGGG
12,904
WMSV_P03359_3mut





EAAAKGGG
12,905
BAEVM_P10272_3mutA





GGGGSS
12,906
MLVMS_P03355_3mutA_WS





GSSGGGEAAAK
12,907
KORV_Q9TTC1_3mut





GGSGSS
12,908
AVIRE_P03360_3mutA





EAAAKPAP
12,909
MLVMS_P03355_3mut





EAAAKEAAAKEAAAK
12,910
FLV_P10273_3mutA





GGGG
12,911
XMRV6_A1Z651_3mutA





GSSPAPGGS
12,912
BAEVM_P10272_3mutA





GSSGGGGGS
12,913
MLVFF_P26809_3mutA





GGGGSSGGS
12,914
MLVAV_P03356_3mutA





GGS

PERV_Q4VFZ2_3mut





GGGGG
12,916
WMSV_P03359_3mutA





GSSGSSGSSGSSGSSGSS
12,917
FLV_P10273_3mutA





PAPGGGGSS
12,918
MLVAV_P03356_3mutA





GGGGGGGG
12,919
BAEVM_P10272_3mutA





SGSETPGTSESATPES
12,920
MLVCB_P08361_3mutA





PAPGGG
12,921
BAEVM_P10272_3mutA





GSSGSSGSS
12,922
MLVCB_P08361_3mutA





GGSGSS
12,923
MLVMS_P03355_3mutA_WS





EAAAKGGGGSEAAAK
12,924
WMSV_P03359_3mutA





GGGGGGGG
12,925
FLV_P10273_3mutA





GSSGSS
12,926
MLVMS_P03355_3mutA_WS





PAPEAAAKGGS
12,927
XMRV6_A1Z651_3mutA





EAAAKEAAAK
12,928
MLVMS_P03355_3mut





GGGGSGGGGSGGGGS
12,929
BAEVM_P10272_3mutA





EAAAKGSSPAP
12,930
MLVMS_P03355_PLV919





GGGGSSEAAAK
12,931
MLVMS_P03355_3mut





GGGGSSEAAAK
12,932
BAEVM_P10272_3mutA





PAPGGSGSS
12,933
PERV_Q4VFZ2_3mut





GGSGGGEAAAK
12,934
MLVFF_P26809_3mut





PAPEAAAKGGS
12,935
PERV_Q4VFZ2_3mut





GGGPAPGSS
12,936
AVIRE_P03360_3mut





PAPGGSGGG
12,937
PERV_Q4VFZ2_3mutA_WS





GGGGGGGG
12,938
PERV_Q4VFZ2_3mutA_WS





GSSEAAAK
12,939
MLVMS_P03355_3mutA_WS





GGGGSGGGGSGGGGS
12,940
PERV_Q4VFZ2_3mutA_WS





EAAAKGGS
12,941
MLVMS_P03355_3mut





GGGGGSGSS
12,942
MLVCB_P08361_3mut





GGGPAP
12,943
KORV_Q9TTC1-Pro_3mutA





EAAAKPAPGGG
12,944
MLVCB_P08361_3mut





GSSGGSPAP
12,945
MLVCB_P08361_3mutA





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
12,946
MLVMS_P03355_3mut





PAPAPAPAP
12,947
MLVMS_P03355_3mut





GSSGGS
12,948
XMRV6_A1Z651_3mutA





GSSEAAAKGGG
12,949
MLVMS_P03355_3mut





GGSGSSPAP
12,950
MLVMS_P03355_3mutA_WS





GSSEAAAKGGS
12,951
MLVMS_P03355_PLV919





EAAAKEAAAKEAAAKEAAAKEAAAK
12,952
BAEVM_P10272_3mut





PAPGGGGSS
12,953
KORV_Q9TTC1_3mutA





EAAAKGSS
12,954
MLVMS_P03355_3mutA_WS





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,955
FFV_093209_2mut





GGSGGSGGSGGSGGSGGS
12,956
BAEVM_P10272_3mutA





GGGGGG
12,957
MLVMS_P03355_PLV919





PAPEAAAK
12,958
BAEVM_P10272_3mutA





GGSGSSEAAAK
12,959
MLVAV_P03356_3mutA





GGG

MLVCB_P08361_3mutA





GGGGG
12,961
MLVCB_P08361_3mutA





GGSGGSGGSGGS
12,962
KORV_Q9TTC1-Pro_3mutA





GSSGSSGSSGSSGSSGSS
12,963
XMRV6_A1Z651_3mutA





GSSEAAAKPAP
12,964
FLV_P10273_3mutA





GGGEAAAKPAP
12,965
MLVCB_P08361_3mutA





GSSGSSGSS
12,966
MLVMS_P03355_3mutA_WS





PAPAPAPAP
12,967
MLVMS_P03355_PLV919





EAAAKGGG
12,968
MLVMS_P03355_PLV919





PAPAPAPAPAPAP
12,969
FLV_P10273_3mutA





EAAAKGGSGSS
12,970
MLVMS_P03355_3mut





GGGGGG
12,971
PERV_Q4VFZ2_3mutA_WS





PAPGGG
12,972
MLVCB_P08361_3mutA





GGGGGSGSS
12,973
KORV_Q9TTC1_3mutA





GGGGSGGGGSGGGGSGGGGS
12,974
XMRV6_A1Z651_3mut





GGSGGSGGS
12,975
KORV_Q9TTC1-Pro_3mutA





EAAAKPAPGGG
12,976
MLVMS_P03355_3mutA_WS





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
12,977
XMRV6_A1Z651





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
12,978
FLV_P10273_3mutA





EAAAKGGGGSEAAAK
12,979
PERV_Q4VFZ2_3mutA_WS





GGGPAPGSS
12,980
AVIRE_P03360_3mutA





GGGGG
12,981
MLVMS_P03355_3mutA_WS





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
12,982
MLVMS_P03355_3mut





GGGGSGGGGS
12,983
MLVMS_P03355_3mutA_WS





EAAAKGGSPAP
12,984
XMRV6_A1Z651_3mutA





EAAAKGSSPAP
12,985
AVIRE_P03360_3mutA





PAPGGSGSS
12,986
KORV_Q9TTC1-Pro_3mutA





GSS

MLVBM_Q7SVK7_3mutA_WS





GSS

WMSV_P03359_3mut





GGGPAPGSS
12,989
MLVFF_P26809_3mutA





EAAAKPAP
12,990
MLVMS_P03355_3mut





GSSPAPEAAAK
12,991
FLV_P10273_3mutA





GGSPAPGSS
12,992
MLVBM_Q7SVK7_3mutA_WS





GGGGGSEAAAK
12,993
XMRV6_A1Z651_3mut





PAPEAAAKGGG
12,994
WMSV_P03359_3mutA





PAPGGG
12,995
PERV_Q4VFZ2_3mut





GGSPAPEAAAK
12,996
WMSV_P03359_3mutA





GGSGGGGSS
12,997
PERV_Q4VFZ2_3mut





EAAAKGGGGSS
12,998
PERV_Q4VFZ2_3mut





EAAAKGGSPAP
12,999
AVIRE_P03360_3mut





GGSGGGGSS
13,000
WMSV_P03359_3mutA





PAPGSSEAAAK
13,001
MLVFF_P26809_3mut





GSSEAAAK
13,002
MLVMS_P03355_PLV919





GSAGSAAGSGEF
13,003
AVIRE_P03360_3mutA





EAAAKGGSGSS
13,004
MLVMS_P03355_3mut





GGSEAAAKPAP
13,005
MLVMS_P03355_PLV919





GGGGSGGGGSGGGGSGGGGSGGGGS
13,006
MLVFF_P26809_3mutA





PAPGSSEAAAK
13,007
PERV_Q4VFZ2_3mutA_WS





GGGGSSPAP
13,008
MLVMS_P03355_3mutA_WS





PAPAPAP
13,009
MLVCB_P08361_3mutA





EAAAKPAPGGG
13,010
MLVBM_Q7SVK7_3mutA_WS





GGGPAPGSS
13,011
BAEVM_P10272_3mutA





PAP

MLVMS_P03355_3mutA_WS





PAPGGSGGG
13,013
MLVMS_P03355_3mutA_WS





GGSGGSGGSGGSGGS
13,014
MLVBM_Q7SVK7_3mutA_WS





PAPAPAPAP
13,015
XMRV6_A1Z651_3mut





GSSPAPGGG
13,016
MLVMS_P03355_3mutA_WS





GSSPAPGGG
13,017
MLVMS_P03355_3mut





PAPGGG
13,018
MLVMS_P03355_PLV919





GGGEAAAKGSS
13,019
WMSV_P03359_3mut





EAAAKGSS
13,020
KORV_Q9TTC1-Pro_3mutA





EAAAKGGS
13,021
PERV_Q4VFZ2_3mut





EAAAKEAAAKEAAAKEAAAKEAAAK
13,022
PERV_Q4VFZ2_3mut





PAPEAAAKGGG
13,023
MLVMS_P03355_PLV919





EAAAKGSSGGG
13,024
MLVFF_P26809_3mut





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
13,025
PERV_Q4VFZ2





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
13,026
MLVAV_P03356_3mutA





GSSGGSGGG
13,027
MLVFF_P26809_3mut





GSSGSSGSSGSS
13,028
PERV_Q4VFZ2_3mutA_WS





GGSPAPGGG
13,029
MLVMS_P03355_PLV919





GSS

BAEVM_P10272_3mut





GGGPAPGSS
13,031
MLVMS_P03355_3mutA_WS





GGGGSS
13,032
KORV_Q9TTC1_3mutA





GSSGGSGGG
13,033
BAEVM_P10272_3mutA





EAAAKEAAAKEAAAK
13,034
MLVCB_P08361_3mutA





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
13,035
FLV_P10273_3mutA





PAPGGGGGS
13,036
PERV_Q4VFZ2_3mut





PAPAPAPAPAP
13,037
KORV_Q9TTC1-Pro_3mutA





EAAAK
13,038
MLVMS_P03355_3mutA_WS





GGG

MLVCB_P08361_3mut





GGSEAAAKGGG
13,040
BAEVM_P10272_3mutA





GGGGGSGSS
13,041
MLVAV_P03356_3mutA





EAAAKGSSPAP
13,042
MLVBM_Q7SVK7_3mutA_WS





GGSGGSGGS
13,043
XMRV6_A1Z651_3mut





EAAAKPAPGGG
13,044
KORV_Q9TTC1-Pro_3mutA





GGGPAPEAAAK
13,045
FLV_P10273_3mutA





GGSPAPEAAAK
13,046
MLVMS_P03355_3mutA_WS





GGSGGSGGSGGSGGS
13,047
MLVFF_P26809_3mut





EAAAKGGSGSS
13,048
MLVMS_P03355_PLV919





GGGEAAAKGGS
13,049
MLVBM_Q7SVK7_3mutA_WS





PAPAPAPAP
13,050
BAEVM_P10272_3mutA





EAAAKEAAAKEAAAKEAAAK
13,051
MLVMS_P03355_3mut





EAAAKPAP
13,052
XMRV6_A1Z651_3mut





EAAAKEAAAK
13,053
MLVBM_Q7SVK7_3mutA_WS





EAAAKGGG
13,054
BAEVM_P10272_3mut





EAAAKGSS
13,055
MLVAV_P03356_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
13,056
MLVFF_P26809_3mut





GGGPAPGSS
13,057
PERV_Q4VFZ2_3mutA_WS





GGGG
13,058
PERV_Q4VFZ2_3mut





EAAAKGGSGSS
13,059
MLVMS_P03355_PLV919





GGGGSGGGGSGGGGS
13,060
MLVMS_P03355_3mutA_WS





EAAAK
13,061
MLVMS_P03355_3mutA_WS





GGGGSS
13,062
PERV_Q4VFZ2





PAPEAAAKGGS
13,063
MLVCB_P08361_3mut





GSS

MLVMS_P03355_3mut





GSAGSAAGSGEF
13,065
MLVFF_P26809_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
13,066
KORV_Q9TTC1-Pro_3mut





GGGGSGGGGS
13,067
AVIRE_P03360_3mutA





EAAAK
13,068
MLVMS_P03355_3mut





GGGPAPGGS
13,069
PERV_Q4VFZ2_3mut





GGGGSGGGGSGGGGS
13,070
MLVMS_P03355_PLV919





PAPGGG
13,071
MLVMS_P03355_3mutA_WS





GGGEAAAKPAP
13,072
PERV_Q4VFZ2_3mutA_WS





EAAAKPAPGSS
13,073
KORV_Q9TTC1-Pro_3mutA





PAPGSS
13,074
KORV_Q9TTC1_3mutA





GSAGSAAGSGEF
13,075
PERV_Q4VFZ2_3mut





PAPGGGGSS
13,076
KORV_Q9TTC1-Pro_3mutA





GSSGGGEAAAK
13,077
MLVCB_P08361_3mutA





GSS

AVIRE_P03360_3mutA





GSSGSSGSSGSS
13,079
XMRV6_A1Z651_3mutA





PAPEAAAKGGG
13,080
MLVMS_P03355_PLV919





GGGPAPEAAAK
13,081
MLVCB_P08361_3mutA





PAPGGGGGS
13,082
MLVCB_P08361_3mutA





EAAAKEAAAKEAAAKEAAAK
13,083
PERV_Q4VFZ2_3mutA_WS





GGGGGSPAP
13,084
MLVFF_P26809_3mutA





GSSGSSGSSGSSGSS
13,085
PERV_Q4VFZ2





GSSPAPEAAAK
13,086
MLVMS_P03355_PLV919





GSSGSSGSSGSSGSSGSS
13,087
MLVBM_Q7SVK7_3mutA_WS





GSSGSSGSSGSSGSSGSS
13,088
MLVMS_P03355_3mutA_WS





GGSPAPEAAAK
13,089
MLVAV_P03356_3mutA





GSSGGG
13,090
BAEVM_P10272_3mut





EAAAKGSSGGS
13,091
KORV_Q9TTC1-Pro_3mutA





GGSGSSEAAAK
13,092
MLVMS_P03355_3mutA_WS





GGGPAPEAAAK
13,093
MLVFF_P26809_3mutA





GGGPAPGGS
13,094
MLVMS_P03355_3mutA_WS





GGGGG
13,095
MLVMS_P03355_PLV919





GGGEAAAKPAP
13,096
MLVBM_Q7SVK7_3mutA_WS





GGGGSGGGGS
13,097
WMSV_P03359_3mut





GGGPAPEAAAK
13,098
PERV_Q4VFZ2_3mut





GGSGSSEAAAK
13,099
MLVMS_P03355_PLV919





EAAAKGGGPAP
13,100
MLVMS_P03355_3mutA_WS





GSSGSSGSSGSSGSS
13,101
KORV_Q9TTC1-Pro_3mutA





PAPAP
13,102
WMSV_P03359_3mutA





GGSPAPGSS
13,103
MLVAV_P03356_3mutA





GGSGGGPAP
13,104
MLVMS_P03355_3mut





GGSPAP
13,105
MLVMS_P03355_PLV919





EAAAKGGSPAP
13,106
PERV_Q4VFZ2_3mut





GSSPAPGGG
13,107
KORV_Q9TTC1-Pro_3mutA





GSAGSAAGSGEF
13,108
MLVMS_P03355_3mut





GGSPAP
13,109
PERV_Q4VFZ2_3mut





GSSGSS
13,110
KORV_Q9TTC1-Pro_3mut





GGGPAPGSS
13,111
MLVMS_P03355_3mutA_WS





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
13,112
FOAMV_P14350





PAPGSSGGG
13,113
MLVMS_P03355_PLV919





GGSEAAAKPAP
13,114
BAEVM_P10272_3mutA





GGGGGS
13,115
MLVCB_P08361_3mutA





PAPEAAAKGGS
13,116
MLVMS_P03355_3mutA_WS





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
13,117
BAEVM_P10272_3mutA





GGSEAAAK
13,118
BAEVM_P10272_3mutA





GSSPAPEAAAK
13,119
MLVMS_P03355_3mutA_WS





PAPGGG
13,120
WMSV_P03359_3mut





EAAAKPAP
13,121
PERV_Q4VFZ2_3mut





GSSGSSGSSGSSGSS
13,122
WMSV_P03359_3mut





PAPGGG
13,123
MLVBM_Q7SVK7_3mutA_WS





GGSGGGEAAAK
13,124
BAEVM_P10272_3mutA





PAPGGS
13,125
MLVMS_P03355_3mut





GGSGGSGGSGGS
13,126
MLVBM_Q7SVK7_3mutA_WS





EAAAKEAAAKEAAAKEAAAK
13,127
PERV_Q4VFZ2_3mut





GGSEAAAKGGG
13,128
WMSV_P03359_3mutA





GGGPAP
13,129
BAEVM_P10272_3mutA





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
13,130
XMRV6_A1Z651_3mut





GGSPAPGSS
13,131
KORV_Q9TTC1_3mut





GGGPAPGSS
13,132
MLVMS_P03355_3mut





GGGGSSGGS
13,133
BAEVM_P10272_3mutA





GGGEAAAKGSS
13,134
KORV_Q9TTC1-Pro_3mutA





PAPAP
13,135
MLVBM_Q7SVK7_3mutA_WS





GGSPAPGGG
13,136
PERV_Q4VFZ2_3mut





PAPGSS
13,137
PERV_Q4VFZ2_3mutA_WS





GSSGGSPAP
13,138
MLVBM_Q7SVK7_3mutA_WS





EAAAKGGGGSEAAAK
13,139
PERV_Q4VFZ2_3mut





GSSEAAAKGGS
13,140
KORV_Q9TTC1-Pro_3mut





PAPAPAPAP
13,141
KORV_Q9TTC1-Pro_3mutA





GGSEAAAKPAP
13,142
WMSV_P03359_3mutA





PAPGGS
13,143
FLV_P10273_3mutA





EAAAKGGGPAP
13,144
PERV_Q4VFZ2_3mut





GGSGSSGGG
13,145
AVIRE_P03360_3mutA





EAAAKGGSGSS
13,146
BAEVM_P10272_3mutA





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
13,147
MLVCB_P08361_3mutA





GSSEAAAKGGS
13,148
XMRV6_A1Z651_3mutA





GGGGG
13,149
BAEVM_P10272_3mutA





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
13,150
SFV3L_P27401_2mutA





GGGEAAAKGSS
13,151
MLVMS_P03355_PLV919





EAAAKGGGGSEAAAK
13,152
KORV_Q9TTC1_3mutA





EAAAKGGG
13,153
AVIRE_P03360_3mut





GGSGGG
13,154
MLVMS_P03355_3mutA_WS





GGSGSSGGG
13,155
MLVMS_P03355_PLV919





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
13,156
KORV_Q9TTC1_3mut





GGGGSEAAAKGGGGS
13,157
KORV_Q9TTC1_3mutA





PAPAPAPAPAP
13,158
FLV_P10273_3mutA





GGS

MLVBM_Q7SVK7_3mutA_WS





GGGGGSEAAAK
13,160
MLVBM_Q7SVK7_3mutA_WS





GSSGSSGSSGSSGSS
13,161
MLVMS_P03355_3mutA_WS





EAAAKEAAAKEAAAKEAAAKEAAAK
13,162
MLVMS_P03355_3mut





GGSGSSGGG
13,163
PERV_Q4VFZ2_3mut





PAP

MLVFF_P26809_3mut





GSSPAPEAAAK
13,165
MLVAV_P03356_3mutA





EAAAKGGGGSS
13,166
MLVMS_P03355_3mut





GGGEAAAKGGS
13,167
XMRV6_A1Z651_3mut





GGSGGGPAP
13,168
MLVBM_Q7SVK7_3mutA_WS





GSAGSAAGSGEF
13,169
BAEVM_P10272_3mutA





GSSEAAAK
13,170
MLVCB_P08361_3mut





PAPGSS
13,171
MLVMS_P03355_3mut





EAAAKEAAAKEAAAK
13,172
MLVAV_P03356_3mutA





GSAGSAAGSGEF
13,173
XMRV6_A1Z651_3mutA





GSSGSSGSSGSS
13,174
BAEVM_P10272_3mutA





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
13,175
KORV_Q9TTC1-Pro_3mut





GGGGSSEAAAK
13,176
WMSV_P03359_3mut





GSSGGGEAAAK
13,177
MLVBM_Q7SVK7_3mutA_WS





EAAAKPAP
13,178
MLVFF_P26809_3mutA





GGSPAPGGG
13,179
KORV_Q9TTC1_3mutA





PAPEAAAK
13,180
FLV_P10273_3mutA





GSSGSSGSS
13,181
MLVBM_Q7SVK7_3mutA_WS





GSSGGGEAAAK
13,182
FLV_P10273_3mutA





GGSPAP
13,183
MLVBM_Q7SVK7_3mutA_WS





GSAGSAAGSGEF
13,184
KORV_Q9TTC1-Pro_3mutA





PAPGGSEAAAK
13,185
MLVMS_P03355_PLV919





GGSPAPEAAAK
13,186
MLVBM_Q7SVK7_3mutA_WS





GGGGGSPAP
13,187
MLVBM_Q7SVK7_3mutA_WS





EAAAKGSSPAP
13,188
WMSV_P03359_3mut





EAAAKGGGPAP
13,189
MLVBM_Q7SVK7_3mutA_WS





PAPGSS
13,190
KORV_Q9TTC1-Pro_3mutA





GGSGSSGGG
13,191
BAEVM_P10272_3mut





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
13,192
FFV_093209-Pro_2mut





GGSGGSGGSGGSGGSGGS
13,193
WMSV_P03359_3mutA





GGSGGSGGS
13,194
PERV_Q4VFZ2_3mutA_WS





GGGGG
13,195
PERV_Q4VFZ2_3mutA_WS





GGGPAP
13,196
FLV_P10273_3mutA





PAPGGSGGG
13,197
XMRV6_A1Z651_3mutA





GGGGSEAAAKGGGGS
13,198
XMRV6_A1Z651_3mut





EAAAKGSSGGG
13,199
KORV_Q9TTC1-Pro_3mutA





GSSGGSEAAAK
13,200
WMSV_P03359_3mut





EAAAKGGSGSS
13,201
PERV_Q4VFZ2_3mut





PAPAPAPAPAP
13,202
PERV_Q4VFZ2_3mut





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
13,203
MLVMS_P03355_3mutA_WS





GGGGGGG
13,204
KORV_Q9TTC1_3mutA





EAAAK
13,205
KORV_Q9TTC1-Pro_3mutA





GGGEAAAKGGS
13,206
KORV_Q9TTC1-Pro_3mutA





GGGEAAAKGGS
13,207
PERV_Q4VFZ2_3mutA_WS





GGGGGSPAP
13,208
XMRV6_A1Z651_3mut





GGGGSGGGGSGGGGSGGGGS
13,209
MLVFF_P26809_3mut





GGGGGGG
13,210
MLVFF_P26809_3mut





PAPAPAPAPAPAP
13,211
AVIRE_P03360_3mutA





GSSPAPGGG
13,212
FLV_P10273_3mutA





GGGGGSPAP
13,213
MLVMS_P03355_3mutA_WS





GGGGSGGGGSGGGGS
13,214
MLVMS_P03355_3mut





GGGGGGGGSGGGGS
13,215
KORV_Q9TTC1_3mut





GSSEAAAKGGS
13,216
MLVAV_P03356_3mutA





GSSGSSGSSGSSGSS
13,217
MLVMS_P03355_3mut





EAAAKGGGGGS
13,218
PERV_Q4VFZ2_3mutA_WS





GSSGGGGGS
13,219
PERV_Q4VFZ2_3mut





GGGEAAAKPAP
13,220
MLVMS_P03355_3mut





GSSGGSPAP
13,221
PERV_Q4VFZ2_3mutA_WS





GSSGGGPAP
13,222
BAEVM_P10272_3mutA





GGGGGSGSS
13,223
MLVMS_P03355_PLV919





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
13,224
BAEVM_P10272_3mut





PAPEAAAK
13,225
MLVMS_P03355_3mut





GGGGSGGGGSGGGGS
13,226
FLV_P10273_3mutA





GGSGSSGGG
13,227
WMSV_P03359_3mutA





EAAAKGGS
13,228
PERV_Q4VFZ2_3mut





EAAAKGSSPAP
13,229
MLVCB_P08361_3mut





EAAAKGGSGSS
13,230
WMSV_P03359_3mutA





GSSGSS
13,231
PERV_Q4VFZ2_3mutA_WS





PAPAPAPAP
13,232
MLVMS_P03355_PLV919





GGSGGG
13,233
PERV_Q4VFZ2_3mutA_WS





GSS

MLVBM_Q7SVK7_3mutA_WS





PAP

KORV_Q9TTC1-Pro_3mutA





GGSGSSEAAAK
13,236
MLVFF_P26809_3mut





PAPEAAAKGSS
13,237
KORV_Q9TTC1-Pro_3mutA





GGSGGS
13,238
MLVCB_P08361_3mutA





GGGGGGG
13,239
PERV_Q4VFZ2_3mutA_WS





GGSPAPEAAAK
13,240
MLVBM_Q7SVK7_3mut





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
13,241
KORV_Q9TTC1_3mutA





GGSPAP
13,242
MLVMS_P03355_3mut





GGSEAAAKGGG
13,243
PERV_Q4VFZ2_3mut





GGGGSGGGGS
13,244
FLV_P10273_3mutA





GGGEAAAK
13,245
BAEVM_P10272_3mutA





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
13,246
SFV3L_P27401_2mut





GGSEAAAKPAP
13,247
KORV_Q9TTC1-Pro_3mutA





GSSGGGEAAAK
13,248
MLVMS_P03355_PLV919





GGGGGSEAAAK
13,249
MLVMS_P03355_PLV919





EAAAKGGSGGG
13,250
MLVMS_P03355_3mutA_WS





GGGGSSPAP
13,251
MLVAV_P03356_3mutA





EAAAKEAAAK
13,252
MLVMS_P03355_3mutA_WS





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
13,253
SFV3L_P27401_2mut





GSSGSSGSSGSSGSS
13,254
MLVMS_P03355_PLV919





GSSGGG
13,255
KORV_Q9TTC1-Pro_3mutA





GSSGGS
13,256
MLVFF_P26809_3mutA





GGGGSGGGGS
13,257
XMRV6_A1Z651_3mutA





PAPGSS
13,258
MLVBM_Q7SVK7_3mutA_WS





GGGPAPEAAAK
13,259
XMRV6_A1Z651_3mutA





EAAAKGGS
13,260
MLVFF_P26809_3mut





GSS

KORV_Q9TTC1_3mutA





GGGG
13,262
PERV_Q4VFZ2_3mut





GGGGGSEAAAK
13,263
AVIRE_P03360_3mutA





GSSGSSGSSGSSGSS
13,264
MLVMS_P03355_PLV919





PAPGGSGGG
13,265
PERV_Q4VFZ2_3mut





GGGPAP
13,266
PERV_Q4VFZ2_3mut





GGGPAPEAAAK
13,267
AVIRE_P03360_3mutA





GGGEAAAK
13,268
MLVCB_P08361_3mut





GGG

MLVFF_P26809_3mutA





EAAAKPAPGSS
13,270
XMRV6_A1Z651_3mutA





GGSGSSEAAAK
13,271
PERV_Q4VFZ2_3mutA_WS





EAAAKGSS
13,272
MLVMS_P03355_3mut





GGSGSSEAAAK
13,273
BAEVM_P10272_3mut





GGSGGG
13,274
MLVBM_Q7SVK7_3mutA_WS





GGGPAP
13,275
MLVMS_P03355_PLV919





GGSPAPGGG
13,276
PERV_Q4VFZ2_3mutA_WS





GGGGGSEAAAK
13,277
MLVFF_P26809_3mutA





EAAAKGSSGGS
13,278
MLVBM_Q7SVK7_3mut





PAPAP
13,279
XMRV6_A1Z651_3mut





GSSPAPGGS
13,280
MLVBM_Q7SVK7_3mutA_WS





GSSEAAAKGGG
13,281
WMSV_P03359_3mutA





EAAAKGGGGGS
13,282
PERV_Q4VFZ2_3mut





GSSGSSGSSGSSGSS
13,283
MLVCB_P08361_3mutA





EAAAKGGGGSS
13,284
PERV_Q4VFZ2_3mut





EAAAKGSS
13,285
PERV_Q4VFZ2_3mut





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
13,286
AVIRE_P03360_3mutA





EAAAKGGS
13,287
MLVCB_P08361_3mut





GSSGGSEAAAK
13,288
MLVAV_P03356_3mutA





EAAAKPAPGGS
13,289
PERV_Q4VFZ2_3mut





GGSGGS
13,290
MLVAV_P03356_3mutA





EAAAKGSSGGG
13,291
AVIRE_P03360_3mutA





GGSGGSGGSGGS
13,292
PERV_Q4VFZ2_3mut





GGGGGGGG
13,293
KORV_Q9TTC1_3mutA





GGSGSSEAAAK
13,294
MLVCB_P08361_3mutA





EAAAKGGG
13,295
MLVBM_Q7SVK7_3mutA_WS





GGGGSGGGGSGGGGS
13,296
MLVCB_P08361_3mut





GGSGGSGGSGGS
13,297
PERV_Q4VFZ2_3mutA_WS





PAPAPAPAPAP
13,298
WMSV_P03359_3mut





EAAAKEAAAKEAAAKEAAAK
13,299
PERV_Q4VFZ2_3mut





GGSGGSGGS
13,300
XMRV6_A1Z651_3mutA





PAPGGGGSS
13,301
BAEVM_P10272_3mutA





GSSEAAAKGGS
13,302
MLVCB_P08361_3mut





GSSGGGPAP
13,303
MLVCB_P08361_3mutA





GGSGSS
13,304
MLVBM_Q7SVK7_3mutA_WS





GGGGGSEAAAK
13,305
MLVAV_P03356_3mutA





GSSEAAAK
13,306
PERV_Q4VFZ2_3mutA_WS





GGGGGSGSS
13,307
MLVBM_Q7SVK7_3mutA_WS





EAAAKGGSGSS
13,308
MLVFF_P26809_3mut





PAP

FLV_P10273_3mutA





GGGGG
13,310
MLVMS_P03355_3mutA_WS





EAAAK
13,311
PERV_Q4VFZ2_3mut





GSS

FLV_P10273_3mutA





PAPAPAPAPAPAP
13,313
KORV_Q9TTC1-Pro_3mutA





EAAAKEAAAKEAAAKEAAAK
13,314
MLVCB_P08361_3mut





EAAAKGGGGSEAAAK
13,315
XMRV6_A1Z651_3mut





PAPGGSGGG
13,316
MLVBM_Q7SVK7_3mutA_WS





GGSGGGPAP
13,317
WMSV_P03359_3mutA





GGGGSSEAAAK
13,318
MLVBM_Q7SVK7_3mutA_WS





PAPGGGGSS
13,319
MLVCB_P08361_3mut





GGSGGSGGSGGS
13,320
PERV_Q4VFZ2_3mutA_WS





PAPGGSGGG
13,321
MLVMS_P03355_3mutA_WS





GSSPAPGGS
13,322
MLVCB_P08361_3mutA





GSSGSSGSS
13,323
MLVFF_P26809_3mut





PAPGGGGGS
13,324
MLVBM_Q7SVK7_3mutA_WS





GSSPAP
13,325
PERV_Q4VFZ2_3mut





GGSGGG
13,326
KORV_Q9TTC1-Pro_3mut





EAAAKGGGGSEAAAK
13,327
PERV_Q4VFZ2_3mutA_WS





GGSPAPEAAAK
13,328
PERV_Q4VFZ2_3mutA_WS





EAAAKPAP
13,329
BAEVM_P10272_3mut





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
13,330
MLVMS_P03355_3mut





EAAAKGGGGSS
13,331
MLVFF_P26809_3mut





EAAAKEAAAK
13,332
MLVCB_P08361_3mut





GSSEAAAKGGS
13,333
PERV_Q4VFZ2_3mut





GGSPAP
13,334
KORV_Q9TTC1-Pro_3mutA





EAAAKEAAAKEAAAKEAAAK
13,335
MLVMS_P03355_3mutA_WS





GSSGSSGSSGSSGSS
13,336
BAEVM_P10272_3mut





PAPEAAAK
13,337
MLVMS_P03355_3mut





GSSGGSPAP
13,338
PERV_Q4VFZ2





GGGPAPGGS
13,339
BAEVM_P10272_3mutA





EAAAKPAPGGS
13,340
MLVMS_P03355_PLV919





GGGGSGGGGS
13,341
PERV_Q4VFZ2





GGGEAAAK
13,342
KORV_Q9TTC1-Pro_3mut





EAAAKGGGGGS
13,343
FLV_P10273_3mutA





GGSPAPGSS
13,344
MLVMS_P03355_3mut





GSSPAPEAAAK
13,345
MLVMS_P03355_3mutA_WS





GSAGSAAGSGEF
13,346
MLVBM_Q7SVK7_3mutA_WS





EAAAK
13,347
BAEVM_P10272_3mutA





EAAAKGGGGSS
13,348
BAEVM_P10272_3mutA





GGG

WMSV_P03359_3mut





GGSGSSPAP
13,350
BAEVM_P10272_3mut





GGSEAAAKPAP
13,351
MLVBM_Q7SVK7_3mutA_WS





EAAAKGGSGSS
13,352
MLVCB_P08361_3mut





PAPGSS
13,353
MLVAV_P03356_3mutA





PAPEAAAKGGG
13,354
MLVCB_P08361_3mutA





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
13,355
FOAMV_P14350-Pro_2mut





GSSGSSGSS
13,356
PERV_Q4VFZ2_3mut





PAPGGG
13,357
MLVMS_P03355_3mut





PAPGGS
13,358
PERV_Q4VFZ2_3mut





GSSGGG
13,359
MLVMS_P03355_PLV919





GSSGSSGSSGSSGSSGSS
13,360
WMSV_P03359_3mut





PAP

AVIRE_P03360_3mutA





EAAAKGSSPAP
13,362
MLVBM_Q7SVK7_3mutA_WS





GSSGSSGSSGSS
13,363
MLVMS_P03355_PLV919





GGGGSGGGGSGGGGSGGGGSGGGGS
13,364
AVIRE_P03360





GGGGS
13,365
PERV_Q4VFZ2_3mut





EAAAKGSSGGG
13,366
MLVBM_Q7SVK7_3mutA_WS





GGGGGG
13,367
KORV_Q9TTC1-Pro_3mut





GGSGSSEAAAK
13,368
PERV_Q4VFZ2_3mut





GSSPAPEAAAK
13,369
MLVBM_Q7SVK7_3mutA_WS





GGGGSGGGGS
13,370
MLVBM_Q7SVK7_3mutA_WS





GSSGGGGGS
13,371
MLVAV_P03356_3mutA





GSAGSAAGSGEF
13,372
WMSV_P03359_3mutA





GGGEAAAKGSS
13,373
BAEVM_P10272_3mutA





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
13,374
FFV_093209-Pro_2mut





PAPGGSGGG
13,375
MLVCB_P08361_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAK
13,376
SFV3L_P27401_2mut





GGSGSSPAP
13,377
MLVMS_P03355_PLV919





GGGGGG
13,378
PERV_Q4VFZ2_3mut





EAAAKEAAAKEAAAKEAAAKEAAAK
13,379
PERV_Q4VFZ2_3mut





EAAAKGSSPAP
13,380
MLVFF_P26809_3mut





GGGPAPGGS
13,381
MLVBM_Q7SVK7_3mutA_WS





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
13,382
SFV3L_P27401





PAP

PERV_Q4VFZ2_3mut





EAAAKGGS
13,384
MLVMS_P03355_PLV919





GSSGGSEAAAK
13,385
WMSV_P03359_3mutA





GGSGSSEAAAK
13,386
KORV_Q9TTC1-Pro_3mutA





EAAAKEAAAKEAAAK
13,387
PERV_Q4VFZ2





GGSGGGEAAAK
13,388
MLVMS_P03355_3mutA_WS





GGGGSGGGGSGGGGSGGGGS
13,389
BAEVM_P10272_3mut





EAAAKGSS
13,390
XMRV6_A1Z651_3mutA





GSSGGGGGS
13,391
WMSV_P03359_3mutA





GSSGSSGSSGSSGSSGSS
13,392
MLVFF_P26809_3mutA





GGSGSS
13,393
MLVAV_P03356_3mutA





EAAAKGGGGSEAAAK
13,394
MLVMS_P03355_PLV919





EAAAKGGGPAP
13,395
PERV_Q4VFZ2





GGSEAAAKGGG
13,396
MLVAV_P03356_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
13,397
MLVBM_Q7SVK7_3mut





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
13,398
KORV_Q9TTC1-Pro_3mutA





GSSPAPEAAAK
13,399
MLVFF_P26809_3mutA





GGGGSEAAAKGGGGS
13,400
PERV_Q4VFZ2_3mut





GSSGSSGSSGSS
13,401
PERV_Q4VFZ2_3mut





GGSEAAAK
13,402
MLVFF_P26809_3mutA





GGGGGGGG
13,403
MLVMS_P03355_3mut





GSSGGG
13,404
XMRV6_A1Z651_3mutA





EAAAKGGS
13,405
BAEVM_P10272_3mutA





GGGGS
13,406
BAEVM_P10272_3mutA





GGSEAAAKGGG
13,407
KORV_Q9TTC1-Pro_3mutA





GGSGSSGGG
13,408
KORV_Q9TTC1_3mutA





GGSGSSEAAAK
13,409
WMSV_P03359_3mut





EAAAKGGSGSS
13,410
MLVBM_Q7SVK7_3mutA_WS





GGS

BAEVM_P10272_3mutA





GGGPAPGSS
13,412
WMSV_P03359_3mutA





GSSGSSGSSGSSGSS
13,413
AVIRE_P03360_3mut





GGGEAAAKPAP
13,414
XMRV6_A1Z651_3mut





GSSGGG
13,415
MLVFF_P26809_3mutA





GGSPAPGSS
13,416
PERV_Q4VFZ2_3mut





PAPGGS
13,417
MLVCB_P08361_3mut





PAPAPAPAPAP
13,418
KORV_Q9TTC1_3mutA





GSSGGS
13,419
MLVCB_P08361_3mutA





GSSGGSEAAAK
13,420
PERV_Q4VFZ2_3mut





EAAAKGSSGGS
13,421
MLVMS_P03355_PLV919





EAAAKGGG
13,422
WMSV_P03359_3mut





PAPGGGGGS
13,423
BAEVM_P10272_3mutA





GGGGSEAAAKGGGGS
13,424
WMSV_P03359_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
13,425
MLVMS_P03355_3mutA_WS





GGS

KORV_Q9TTC1-Pro_3mutA





GSSGGSPAP
13,427
BAEVM_P10272_3mutA





GGG

MLVMS_P03355_PLV919





PAPGSS
13,429
KORV_Q9TTC1-Pro_3mut





GGSEAAAKGGG
13,430
FLV_P10273_3mutA





GGSEAAAKPAP
13,431
PERV_Q4VFZ2_3mutA_WS





GGGGSSPAP
13,432
XMRV6_A1Z651_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAK
13,433
PERV_Q4VFZ2_3mutA_WS





GGGG
13,434
PERV_Q4VFZ2_3mutA_WS





GGSEAAAKPAP
13,435
MLVMS_P03355_3mut





PAPGSSGGG
13,436
MLVMS_P03355_3mutA_WS





PAPEAAAKGGS
13,437
AVIRE_P03360_3mut





GGGGSSPAP
13,438
MLVMS_P03355_3mutA_WS





GGGGSGGGGSGGGGSGGGGS
13,439
PERV_Q4VFZ2_3mut





GGGEAAAK
13,440
MLVMS_P03355_3mut





GGGGSS
13,441
MLVFF_P26809_3mut





GGSPAPGSS
13,442
XMRV6_A1Z651_3mut





GGGGS
13,443
KORV_Q9TTC1-Pro_3mutA





EAAAKGSSGGS
13,444
FLV_P10273_3mutA





GSS

MLVMS_P03355_PLV919





GGGG
13,446
MLVMS_P03355_PLV919





GSSGGS
13,447
MLVMS_P03355_PLV919





GGSGGSGGSGGS
13,448
MLVMS_P03355_3mut





PAPEAAAKGGS
13,449
MLVMS_P03355_3mut





EAAAKGSSGGG
13,450
BAEVM_P10272_3mutA





GSSEAAAK
13,451
KORV_Q9TTC1-Pro_3mutA





GSAGSAAGSGEF
13,452
KORV_Q9TTC1_3mutA





GGGGGSEAAAK
13,453
MLVCB_P08361_3mut





GGGG
13,454
WMSV_P03359_3mut





GGGGSSEAAAK
13,455
MLVMS_P03355_PLV919





PAPGGG
13,456
WMSV_P03359_3mutA





EAAAKGGSGGG
13,457
MLVAV_P03356_3mutA





GGGPAPGGS
13,458
MLVMS_P03355_3mut





EAAAKPAP
13,459
PERV_Q4VFZ2_3mutA_WS





GSSGSSGSS
13,460
KORV_Q9TTC1-Pro_3mutA





GSSPAPGGS
13,461
XMRV6_A1Z651_3mut





GGGGGSPAP
13,462
BAEVM_P10272_3mutA





GGSGSSGGG
13,463
PERV_Q4VFZ2_3mutA_WS





GGGEAAAKGSS
13,464
AVIRE_P03360_3mut





GSSEAAAK
13,465
FLV_P10273_3mutA





EAAAK
13,466
MLVMS_P03355_3mut





EAAAKGGSGSS
13,467
WMSV_P03359_3mut





GSSEAAAKGGG
13,468
PERV_Q4VFZ2_3mut





PAPGSSGGG
13,469
BAEVM_P10272_3mutA





EAAAKGGGGGS
13,470
MLVMS_P03355_3mut





GGSEAAAKPAP
13,471
AVIRE_P03360_3mut





GGGPAPGGS
13,472
XMRV6_A1Z651_3mut





GGGGS
13,473
KORV_Q9TTC1_3mutA





GGSGGSGGSGGSGGS
13,474
XMRV6_A1Z651_3mut





GGGPAP
13,475
KORV_Q9TTC1-Pro_3mut





EAAAKPAP
13,476
MLVBM_Q7SVK7_3mutA_WS





GGSEAAAK
13,477
MLVMS_P03355_PLV919





GSSEAAAKPAP
13,478
KORV_Q9TTC1-Pro_3mutA





GGSGSS
13,479
MLVMS_P03355_3mut





EAAAKPAPGGG
13,480
PERV_Q4VFZ2_3mut





GGSPAPEAAAK
13,481
KORV_Q9TTC1_3mutA





GGSEAAAKGGG
13,482
AVIRE_P03360_3mutA





GGGGSEAAAKGGGGS
13,483
MLVMS_P03355_PLV919





GSSGGGEAAAK
13,484
KORV_Q9TTC1-Pro_3mutA





EAAAKGGGPAP
13,485
WMSV_P03359_3mut





GSSPAP
13,486
XMRV6_A1Z651_3mutA





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
13,487
SFV3L_P27401-Pro





GGSEAAAKGSS
13,488
MLVMS_P03355_PLV919





GSSGGSEAAAK
13,489
KORV_Q9TTC1-Pro_3mutA





GGSEAAAKGSS
13,490
KORV_Q9TTC1-Pro_3mutA





EAAAKGGG
13,491
AVIRE_P03360_3mutA





GSSGGSEAAAK
13,492
BAEVM_P10272_3mutA





GGGGSEAAAKGGGGS
13,493
KORV_Q9TTC1-Pro_3mut





PAPGSSEAAAK
13,494
MLVMS_P03355_3mut





PAPEAAAK
13,495
WMSV_P03359_3mut





PAPGGSGSS
13,496
PERV_Q4VFZ2_3mutA_WS





PAPGSS
13,497
BAEVM_P10272_3mut





PAPGGGGGS
13,498
MLVMS_P03355_3mut





EAAAKPAPGSS
13,499
MLVBM_Q7SVK7_3mutA_WS





GSSPAPGGS
13,500
MLVMS_P03355_PLV919





GGSGSSEAAAK
13,501
MLVMS_P03355_3mut





GGGGGG
13,502
KORV_Q9TTC1-Pro_3mutA





EAAAKEAAAKEAAAKEAAAK
13,503
MLVBM_Q7SVK7_3mut





GGSPAPGSS
13,504
MLVMS_P03355_PLV919





PAPAPAPAPAP
13,505
MLVCB_P08361_3mut





GGSGSSPAP
13,50€
WMSV_P03359_3mutA





EAAAKGGSGGG
13,507
PERV_Q4VFZ2_3mutA_WS





GSSGSSGSSGSSGSS
13,508
PERV_Q4VFZ2_3mut





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
13,509
KORV_Q9TTC1_3mutA





GSSGGGEAAAK
13,510
WMSV_P03359_3mutA





GSSGGSEAAAK
13,511
FLV_P10273_3mutA





GGGGGGGG
13,512
PERV_Q4VFZ2_3mut





PAPGGSEAAAK
13,513
FLV_P10273_3mutA





GGGGSSPAP
13,514
BAEVM_P10272_3mutA





PAPAPAPAP
13,515
WMSV_P03359_3mut





GGSEAAAKPAP
13,516
PERV_Q4VFZ2_3mut





PAPGGSGGG
13,517
BAEVM_P10272_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
13,518
MLVMS_P03355_3mut





GGGGSGGGGSGGGGS
13,519
PERV_Q4VFZ2_3mut





GGSGGGPAP
13,520
PERV_Q4VFZ2_3mut





GGGPAPEAAAK
13,521
MLVFF_P26809_3mut





GGGGGSGSS
13,522
MLVMS_P03355_3mutA_WS





GSS

MLVCB_P08361_3mut





GGGGGSPAP
13,524
MLVMS_P03355_PLV919





GGSPAP
13,525
MLVAV_P03356_3mutA





GGGPAPGGS
13,526
KORV_Q9TTC1-Pro_3mutA





PAPGSSGGG
13,527
FLV_P10273_3mutA





PAPGSSGGG
13,528
WMSV_P03359_3mutA





PAPGGS
13,529
MLVBM_Q7SVK7_3mutA_WS





GGGEAAAKGSS
13,530
PERV_Q4VFZ2_3mutA_WS





GGSEAAAKGSS
13,531
MLVBM_Q7SVK7_3mutA_WS





PAPGGSEAAAK
13,532
MLVCB_P08361_3mut





GGSEAAAKGGG
13,533
XMRV6_A1Z651_3mutA





GGSGGGGSS
13,534
WMSV_P03359_3mut





GGGEAAAKPAP
13,535
KORV_Q9TTC1_3mutA





EAAAKGSS
13,536
KORV_Q9TTC1-Pro_3mut





PAPEAAAKGSS
13,537
MLVFF_P26809_3mut





GSAGSAAGSGEF
13,538
PERV_Q4VFZ2_3mut





EAAAKGGGGGS
13,539
WMSV_P03359_3mut





EAAAKGSSPAP
13,540
WMSV_P03359_3mutA





GGGGSEAAAKGGGGS
13,541
XMRV6_A1Z651_3mutA





GSSEAAAKPAP
13,542
SFV3L_P27401-Pro_2mutA





0
13,543
PERV_Q4VFZ2_3mutA_WS





PAPGGS
13,544
BAEVM_P10272_3mut





PAP

AVIRE_P03360_3mut





PAPAPAP
13,546
MLVBM_Q7SVK7_3mutA_WS





GGGG
13,547
PERV_Q4VFZ2_3mutA_WS





GSSGGSEAAAK
13,548
MLVBM_Q7SVK7_3mut





GGSGGGGSS
13,549
MLVFF_P26809_3mut





GGGGSSGGS
13,550
AVIRE_P03360_3mutA





GSSPAPGGG
13,551
PERV_Q4VFZ2_3mutA_WS





GGSEAAAKPAP
13,552
MLVMS_P03355_PLV919





PAP

KORV_Q9TTC1-Pro_3mut





GSSGGS
13,554
PERV_Q4VFZ2_3mut





GGGGG
13,555
PERV_Q4VFZ2_3mut





GSSGGGPAP
13,556
FLV_P10273_3mutA





GSSEAAAKGGG
13,557
KORV_Q9TTC1-Pro_3mut





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
13,558
MLVCB_P08361_3mut





GGSEAAAKPAP
13,559
MLVCB_P08361_3mut





PAPAPAPAPAPAP
13,560
BAEVM_P10272_3mutA





GGGGSEAAAKGGGGS
13,561
MLVMS_P03355_3mut





EAAAKPAPGSS
13,562
MLVMS_P03355_3mut





GSSGSSGSSGSSGSS
13,563
MLVBM_Q7SVK7_3mutA_WS





PAPEAAAKGSS
13,564
MLVAV_P03356_3mut





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
13,565
AVIRE_P03360_3mut





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
13,566
PERV_Q4VFZ2_3mut





GGSEAAAKGGG
13,567
PERV_Q4VFZ2_3mutA_WS





GGSGGGGSS
13,568
MLVFF_P26809_3mutA





PAPEAAAKGSS
13,569
MLVCB_P08361_3mut





GGG

PERV_Q4VFZ2_3mutA_WS





GGSGGGEAAAK
13,571
MLVMS_P03355_3mut





EAAAKGGGGSS
13,572
WMSV_P03359_3mut





GSSPAPGGG
13,573
WMSV_P03359_3mutA





EAAAKGSSGGG
13,574
PERV_Q4VFZ2_3mut





GGSGGGEAAAK
13,575
PERV_Q4VFZ2_3mutA_WS





GGSGGSGGSGGSGGS
13,576
PERV_Q4VFZ2_3mutA_WS





EAAAKPAPGGS
13,577
PERV_Q4VFZ2_3mutA_WS





GGGGGSEAAAK
13,578
PERV_Q4VFZ2_3mutA_WS





GSSPAP
13,579
MLVFF_P26809_3mut





GGGEAAAKPAP
13,580
AVIRE_P03360_3mut





GSSGGSEAAAK
13,581
MLVMS_P03355_PLV919





EAAAKPAPGGS
13,582
WMSV_P03359_3mutA





PAPGGG
13,583
KORV_Q9TTC1_3mutA





EAAAKGSSPAP
13,584
KORV_Q9TTC1-Pro_3mut





GSSPAPEAAAK
13,585
MLVFF_P26809_3mut





GGSGGGEAAAK
13,586
MLVFF_P26809_3mutA





GSSGSSGSS
13,587
WMSV_P03359_3mutA





EAAAKGGS
13,588
BAEVM_P10272_3mut





EAAAKPAPGGS
13,589
KORV_Q9TTC1_3mutA





EAAAKPAPGGS
13,590
BAEVM_P10272_3mutA





GSSGGGGGS
13,591
PERV_Q4VFZ2_3mut





PAPGGGGSS
13,592
PERV_Q4VFZ2_3mut





GSSGSSGSS
13,593
WMSV_P03359_3mut





EAAAKEAAAKEAAAKEAAAK
13,594
WMSV_P03359_3mut





GGS

AVIRE_P03360_3mut





EAAAKPAPGSS
13,596
MLVFF_P26809_3mut





EAAAKGGG
13,597
KORV_Q9TTC1_3mut





PAPGSSEAAAK
13,598
MLVMS_P03355_3mut





PAPGSSGGS
13,599
MLVMS_P03355_PLV919





GSSPAPEAAAK
13,600
MLVMS_P03355_3mut





GSSGSSGSSGSSGSSGSS
13,601
WMSV_P03359_3mutA





GGGGS
13,602
BAEVM_P10272_3mut





GSSPAP
13,603
MLVMS_P03355_3mut





EAAAKGGGGSEAAAK
13,604
KORV_Q9TTC1-Pro_3mutA





EAAAKEAAAK
13,605
WMSV_P03359_3mutA





GGGGSSGGS
13,606
MLVCB_P08361_3mutA





PAPGGSEAAAK
13,607
BAEVM_P10272_3mut





EAAAKGGSPAP
13,608
MLVFF_P26809_3mut





GSSGGSGGG
13,609
MLVBM_Q7SVK7_3mutA_WS





GSSGGS
13,610
PERV_Q4VFZ2_3mut





PAPGGSGSS
13,611
PERV_Q4VFZ2_3mutA_WS





EAAAKGGSGSS
13,612
KORV_Q9TTC1-Pro_3mutA





PAPAP
13,613
MLVCB_P08361_3mut





EAAAKGSSPAP
13,614
PERV_Q4VFZ2_3mutA_WS





EAAAKPAPGGG
13,615
MLVMS_P03355_PLV919





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
13,616
MLVBM_Q7SVK7_3mut





EAAAKGGGGSS
13,617
MLVMS_P03355_PLV919





PAPEAAAK
13,618
PERV_Q4VFZ2_3mut





EAAAKPAPGSS
13,619
BAEVM_P10272_3mutA





GGSPAP
13,620
PERV_Q4VFZ2_3mutA_WS





GGSGGS
13,621
BAEVM_P10272_3mutA





PAPEAAAKGSS
13,622
KORV_Q9TTC1_3mut





PAPGSS
13,623
MLVMS_P03355_PLV919





PAPAPAPAPAP
13,624
MLVAV_P03356_3mutA





GGG

XMRV6_A1Z651_3mutA





GGGPAP
13,626
PERV_Q4VFZ2_3mutA_WS





GSSPAPEAAAK
13,627
KORV_Q9TTC1_3mutA





PAP

BAEVM_P10272_3mutA





GGSPAP
13,629
BAEVM_P10272_3mutA





PAPEAAAKGGS
13,630
MLVMS_P03355_PLV919





PAPGSSGGS
13,631
PERV_Q4VFZ2_3mutA_WS





PAPAPAPAPAPAP
13,632
PERV_Q4VFZ2_3mut





EAAAKEAAAKEAAAK
13,633
MLVCB_P08361_3mut





GGSGGSGGSGGSGGS
13,634
MLVMS_P03355_PLV919





EAAAKPAPGGS
13,635
MLVMS_P03355_3mut





GGSGGS
13,636
MLVMS_P03355_PLV919





EAAAKPAP
13,637
MLVMS_P03355_3mutA_WS





GGSEAAAK
13,638
XMRV6_A1Z651_3mutA





GGSGGG
13,639
KORV_Q9TTC1_3mut





GGSGGGEAAAK
13,640
PERV_Q4VFZ2_3mut





PAPEAAAKGGG
13,641
AVIRE_P03360





PAPAP
13,642
PERV_Q4VFZ2_3mut





GSS

KORV_Q9TTC1-Pro_3mutA





EAAAKGSSGGG
13,644
MLVAV_P03356_3mutA





GGSPAPGSS
13,645
MLVBM_Q7SVK7_3mutA_WS





PAPEAAAK
13,646
MLVAV_P03356_3mut





EAAAKGGSPAP
13,647
BAEVM_P10272_3mutA





PAPAPAPAP
13,648
WMSV_P03359_3mutA





PAPGGSEAAAK
13,649
MLVMS_P03355_3mut





GGSGGSGGSGGS
13,650
WMSV_P03359_3mut





GGGGGSGSS
13,651
XMRV6_A1Z651_3mut





PAPGGSGGG
13,652
KORV_Q9TTC1_3mutA





GGS

MLVMS_P03355_3mut





EAAAK
13,654
WMSV_P03359_3mut





GGGEAAAKGSS
13,655
MLVBM_Q7SVK7_3mutA_WS





GGSPAPGSS
13,656
MLVCB_P08361_3mut





GGSEAAAKPAP
13,657
PERV_Q4VFZ2_3mut





GGGGSGGGGSGGGGSGGGGSGGGGS
13,658
MLVCB_P08361_3mutA





GGSGSS
13,659
BAEVM_P10272_3mutA





GGGEAAAKGSS
13,660
WMSV_P03359_3mutA





EAAAKGGSPAP
13,661
WMSV_P03359_3mut





GSSPAPEAAAK
13,662
MLVMS_P03355_3mut





GGSGGSGGSGGS
13,663
MLVMS_P03355_PLV919





GSSPAPEAAAK
13,664
WMSV_P03359_3mut





GSSGSSGSSGSS
13,665
PERV_Q4VFZ2





GGSGSSEAAAK
13,666
WMSV_P03359_3mutA





GGSGGG
13,667
MLVFF_P26809_3mut





GGSPAPGGG
13,668
MLVFF_P26809_3mut





GGSGGSGGS
13,669
BAEVM_P10272_3mutA





GGGGSSEAAAK
13,670
MLVBM_Q7SVK7_3mut





GGSPAPGSS
13,671
MLVMS_P03355_3mut





EAAAKPAPGSS
13,672
AVIRE_P03360_3mut





GGGGSSGGS
13,673
FLV_P10273_3mutA





GGSPAPEAAAK
13,674
PERV_Q4VFZ2_3mut





GGSEAAAK
13,675
MLVMS_P03355_3mutA_WS





GSSGSSGSSGSS
13,676
MLVCB_P08361_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAK
13,677
MLVMS_P03355_PLV919





GGGGG
13,678
PERV_Q4VFZ2_3mut





GGSEAAAKGSS
13,679
MLVCB_P08361_3mutA





GSSGGG
13,680
MLVBM_Q7SVK7_3mutA_WS





PAPGSSGGG
13,681
KORV_Q9TTC1-Pro_3mutA





GGSGGS
13,682
BAEVM_P10272_3mut





EAAAKGGGGGS
13,683
MLVBM_Q7SVK7_3mutA_WS





GGSGSSPAP
13,684
MLVCB_P08361_3mut





PAPGSSGGG
13,685
KORV_Q9TTC1





PAPGGSGGG
13,686
MLVMS_P03355_3mut





GGGG
13,687
WMSV_P03359_3mutA





EAAAKGGSPAP
13,688
MLVCB_P08361_3mut





GSSGSS
13,689
FLV_P10273_3mutA





GGSEAAAKPAP
13,690
SFV3L_P27401_2mut





EAAAKGSSGGS
13,691
MLVAV_P03356_3mutA





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
13,692
MLVAV_P03356_3mutA





EAAAKGGSGSS
13,693
PERV_Q4VFZ2_3mutA_WS





GGGGG
13,694
MLVCB_P08361_3mut





GGGEAAAK
13,695
BAEVM_P10272_3mut





GGSGGSGGSGGS
13,696
MLVCB_P08361_3mut





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
13,697
PERV_Q4VFZ2





PAPAPAPAPAP
13,698
MLVMS_P03355_3mutA_WS





EAAAKEAAAK
13,699
XMRV6_A1Z651_3mut





GSSGGSEAAAK
13,700
PERV_Q4VFZ2_3mutA_WS





PAPGGSEAAAK
13,701
KORV_Q9TTC1-Pro_3mutA





EAAAKGGGPAP
13,702
MLVBM_Q7SVK7_3mutA_WS





PAPGGSGSS
13,703
PERV_Q4VFZ2





SGSETPGTSESATPES
13,704
MLVMS_P03355_3mut





GGSGGS
13,705
MLVMS_P03355_PLV919





EAAAKGGS
13,706
FLV_P10273_3mut





GGSPAPGSS
13,707
MLVMS_P03355_3mutA_WS





EAAAKEAAAKEAAAKEAAAK
13,708
FFV_093209_2mut





GSSGGSGGG
13,709
MLVMS_P03355_3mutA_WS





PAPGSSEAAAK
13,710
WMSV_P03359_3mut





PAPAPAPAPAPAP
13,711
KORV_Q9TTC1_3mutA





GGGGSS
13,712
BAEVM_P10272_3mut





GGGGSEAAAKGGGGS
13,713
AVIRE_P03360_3mut





GSSPAPEAAAK
13,714
KORV_Q9TTC1-Pro_3mutA





PAPEAAAKGGG
13,715
MLVBM_Q7SVK7_3mut





EAAAKEAAAK
13,716
WMSV_P03359_3mut





EAAAK
13,717
SFV3L_P27401-Pro_2mutA





GSSGGSGGG
13,718
XMRV6_A1Z651_3mutA





GGGEAAAKPAP
13,719
WMSV_P03359_3mutA





GGSGGS
13,720
MLVFF_P26809_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
13,721
FOAMV_P14350_2mutA





GGGGG
13,722
MLVAV_P03356_3mutA





GSSGGSEAAAK
13,723
BAEVM_P10272_3mut





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
13,724
SFV1_P23074





GGSGGGPAP
13,725
MLVCB_P08361_3mut





GGSGSS
13,726
PERV_Q4VFZ2_3mut





SGSETPGTSESATPES
13,727
MLVFF_P26809_3mut





EAAAKGGSPAP
13,728
MLVMS_P03355_3mut





PAPAP
13,729
PERV_Q4VFZ2_3mut





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
13,730
MLVBM_Q7SVK7_3mut





GGGGGS
13,731
BAEVM_P10272_3mutA





EAAAKEAAAK
13,732
AVIRE_P03360_3mut





GSSGGSEAAAK
13,733
PERV_Q4VFZ2_3mut





GGGEAAAK
13,734
WMSV_P03359_3mut





GSSGGGEAAAK
13,735
AVIRE_P03360_3mutA





GGG

XMRV6_A1Z651_3mut





GGGGSEAAAKGGGGS
13,737
BAEVM_P10272_3mut





GGGG
13,738
MLVMS_P03355_3mut





GGSGGS
13,739
MLVMS_P03355_3mutA_WS





GGSGGGGSS
13,740
MLVBM_Q7SVK7_3mutA_WS





GSSPAPGGS
13,741
PERV_Q4VFZ2_3mut





GSSPAPEAAAK
13,742
PERV_Q4VFZ2_3mutA_WS





EAAAKGGS
13,743
WMSV_P03359_3mut





GGSGGSGGSGGS
13,744
PERV_Q4VFZ2_3mut





GGGGSSEAAAK
13,745
KORV_Q9TTC1-Pro_3mut





PAPAPAPAPAPAP
13,746
MLVAV_P03356_3mut





EAAAKGSSGGG
13,747
MLVMS_P03355_PLV919





GGGGG
13,748
MLVBM_Q7SVK7_3mutA_WS





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
13,749
FFV_093209_2mutA





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
13,750
KORV_Q9TTC1-Pro_3mut





GGSPAPGGG
13,751
MLVMS_P03355_3mutA_WS





GGGEAAAKGGS
13,752
MLVMS_P03355_3mut





GGGEAAAK
13,753
PERV_Q4VFZ2_3mut





PAPEAAAKGGG
13,754
MLVMS_P03355_3mut





GSSGSSGSSGSSGSSGSS
13,755
BAEVM_P10272_3mutA





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
13,756
GALV_P21414_3mutA





EAAAKGGSPAP
13,757
FFV_093209-Pro





EAAAKEAAAK
13,758
MLVFF_P26809_3mut





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
13,759
PERV_Q4VFZ2_3mutA_WS





GGSGGSGGSGGS
13,760
MLVAV_P03356_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAK
13,761
SFV3L_P27401_2mutA





GSSGSSGSSGSSGSSGSS
13,762
BAEVM_P10272_3mut





GGGGS
13,763
MLVMS_P03355_PLV919





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
13,764
SFV1_P23074





GGGGSGGGGS
13,765
KORV_Q9TTC1-Pro_3mutA





GGGGSGGGGS
13,766
MLVMS_P03355_3mut





GGSGSS
13,767
KORV_Q9TTC1_3mutA





GSSPAPGGG
13,768
PERV_Q4VFZ2_3mut





GSSGGSPAP
13,769
PERV_Q4VFZ2_3mutA_WS





PAPGGS
13,770
PERV_Q4VFZ2_3mutA_WS





GGSPAPEAAAK
13,771
FOAMV_P14350_2mutA





GGGPAPGGS
13,772
SFV3L_P27401_2mut





PAPGSSGGG
13,773
MLVCB_P08361_3mut





GSSGGGEAAAK
13,774
AVIRE_P03360_3mut





GSSGGG
13,775
XMRV6_A1Z651_3mut





GSSGSS
13,776
PERV_Q4VFZ2_3mut





GSSGGG
13,777
MLVAV_P03356_3mutA





PAPGGGGGS
13,778
PERV_Q4VFZ2_3mut





GSSEAAAK
13,779
MLVMS_P03355_3mut





PAPGGG
13,780
FLV_P10273_3mutA





GGGGSGGGGS
13,781
PERV_Q4VFZ2_3mut





GSSGGS
13,782
MLVMS_P03355_PLV919





GGGGSGGGGS
13,783
SFV3L_P27401_2mut





EAAAKGGSGSS
13,784
FLV_P10273_3mutA





GSSEAAAKGGS
13,785
MLVMS_P03355_3mutA_WS





PAPGSSEAAAK
13,786
SFV3L_P27401_2mutA





GGGGSGGGGS
13,787
SFV3L_P27401-Pro_2mutA





PAPGSSEAAAK
13,788
PERV_Q4VFZ2_3mut





PAPGSSEAAAK
13,789
PERV_Q4VFZ2





GGSPAPGGG
13,790
AVIRE_P03360_3mut





GGGGGS
13,791
PERV_Q4VFZ2_3mutA_WS





GGGGSSGGS
13,792
PERV_Q4VFZ2_3mut





PAPAPAPAP
13,793
AVIRE_P03360_3mutA





GGSGGS
13,794
WMSV_P03359_3mutA





GGGPAPGGS
13,795
PERV_Q4VFZ2_3mut





GGSGGSGGSGGSGGS
13,796
MLVMS_P03355_PLV919





GGSGGG
13,797
PERV_Q4VFZ2_3mut





EAAAKEAAAK
13,798
SFV3L_P27401_2mut





PAPGSS
13,799
XMRV6_A1Z651_3mut





GSSEAAAK
13,800
MLVFF_P26809_3mut





GGSPAPGGG
13,801
MLVMS_P03355_3mut





EAAAKGGG
13,802
WMSV_P03359_3mutA





GSSEAAAKGGS
13,803
PERV_Q4VFZ2_3mutA_WS





GSSGGSPAP
13,804
FFV_093209





GGGGGS
13,805
KORV_Q9TTC1-Pro_3mut





GSSGGG
13,806
MLVCB_P08361_3mut





GSSGSS
13,807
MLVCB_P08361_3mutA





GGSEAAAKPAP
13,808
BAEVM_P10272_3mut





EAAAKGGGGSS
13,809
MLVCB_P08361_3mut





EAAAKPAPGGS
13,810
KORV_Q9TTC1-Pro_3mutA





GSSGSSGSSGSSGSS
13,811
MLVAV_P03356_3mutA





GGGGSEAAAKGGGGS
13,812
PERV_Q4VFZ2_3mutA_WS





GGSGSS
13,813
KORV_Q9TTC1-Pro_3mut





GSS

SFV3L_P27401-Pro_2mutA





PAPAP
13,815
BAEVM_P10272_3mut





EAAAKPAP
13,816
BAEVM_P10272





EAAAKEAAAKEAAAKEAAAKEAAAK
13,817
KORV_Q9TTC1-Pro_3mut





GGGGGGG
13,818
PERV_Q4VFZ2_3mutA_WS





GGGGS
13,819
MLVMS_P03355_3mut





GSSGGG
13,820
FLV_P10273_3mutA





PAPAPAPAPAP
13,821
FLV_P10273_3mut





EAAAKEAAAKEAAAK
13,822
WMSV_P03359_3mutA





GSSGGS
13,823
MLVBM_Q7SVK7_3mutA_WS





EAAAKPAPGGG
13,824
MLVMS_P03355_3mut





GSSPAPGGS
13,825
WMSV_P03359_3mut





PAPGSSGGG
13,826
PERV_Q4VFZ2_3mutA_WS





GSSGGG
13,827
AVIRE_P03360_3mutA





PAPGGSGSS
13,828
MLVFF_P26809_3mut





PAPGSS
13,829
PERV_Q4VFZ2_3mut





GGGGGSGSS
13,830
WMSV_P03359_3mutA





EAAAKGGGGSS
13,831
MLVBM_Q7SVK7_3mutA_WS





GGGGGGG
13,832
BAEVM_P10272_3mut





PAPEAAAKGSS
13,833
MLVMS_P03355_3mut





GGSGGGEAAAK
13,834
MLVMS_P03355_PLV919





EAAAKGGGGGS
13,835
MLVCB_P08361_3mut





PAPGGS
13,836
KORV_Q9TTC1-Pro_3mut





GGGG
13,837
FLV_P10273_3mutA





EAAAKGGSGSS
13,838
MLVBM_Q7SVK7_3mutA_WS





GGGGSSGGS
13,839
MLVMS_P03355_3mutA_WS





GGGGGGGG
13,840
WMSV_P03359_3mut





GGSGSSGGG
13,841
MLVMS_P03355_PLV919





GSSEAAAKGGS
13,842
KORV_Q9TTC1-Pro_3mutA





EAAAKPAPGSS
13,843
MLVCB_P08361_3mut





GGSPAPGSS
13,844
KORV_Q9TTC1_3mutA





PAPGSSGGG
13,845
BAEVM_P10272_3mut





EAAAKPAPGSS
13,846
WMSV_P03359_3mut





GGSPAPEAAAK
13,847
XMRV6_A1Z651_3mutA





GSSPAP
13,848
FLV_P10273_3mutA





GSS

BAEVM_P10272_3mutA





EAAAKPAPGGS
13,850
FLV_P10273_3mutA





GGSGSSPAP
13,851
FLV_P10273_3mutA





PAPGSSGGS
13,852
MLVMS_P03355_3mut





GSAGSAAGSGEF
13,853
PERV_Q4VFZ2_3mutA_WS





GSSGGSEAAAK
13,854
KORV_Q9TTC1_3mutA





GSSGGS
13,855
MLVMS_P03355_3mutA_WS





EAAAKGGGGSEAAAK
13,856
SFV3L_P27401_2mut





GSSGGS
13,857
PERV_Q4VFZ2_3mutA_WS





GGSPAPEAAAK
13,858
FLV_P10273_3mut





GGSEAAAKGSS
13,859
PERV_Q4VFZ2_3mutA_WS





GSSPAPEAAAK
13,860
PERV_Q4VFZ2_3mutA_WS





GGSGSSGGG
13,861
PERV_Q4VFZ2_3mut





GGGG
13,862
AVIRE_P03360_3mutA





GGSEAAAKPAP
13,863
WMSV_P03359_3mut





GSSGGSPAP
13,864
MLVAV_P03356_3mutA





GSSGGSEAAAK
13,865
MLVMS_P03355_3mut





PAPEAAAKGGS
13,866
KORV_Q9TTC1-Pro_3mut





GGSPAP
13,867
PERV_Q4VFZ2_3mutA_WS





GGSEAAAK
13,868
MLVAV_P03356_3mutA





EAAAKGGGGSEAAAK
13,869
KORV_Q9TTC1-Pro_3mut





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
13,870
MLVMS_P03355_PLV919





GSSEAAAK
13,871
KORV_Q9TTC1_3mutA





GGG

AVIRE_P03360





GGSEAAAKGSS
13,873
MLVBM_Q7SVK7_3mut





GGSEAAAKGSS
13,874
MLVMS_P03355_3mut





GGSPAPEAAAK
13,875
MLVCB_P08361_3mut





GGSGGGEAAAK
13,876
MLVCB_P08361_3mut





GGSEAAAKPAP
13,877
MLVMS_P03355_3mutA_WS





EAAAKGGSGSS
13,878
KORV_Q9TTC1-Pro_3mut





GGGEAAAKGGS
13,879
MLVCB_P08361_3mut





EAAAKGGGGSEAAAK
13,880
FLV_P10273_3mutA





GGSPAP
13,881
MLVFF_P26809_3mut





GGGGSSGGS
13,882
XMRV6_A1Z651_3mutA





PAP

MLVCB_P08361_3mut





GGS

SFV3L_P27401-Pro_2mutA





GGGGSGGGGS
13,885
MLVMS_P03355_3mut





GGGEAAAKGGS
13,886
MLVAV_P03356_3mutA





GSSGSSGSSGSSGSSGSS
13,887
MLVMS_P03355_PLV919





PAPGSS
13,888
MLVCB_P08361_3mut





GGSGGSGGS
13,889
MLVMS_P03355_PLV919





PAPGGSGGG
13,890
FLV_P10273_3mutA





GGGGSGGGGSGGGGS
13,891
FLV_P10273_3mut





GGSGSSGGG
13,892
KORV_Q9TTC1-Pro_3mutA





GGSGGSGGS
13,893
GALV_P21414_3mutA





GGGEAAAKGGS
13,894
WMSV_P03359_3mut





SGSETPGTSESATPES
13,895
KORV_Q9TTC1_3mutA





EAAAKGGGGGS
13,896
KORV_Q9TTC1-Pro_3mut





EAAAKGSSPAP
13,897
BAEVM_P10272_3mut





GGGG
13,898
MLVCB_P08361_3mut





GGGGSGGGGSGGGGSGGGGSGGGGS
13,899
MLVBM_Q7SVK7_3mut





GSSGGSGGG
13,900
MLVMS_P03355_PLV919





GGSGSS
13,901
MLVFF_P26809_3mut





EAAAKGGS
13,902
AVIRE_P03360_3mutA





GSSEAAAKGGS
13,903
MLVBM_Q7SVK7_3mutA_WS





EAAAKPAPGGG
13,904
WMSV_P03359_3mut





PAPGSSGGG
13,905
MLVCB_P08361_3mutA





GGGGSSEAAAK
13,906
KORV_Q9TTC1-Pro_3mutA





GSSEAAAKPAP
13,907
BAEVM_P10272_3mutA





PAPGGGEAAAK
13,908
MLVBM_Q7SVK7_3mutA_WS





GGSGGGEAAAK
13,909
MLVCB_P08361_3mutA





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
13,910
FFV_093209





EAAAKGGGGGS
13,911
GALV_P21414_3mutA





GGSPAPGGG
13,912
MLVMS_P03355_3mut





GSSGSSGSS
13,913
FLV_P10273_3mutA





EAAAK
13,914
MLVBM_Q7SVK7_3mut





GGGGSSGGS
13,915
MLVMS_P03355_3mut





GGSGSSPAP
13,916
PERV_Q4VFZ2_3mut





EAAAKEAAAKEAAAKEAAAK
13,917
BAEVM_P10272_3mut





GGGPAPGSS
13,918
MLVMS_P03355_3mut





GSSPAPGGS
13,919
PERV_Q4VFZ2_3mutA_WS





PAPAP
13,920
FLV_P10273_3mutA





PAPAPAPAP
13,921
PERV_Q4VFZ2_3mut





GGGGGSEAAAK
13,922
GALV_P21414_3mutA





GGGGGSGSS
13,923
BAEVM_P10272_3mutA





GGGEAAAKGSS
13,924
KORV_Q9TTC1_3mutA





GGGGGSPAP
13,925
AVIRE_P03360_3mut





GGGGGSEAAAK
13,926
SFV3L_P27401_2mutA





GGS

KORV_Q9TTC1_3mutA





GGGGGGG
13,928
PERV_Q4VFZ2_3mut





SGSETPGTSESATPES
13,929
SFV3L_P27401_2mutA





EAAAKGGSGGG
13,930
MLVMS_P03355_3mut





GGGGS
13,931
MLVFF_P26809_3mut





EAAAKGSSGGG
13,932
BAEVM_P10272_3mut





EAAAKPAPGGS
13,933
MLVF5_P26810_3mutA





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
13,934
SFV3L_P27401_2mutA





GGSPAPGGG
13,935
WMSV_P03359_3mutA





GSAGSAAGSGEF
13,936
MLVFF_P26809_3mut





GGGGSSGGS
13,937
MLVMS_P03355_3mutA_WS





GGGGGGG
13,938
MLVCB_P08361_3mut





GSSEAAAK
13,939
WMSV_P03359_3mut





PAPGSS
13,940
FLV_P10273_3mutA





GSSGGG
13,941
PERV_Q4VFZ2_3mutA_WS





PAPGGG
13,942
MLVFF_P26809_3mut





GGGGGSPAP
13,943
MLVMS_P03355_3mut





GGSEAAAK
13,944
XMRV6_A1Z651_3mut





GSSGGG
13,945
PERV_Q4VFZ2_3mut





GGSGGSGGSGGS
13,946
MLVMS_P03355_3mut





PAPAP
13,947
AVIRE_P03360_3mut





GGSEAAAK
13,948
PERV_Q4VFZ2_3mut





GGGGS
13,949
MLVMS_P03355_PLV919





GGGG
13,950
BAEVM_P10272_3mutA





EAAAKGGGGSS
13,951
MLVCB_P08361_3mutA





EAAAKEAAAKEAAAK
13,952
GALV_P21414_3mutA





PAPGGGEAAAK
13,953
KORV_Q9TTC1





EAAAKGGSPAP
13,954
MLVMS_P03355_3mut





GGSGSSEAAAK
13,955
MLVMS_P03355_3mut





GGSPAPEAAAK
13,956
FLV_P10273_3mutA





GGGGGGG
13,957
PERV_Q4VFZ2_3mut





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
13,958
SFV1_P23074_2mutA





EAAAKGSSGGS
13,959
MLVMS_P03355_3mut





GSSEAAAKPAP
13,960
MLVFF_P26809_3mut





GGGGSS
13,961
FLV_P10273_3mutA





EAAAKGGSGGG
13,962
AVIRE_P03360_3mutA





GGSGGS
13,963
PERV_Q4VFZ2_3mutA_WS





GGGGGSPAP
13,964
AVIRE_P03360_3mutA





EAAAKEAAAKEAAAK
13,965
XMRV6_A1Z651_3mut





PAPEAAAKGGS
13,966
FLV_P10273_3mutA





GSSGGSEAAAK
13,967
MLVCB_P08361_3mut





EAAAKGGSGGG
13,968
MLVMS_P03355





GGSGGGPAP
13,969
MLVMS_P03355_3mut





GGS

XMRV6_A1Z651_3mut





GGSEAAAKPAP
13,971
MLVFF_P26809_3mut





EAAAKGGG
13,972
MLVMS_P03355_PLV919





GSSGSSGSSGSS
13,973
WMSV_P03359_3mut





GGSGSSPAP
13,974
PERV_Q4VFZ2_3mut





GGGEAAAK
13,975
MLVMS_P03355_3mutA_WS





GSSPAPGGS
13,976
KORV_Q9TTC1-Pro_3mutA





GSSEAAAKGGG
13,977
SFV3L_P27401_2mut





EAAAKPAPGGS
13,978
MLVCB_P08361_3mut





GGSGGGEAAAK
13,979
PERV_Q4VFZ2





GGSGSS
13,980
MLVCB_P08361_3mut





GGSGGGEAAAK
13,981
MLVBM_Q7SVK7_3mutA_WS





GGSGGSGGSGGSGGSGGS
13,982
FLV_P10273_3mut





PAPEAAAKGSS
13,983
MLVMS_P03355_3mut





EAAAKGSSGGS
13,984
WMSV_P03359_3mutA





GGSGSSEAAAK
13,985
MLVCB_P08361_3mut





GGSGSSEAAAK
13,986
KORV_Q9TTC1_3mutA





GSSGGSGGG
13,987
MLVMS_P03355_PLV919





EAAAKGGSGGG
13,988
SFV3L_P27401-Pro_2mutA





GGSGGS
13,989
AVIRE_P03360_3mutA





GSAGSAAGSGEF
13,990
MLVMS_P03355_PLV919





GGSGSS
13,991
GALV_P21414_3mutA





GGGG
13,992
MLVFF_P26809_3mutA





GGGGSGGGGSGGGGSGGGGS
13,993
WMSV_P03359_3mut





SGSETPGTSESATPES
13,994
BAEVM_P10272_3mut





EAAAKEAAAKEAAAKEAAAK
13,995
FOAMV_P14350_2mutA





GGGEAAAKGGS
13,996
FLV_P10273_3mutA





GSSGGSEAAAK
13,997
MLVFF_P26809_3mut





EAAAKGGGGSS
13,998
MLVAV_P03356_3mut





PAPGGSEAAAK
13,999
KORV_Q9TTC1-Pro_3mut





EAAAK
14,000
XMRV6_A1Z651_3mut





GSSGSSGSSGSSGSSGSS
14,001
PERV_Q4VFZ2_3mut





GGGG
14,002
MLVCB_P08361_3mutA





GSSGSS
14,003
WMSV_P03359_3mutA





GSSGGSPAP
14,004
AVIRE_P03360_3mut





GGSGGSGGS
14,005
MLVCB_P08361_3mut





EAAAKGGGPAP
14,006
FLV_P10273_3mutA





GGGGSGGGGS
14,007
MLVCB_P08361_3mut





GGSEAAAKGSS
14,008
PERV_Q4VFZ2_3mutA_WS





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
14,009
SFV3L_P27401_2mutA





GGSGSSEAAAK
14,010
PERV_Q4VFZ2_3mutA_WS





EAAAKEAAAKEAAAKEAAAK
14,011
SFV3L_P27401-Pro_2mutA





GSSEAAAKGGS
14,012
FLV_P10273_3mutA





GGSGSS
14,013
PERV_Q4VFZ2





GGSGSSEAAAK
14,014
SFV3L_P27401-Pro_2mutA





GSSGSSGSS
14,015
XMRV6_A1Z651_3mutA





EAAAKGSSPAP
14,016
KORV_Q9TTC1_3mutA





EAAAKPAP
14,017
FLV_P10273_3mutA





GGSGSSEAAAK
14,018
KORV_Q9TTC1-Pro_3mut





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
14,019
KORV_Q9TTC1_3mutA





GGGGSGGGGSGGGGS
14,020
KORV_Q9TTC1-Pro_3mutA





GGGGGGG
14,021
FLV_P10273_3mut





EAAAKGSS
14,022
WMSV_P03359_3mut





EAAAKGGGPAP
14,023
MLVCB_P08361_3mut





GSSGSS
14,024
MLVBM_Q7SVK7_3mutA_WS





EAAAKGGGGGS
14,025
MLVFF_P26809_3mut





GGSGGGEAAAK
14,026
FLV_P10273_3mutA





PAPGSS
14,027
MLVFF_P26809_3mutA





PAPGSS
14,028
BAEVM_P10272_3mutA





GGSPAPGSS
14,029
AVIRE_P03360_3mut





GGGGSSEAAAK
14,030
MLVMS_P03355_3mut





GSSGGGGGS
14,031
FFV_093209-Pro





EAAAKGSSPAP
14,032
PERV_Q4VFZ2_3mut





GSSPAPGGS
14,033
PERV_Q4VFZ2_3mut





GGGGGG
14,034
BAEVM_P10272_3mut





EAAAKGGGGSS
14,035
PERV_Q4VFZ2_3mutA_WS





PAPGGSEAAAK
14,036
KORV_Q9TTC1_3mutA





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
14,037
MLVMS_P03355_3mutA_WS





GSSGSSGSSGSS
14,038
MLVMS_P03355_3mut





EAAAKGSSGGG
14,039
MLVMS_P03355_PLV919





GGSEAAAKPAP
14,040
AVIRE_P03360_3mutA





GSSGSSGSSGSSGSS
14,041
WMSV_P03359_3mutA





GGGEAAAKPAP
14,042
FLV_P10273_3mutA





PAPGSSGGG
14,043
KORV_Q9TTC1_3mutA





GSSGSS
14,044
MLVMS_P03355_3mutA_WS





PAPEAAAK
14,045
BAEVM_P10272_3mut





GGGPAPGSS
14,046
PERV_Q4VFZ2





GSSGGSPAP
14,047
MLVFF_P26809_3mut





GGGGSS
14,048
SFV3L_P27401_2mut





PAPEAAAKGSS
14,049
SFV3L_P27401_2mut





GGSGGGPAP
14,050
XMRV6_A1Z651_3mutA





PAPGGS
14,051
BAEVM_P10272_3mutA





EAAAKGGGGGS
14,052
AVIRE_P03360_3mut





GSSGGSPAP
14,053
KORV_Q9TTC1-Pro_3mutA





GSSGGGGGS
14,054
WMSV_P03359_3mut





GGGEAAAKGGS
14,055
AVIRE_P03360_3mut





GGGEAAAKGSS
14,056
BAEVM_P10272_3mut





PAPEAAAKGSS
14,057
MLVAV_P03356_3mutA





GSSGSSGSSGSSGSS
14,058
MLVCB_P08361_3mut





GGSPAPGSS
14,059
FLV_P10273_3mutA





EAAAKGSSPAP
14,060
BAEVM_P10272_3mutA





GGSGGSGGSGGSGGSGGS
14,061
PERV_Q4VFZ2





GGGGSSEAAAK
14,062
FLV_P10273_3mutA





GGGGSSPAP
14,063
FFV_093209





GSSGGSPAP
14,064
MLVMS_P03355_3mut





GGGPAPGSS
14,065
MLVMS_P03355_PLV919





PAPGSSGGS
14,066
PERV_Q4VFZ2_3mut





GGGGGSPAP
14,067
MLVFF_P26809_3mut





SGSETPGTSESATPES
14,068
MLVMS_P03355_3mutA_WS





GSSGSSGSSGSSGSS
14,069
KORV_Q9TTC1_3mutA





GSSPAPGGG
14,070
WMSV_P03359_3mut





PAPAPAPAPAPAP
14,071
SFV3L_P27401_2mutA





GGGPAPGGS
14,072
MLVMS_P03355_3mut





PAPGGSEAAAK
14,073
WMSV_P03359_3mut





GGGGSSEAAAK
14,074
FFV_093209-Pro





GGSPAPGGG
14,075
FLV_P10273_3mutA





GSSPAPEAAAK
14,076
AVIRE_P03360_3mut





GGGEAAAK
14,077
FLV_P10273_3mutA





PAPEAAAKGGG
14,078
MLVCB_P08361_3mut





GGSPAPGGG
14,079
MLVCB_P08361_3mut





GGSGGGGSS
14,080
BAEVM_P10272_3mutA





GSSPAPEAAAK
14,081
MLVCB_P08361_3mut





GGSPAPGGG
14,082
KORV_Q9TTC1-Pro_3mutA





PAPGGSGSS
14,083
KORV_Q9TTC1_3mutA





GSSPAP
14,084
KORV_Q9TTC1-Pro_3mutA





SGSETPGTSESATPES
14,085
MLVMS_P03355





GSSGSSGSS
14,086
MLVAV_P03356_3mutA





PAPGSSGGS
14,087
PERV_Q4VFZ2_3mutA_WS





PAPGGS
14,088
KORV_Q9TTC1-Pro_3mutA





PAPEAAAKGGG
14,089
SFV3L_P27401-Pro_2mutA





GGSGGSGGS
14,090
BAEVM_P10272_3mut





PAPGGS
14,091
MLVFF_P26809_3mut





GSSGGSPAP
14,092
MLVMS_P03355_PLV919





GSSGGGGGS
14,093
FLV_P10273_3mutA





GGGGGSPAP
14,094
KORV_Q9TTC1-Pro_3mut





EAAAKPAPGSS
14,095
SFV3L_P27401-Pro_2mutA





EAAAKGGSPAP
14,096
KORV_Q9TTC1-Pro





GGGPAPEAAAK
14,097
MLVMS_P03355_PLV919





GGSEAAAKGSS
14,098
MLVMS_P03355





PAPEAAAKGSS
14,099
KORV_Q9TTC1_3mutA





PAPEAAAKGGS
14,100
WMSV_P03359_3mutA





GSSGGG
14,101
PERV_Q4VFZ2_3mutA_WS





EAAAKGGGGSS
14,102
MLVMS_P03355_PLV919





EAAAKGGSPAP
14,103
AVIRE_P03360_3mutA





GGGGSSGGS
14,104
MLVMS_P03355_PLV919





PAPEAAAKGSS
14,105
PERV_Q4VFZ2_3mutA_WS





EAAAKGGGGGS
14,106
BAEVM_P10272_3mut





GSSGGGGGS
14,107
MLVMS_P03355_3mut





PAPAPAPAP
14,108
KORV_Q9TTC1_3mutA





GGSGGSGGSGGS
14,109
MLVAV_P03356_3mut





PAPAPAPAP
14,110
SFV3L_P27401_2mut





GSSEAAAKPAP
14,111
MLVMS_P03355_3mut





GGSGGGEAAAK
14,112
SFV3L_P27401_2mutA





GSSGGSGGG
14,113
MLVMS_P03355_3mutA_WS





GGGGGSPAP
14,114
MLVCB_P08361_3mutA





GGGEAAAKGSS
14,115
XMRV6_A1Z651_3mutA





GGGGSSPAP
14,116
BAEVM_P10272_3mut





GGSGGG
14,117
PERV_Q4VFZ2_3mut





GGGGSS
14,118
MLVBM_Q7SVK7_3mutA_WS





EAAAKGSSGGS
14,119
PERV_Q4VFZ2_3mutA_WS





GSSGGGGGS
14,120
PERV_Q4VFZ2





EAAAKGSSGGS
14,121
PERV_Q4VFZ2_3mut





EAAAKEAAAK
14,122
MLVAV_P03356_3mut





GSSGGGEAAAK
14,123
MLVAV_P03356_3mut





GSSPAPGGG
14,124
XMRV6_A1Z651_3mut





GGGGSGGGGSGGGGS
14,125
PERV_Q4VFZ2_3mut





EAAAKEAAAKEAAAKEAAAK
14,126
KORV_Q9TTC1_3mutA





EAAAKGGSGSS
14,127
MLVBM_Q7SVK7_3mut





PAPEAAAK
14,128
BLVJ_P03361





GSSGGG
14,129
FFV_093209-Pro





GGSGGGEAAAK
14,130
KORV_Q9TTC1-Pro_3mutA





EAAAK
14,131
FLV_P10273_3mutA





GGGGSSPAP
14,132
MLVMS_P03355_3mut





GSS

SFV3L_P27401-Pro_2mut





PAPEAAAKGSS
14,134
BAEVM_P10272_3mut





GGGGGSPAP
14,135
PERV_Q4VFZ2_3mut





GSSGSSGSS
14,136
BAEVM_P10272_3mutA





GGGGSGGGGSGGGGSGGGGS
14,137
SFV1_P23074_2mut





GGGGSSEAAAK
14,138
SFV3L_P27401_2mutA





GGGGSGGGGSGGGGSGGGGS
14,139
FOAMV_P14350-Pro_2mut





PAPGSSEAAAK
14,140
MLVBM_Q7SVK7_3mutA_WS





GGGGGSGSS
14,141
MLVFF_P26809_3mutA





GGSEAAAKGGG
14,142
MLVBM_Q7SVK7_3mut





PAPGSSGGG
14,143
PERV_Q4VFZ2





GGS

PERV_Q4VFZ2_3mutA_WS





EAAAKGGSGSS
14,145
FLV_P10273_3mut





GGGEAAAK
14,146
WMSV_P03359_3mutA





GGSEAAAKPAP
14,147
MLVBM_Q7SVK7_3mut





SGSETPGTSESATPES
14,148
FOAMV_P14350-Pro_2mutA





EAAAKPAPGGS
14,149
AVIRE_P03360_3mut





EAAAKGGGGGS
14,150
KORV_Q9TTC1-Pro_3mutA





GGGGS
14,151
PERV_Q4VFZ2_3mut





GGSEAAAKGSS
14,152
MLVFF_P26809_3mutA





GGSEAAAKGGG
14,153
AVIRE_P03360





GGSGGSGGSGGSGGSGGS
14,154
SFV3L_P27401_2mut





GGSEAAAKGSS
14,155
SFV3L_P27401-Pro_2mutA





GGGEAAAKPAP
14,156
MLVCB_P08361_3mut





GGSEAAAK
14,157
MLVMS_P03355_PLV919





GGSPAPGSS
14,158
KORV_Q9TTC1-Pro_3mutA





GSSPAPEAAAK
14,159
WMSV_P03359_3mutA





GGSGSS
14,160
KORV_Q9TTC1-Pro_3mutA





PAPGGGGGS
14,161
AVIRE_P03360_3mut





PAPEAAAKGSS
14,162
FFV_093209-Pro





GGSGGGEAAAK
14,163
WMSV_P03359_3mut





PAPGGG
14,164
MLVMS_P03355_3mut





EAAAKGGG
14,165
FLV_P10273_3mutA





GSSGSSGSSGSS
14,166
MLVCB_P08361_3mut





EAAAKGGSGGG
14,167
FFV_093209





GSSPAPGGS
14,168
PERV_Q4VFZ2_3mutA_WS





GSSPAPGGS
14,169
MLVCB_P08361_3mut





GGGPAP
14,170
WMSV_P03359_3mutA





GGGPAP
14,171
KORV_Q9TTC1_3mutA





GGSPAPGSS
14,172
KORV_Q9TTC1-Pro_3mut





PAPAP
14,173
MLVMS_P03355_3mut





GGGGGGG
14,174
MLVMS_P03355_3mut





GGGGG
14,175
KORV_Q9TTC1-Pro_3mut





GSAGSAAGSGEF
14,176
FOAMV_P14350_2mutA





PAPAP
14,177
KORV_Q9TTC1-Pro_3mutA





GGSEAAAKGGG
14,178
SFV3L_P27401-Pro_2mutA





PAPAP
14,179
WMSV_P03359_3mut





GGGGSGGGGSGGGGS
14,180
SFV3L_P27401_2mut





PAPGGS
14,181
KORV_Q9TTC1_3mutA





GGGEAAAKPAP
14,182
FLV_P10273_3mut





GGGGGS
14,183
MLVAV_P03356_3mutA





GSSEAAAKGGG
14,184
WMSV_P03359_3mut





EAAAKGGGGSS
14,185
GALV_P21414_3mutA





GSSGGS
14,186
MLVAV_P03356_3mutA





GSSGGG
14,187
MLVBM_Q7SVK7_3mut





PAPAPAP
14,188
SFV3L_P27401-Pro_2mutA





GGGG
14,189
KORV_Q9TTC1_3mutA





EAAAKPAPGGS
14,190
MLVFF_P26809_3mut





GGGGSGGGGS
14,191
XMRV6_A1Z651_3mut





EAAAKGGG
14,192
MLVCB_P08361_3mut





GGGGSSPAP
14,193
KORV_Q9TTC1_3mutA





GSSEAAAKGGG
14,194
KORV_Q9TTC1-Pro_3mutA





GGGGG
14,195
BLVJ_P03361_2mutB





GGGEAAAKGSS
14,196
FFV_O93209-Pro





GSSGSSGSS
14,197
BAEVM_P10272_3mut





GSSGGSPAP
14,198
PERV_Q4VFZ2_3mut





EAAAKGGS
14,199
KORV_Q9TTC1_3mut





GGSPAPEAAAK
14,200
AVIRE_P03360_3mut





GGSEAAAK
14,201
WMSV_P03359_3mut





GSSGGS
14,202
KORV_Q9TTC1-Pro_3mutA





GGGPAPEAAAK
14,203
KORV_Q9TTC1_3mutA





PAPGSS
14,204
WMSV_P03359_3mutA





GGSEAAAKGSS
14,205
FLV_P10273_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAK
14,206
SFV3L_P27401





GSSEAAAKGGG
14,207
SFV3L_P27401-Pro_2mutA





GGGGSEAAAKGGGGS
14,208
KORV_Q9TTC1-Pro_3mutA





GGSGGSGGS
14,209
WMSV_P03359_3mut





GGGGGSGSS
14,210
KORV_Q9TTC1-Pro





GGGGSGGGGSGGGGSGGGGS
14,211
MLVMS_P03355_3mut





EAAAKGGG
14,212
PERV_Q4VFZ2





GGSEAAAKGGG
14,213
KORV_Q9TTC1-Pro_3mut





GSSGGSGGG
14,214
PERV_Q4VFZ2_3mutA_WS





GGGGGS
14,215
PERV_Q4VFZ2_3mut





GSAGSAAGSGEF
14,216
PERV_Q4VFZ2





PAPEAAAKGSS
14,217
BAEVM_P10272_3mutA





GSSPAPGGG
14,218
MLVCB_P08361_3mut





GGGGSSPAP
14,219
KORV_Q9TTC1-Pro_3mutA





PAPGGSGGG
14,220
MLVFF_P26809_3mut





GSSPAP
14,221
KORV_Q9TTC1_3mutA





PAPGSS
14,222
SFV3L_P27401-Pro_2mut





GGSGGGGSS
14,223
MLVMS_P03355_PLV919





GSSGGS
14,224
WMSV_P03359_3mutA





EAAAKGGGGGS
14,225
PERV_Q4VFZ2





GGGGG
14,226
KORV_Q9TTC1_3mutA





EAAAKGSS
14,227
MLVMS_P03355_PLV919





EAAAKEAAAKEAAAKEAAAKEAAAK
14,228
FLV_P10273_3mut





EAAAKEAAAKEAAAKEAAAK
14,229
SFV3L_P27401-Pro_2mut





GSAGSAAGSGEF
14,230
SFV3L_P27401_2mutA





GGGPAPGGS
14,231
FLV_P10273_3mutA





GGSEAAAKGGG
14,232
MLVCB_P08361_3mut





PAPGGGEAAAK
14,233
BAEVM_P10272_3mut





EAAAKPAPGSS
14,234
FOAMV_P14350_2mut





GGSEAAAK
14,235
KORV_Q9TTC1_3mutA





GGSGSS
14,236
AVIRE_P03360





GGSPAPEAAAK
14,237
MLVMS_P03355_PLV919





GGGGS
14,238
XMRV6_A1Z651_3mut





GGSPAPGGG
14,239
XMRV6_A1Z651_3mut





EAAAKPAPGGS
14,240
PERV_Q4VFZ2





GSSPAP
14,241
BAEVM_P10272_3mut





GGSGSSGGG
14,242
FLV_P10273_3mutA





PAPGGG
14,243
PERV_Q4VFZ2_3mutA_WS





GSSGGSEAAAK
14,244
MLVBM_Q7SVK7_3mut





GGSEAAAK
14,245
MLVMS_P03355_3mut





GGGPAPGGS
14,246
MLVFF_P26809_3mut





GSAGSAAGSGEF
14,247
MLVBM_Q7SVK7_3mutA_WS





EAAAKPAPGGS
14,248
SFVCP_Q87040





PAPGGG
14,249
PERV_Q4VFZ2_3mutA_WS





GSSPAPEAAAK
14,250
MLVBM_Q7SVK7





PAPEAAAK
14,251
MLVBM_Q7SVK7_3mut





PAPGGGGGS
14,252
AVIRE_P03360_3mutA





GGSEAAAKPAP
14,253
MLVBM_Q7SVK7_3mut





EAAAKGSS
14,254
WMSV_P03359_3mutA





GGGEAAAK
14,255
MLVFF_P26809_3mutA





EAAAKEAAAKEAAAK
14,256
MLVMS_P03355_3mut





PAPEAAAKGGG
14,257
BAEVM_P10272_3mut





PAPAPAP
14,258
MLVCB_P08361_3mut





EAAAKPAPGGS
14,259
BAEVM_P10272_3mut





GGGGSGGGGS
14,260
FLV_P10273_3mut





GGGGSEAAAKGGGGS
14,261
KORV_Q9TTC1_3mut





EAAAK
14,262
FLV_P10273_3mut





PAPAPAP
14,263
WMSV_P03359_3mut





GGGGSEAAAKGGGGS
14,264
FFV_093209-Pro





GGSPAPEAAAK
14,265
MLVMS_P03355_3mut





GGSGSSGGG
14,266
XMRV6_A1Z651_3mut





GGSPAPGSS
14,267
PERV_Q4VFZ2_3mut





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
14,268
SFV3L_P27401-Pro_2mutA





EAAAKGGGPAP
14,269
BAEVM_P10272_3mutA





GSSGGSEAAAK
14,270
MLVMS_P03355_3mutA_WS





SGSETPGTSESATPES
14,271
PERV_Q4VFZ2_3mutA_WS





EAAAKEAAAKEAAAKEAAAKEAAAK
14,272
KORV_Q9TTC1-Pro_3mutA





GSSGSSGSS
14,273
KORV_Q9TTC1_3mutA





GSSPAPGGG
14,274
SFV3L_P27401-Pro_2mutA





GSSGGGEAAAK
14,275
KORV_Q9TTC1_3mutA





GGSGGGGSS
14,276
PERV_Q4VFZ2_3mutA_WS





GSSGGGEAAAK
14,277
MLVCB_P08361_3mut





GSSEAAAKGGG
14,278
MLVCB_P08361_3mut





GGSGGGGSS
14,279
KORV_Q9TTC1_3mutA





GGSGSSPAP
14,280
PERV_Q4VFZ2_3mutA_WS





GSSPAP
14,281
MLVMS_P03355_3mut





GGGGSSEAAAK
14,282
AVIRE_P03360





GGS

WMSV_P03359_3mut





EAAAKEAAAK
14,284
PERV_Q4VFZ2_3mut





PAPAPAPAP
14,285
MLVAV_P03356_3mut





GGSEAAAKGGG
14,286
KORV_Q9TTC1_3mutA





PAPGGG
14,287
MLVAV_P03356_3mut





EAAAKGSS
14,288
BAEVM_P10272_3mut





GGGGSGGGGS
14,289
WMSV_P03359_3mutA





GGSGGSGGS
14,290
SFV3L_P27401_2mut





EAAAK
14,291
MLVCB_P08361_3mut





GGGGSSGGS
14,292
WMSV_P03359_3mutA





GGGPAPEAAAK
14,293
MLVAV_P03356_3mutA





EAAAKEAAAKEAAAK
14,294
FFV_093209





GSSEAAAKGGG
14,295
MLVBM_Q7SVK7_3mut





GGGPAPGGS
14,296
FLV_P10273_3mut





GGSEAAAKGGG
14,297
WMSV_P03359_3mut





EAAAKGGGGGS
14,298
XMRV6_A1Z651_3mutA





EAAAKGGSGGG
14,299
FLV_P10273_3mutA





GGSEAAAKGGG
14,300
SFV3L_P27401_2mutA





GGGGS
14,301
PERV_Q4VFZ2_3mutA_WS





GSSGGS
14,302
MLVMS_P03355_3mut





GSSGSS
14,303
MLVAV_P03356_3mutA





GGSPAPGGG
14,304
MLVBM_Q7SVK7_3mutA_WS





GSSGGGGGS
14,305
MLVF5_P26810_3mut





PAPAPAPAP
14,306
MLVCB_P08361_3mut





PAPAP
14,307
PERV_Q4VFZ2_3mutA_WS





PAPGSSGGS
14,308
KORV_Q9TTC1_3mut





PAPGSSGGG
14,309
PERV_Q4VFZ2_3mut





GGGEAAAK
14,310
MLVMS_P03355_PLV919





GGSGGSGGSGGSGGS
14,311
SFV3L_P27401-Pro_2mutA





GGSGGG
14,312
FLV_P10273_3mut





PAPEAAAKGGG
14,313
MLVFF_P26809_3mut





PAP

PERV_Q4VFZ2_3mutA_WS





PAPGGSGSS
14,315
FFV_093209_2mut





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
14,316
FFV_093209-Pro_2mut





GSSGSSGSSGSS
14,317
FFV_O93209-Pro





GSSGSSGSSGSSGSS
14,318
FLV_P10273_3mutA





GGGEAAAKPAP
14,319
PERV_Q4VFZ2





PAPGSSGGG
14,320
SFV3L_P27401_2mut





PAPGGSGSS
14,321
KORV_Q9TTC1-Pro_3mut





PAPAPAPAPAP
14,322
GALV_P21414_3mutA





GGSGGGEAAAK
14,323
PERV_Q4VFZ2_3mut





GSSPAP
14,324
MLVCB_P08361_3mut





EAAAKPAP
14,325
MLVF5_P26810_3mut





GGGGSGGGGSGGGGSGGGGS
14,326
MLVBM_Q7SVK7_3mut





GGSGGG
14,327
WMSV_P03359_3mut





GGSGGSGGS
14,328
KORV_Q9TTC1_3mut





GGGGGGGG
14,329
MLVFF_P26809_3mut





GGGGSS
14,330
MLVAV_P03356_3mut





GSSGGGGGS
14,331
SFV3L_P27401_2mut





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
14,332
GALV_P21414_3mutA





GSSGSSGSS
14,333
PERV_Q4VFZ2_3mut





GSSPAPGGS
14,334
MLVFF_P26809_3mut





PAPAPAP
14,335
AVIRE_P03360_3mutA





EAAAKEAAAKEAAAKEAAAK
14,336
WMSV_P03359_3mutA





PAPAPAPAP
14,337
SFV3L_P27401_2mutA





GGGGSS
14,338
MLVAV_P03356_3mutA





GSSGSSGSSGSSGSS
14,339
SFV3L_P27401_2mutA





PAPGGS
14,340
WMSV_P03359_3mutA





GSSEAAAKGGG
14,341
PERV_Q4VFZ2





GSSGGSPAP
14,342
MLVMS_P03355_PLV919





GSSGSSGSSGSSGSSGSS
14,343
SFV3L_P27401_2mutA





GGSGSSGGG
14,344
MLVCB_P08361_3mut





GGGPAPGSS
14,345
SFV3L_P27401-Pro_2mutA





GSSEAAAKGGS
14,346
WMSV_P03359_3mut





GSSEAAAKGGG
14,347
MLVAV_P03356_3mut





GGSGGGPAP
14,348
FFV_O93209-Pro





GSSGSS
14,349
PERV_Q4VFZ2_3mut





PAPGGGGGS
14,350
GALV_P21414_3mutA





EAAAKPAPGGS
14,351
MLVAV_P03356_3mut





GSSGSS
14,352
MLVMS_P03355_3mut





EAAAKPAPGGS
14,353
FFV_093209-Pro





GGGPAPEAAAK
14,354
MLVMS_P03355_3mutA_WS





GSSEAAAKGGG
14,355
MLVBM_Q7SVK7_3mut





GGGEAAAKGGS
14,356
BAEVM_P10272_3mut





GSSGSS
14,357
KORV_Q9TTC1-Pro_3mutA





EAAAKEAAAKEAAAK
14,358
SFV1_P23074





PAPGSSGGS
14,359
KORV_Q9TTC1-Pro_3mut





PAPAPAPAPAP
14,360
MLVMS_P03355





GSSEAAAK
14,361
SFV3L_P27401_2mut





PAP

PERV_Q4VFZ2_3mut





GGSEAAAKGGG
14,363
MLVBM_Q7SVK7_3mut





GGSGGGPAP
14,364
MLVBM_Q7SVK7_3mutA_WS





GSSGSS
14,365
MLVMS_P03355_3mut





GGSEAAAK
14,366
MLVMS_P03355





GSSEAAAKGGS
14,367
MLVMS_P03355_PLV919





PAPGGGGGS
14,368
MLVFF_P26809_3mut





GSSGGG
14,369
PERV_Q4VFZ2_3mut





GSSGGS
14,370
PERV_Q4VFZ2_3mutA_WS





PAPGGG
14,371
BAEVM_P10272_3mut





PAPGSSGGG
14,372
MLVBM_Q7SVK7_3mut





GGSEAAAK
14,373
SFV3L_P27401_2mut





GSSPAPEAAAK
14,374
SFV3L_P27401-Pro_2mut





GSSGGSPAP
14,375
BAEVM_P10272_3mut





GGSPAPGSS
14,376
PERV_Q4VFZ2_3mutA_WS





GGSGGSGGS
14,377
PERV_Q4VFZ2





GGSGGGPAP
14,378
FLV_P10273_3mut





GGGPAPEAAAK
14,379
SFV3L_P27401_2mutA





GGGGS
14,380
FLV_P10273_3mutA





GSSGGSGGG
14,381
XMRV6_A1Z651_3mut





EAAAKGGGGSS
14,382
PERV_Q4VFZ2





GGSGSSGGG
14,383
SFV3L_P27401-Pro_2mutA





GGSGGSGGS
14,384
MLVFF_P26809_3mut





GGGPAPEAAAK
14,385
FLV_P10273_3mut





GSSGGGEAAAK
14,386
MLVMS_P03355_3mut





GGG

SFV3L_P27401_2mut





GSAGSAAGSGEF
14,388
WMSV_P03359_3mut





GSSGGGPAP
14,389
MLVMS_P03355_PLV919





GGGGSS
14,390
KORV_Q9TTC1-Pro_3mut





GGGGSSEAAAK
14,391
KORV_Q9TTC1





PAPGGSGGG
14,392
SFV3L_P27401_2mut





GSSGSSGSSGSSGSS
14,393
FFV_093209





GSSGGSPAP
14,394
MLVMS_P03355_3mut





GGSEAAAK
14,395
KORV_Q9TTC1-Pro_3mutA





GGGGSGGGGS
14,396
BAEVM_P10272_3mut





GSSEAAAKGGG
14,397
AVIRE_P03360_3mut





EAAAKPAPGGG
14,398
FLV_P10273_3mut





EAAAKGGSPAP
14,399
SFV3L_P27401-Pro_2mutA





GSSEAAAKPAP
14,400
MLVBM_Q7SVK7_3mut





GGGPAPGGS
14,401
MLVCB_P08361_3mut





GGG

SFV3L_P27401_2mutA





EAAAKGGGGSEAAAK
14,403
SFV3L_P27401_2mutA





GGSGSSGGG
14,404
MLVBM_Q7SVK7_3mut





GSAGSAAGSGEF
14,405
BAEVM_P10272_3mut





GGGEAAAK
14,406
FOAMV_P14350_2mutA





PAPEAAAKGGS
14,407
WMSV_P03359_3mut





PAPAPAPAPAPAP
14,408
MLVF5_P26810_3mutA





GGSGGGGSS
14,409
FLV_P10273_3mutA





PAPGSSGGS
14,410
BAEVM_P10272_3mut





PAPEAAAK
14,411
WMSV_P03359_3mutA





GSSGSSGSSGSSGSSGSS
14,412
FFV_093209-Pro_2mut





GGGGGSGSS
14,413
FFV_093209-Pro





GGGGGGGG
14,414
SFV3L_P27401-Pro_2mutA





GGGGGG
14,415
FLV_P10273_3mut





GSSGGSGGG
14,416
MLVAV_P03356_3mutA





GGGGSS
14,417
SFV3L_P27401-Pro_2mutA





GGSGGGPAP
14,418
FOAMV_P14350_2mut





GSSGSS
14,419
AVIRE_P03360_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAK
14,420
SFV3L_P27401-Pro_2mutA





EAAAKEAAAK
14,421
BAEVM_P10272_3mut





GSSPAPEAAAK
14,422
GALV_P21414_3mutA





GGSEAAAKPAP
14,423
SFV3L_P27401_2mutA





GGSGGGEAAAK
14,424
SFV3L_P27401-Pro_2mutA





EAAAKGSSPAP
14,425
FOAMV_P14350_2mut





GGSGSSEAAAK
14,426
SFV3L_P27401_2mut





GGG

PERV_Q4VFZ2





GGGGGSGSS
14,428
FOAMV_P14350_2mut





GGSGGGEAAAK
14,429
KORV_Q9TTC1-Pro_3mut





GSSGGSGGG
14,430
AVIRE_P03360_3mutA





EAAAKPAPGGG
14,431
SFV3L_P27401_2mutA





PAPGGSGGG
14,432
KORV_Q9TTC1-Pro_3mut





PAPAPAP
14,433
WMSV_P03359_3mutA





GSSEAAAKPAP
14,434
SFV1_P23074





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
14,435
SRV2_P51517





GSSGGSGGG
14,436
PERV_Q4VFZ2_3mutA_WS





GSSGSSGSSGSSGSSGSS
14,437
FFV_093209





GSSGGGPAP
14,438
WMSV_P03359_3mut





PAPAPAPAPAPAP
14,439
MLVBM_Q7SVK7_3mut





GGGGGSPAP
14,440
KORV_Q9TTC1-Pro_3mutA





PAPGSS
14,441
MLVBM_Q7SVK7_3mutA_WS





PAPEAAAKGGS
14,442
SFV3L_P27401-Pro_2mut





GGGGSSPAP
14,443
MLVMS_P03355_3mut





GGSEAAAK
14,444
FFV_093209-Pro





EAAAKPAPGGS
14,445
AVIRE_P03360_3mutA





PAPGSS
14,446
WMSV_P03359_3mut





PAPGSSGGG
14,447
SFV3L_P27401-Pro_2mutA





EAAAKEAAAKEAAAK
14,448
SFV3L_P27401_2mut





GGS

MLVRD_P11227_3mut





GGGGS
14,450
KORV_Q9TTC1-Pro_3mut





GGSGGGGSS
14,451
KORV_Q9TTC1





GGSGGG
14,452
MLVMS_P03355_3mutA_WS





GGGEAAAKPAP
14,453
BAEVM_P10272_3mut





EAAAKEAAAKEAAAKEAAAKEAAAK
14,454
FLV_P10273





PAPGGSGGG
14,455
KORV_Q9TTC1-Pro_3mutA





GSSGSSGSSGSSGSSGSS
14,456
HTL1L_POC211





GGGEAAAKPAP
14,457
WMSV_P03359





GSSGGSPAP
14,458
FFV_093209-Pro





PAPAPAPAPAP
14,459
SFV3L_P27401-Pro_2mutA





GSSGGSEAAAK
14,460
SFV3L_P27401_2mutA





GGSPAPGSS
14,461
SFV3L_P27401_2mut





GGSGGSGGS
14,462
KORV_Q9TTC1-Pro_3mut





PAPEAAAKGSS
14,463
KORV_Q9TTC1-Pro_3mut





EAAAKGGS
14,464
KORV_Q9TTC1_3mutA





EAAAKGGGGSEAAAK
14,465
SFV3L_P27401-Pro_2mut





GGGGSSPAP
14,466
FFV_093209-Pro





EAAAK
14,467
SFV3L_P27401_2mut





EAAAKGGGGSS
14,468
BAEVM_P10272_3mut





GGGGGSEAAAK
14,469
MLVBM_Q7SVK7_3mut





GGGG
14,470
PERV_Q4VFZ2





GGGGGSEAAAK
14,471
FLV_P10273_3mut





EAAAKGGGPAP
14,472
KORV_Q9TTC1-Pro





GGGGSGGGGSGGGGSGGGGS
14,473
FFV_093209_2mutA





GSSGGSGGG
14,474
PERV_Q4VFZ2_3mut





GGGGSGGGGSGGGGS
14,475
GALV_P21414_3mutA





GGSGGGEAAAK
14,476
AVIRE_P03360_3mutA





PAPEAAAKGGG
14,477
SFV3L_P27401_2mut





GGGGSGGGGS
14,478
AVIRE_P03360





GSSGGGEAAAK
14,479
SFV3L_P27401_2mutA





GGGGG
14,480
AVIRE_P03360_3mutA





GGSGSS
14,481
KORV_Q9TTC1_3mut





PAPAPAPAPAPAP
14,482
FOAMV_P14350_2mut





GGSEAAAKPAP
14,483
KORV_Q9TTC1-Pro_3mut





GGGGGG
14,484
PERV_Q4VFZ2_3mut





GSSGGGEAAAK
14,485
MLVBM_Q7SVK7





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
14,486
MLVAV_P03356





GGSPAPGSS
14,487
BAEVM_P10272_3mut





GGGGSSPAP
14,488
BAEVM_P10272





GGGGSEAAAKGGGGS
14,489
SFV3L_P27401_2mut





GGGGGGGG
14,490
GALV_P21414_3mutA





PAPAP
14,491
MLVAV_P03356_3mut





GGGEAAAK
14,492
PERV_Q4VFZ2_3mutA_WS





GSSPAPGGG
14,493
FFV_093209_2mut





GGSGGSGGSGGSGGS
14,494
BAEVM_P10272





GGGGGS
14,495
MLVF5_P26810_3mutA





PAPGGGGSS
14,496
FLV_P10273_3mutA





GGGEAAAK
14,497
MLVBM_Q7SVK7_3mut





PAPEAAAKGGG
14,498
WMSV_P03359_3mut





GSSEAAAK
14,499
MLVBM_Q7SVK7_3mut





EAAAKEAAAK
14,500
AVIRE_P03360





EAAAKGGGGGS
14,501
MLVBM_Q7SVK7_3mut





GGGEAAAKGGS
14,502
SFV3L_P27401-Pro_2mutA





PAPAPAPAPAP
14,503
MLVF5_P26810_3mut





PAPGSSEAAAK
14,504
SFV3L_P27401-Pro_2mutA





EAAAKEAAAKEAAAK
14,505
BAEVM_P10272_3mutA





GGSPAPGSS
14,506
MLVMS_P03355





PAPGSSGGS
14,507
FLV_P10273_3mutA





EAAAKEAAAKEAAAKEAAAK
14,508
FOAMV_P14350-Pro_2mut





EAAAKGGG
14,509
KORV_Q9TTC1_3mutA





EAAAKGGSGGG
14,510
MLVBM_Q7SVK7_3mut





GGGGGS
14,511
KORV_Q9TTC1-Pro_3mutA





PAPGGSGGG
14,512
WMSV_P03359_3mut





GGGPAPGGS
14,513
KORV_Q9TTC1_3mutA





GSS

FFV_093209





GGSGGSGGS
14,515
PERV_Q4VFZ2_3mut





GGGGS
14,516
GALV_P21414_3mutA





GGGG
14,517
MLVF5_P26810_3mut





GGSEAAAKPAP
14,518
FFV_093209-Pro_2mut





PAPAPAPAP
14,519
FFV_093209-Pro





PAP

MLVF5_P26810_3mut





EAAAKEAAAKEAAAK
14,521
FFV_093209_2mut





EAAAKGSS
14,522
MLVCB_P08361_3mut





EAAAKGGG
14,523
MLVBM_Q7SVK7_3mut





PAPEAAAKGGG
14,524
FFV_093209_2mut





GSSGGGEAAAK
14,525
SFV1_P23074-Pro_2mut





PAPGGGEAAAK
14,526
GALV_P21414_3mutA





GGGGSGGGGSGGGGSGGGGS
14,527
FOAMV_P14350-Pro_2mutA





GSSGGG
14,528
FOAMV_P14350_2mut





GGGGSGGGGSGGGGSGGGGS
14,529
SFV3L_P27401_2mutA





GGSGSS
14,530
AVIRE_P03360_3mut





GGSGSSEAAAK
14,531
MMTVB_P03365_WS





PAPAPAP
14,532
MLVAV_P03356_3mutA





GSSGGSPAP
14,533
SFV3L_P27401-Pro_2mut





GGSPAP
14,534
AVIRE_P03360





GGSGGGPAP
14,535
FFV_093209





GSSEAAAK
14,536
PERV_Q4VFZ2





GSSGGGPAP
14,537
PERV_Q4VFZ2_3mutA_WS





GGGGSSEAAAK
14,538
KORV_Q9TTC1_3mutA





GGSEAAAKPAP
14,539
SFVCP_Q87040





GGSGGGPAP
14,540
FOAMV_P14350_2mutA





GGGGSGGGGSGGGGSGGGGS
14,541
BLVJ_P03361_2mutB





GGGGSSPAP
14,542
SFV3L_P27401_2mutA





EAAAKGGS
14,543
MLVF5_P26810_3mut





GGSEAAAKGSS
14,544
MLVCB_P08361_3mut





GGGGSSEAAAK
14,545
SFV3L_P27401_2mut





EAAAKGGSGGG
14,546
FOAMV_P14350_2mut





GGSGGS
14,547
FLV_P10273_3mut





EAAAKGGG
14,548
FFV_093209-Pro





GSSGSSGSSGSSGSS
14,549
SFV3L_P27401





GSSGGGPAP
14,550
PERV_Q4VFZ2_3mutA_WS





PAPGGSEAAAK
14,551
SFV3L_P27401-Pro_2mutA





GGSPAP
14,552
KORV_Q9TTC1





EAAAKPAPGSS
14,553
KORV_Q9TTC1_3mutA





SGSETPGTSESATPES
14,554
SFV1_P23074





GSSPAP
14,555
SFV3L_P27401-Pro_2mutA





GSSPAPGGG
14,556
SFV3L_P27401_2mut





GGGEAAAKGSS
14,557
SFV1_P23074_2mut





GGGPAPGGS
14,558
BAEVM_P10272_3mut





EAAAKGGG
14,559
KORV_Q9TTC1-Pro_3mutA





GSSGGG
14,560
SFV3L_P27401-Pro_2mut





GGSPAPEAAAK
14,561
BAEVM_P10272_3mut





EAAAKGSSPAP
14,562
FFV_093209





EAAAKGGGGSEAAAK
14,563
SFV3L_P27401-Pro_2mutA





GSSGSSGSSGSSGSS
14,564
SFV1_P23074_2mut





EAAAKGGSPAP
14,565
FOAMV_P14350_2mut





GGSGGS
14,566
KORV_Q9TTC1-Pro_3mutA





EAAAKGSSGGS
14,567
GALV_P21414





GSSGGGPAP
14,568
MLVAV_P03356





PAPEAAAKGGS
14,569
FOAMV_P14350_2mut





EAAAKPAPGGG
14,570
AVIRE_P03360_3mut





GGSPAP
14,571
SFV3L_P27401_2mutA





GGGGSGGGGS
14,572
SFV3L_P27401_2mutA





GGGGSS
14,573
AVIRE_P03360_3mutA





GGSPAPGGG
14,574
SFV3L_P27401-Pro_2mutA





EAAAKPAPGSS
14,575
SFV3L_P27401





EAAAKPAP
14,576
FOAMV_P14350-Pro_2mut





PAPEAAAKGSS
14,577
PERV_Q4VFZ2_3mutA_WS





EAAAKGGSGSS
14,578
SFV3L_P27401_2mutA





GGGEAAAKGSS
14,579
GALV_P21414_3mutA





GGGGSEAAAKGGGGS
14,580
PERV_Q4VFZ2_3mut





PAPGGSGSS
14,581
FFV_093209-Pro_2mutA





GGSEAAAKPAP
14,582
GALV_P21414_3mutA





GGSGGSGGSGGSGGS
14,583
FFV_093209-Pro





GSSGGSEAAAK
14,584
SFV3L_P27401-Pro_2mut





GGS

GALV_P21414_3mutA





PAPGGSEAAAK
14,586
MLVMS_P03355





PAPEAAAKGGS
14,587
BAEVM_P10272_3mutA





GGSGSSPAP
14,588
SFV3L_P27401-Pro_2mutA





GSSPAP
14,589
WMSV_P03359_3mut





GGGEAAAK
14,590
MMTVB_P03365





GGGGSS
14,591
PERV_Q4VFZ2_3mut





GGSPAPGSS
14,592
SFV3L_P27401-Pro_2mut





PAPGGS
14,593
MLVBM_Q7SVK7_3mut





EAAAKGSSPAP
14,594
MLVBM_Q7SVK7_3mut





GGGGSSGGS
14,595
PERV_Q4VFZ2_3mut





PAPAPAPAPAPAP
14,596
SFV1_P23074





GGSEAAAKGGG
14,597
SFV3L_P27401-Pro_2mut





GGSGGS
14,598
SFV1_P23074_2mut





GSSGGGGGS
14,599
MLVF5_P26810_3mutA





EAAAKGGGPAP
14,600
SFV3L_P27401





EAAAKEAAAKEAAAKEAAAK
14,601
FOAMV_P14350-Pro_2mutA





GGGPAPGSS
14,602
SFV3L_P27401_2mutA





GGGGSGGGGSGGGGSGGGGS
14,603
SFV3L_P27401_2mut





EAAAKEAAAKEAAAKEAAAK
14,604
MMTVB_P03365_WS





PAPGSSGGS
14,605
KORV_Q9TTC1-Pro_3mutA





PAPGSSEAAAK
14,606
FOAMV_P14350-Pro_2mut





GSSPAPEAAAK
14,607
BAEVM_P10272_3mut





EAAAKGGGGSEAAAK
14,608
FFV_093209-Pro





GGSPAP
14,609
PERV_Q4VFZ2





GGSGSSEAAAK
14,610
XMRV6_A1Z651_3mut





GGSEAAAKGGG
14,611
GALV_P21414_3mutA





PAPGGGGSS
14,612
AVIRE_P03360_3mutA





GGSGGSGGSGGS
14,613
PERV_Q4VFZ2





GGGGSSGGS
14,614
PERV_Q4VFZ2_3mutA_WS





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
14,615
BAEVM_P10272_3mutA





GGGPAP
14,616
MLVAV_P03356_3mut





GGGGSGGGGSGGGGSGGGGS
14,617
FFV_093209_2mut





GSSEAAAK
14,618
FFV_093209





GGSPAPEAAAK
14,619
FOAMV_P14350_2mut





GGGGGSEAAAK
14,620
FOAMV_P14350_2mut





GSSPAPGGS
14,621
MLVBM_Q7SVK7_3mut





GSS

SFVCP_Q87040_2mut





EAAAKPAP
14,623
FOAMV_P14350-Pro





EAAAKGGG
14,624
SFV3L_P27401_2mut





GGGEAAAK
14,625
AVIRE_P03360_3mutA





PAPGSSGGG
14,626
WMSV_P03359_3mut





EAAAKGGSPAP
14,627
SFV3L_P27401





GSSGGSGGG
14,628
SFV3L_P27401-Pro_2mutA





GSSGGGEAAAK
14,629
GALV_P21414_3mutA





GGGPAPGSS
14,630
MLVBM_Q7SVK7_3mutA_WS





PAPGGGEAAAK
14,631
FFV_093209-Pro_2mut





GSSGSSGSSGSS
14,632
SFV1_P23074_2mut





GGSEAAAK
14,633
PERV_Q4VFZ2_3mutA_WS





GGGEAAAKPAP
14,634
SFV3L_P27401_2mut





EAAAKGGGPAP
14,635
SFV3L_P27401_2mut





GGGGSSPAP
14,636
FLV_P10273_3mut





EAAAKPAPGSS
14,637
FFV_093209_2mut





GGGGSSPAP
14,638
SFV3L_P27401_2mut





GSSGSS
14,639
KORV_Q9TTC1_3mutA





GGGGSGGGGSGGGGSGGGGSGGGGS
14,640
BLVJ_P03361_2mut





GGGGSSGGS
14,641
GALV_P21414_3mutA





EAAAKGGSGSS
14,642
FFV_093209-Pro





EAAAKPAP
14,643
PERV_Q4VFZ2





GSSGGGEAAAK
14,644
MLVBM_Q7SVK7_3mut





PAPGGSGGG
14,645
BAEVM_P10272





EAAAKGGGPAP
14,646
MLVF5_P26810





GSSGSSGSS
14,647
MLVBM_Q7SVK7_3mut





GSSGGS
14,648
AVIRE_P03360_3mutA





GGSEAAAKGGG
14,649
FOAMV_P14350_2mut





EAAAKGGS
14,650
MLVF5_P26810_3mutA





GGSGSSGGG
14,651
WMSV_P03359_3mut





EAAAK
14,652
SFV1_P23074_2mut





GSSGGSPAP
14,653
SFV3L_P27401-Pro_2mutA





GGGGSSGGS
14,654
KORV_Q9TTC1_3mut





PAPGGSGGG
14,655
FFV_093209-Pro_2mut





GGGPAPGGS
14,656
SFV3L_P27401_2mutA





GSSPAPEAAAK
14,657
FLV_P10273_3mut





GGSGSSPAP
14,658
SFV3L_P27401_2mut





GSSEAAAKGGS
14,659
SFV3L_P27401_2mut





PAPGGG
14,660
SFV3L_P27401_2mutA





SGSETPGTSESATPES
14,661
KORV_Q9TTC1-Pro_3mut





GGGGS
14,662
SFV1_P23074-Pro_2mutA





GSSGGGEAAAK
14,663
WMSV_P03359





EAAAKGGGGSEAAAK
14,664
MLVF5_P26810_3mutA





GSSEAAAKPAP
14,665
FFV_093209





GGGGGG
14,666
SFV1_P23074_2mutA





EAAAKEAAAKEAAAK
14,667
MMTVB_P03365-Pro





EAAAKPAPGSS
14,668
MLVBM_Q7SVK7_3mut





GGSGSSEAAAK
14,669
SFV3L_P27401_2mutA





GGSEAAAK
14,670
MLVMS_P03355_3mut





GGSPAPEAAAK
14,671
SFV3L_P27401_2mut





GGGPAPGSS
14,672
SFV1_P23074





GGGGGSEAAAK
14,673
MLVBM_Q7SVK7_3mutA_WS





EAAAKPAPGSS
14,674
KORV_Q9TTC1-Pro





GSSGSSGSSGSS
14,675
SFV3L_P27401_2mut





EAAAKPAP
14,676
SFV3L_P27401_2mut





GGGEAAAK
14,677
PERV_Q4VFZ2_3mut





GGSGGS
14,678
SFV3L_P27401_2mutA





EAAAKGSSGGS
14,679
MMTVB_P03365





SGSETPGTSESATPES
14,680
SFV3L_P27401





EAAAKGSSGGG
14,681
PERV_Q4VFZ2





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
14,682
MMTVB_P03365





GGSGGGPAP
14,683
KORV_Q9TTC1_3mutA





PAPAPAPAP
14,684
SFV3L_P27401





GGGEAAAKGGS
14,685
SFV1_P23074_2mut





GSSGGSGGG
14,686
PERV_Q4VFZ2_3mut





PAPEAAAKGGS
14,687
FOAMV_P14350_2mutA





GGGEAAAKGSS
14,688
SFV3L_P27401_2mut





GGGGSGGGGSGGGGSGGGGS
14,689
MLVBM_Q7SVK7





PAPGSSGGG
14,690
FLV_P10273





GGSGSSGGG
14,691
FFV_093209





EAAAKPAPGSS
14,692
MLVBM_Q7SVK7





GSSEAAAKGGG
14,693
SFV3L_P27401_2mutA





GGSGGSGGSGGSGGS
14,694
MLVF5_P26810





GGSEAAAKPAP
14,695
SFV3L_P27401-Pro_2mutA





EAAAKGGSPAP
14,696
SFV3L_P27401_2mutA





EAAAKGGGGGS
14,697
SFV3L_P27401_2mut





GSSPAPEAAAK
14,698
SFV3L_P27401_2mutA





PAPAP
14,699
MLVBM_Q7SVK7_3mut





PAPGGSEAAAK
14,700
KORV_Q9TTC1-Pro





GGSGSS
14,701
MLVF5_P26810_3mutA





GGSEAAAKPAP
14,702
FFV_093209_2mut





GSS

MLVMS_P03355





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
14,704
SFV3L_P27401-Pro





PAPGGGEAAAK
14,705
SFV3L_P27401_2mut





PAPGGGGGS
14,706
SFV3L_P27401-Pro_2mut





PAPGGSGSS
14,707
BAEVM_P10272_3mut





GSSGGGEAAAK
14,708
FFV_093209





GGSEAAAKPAP
14,709
SFV1_P23074_2mut





GGGGG
14,710
FLV_P10273_3mut





GGGEAAAKGSS
14,711
SFV3L_P27401





GSSGSSGSSGSSGSS
14,712
SFV1_P23074-Pro





SGSETPGTSESATPES
14,713
AVIRE_P03360





PAPGSSGGG
14,714
MLVBM_Q7SVK7_3mut





GGGGSSPAP
14,715
HTL3P_Q4U0X6_2mut





GGGEAAAK
14,716
SFV1_P23074





GGSGGG
14,717
AVIRE_P03360





EAAAKGSSGGG
14,718
SFV3L_P27401_2mutA





GSSPAPEAAAK
14,719
FOAMV_P14350-Pro_2mutA





GGGPAPGSS
14,720
WMSV_P03359





EAAAKGSSGGG
14,721
MLVMS_P03355





GGGGGSEAAAK
14,722
MLVMS_P03355





EAAAKPAPGGS
14,723
SFV3L_P27401





EAAAKGSSPAP
14,724
SFV3L_P27401





GGGGGGG
14,725
FOAMV_P14350_2mutA





EAAAKEAAAKEAAAK
14,726
SFV3L_P27401





GSSPAPGGS
14,727
FFV_093209_2mutA





GGGGSSEAAAK
14,728
SFV3L_P27401-Pro_2mutA





GGSEAAAKGSS
14,729
GALV_P21414_3mutA





GGSEAAAKGSS
14,730
BAEVM_P10272_3mutA





EAAAKPAPGGG
14,731
MLVCB_P08361





GSSGSSGSSGSSGSSGSS
14,732
SFV1_P23074-Pro





GGGGSEAAAKGGGGS
14,733
FOAMV_P14350_2mut





GSSPAPGGS
14,734
MLVMS_P03355_PLV919





GGGGSGGGGS
14,735
FFV_093209-Pro





GSSGGSPAP
14,736
KORV_Q9TTC1_3mutA





GGSGGS
14,737
GALV_P21414_3mutA





PAPGSSEAAAK
14,738
WMSV_P03359





PAPGGGGSS
14,739
MMTVB_P03365-Pro





GGGGSSGGS
14,740
PERV_Q4VFZ2_3mutA_WS





GGGGSGGGGS
14,741
FFV_093209_2mut





GGGGSGGGGSGGGGSGGGGS
14,742
XMRV6_A1Z651





GGSGSSEAAAK
14,743
SFV1_P23074_2mut





GGSGGGGSS
14,744
GALV_P21414_3mutA





GGSEAAAKPAP
14,745
MLVBM_Q7SVK7





EAAAKGGSPAP
14,746
SFV1_P23074_2mutA





PAPAPAPAP
14,747
FFV_093209





GSSGGSPAP
14,748
MMTVB_P03365-Pro





GGGGGSPAP
14,749
KORV_Q9TTC1_3mutA





EAAAKGGGPAP
14,750
PERV_Q4VFZ2





GSSGGSPAP
14,751
BAEVM_P10272





GGGGG
14,752
FFV_093209





GGGGGS
14,753
FLV_P10273_3mutA





EAAAKEAAAKEAAAK
14,754
FOAMV_P14350





PAPGGG
14,755
MLVCB_P08361_3mut





GSSGGSEAAAK
14,756
FOAMV_P14350_2mutA





GGSPAPGGG
14,757
FLV_P10273_3mut





GSSGSSGSSGSSGSSGSS
14,758
SFV1_P23074-Pro_2mutA





GGSPAPEAAAK
14,759
SFV3L_P27401





PAPGGGGSS
14,760
HTL3P_Q4U0X6_2mutB





GGGGSSEAAAK
14,761
MMTVB_P03365_2mut_WS





PAPGGS
14,762
MLVRD_P11227_3mut





GGSGGSGGSGGSGGS
14,763
MMTVB_P03365





GSAGSAAGSGEF
14,764
AVIRE_P03360





GSSGGS
14,765
BAEVM_P10272_3mutA





GGSGGGGSS
14,766
MMTVB_P03365





GGSGGGGSS
14,767
WMSV_P03359





PAPEAAAKGSS
14,768
SFV1_P23074





GSSGSSGSSGSS
14,769
SFV1_P23074-Pro_2mutA





PAPAPAPAPAPAP
14,770
SFV3L_P27401





PAPGSSGGG
14,771
FLV_P10273_3mut





GGSGSSPAP
14,772
MLVMS_P03355





GGSGGGPAP
14,773
FOAMV_P14350





PAPGGGGGS
14,774
KORV_Q9TTC1_3mutA





EAAAKGSSPAP
14,775
GALV_P21414_3mutA





GGSGSSPAP
14,776
MLVBM_Q7SVK7_3mut





EAAAKGSS
14,777
SFV3L_P27401_2mut





GGGGGSEAAAK
14,778
WMSV_P03359





GGGGGGGG
14,779
SFV1_P23074-Pro





EAAAKEAAAK
14,780
MLVBM_Q7SVK7





GGGEAAAKGGS
14,781
MLVBM_Q7SVK7





EAAAKGGSPAP
14,782
SFV3L_P27401_2mut





GSSEAAAK
14,783
XMRV6_A1Z651





PAPGGGEAAAK
14,784
MMTVB_P03365_WS





GGSPAP
14,785
GALV_P21414_3mutA





GSSPAPGGG
14,786
MLVBM_Q7SVK7_3mutA_WS





GGSGSSPAP
14,787
SFV1_P23074_2mutA





GGS

HTL32_QOR5R2_2mut





GGSGGGGSS
14,789
MMTVB_P03365-Pro





GGGGSGGGGSGGGGSGGGGS
14,790
SFVCP_Q87040_2mutA





EAAAKGGGPAP
14,791
FOAMV_P14350_2mut





GSSGGGEAAAK
14,792
MMTVB_P03365





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
14,793
MLVBM_Q7SVK7_3mutA_WS





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
14,794
MMTVB_P03365_WS





EAAAKEAAAK
14,795
FOAMV_P14350-Pro_2mut





GSSPAPEAAAK
14,796
FOAMV_P14350_2mutA





EAAAKPAPGGS
14,797
GALV_P21414_3mutA





GSSGGSPAP
14,798
KORV_Q9TTC1-Pro_3mut





GGGPAPEAAAK
14,799
MLVAV_P03356





GGGEAAAKPAP
14,800
SFV1_P23074-Pro_2mut





GGGGGSEAAAK
14,801
SFV3L_P27401_2mut





GGGPAPGSS
14,802
SFV3L_P27401_2mut





GGSEAAAKPAP
14,803
AVIRE_P03360





GSSGSSGSSGSSGSSGSS
14,804
SFV1_P23074-Pro_2mut





EAAAKGSSGGS
14,805
FOAMV_P14350_2mutA





GGGGGG
14,806
MLVBM_Q7SVK7_3mut





GSSPAPGGS
14,807
PERV_Q4VFZ2





GGSGSSPAP
14,808
GALV_P21414_3mutA





GGGPAPEAAAK
14,809
SFV3L_P27401





GGSGGGEAAAK
14,810
WMSV_P03359





GSAGSAAGSGEF
14,811
SFV1_P23074_2mut





GSSGGGEAAAK
14,812
MLVMS_P03355





GGG

MMTVB_P03365-Pro





PAPGSSGGS
14,814
FOAMV_P14350_2mut





GGGGSSPAP
14,815
FFV_093209_2mut





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
14,816
MMTVB_P03365_WS





GGGGGGG
14,817
XMRV6_A1Z651





PAPAPAPAPAP
14,818
FOAMV_P14350





GGGGSGGGGSGGGGSGGGGS
14,819
MMTVB_P03365_2mut_WS





GGSGGGPAP
14,820
SFV3L_P27401_2mut





GGGGGG
14,821
SFV1_P23074-Pro





EAAAKPAPGSS
14,822
SFV3L_P27401_2mut





GGGGSSGGS
14,823
HTL3P_Q4U0X6_2mut





PAPGSSEAAAK
14,824
MMTVB_P03365-Pro





GGGGSSPAP
14,825
FOAMV_P14350-Pro_2mut





PAPGSSGGS
14,826
MMTVB_P03365





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
14,827
SRV2_P51517





PAPAPAP
14,828
MMTVB_P03365_2mut_WS





PAPGGGGGS
14,829
MMTVB_P03365_2mutB





GGGGSS
14,830
SFV1_P23074-Pro_2mutA





EAAAKEAAAKEAAAKEAAAK
14,831
SFV3L_P27401-Pro





GGSGGSGGSGGSGGS
14,832
MMTVB_P03365-Pro





GGGGGGG
14,833
SFV3L_P27401_2mut





PAPGGGEAAAK
14,834
SFV3L_P27401





PAPGSS
14,835
FOAMV_P14350_2mutA





GGGGSGGGGS
14,836
SFVCP_Q87040_2mutA





GSSGGSGGG
14,837
XMRV6_A1Z651





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
14,838
MLVBM_Q7SVK7





GSSEAAAKGGG
14,839
FFV_093209-Pro_2mut





GGSEAAAKPAP
14,840
SFV3L_P27401-Pro





GSSGGSGGG
14,841
SFV1_P23074_2mut





EAAAKGGGGSS
14,842
FOAMV_P14350_2mutA





GGGGGG
14,843
SFV3L_P27401_2mut





GGGGG
14,844
MLVBM_Q7SVK7_3mut





PAPEAAAKGGG
14,845
SFV3L_P27401





EAAAKGGSPAP
14,846
KORV_Q9TTC1_3mutA





GGGEAAAKPAP
14,847
SFV1_P23074_2mut





GSSGSSGSSGSSGSSGSS
14,848
KORV_Q9TTC1-Pro





EAAAKEAAAKEAAAKEAAAK
14,849
SFVCP_Q87040





PAPGSSEAAAK
14,850
MLVBM_Q7SVK7





GSSGSSGSS
14,851
FFV_093209-Pro_2mut





GSSGGGPAP
14,852
SFV3L_P27401-Pro_2mut





GGGPAPEAAAK
14,853
WMSV_P03359_3mut





GGGEAAAK
14,854
MMTVB_P03365-Pro





GSSGSSGSSGSS
14,855
SFV3L_P27401-Pro_2mutA





PAPAPAPAPAP
14,856
FFV_093209-Pro





GGSPAPEAAAK
14,857
FFV_093209-Pro_2mut





GSSGSSGSSGSSGSSGSS
14,858
GALV_P21414





EAAAKEAAAKEAAAKEAAAKEAAAK
14,859
FOAMV_P14350





GGGPAPEAAAK
14,860
MMTVB_P03365-Pro





PAPGGSGGG
14,861
MLVF5_P26810_3mutA





PAPGGSGGG
14,862
FLV_P10273_3mut





GGGEAAAKGGS
14,863
SFV3L_P27401





GSAGSAAGSGEF
14,864
MLVBM_Q7SVK7_3mut





GSSPAPGGG
14,865
MPMV_P07572_2mutB





GSSGSSGSSGSSGSSGSS
14,866
FOAMV_P14350





GGSGGGGSS
14,867
BLVJ_P03361_2mut





PAPEAAAKGSS
14,868
SFV1_P23074-Pro





GGG

FFV_093209





EAAAKGGGGSS
14,870
SFV1_P23074_2mut





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
14,871
SRV2_P51517





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
14,872
MMTVB_P03365





GGGEAAAKGGS
14,873
MMTVB_P03365_WS





GSSGSS
14,874
SFV1_P23074





GSSGGGGGS
14,875
SFV3L_P27401





GGGGSSEAAAK
14,876
SFV1_P23074





EAAAKGSSGGS
14,877
HTL1A_P03362_2mutB





GSSEAAAKGGS
14,878
GALV_P21414_3mutA





EAAAKGSSPAP
14,879
SFV1_P23074





EAAAKPAPGSS
14,880
SFV3L_P27401_2mutA





PAPGSSGGG
14,881
SFV3L_P27401-Pro_2mut





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
14,882
SFV3L_P27401-Pro





EAAAKEAAAKEAAAKEAAAKEAAAK
14,883
MMTVB_P03365_WS





GGGGSSEAAAK
14,884
MLVF5_P26810_3mutA





EAAAKGGSPAP
14,885
GALV_P21414





PAPEAAAKGSS
14,886
MMTVB_P03365_WS





GSSGGGGGS
14,887
SFVCP_Q87040_2mut





GGGGSSPAP
14,888
SFV1_P23074





EAAAKGGGGSS
14,889
XMRV6_A1Z651





PAPAPAPAP
14,890
MMTVB_P03365





GGSEAAAKGSS
14,891
SFV3L_P27401_2mutA





GSSPAPGGG
14,892
MMTVB_P03365_WS





GGGGGG
14,893
SFV3L_P27401-Pro





GGSGGSGGS
14,894
FOAMV_P14350-Pro_2mut





PAPAPAPAPAPAP
14,895
WMSV_P03359





GSSPAP
14,896
MLVBM_Q7SVK7





GGGGGSGSS
14,897
MMTVB_P03365_2mut_WS





EAAAKGSSGGS
14,898
MMTVB_P03365_2mutB_WS





EAAAK
14,899
FFV_093209_2mutA





PAPEAAAK
14,900
SFV1_P23074-Pro





EAAAKGGSGSS
14,901
SFV3L_P27401





GGSGGSGGS
14,902
FFV_093209-Pro





GSSGGGEAAAK
14,903
MMTVB_P03365





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
14,904
MLVFF_P26809_3mutA





GGSGGSGGSGGSGGSGGS
14,905
HTL1L_POC211_2mutB





GGGEAAAK
14,906
SFV3L_P27401-Pro_2mutA





GGGGGSGSS
14,907
MMTVB_P03365





GSSPAPGGS
14,908
FOAMV_P14350_2mutA





EAAAKGSS
14,909
MLVMS_P03355





GSSGGSGGG
14,910
FFV_093209-Pro





GGSGGGGSS
14,911
MMTVB_P03365-Pro_2mut





GGSPAPGSS
14,912
FOAMV_P14350_2mut





GGSGGSGGSGGSGGSGGS
14,913
SFVCP_Q87040-Pro_2mut





GSSEAAAKGGG
14,914
FOAMV_P14350_2mutA





GGSGGSGGS
14,915
MMTVB_P03365-Pro





GSSGSSGSSGSSGSSGSS
14,916
MMTVB_P03365_2mut_WS





GSSGSSGSSGSSGSS
14,917
MMTVB_P03365-Pro





PAPEAAAK
14,918
WDSV_O92815





GSSGSSGSSGSSGSS
14,919
FFV_093209-Pro_2mut





EAAAKGGGGSEAAAK
14,920
MMTVB_P03365-Pro





GGSPAPEAAAK
14,921
FOAMV_P14350





GSSGSS
14,922
PERV_Q4VFZ2





GGG

MMTVB_P03365-Pro





GGGGSGGGGSGGGGS
14,924
FFV_093209_2mut





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
14,925
MMTVB_P03365-Pro





GGSGSSPAP
14,926
WMSV_P03359





GGGGGGGG
14,927
SFV3L_P27401_2mut





PAPGSSEAAAK
14,928
FOAMV_P14350-Pro_2mutA





GGGGSSPAP
14,929
FOAMV_P14350_2mut





GSSGGSPAP
14,930
MLVBM_Q7SVK7_3mut





GSSGGGGGS
14,931
GALV_P21414_3mutA





EAAAKEAAAKEAAAKEAAAKEAAAK
14,932
MMTVB_P03365





GSSGGGGGS
14,933
SFV1_P23074_2mut





GGGGSEAAAKGGGGS
14,934
SFV1_P23074





GGGEAAAKPAP
14,935
FFV_093209





PAPGGGEAAAK
14,936
SFV1_P23074





GGSGGGEAAAK
14,937
PERV_Q4VFZ2_3mutA_WS





GSSGGG
14,938
MMTVB_P03365-Pro





EAAAKGSSGGS
14,939
FFV_093209_2mut





GGGGG
14,940
SFV1_P23074_2mut





GGGPAP
14,941
SFV3L_P27401





GSSGGSEAAAK
14,942
FFV_093209





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
14,943
MMTVB_P03365-Pro





GSSGGGEAAAK
14,944
SFV1_P23074_2mutA





GSSGSSGSSGSSGSS
14,945
SFV3L_P27401_2mut





GGSEAAAKPAP
14,946
FLV_P10273





GGGGSGGGGS
14,947
FOAMV_P14350-Pro_2mutA





GSSEAAAKPAP
14,948
SFV3L_P27401





GGGGSEAAAKGGGGS
14,949
MMTVB_P03365-Pro





PAPGSSEAAAK
14,950
MLVF5_P26810_3mut





EAAAKGGSGGG
14,951
SFV3L_P27401





GGGPAPGGS
14,952
SFV3L_P27401





GSSEAAAKGGS
14,953
FOAMV_P14350_2mutA





EAAAKGGSGGG
14,954
HTL1L_POC211





GSSGGSPAP
14,955
SFV3L_P27401_2mutA





PAPAP
14,956
FFV_093209





PAPGGSGSS
14,957
MMTVB_P03365_WS





EAAAKGGGGGS
14,958
FOAMV_P14350_2mut





PAPEAAAKGGS
14,959
SFV3L_P27401_2mut





GSSEAAAKPAP
14,960
MMTVB_P03365-Pro





GGSGGS
14,961
PERV_Q4VFZ2_3mut





GSSEAAAKGGG
14,962
FFV_093209-Pro_2mutA





EAAAK
14,963
HTL1L_POC211





GSSPAP
14,964
MLVMS_P03355





EAAAKPAPGGG
14,965
FFV_093209-Pro_2mut





GGGGSEAAAKGGGGS
14,966
SFV1_P23074-Pro_2mut





EAAAKGSSGGS
14,967
SFV3L_P27401





GSAGSAAGSGEF
14,968
FFV_093209_2mutA





PAPEAAAKGGS
14,969
MMTVB_P03365_2mutB_WS





EAAAKEAAAKEAAAKEAAAKEAAAKEAAAK
14,970
MMTVB_P03365





GGS

MMTVB_P03365





GGSEAAAKPAP
14,972
SFV1_P23074





EAAAKGSSGGG
14,973
HTLV2_P03363_2mut





GGSEAAAKGGG
14,974
MMTVB_P03365_WS





GGSGGS
14,975
FFV_093209-Pro





GSSEAAAKGGS
14,976
MMTVB_P03365-Pro





PAPAPAPAPAP
14,977
SFV1_P23074_2mutA





GGSEAAAKGGG
14,978
MMTVB_P03365_2mutB_WS





PAPAPAPAP
14,979
MMTVB_P03365_WS





GGGGSGGGGSGGGGSGGGGSGGGGS
14,980
HTL3P_Q4U0X6_2mut





PAPGGSEAAAK
14,981
SFV1_P23074-Pro_2mut





GGSGGGPAP
14,982
MMTVB_P03365





GSSGSSGSSGSSGSSGSS
14,983
MMTVB_P03365-Pro





GGSEAAAKPAP
14,984
SFV1_P23074-Pro





GGGEAAAKGSS
14,985
SFV3L_P27401_2mutA





GGGPAPGGS
14,986
AVIRE_P03360





PAPGGG
14,987
MLVRD_P11227





GGSEAAAKGSS
14,988
SFV3L_P27401_2mut





GGGEAAAKGSS
14,989
FOAMV_P14350_2mut





GGGEAAAKGSS
14,990
SFV1_P23074-Pro





EAAAKEAAAKEAAAKEAAAK
14,991
MLVAV_P03356





EAAAKGGGPAP
14,992
JSRV_P31623_2mutB





EAAAKGGGGSS
14,993
FOAMV_P14350_2mut





EAAAKEAAAKEAAAKEAAAKEAAAK
14,994
SRV2_P51517





GSSGGGGGS
14,995
FFV_093209





PAPAPAP
14,996
FOAMV_P14350_2mutA





GGSGGSGGSGGS
14,997
FOAMV_P14350





GGGEAAAK
14,998
MMTVB_P03365_WS





GGGGGS
14,999
SFV1_P23074_2mutA





GGSGGS
15,000
WMSV_P03359_3mut





EAAAKGGS
15,001
MMTVB_P03365-Pro





GGGGSS
15,002
BLVJ_P03361_2mut





PAPAP
15,003
MMTVB_P03365-Pro_2mut





PAPGGG
15,004
SMRVH_P03364





EAAAKGGGGSS
15,005
SFV3L_P27401





PAPAPAPAPAP
15,006
MMTVB_P03365





GGGPAP
15,007
MMTVB_P03365-Pro





GSSGGSGGG
15,008
MMTVB_P03365





EAAAKGGGPAP
15,009
FOAMV_P14350_2mutA





GSSGSSGSSGSS
15,010
SFV1_P23074





GGGGSGGGGS
15,011
SFV3L_P27401





GSSGGSGGG
15,012
MLVF5_P26810





GGGEAAAKPAP
15,013
MMTVB_P03365-Pro





PAPEAAAK
15,014
HTLV2_P03363_2mut





GSSGSSGSSGSS
15,015
FOAMV_P14350_2mut





GSSEAAAKPAP
15,016
MMTVB_P03365-Pro





PAPEAAAKGGG
15,017
HTL3P_Q4U0X6_2mut





GGSEAAAKGSS
15,018
MMTVB_P03365-Pro





EAAAKPAPGGS
15,019
MMTVB_P03365_2mut_WS





GSSGGSEAAAK
15,020
MLVF5_P26810_3mutA





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
15,021
MLVF5_P26810_3mut





EAAAKGGGGSS
15,022
MMTVB_P03365-Pro





GGGGGSGSS
15,023
HTL1A_P03362_2mutB





PAPAP
15,024
FFV_093209-Pro_2mut





GGGGGSPAP
15,025
HTL1C_P14078_2mut





GGGPAP
15,026
HTLV2_P03363_2mut





EAAAKGGGGSEAAAK
15,027
SFVCP_Q87040





GGSEAAAKGGG
15,028
FFV_093209-Pro_2mutA





GSSPAPGGS
15,029
FOAMV_P14350-Pro_2mut





GGGGGGG
15,030
MMTVB_P03365-Pro





EAAAKGSS
15,031
SFV3L_P27401_2mutA





EAAAKGGGGSEAAAK
15,032
MMTVB_P03365-Pro





GGGGSEAAAKGGGGS
15,033
SFV1_P23074-Pro_2mutA





EAAAKGGGGSS
15,034
MMTVB_P03365





GGGEAAAKGGS
15,035
SFV1_P23074





PAPEAAAKGGG
15,036
MLVF5_P26810





GGGGSSGGS
15,037
MMTVB_P03365





GGSGSS
15,038
MMTVB_P03365





PAPAPAPAPAPAP
15,039
KORV_Q9TTC1





EAAAKGGG
15,040
SFV1_P23074-Pro_2mut





PAPAPAPAPAPAP
15,041
SRV2_P51517





GSSGSSGSSGSSGSS
15,042
FFV_093209-Pro_2mutA





GGGGSS
15,043
FOAMV_P14350_2mut





PAPGGGEAAAK
15,044
MMTVB_P03365_WS





GGSGGGEAAAK
15,045
FFV_093209-Pro_2mut





PAPAPAPAPAP
15,046
MMTVB_P03365_WS





GGGEAAAKGGS
15,047
MMTVB_P03365-Pro





GGGEAAAKGSS
15,048
MMTVB_P03365_2mutB





GSSPAPEAAAK
15,049
MMTVB_P03365_WS





EAAAKEAAAKEAAAKEAAAKEAAAK
15,050
SFV1_P23074-Pro_2mutA





PAPGGG
15,051
SFV3L_P27401





GSSEAAAKGGG
15,052
MMTVB_P03365_WS





GGGGSSEAAAK
15,053
FOAMV_P14350_2mut





PAPGSSGGS
15,054
SFV1_P23074-Pro_2mut





GSSGSSGSSGSSGSSGSS
15,055
SFV3L_P27401





EAAAKGSSGGG
15,056
MMTVB_P03365





PAPGGGGSS
15,057
WDSV_O92815_2mutA





GGSPAP
15,058
MMTVB_P03365-Pro





GGSGGSGGSGGSGGS
15,059
SFVCP_Q87040-Pro_2mut





PAPAPAPAP
15,060
MMTVB_P03365-Pro





GGGGG
15,061
HTL1A_P03362





GGSGGSGGSGGS
15,062
SFV1_P23074_2mutA





GSSGSSGSSGSSGSS
15,063
FOAMV_P14350-Pro_2mut





PAPGGSEAAAK
15,064
MMTVB_P03365_2mutB_WS





PAPAPAPAP
15,065
SFV1_P23074_2mut





PAPGGGGSS
15,066
MMTVB_P03365





GGSGSS
15,067
SFV3L_P27401_2mut





EAAAKEAAAKEAAAKEAAAK
15,068
MMTVB_P03365_2mut





EAAAKGGSGGG
15,069
HTL3P_Q4U0X6_2mut





PAPGGGGSS
15,070
SFVCP_Q87040-Pro_2mutA





EAAAKGGGGGS
15,071
MLVAV_P03356





GGGGGS
15,072
FOAMV_P14350_2mut





GGGEAAAKGGS
15,073
FFV_O93209-Pro_2mutA





EAAAKPAPGGG
15,074
MMTVB_P03365_2mutB





GGSGGGPAP
15,075
FFV_093209_2mut





GSSEAAAKPAP
15,076
MMTVB_P03365





PAPAPAPAPAPAP
15,077
SFV1_P23074_2mut





GGSPAPGGG
15,078
MMTVB_P03365-Pro





GGSGGGEAAAK
15,079
MMTVB_P03365





PAPAP
15,080
SFVCP_Q87040





GSSEAAAK
15,081
SFVCP_Q87040





GGGGSGGGGSGGGGS
15,082
MMTVB_P03365-Pro





GSSGSSGSS
15,083
SFV3L_P27401





EAAAKGGSGGG
15,084
MMTVB_P03365-Pro





GSSPAP
15,085
SFV1_P23074_2mut





GGGEAAAK
15,086
SFV1_P23074-Pro





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
15,087
MMTVB_P03365-Pro





PAPGGS
15,088
HTL1C_P14078_2mut





PAPGSSGGS
15,089
SFV1_P23074_2mut





PAPEAAAK
15,090
MMTVB_P03365_WS





PAPAP
15,091
MMTVB_P03365-Pro





EAAAKGGS
15,092
HTL1A_P03362_2mut





GGGGSEAAAKGGGGS
15,093
HTL1C_P14078





EAAAKGSSGGS
15,094
FOAMV_P14350-Pro





PAPGGSGSS
15,095
MMTVB_P03365-Pro





PAPGGSEAAAK
15,096
SFV1_P23074_2mut





PAPGSSEAAAK
15,097
FFV_093209-Pro_2mut





PAPGSSGGG
15,098
FOAMV_P14350-Pro_2mutA





GSSGGGEAAAK
15,099
AVIRE_P03360





GGGGGG
15,100
SMRVH_P03364_2mut





PAPEAAAKGGG
15,101
MMTVB_P03365-Pro





GGGEAAAKGGS
15,102
SFVCP_Q87040_2mutA





PAPAPAPAPAP
15,103
SRV2_P51517





GSSGSSGSSGSSGSSGSS
15,104
MMTVB_P03365





EAAAKGGGPAP
15,105
MLVAV_P03356





PAPAPAPAPAP
15,106
FOAMV_P14350-Pro_2mutA





PAPGGSEAAAK
15,107
FOAMV_P14350





GSSGGGPAP
15,108
HTL32_Q0R5R2_2mutB





GGGGGSPAP
15,109
HTL3P_Q4U0X6_2mutB





GSSGGSGGG
15,110
MMTVB_P03365-Pro





PAPAP
15,111
SFVCP_Q87040-Pro





GSSGGGPAP
15,112
MMTVB_P03365-Pro





GGSGSS
15,113
MMTVB_P03365-Pro_2mut





GGSPAPEAAAK
15,114
SFV1_P23074-Pro_2mut





EAAAKGGSGGG
15,115
SFV3L_P27401_2mut





GGGGSSEAAAK
15,116
MMTVB_P03365_WS





GGGGGSGSS
15,117
MMTVB_P03365_2mut





GGGGSSGGS
15,118
SFV1_P23074-Pro_2mutA





EAAAKGGGGSEAAAK
15,119
MMTVB_P03365_WS





PAPGGGEAAAK
15,120
SFV1_P23074-Pro





PAPEAAAKGGG
15,121
MMTVB_P03365





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
15,122
MMTVB_P03365





GSSGGSEAAAK
15,123
FOAMV_P14350-Pro_2mut





GGSPAP
15,124
MLVBM_Q7SVK7_3mut





GSSEAAAK
15,125
FOAMV_P14350





GSSEAAAK
15,126
MMTVB_P03365-Pro





EAAAKGSSGGS
15,127
HTL1A_P03362_2mut





GGGEAAAKPAP
15,128
FOAMV_P14350-Pro_2mut





EAAAKGGSPAP
15,129
FOAMV_P14350





GSSEAAAKPAP
15,130
MMTVB_P03365_WS





GSSGSSGSS
15,131
FOAMV_P14350_2mut





EAAAKEAAAKEAAAKEAAAK
15,132
MMTVB_P03365_WS





EAAAK
15,133
MMTVB_P03365





PAPGSS
15,134
BAEVM_P10272





PAPGGS
15,135
FFV_093209-Pro_2mut





GGSGGS
15,136
SFV1_P23074-Pro_2mutA





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
15,137
HTLV2_P03363_2mut





GGSGGGEAAAK
15,138
MMTVB_P03365_WS





PAPGSSGGG
15,139
HTL1A_P03362





GGSGGS
15,140
SFV3L_P27401-Pro





GSSGSS
15,141
SFV1_P23074-Pro





PAPGGSEAAAK
15,142
MMTVB_P03365





GSAGSAAGSGEF
15,143
MMTVB_P03365-Pro





PAPGGG
15,144
FOAMV_P14350_2mut





EAAAKGGSGSS
15,145
MMTVB_P03365_WS





GSSGGGEAAAK
15,146
SFV3L_P27401-Pro





GGSGGGPAP
15,147
FOAMV_P14350-Pro_2mut





PAPAPAPAPAPAP
15,148
WDSV_O92815





SGSETPGTSESATPES
15,149
SFVCP_Q87040-Pro_2mutA





GGSGGSGGS
15,150
SFV1_P23074





GGGGSS
15,151
SFVCP_Q87040_2mut





GGGGGSEAAAK
15,152
MMTVB_P03365





SGSETPGTSESATPES
15,153
MMTVB_P03365_WS





PAPAPAP
15,154
SFV3L_P27401





PAPEAAAKGSS
15,155
MMTVB_P03365_2mutB_WS





GSSGSSGSSGSSGSS
15,156
SRV2_P51517





GGGPAPGSS
15,157
HTL32_QOR5R2_2mutB





GGSGGGGSS
15,158
MMTVB_P03365-Pro





SGSETPGTSESATPES
15,159
SRV2_P51517





EAAAKGSSGGS
15,160
MMTVB_P03365-Pro





GSSPAPEAAAK
15,161
MMTVB_P03365-Pro





GSSPAPEAAAK
15,162
SRV2_P51517





GGGGSSPAP
15,163
MMTVB_P03365-Pro





PAPGGGEAAAK
15,164
SFV1_P23074-Pro_2mutA





PAPEAAAKGGS
15,165
MMTVB_P03365





GSSGSSGSSGSSGSSGSS
15,166
FOAMV_P14350-Pro





GGSPAPGSS
15,167
SFV3L_P27401





GGGPAPGGS
15,168
SFV1_P23074-Pro_2mutA





GGGPAPGSS
15,169
MMTVB_P03365-Pro





EAAAKPAP
15,170
MLVBM_Q7SVK7





EAAAKEAAAKEAAAK
15,171
HTL1C_P14078





GSSGGSEAAAK
15,172
SRV2_P51517





PAPGGGGGS
15,173
SRV2_P51517





GGGEAAAK
15,174
FFV_093209-Pro_2mut





EAAAKGGGPAP
15,175
HTL32_QOR5R2





GGSGSSGGG
15,176
MMTVB_P03365





PAPEAAAKGSS
15,177
MMTVB_P03365-Pro





PAPGGGGGS
15,178
MMTVB_P03365-Pro





EAAAKGGGGGS
15,179
MMTVB_P03365_WS





GGGGGS
15,180
MMTVB_P03365-Pro





GGGGSGGGGSGGGGSGGGGSGGGGS
15,181
HTL1C_P14078





EAAAKGGSPAP
15,182
MMTVB_P03365





GGGGSSPAP
15,183
FFV_093209-Pro_2mut





GGGGSSGGS
15,184
MMTVB_P03365-Pro





PAPGSSGGS
15,185
MMTVB_P03365-Pro





GGGGGS
15,186
SRV2_P51517





GGSGSSGGG
15,187
MMTVB_P03365





GSSGGSEAAAK
15,188
MMTVB_P03365-Pro





EAAAKEAAAKEAAAKEAAAK
15,189
GALV_P21414





GGSEAAAKGGG
15,190
MMTVB_P03365-Pro





SGGSSGGSSGSETPGTSESATPESSGGSSGGSS
15,191
MMTVB_P03365-Pro





GSSEAAAKGGS
15,192
MMTVB_P03365





GGGGSGGGGSGGGGSG( GGSGGGGSGGGGS
15,193
HTL3P_Q4U0X6_2mutB





GGGEAAAK
15,194
MMTVB_P03365-Pro





PAPAPAPAP
15,195
MMTVB_P03365-Pro





PAPGSSGGG
15,196
MMTVB_P03365





GSSGSSGSSGSSGSS
15,197
GALV_P21414





GGSPAP
15,198
MMTVB_P03365_WS





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
15,199
MMTVB_P03365-Pro





PAPEAAAK
15,200
MMTVB_P03365-Pro





PAPGSSGGG
15,201
SFV1_P23074-Pro_2mutA





GGGGGSEAAAK
15,202
MMTVB_P03365_2mutB_WS





PAPAPAPAPAP
15,203
MMTVB_P03365-Pro





EAAAKGGSGSS
15,204
MMTVB_P03365-Pro





EAAAKEAAAKEAAAKEAAAK
15,205
MLVRD_P11227_3mut





PAPAPAPAP
15,206
FOAMV_P14350_2mutA





GGGPAPGSS
15,207
SFVCP_Q87040_2mut





PAPEAAAKGSS
15,208
SFVCP_Q87040_2mut





GGSPAPGGG
15,209
MMTVB_P03365-Pro





GGGGSGGGGSGGGGSGGGGS
15,210
MMTVB_P03365





EAAAKGGS
15,211
HTL3P_Q4U0X6_2mut





PAPGSSGGS
15,212
MMTVB_P03365_WS





GGGGSGGGGS
15,213
MMTVB_P03365





GGSGGS
15,214
FOAMV_P14350





EAAAKGGGGSEAAAK
15,215
SFVCP_Q87040-Pro_2mut





EAAAKEAAAKEAAAKEAAAK
15,216
MMTVB_P03365-Pro_2mutB





PAPGGGEAAAK
15,217
SFVCP_Q87040-Pro





GSSGSS
15,218
JSRV_P31623_2mutB





EAAAKGGGGGS
15,219
MMTVB_P03365_2mut_WS





GSSPAPEAAAK
15,220
MMTVB_P03365-Pro





GGGEAAAK
15,221
HTL1C_P14078





PAPEAAAKGSS
15,222
HTL32_QOR5R2_2mutB





GGGGSSEAAAK
15,223
MMTVB_P03365-Pro





PAPGSSGGS
15,224
MMTVB_P03365-Pro





EAAAKGGGGGS
15,225
MMTVB_P03365





GGGGSGGGGSGGGGSGGGGS
15,226
MMTVB_P03365





EAAAKGGGGSS
15,227
HTL3P_Q4U0X6_2mut





GGGEAAAKGGS
15,228
SFVCP_Q87040-Pro





GGGGGSPAP
15,229
MMTVB_P03365-Pro_2mutB





GGSGGGEAAAK
15,230
SFV3L_P27401-Pro





PAPGGGGGS
15,231
SFV3L_P27401-Pro





EAAAKGGGGSEAAAK
15,232
MMTVB_P03365





PAPEAAAKGSS
15,233
MMTVB_P03365-Pro





GGSEAAAKGGG
15,234
MMTVB_P03365-Pro





GGSGGSGGSGGSGGS
15,235
SMRVH_P03364_2mutB





GGSGGSGGSGGSGGS
15,236
HTL1L_POC211_2mut





GGGGGG
15,237
WDSV_092815





GGGGGSGSS
15,238
MMTVB_P03365-Pro





GGSEAAAKPAP
15,239
SFV3L_P27401-Pro_2mut





GGGPAPGSS
15,240
MMTVB_P03365_2mut_WS





GGGGGS
15,241
MMTVB_P03365_WS





GGSPAPEAAAK
15,242
MMTVB_P03365





PAPEAAAKGGS
15,243
HTL1A_P03362





EAAAKGGSGSS
15,244
MMTVB_P03365_2mut_WS





GGGPAPEAAAK
15,245
SFV3L_P27401-Pro_2mut





PAPGGGGSS
15,246
HTL32_QOR5R2_2mut





GSSPAPGGG
15,247
HTL3P_Q4U0X6_2mut





GGGGSSGGS
15,248
BLVAU_P25059_2mut





EAAAKGGGGGS
15,249
HTL1L_POC211





GGSEAAAKGSS
15,250
JSRV_P31623_2mutB





GSSGGG
15,251
JSRV_P31623





GGSGGSGGSGGS
15,252
MMTVB_P03365-Pro





EAAAKPAP
15,253
SFV1_P23074-Pro_2mutA





GGGGSSGGS
15,254
MMTVB_P03365_WS





GGSGGS
15,255
MMTVB_P03365_WS





EAAAKGGGGGS
15,256
MMTVB_P03365-Pro





GGGGSGGGGSGGGGSGGGGSGGGGSGGGGS
15,257
MMTVB_P03365





GGSGGSGGS
15,258
MMTVB_P03365





GGGGGSEAAAK
15,259
MLVBM_Q7SVK7





GGSGSSPAP
15,260
MMTVB_P03365_WS





EAAAKEAAAKEAAAK
15,261
JSRV_P31623





PAPEAAAKGGS
15,262
MMTVB_P03365-Pro





GGSGSSEAAAK
15,263
FOAMV_P14350





GGGGGSGSS
15,264
MMTVB_P03365-Pro_2mut





GGGPAPGGS
15,265
MMTVB_P03365





SGSETPGTSESATPES
15,266
SFVCP_Q87040_2mut





GSSPAPGGS
15,267
SFV1_P23074-Pro_2mutA





GSSGSSGSSGSSGSS
15,268
MMTVB_P03365





EAAAKGGGPAP
15,269
MMTVB_P03365





GSSGGG
15,270
MMTVB_P03365_2mut_WS





GGGEAAAKPAP
15,271
MMTVB_P03365





PAPGGSGGG
15,272
MMTVB_P03365-Pro





GSSGGSGGG
15,273
WDSV_O92815_2mut





GGSGGG
15,274
HTL32_QOR5R2_2mut





EAAAKGGSPAP
15,275
HTLV2_P03363_2mut





GGSPAPEAAAK
15,276
MMTVB_P03365-Pro





GSSGGSEAAAK
15,277
MMTVB_P03365_2mut





GSAGSAAGSGEF
15,278
MMTVB_P03365_WS





PAPGGSGSS
15,279
FFV_093209





GGSEAAAKGGG
15,280
MMTVB_P03365





GGSPAPGSS
15,281
MMTVB_P03365-Pro





GSSGGSGGG
15,282
SFV3L_P27401





PAPEAAAKGGG
15,283
HTL1A_P03362_2mutB





GGGEAAAKPAP
15,284
MMTVB_P03365-Pro





GGSEAAAK
15,285
HTL32_Q0R5R2_2mutB





GGGEAAAKGSS
15,286
MPMV_P07572





GGGGGSEAAAK
15,287
MMTVB_P03365-Pro





PAPAPAPAPAP
15,288
SFVCP_Q87040-Pro_2mutA





PAPAPAPAPAP
15,289
HTL1L_POC211_2mut





GGGGSSGGS
15,290
HTL3P_Q4U0X6





PAPGGSEAAAK
15,291
MMTVB_P03365_2mut_WS





PAPAPAPAPAP
15,292
HTL1A_P03362





EAAAKPAPGGG
15,293
MMTVB_P03365_2mut_WS





GGSEAAAK
15,294
MMTVB_P03365_2mut_WS





GGGEAAAKGSS
15,295
SFV1_P23074-Pro_2mutA





GGSPAPGSS
15,296
MMTVB_P03365-Pro





GGSEAAAKPAP
15,297
MLVBM_Q7SVK7





PAPEAAAKGGG
15,298
MMTVB_P03365_2mut_WS





GSSEAAAKPAP
15,299
MMTVB_P03365-Pro_2mutB





GGGGSEAAAKGGGGS
15,300
MMTVB_P03365-Pro_2mut





GSSEAAAKGGS
15,301
MMTVB_P03365-Pro_2mutB





GSSGSSGSSGSSGSS
15,302
SRV2_P51517_2mutB





GGGGGSPAP
15,303
HTL1L_POC211_2mut





GGSEAAAK
15,304
MMTVB_P03365





GSSPAPEAAAK
15,305
SMRVH_P03364_2mutB





GGGPAPGGS
15,306
HTL1C_P14078_2mut





GGSPAPEAAAK
15,307
MMTVB_P03365_WS





GGSEAAAKPAP
15,308
HTL1A_P03362_2mut





PAPAPAPAP
15,309
HTLV2_P03363_2mut





GSSPAPGGG
15,310
MMTVB_P03365





GSSGSSGSSGSS
15,311
MMTVB_P03365-Pro





GGSEAAAKGSS
15,312
MMTVB_P03365_WS





GGSGSSGGG
15,313
MMTVB_P03365_2mutB





GSSGSSGSSGSSGSSGSS
15,314
JSRV_P31623_2mutB





GGSEAAAKPAP
15,315
MMTVB_P03365-Pro





GSSGGSGGG
15,316
HTLV2_P03363_2mut





AEAAAKEAAAKEAAAKEAAAKALEAEAAAKEAAAKEAAAKEAAAKA
15,317
WDSV_O92815_2mut





GGSPAPEAAAK
15,318
MMTVB_P03365





GGGGSSEAAAK
15,319
MMTVB_P03365





GGSGGGEAAAK
15,320
SFV1_P23074-Pro_2mutA





GGGGSEAAAKGGGGS
15,321
WDSV_O92815_2mut





GGSGSSEAAAK
15,322
MMTVB_P03365_2mutB_WS





GGSEAAAKPAP
15,323
MMTVB_P03365_WS





GSSGGGEAAAK
15,324
SFVCP_Q87040-Pro





GSSGGS
15,325
SFVCP_Q87040-Pro_2mut





GGSEAAAKPAP
15,326
SFVCP_Q87040_2mut





GSSGGSEAAAK
15,327
SFVCP_Q87040_2mut





GSSPAPEAAAK
15,328
SRV2_P51517_2mutB





GGSGGSGGSGGSGGSGGS
15,329
BLVAU_P25059





GSSGSSGSSGSSGSS
15,330
HTL1C_P14078_2mut





EAAAKGGGGSS
15,331
MMTVB_P03365_2mutB





GGGEAAAKGSS
15,332
SFVCP_Q87040-Pro









Example 3: Screening Configurations of Template RNAs that Install the SCD Mutation into the Endogenous HBB Gene in Human Cells

This example describes the use of a gene modifying system containing a gene modifying polypeptide and template RNAs comprising varied lengths of heterologous object sequences and primer binding site sequences to identify favorable configurations for editing of the endogenous HBB gene in human cells. In this example, a template RNA contains:

    • (1) a gRNA spacer;
    • (2) a gRNA scaffold;
    • (3) a heterologous object sequence; and
    • (4) a primer binding site (PBS) sequence.


The template RNAs were designed to contain 8-17 nt PBS sequences and 9-20 nt heterologous object sequences. Two different gRNA spacer sequences were used to target sites proximal to the SCD mutation in the endogenous HBB genomic site. The heterologous object sequences and PBS sequences were designed to install the SCD mutation (an E6V mutation) into the endogenous gene by replacing an “A” nucleotide with a “T” nucleotide at the mutation site using a gene modifying system described herein. The template RNA sequences used are those shown in Table A (HBB5 sequences) and Table B (HBB8 sequences), with the following exceptions. First, the mutation region of the RT template sequence was designed to install the mutation (A->T) rather than to correct back to the wild-type sequence. In particular, RT template regions for SCD installation using template HBB5 comprise at least a portion of the following sequence: Install RT Template (PAM-kill): AACGGCAGACTTCTCTACAG (SEQ ID NO: 21672), of which the no PAM-kill equivalent would be: Install RT Template (no PAM-kill): AACGGCAGACTTCTCCACAG (SEQ ID NO: 21673). In addition, the installation version of the HBB8 spacer had the following sequence that differed from the HBB8 mutation correction spacer due to the SCD mutation falling within the target protospacer, resulting in a single nt difference relative to the WT sequence without the SCD mutation: GTAACGGCAGACTTCTCCTC (SEQ ID NO: 21674). In particular, RT template regions for SCD mutation installation using template HBB8 comprise at least a portion of the following sequence: Install RT Template (293T SNP): TGGTGCACCTGACTCCTGTG (SEQ ID NO: 21676), of which the equivalent template lacking the 293T SNP and targeted to the hg38 reference sequence would be: Install RT Template (no SNP): TGGTGCATCTGACTCCTGTG (SEQ ID NO: 21675).


A gene modifying system comprising a gene modifying polypeptide (see Table C) and a template RNA was transfected into HEK293T cells. The gene modifying polypeptide and the template RNAs were delivered by nucleofection in RNA format. Specifically, 1 μg of gene modifying polypeptide mRNA was combined with 10 μM template RNAs. The mRNA and template RNAs were added to 25 μL SF buffer containing 250,000 HEK293T cells and cells were nucleofected using program DS-150. After nucleofection, cells were grown at 37° C., 5% CO2 for 3 days prior to cell lysis and genomic DNA extraction. To analyze gene editing activity, primers flanking the HBB genomic target site were used to amplify across the locus. Amplicons are analyzed via short read sequencing using an Illumina MiSeq. Gene editing activity with high editing efficiency was detected in the configurations with 9-12nt PBS sequence and 13-16 nt heterologous object sequence. These results indicate that template RNAs comprising gRNA spacers and gRNA scaffolds described herein successfully directed a gene modifying polypeptide to the endogenous HBB gene in human cells, such that specific gene editing occurred. Results are shown in Table E.


Although this experiment demonstrates installation of the mutation rather than correction of the mutation, it indicates that editing may be performed at the native HBB locus.









TABLE E







HBB5 and HBB8 Sequences for_installing mutation. The columns indicate, from left


to right: 1) Name of the HBB5 template RNA, 2) Full HBB5 template RNA sequence depicted as


RNA, further showing chemical modifications as used in Example 3, 3) observed activity of


template RNA of column 2 as defined in Example 3, 4) Name of the HBB8 template RNA, 5) Full


HBB8 template RNA sequence depicted as RNA, further showing chemical modifications as used


in Example 3, 6) observed activity of template RNA of column 5 as defined in Example 3.
















SEQ
Ac-


SEQ
Ac-




ID
tiv-


ID
tiv-


Name
Template Sequence
NO
ity
Name
Template Sequence
NO
ity





HBB
mC*mA*mU*rGrGrUrGrCrArCr
20637
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20727
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT2
UrUrArGrArGrCrUrArGrArArAr


_RT2
UrUrArGrArGrCrUrArGrArArAr




0_PB
UrArGrCrArArGrUrUrArArArAr


0_PB
UrArGrCrArArGrUrUrArArArAr




S17
UrArArGrGrCrUrArGrUrCrCrGr


S17
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArArCrGrGr



GrUrCrGrGrUrGrCrUrGrGrUrGr





CrArGrArCrUrUrCrUrCrUrArCr



CrArCrCrUrGrArCrUrCrCrUrGr





ArGrGrArGrUrCrArGrGrUrGrCr



UrGrGrArGrArArGrUrCrUrGrCr





ArCrC*mA*mU*mG



CrGrU*mU*mA*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20638
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20728
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT2
UrUrArGrArGrCrUrArGrArArAr


_RT2
UrUrArGrArGrCrUrArGrArArAr




0_PB
UrArGrCrArArGrUrUrArArArAr


0_PB
UrArGrCrArArGrUrUrArArArAr




S16
UrArArGrGrCrUrArGrUrCrCrGr


S16
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArArCrGrGr



GrUrCrGrGrUrGrCrUrGrGrUrGr





CrArGrArCrUrUrCrUrCrUrArCr



CrArCrCrUrGrArCrUrCrCrUrGr





ArGrGrArGrUrCrArGrGrUrGrCr



UrGrGrArGrArArGrUrCrUrGrCr





ArC*mC*mA*mU



CrG*mU*mU*mA







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20639
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20729
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT2
UrUrArGrArGrCrUrArGrArArAr


_RT2
UrUrArGrArGrCrUrArGrArArAr




0_PB
UrArGrCrArArGrUrUrArArArAr


0_PB
UrArGrCrArArGrUrUrArArArAr




S15
UrArArGrGrCrUrArGrUrCrCrGr


S15
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArArCrGrGr



GrUrCrGrGrUrGrCrUrGrGrUrGr





CrArGrArCrUrUrCrUrCrUrArCr



CrArCrCrUrGrArCrUrCrCrUrGr





ArGrGrArGrUrCrArGrGrUrGrCr



UrGrGrArGrArArGrUrCrUrGrCr





A*mC*mC*mA



C*mG*mU*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20640
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20730
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT2
UrUrArGrArGrCrUrArGrArArAr


_RT2
UrUrArGrArGrCrUrArGrArArAr




0_PB
UrArGrCrArArGrUrUrArArArAr


0_PB
UrArGrCrArArGrUrUrArArArAr




S14
UrArArGrGrCrUrArGrUrCrCrGr


S14
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArArCrGrGr



GrUrCrGrGrUrGrCrUrGrGrUrGr





CrArGrArCrUrUrCrUrCrUrArCr



CrArCrCrUrGrArCrUrCrCrUrGr





ArGrGrArGrUrCrArGrGrUrGrC



UrGrGrArGrArArGrUrCrUrGrC





*mA*mC*mC



*mC*mG*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20641
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20731
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT2
UrUrArGrArGrCrUrArGrArArAr


_RT2
UrUrArGrArGrCrUrArGrArArAr




0_PB
UrArGrCrArArGrUrUrArArArAr


0_PB
UrArGrCrArArGrUrUrArArArAr




S13
UrArArGrGrCrUrArGrUrCrCrGr


S13
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArArCrGrGr



GrUrCrGrGrUrGrCrUrGrGrUrGr





CrArGrArCrUrUrCrUrCrUrArCr



CrArCrCrUrGrArCrUrCrCrUrGr





ArGrGrArGrUrCrArGrGrUrG*m



UrGrGrArGrArArGrUrCrUrG*m





C*mA*mC



C*mC*mG







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20642
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20732
++


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT2
UrUrArGrArGrCrUrArGrArArAr


_RT2
UrUrArGrArGrCrUrArGrArArAr




0_PB
UrArGrCrArArGrUrUrArArArAr


0_PB
UrArGrCrArArGrUrUrArArArAr




S12
UrArArGrGrCrUrArGrUrCrCrGr


S12
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArArCrGrGr



GrUrCrGrGrUrGrCrUrGrGrUrGr





CrArGrArCrUrUrCrUrCrUrArCr



CrArCrCrUrGrArCrUrCrCrUrGr





ArGrGrArGrUrCrArGrGrU*mG*



UrGrGrArGrArArGrUrCrU*mG*





mC*mA



mC*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20643
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20733
++


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT2
UrUrArGrArGrCrUrArGrArArAr


_RT2
UrUrArGrArGrCrUrArGrArArAr




0_PB
UrArGrCrArArGrUrUrArArArAr


0_PB
UrArGrCrArArGrUrUrArArArAr




S11
UrArArGrGrCrUrArGrUrCrCrGr


S11
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArArCrGrGr



GrUrCrGrGrUrGrCrUrGrGrUrGr





CrArGrArCrUrUrCrUrCrUrArCr



CrArCrCrUrGrArCrUrCrCrUrGr





ArGrGrArGrUrCrArGrG*mU*m



UrGrGrArGrArArGrUrC*mU*m





G*mC



G*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20644
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20734
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT2
UrUrArGrArGrCrUrArGrArArAr


_RT2
UrUrArGrArGrCrUrArGrArArAr




0_PB
UrArGrCrArArGrUrUrArArArAr


0_PB
UrArGrCrArArGrUrUrArArArAr




S10
UrArArGrGrCrUrArGrUrCrCrGr


S10
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArArCrGrGr



GrUrCrGrGrUrGrCrUrGrGrUrGr





CrArGrArCrUrUrCrUrCrUrArCr



CrArCrCrUrGrArCrUrCrCrUrGr





ArGrGrArGrUrCrArG*mG*mU*



UrGrGrArGrArArGrU*mC*mU*





mG



mG







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20645
++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20735
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT2
UrUrArGrArGrCrUrArGrArArAr


_RT2
UrUrArGrArGrCrUrArGrArArAr




0_PB
UrArGrCrArArGrUrUrArArArAr


0_PB
UrArGrCrArArGrUrUrArArArAr




S9
UrArArGrGrCrUrArGrUrCrCrGr


S9
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArArCrGrGr



GrUrCrGrGrUrGrCrUrGrGrUrGr





CrArGrArCrUrUrCrUrCrUrArCr



CrArCrCrUrGrArCrUrCrCrUrGr





ArGrGrArGrUrCrA*mG*mG*m



UrGrGrArGrArArG*mU*mC*m





U



U







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20646
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20736
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT2
UrUrArGrArGrCrUrArGrArArAr


_RT2
UrUrArGrArGrCrUrArGrArArAr




0_PB
UrArGrCrArArGrUrUrArArArAr


0_PB
UrArGrCrArArGrUrUrArArArAr




S8
UrArArGrGrCrUrArGrUrCrCrGr


S8
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArArCrGrGr



GrUrCrGrGrUrGrCrUrGrGrUrGr





CrArGrArCrUrUrCrUrCrUrArCr



CrArCrCrUrGrArCrUrCrCrUrGr





ArGrGrArGrUrC*mA*mG*mG



UrGrGrArGrArA*mG*mU*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20647
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20737
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




9_PB
UrArGrCrArArGrUrUrArArArAr


9_PB
UrArGrCrArArGrUrUrArArArAr




S17
UrArArGrGrCrUrArGrUrCrCrGr


S17
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArCrGrGrCr



GrUrCrGrGrUrGrCrGrGrUrGrCr





ArGrArCrUrUrCrUrCrUrArCrAr



ArCrCrUrGrArCrUrCrCrUrGrUr





GrGrArGrUrCrArGrGrUrGrCrAr



GrGrArGrArArGrUrCrUrGrCrCr





CrC*mA*mU*mG



GrU*mU*mA*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20648
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20738
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




9_PB
UrArGrCrArArGrUrUrArArArAr


9_PB
UrArGrCrArArGrUrUrArArArAr




S16
UrArArGrGrCrUrArGrUrCrCrGr


S16
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArCrGrGrCr



GrUrCrGrGrUrGrCrGrGrUrGrCr





ArGrArCrUrUrCrUrCrUrArCrAr



ArCrCrUrGrArCrUrCrCrUrGrUr





GrGrArGrUrCrArGrGrUrGrCrAr



GrGrArGrArArGrUrCrUrGrCrCr





C*mC*mA*mU



G*mU*mU*mA







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20649
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20739
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




9_PB
UrArGrCrArArGrUrUrArArArAr


9_PB
UrArGrCrArArGrUrUrArArArAr




S15
UrArArGrGrCrUrArGrUrCrCrGr


S15
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArCrGrGrCr



GrUrCrGrGrUrGrCrGrGrUrGrCr





ArGrArCrUrUrCrUrCrUrArCrAr



ArCrCrUrGrArCrUrCrCrUrGrUr





GrGrArGrUrCrArGrGrUrGrCrA



GrGrArGrArArGrUrCrUrGrCrC*





*mC*mC*mA



mG*mU*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20650
++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20740
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




9_PB
UrArGrCrArArGrUrUrArArArAr


9_PB
UrArGrCrArArGrUrUrArArArAr




S14
UrArArGrGrCrUrArGrUrCrCrGr


S14
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArCrGrGrCr



GrUrCrGrGrUrGrCrGrGrUrGrCr





ArGrArCrUrUrCrUrCrUrArCrAr



ArCrCrUrGrArCrUrCrCrUrGrUr





GrGrArGrUrCrArGrGrUrGrC*m



GrGrArGrArArGrUrCrUrGrC*m





A*mC*mC



C*mG*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20651
++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20741
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




9_PB
UrArGrCrArArGrUrUrArArArAr


9_PB
UrArGrCrArArGrUrUrArArArAr




S13
UrArArGrGrCrUrArGrUrCrCrGr


S13
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArCrGrGrCr



GrUrCrGrGrUrGrCrGrGrUrGrCr





ArGrArCrUrUrCrUrCrUrArCrAr



ArCrCrUrGrArCrUrCrCrUrGrUr





GrGrArGrUrCrArGrGrUrG*mC*



GrGrArGrArArGrUrCrUrG*mC*





mA*mC



mC*mG







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20652
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20742
++


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




9_PB
UrArGrCrArArGrUrUrArArArAr


9_PB
UrArGrCrArArGrUrUrArArArAr




S12
UrArArGrGrCrUrArGrUrCrCrGr


S12
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArCrGrGrCr



GrUrCrGrGrUrGrCrGrGrUrGrCr





ArGrArCrUrUrCrUrCrUrArCrAr



ArCrCrUrGrArCrUrCrCrUrGrUr





GrGrArGrUrCrArGrGrU*mG*m



GrGrArGrArArGrUrCrU*mG*m





C*mA



C*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20653
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20743
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




9_PB
UrArGrCrArArGrUrUrArArArAr


9_PB
UrArGrCrArArGrUrUrArArArAr




S11
UrArArGrGrCrUrArGrUrCrCrGr


S11
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArCrGrGrCr



GrUrCrGrGrUrGrCrGrGrUrGrCr





ArGrArCrUrUrCrUrCrUrArCrAr



ArCrCrUrGrArCrUrCrCrUrGrUr





GrGrArGrUrCrArGrG*mU*mG*



GrGrArGrArArGrUrC*mU*mG*





mC



mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20654
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20744
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




9_PB
UrArGrCrArArGrUrUrArArArAr


9_PB
UrArGrCrArArGrUrUrArArArAr




S10
UrArArGrGrCrUrArGrUrCrCrGr


S10
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArCrGrGrCr



GrUrCrGrGrUrGrCrGrGrUrGrCr





ArGrArCrUrUrCrUrCrUrArCrAr



ArCrCrUrGrArCrUrCrCrUrGrUr





GrGrArGrUrCrArG*mG*mU*m



GrGrArGrArArGrU*mC*mU*m





G



G







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20655
++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20745
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




9_PB
UrArGrCrArArGrUrUrArArArAr


9_PB
UrArGrCrArArGrUrUrArArArAr




S9
UrArArGrGrCrUrArGrUrCrCrGr


S9
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArCrGrGrCr



GrUrCrGrGrUrGrCrGrGrUrGrCr





ArGrArCrUrUrCrUrCrUrArCrAr



ArCrCrUrGrArCrUrCrCrUrGrUr





GrGrArGrUrCrA*mG*mG*mU



GrGrArGrArArG*mU*mC*mU




HBB
mC*mA*mU*rGrGrUrGrCrArCr
20656
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20746
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




9_PB
UrArGrCrArArGrUrUrArArArAr


9_PB
UrArGrCrArArGrUrUrArArArAr




S8
UrArArGrGrCrUrArGrUrCrCrGr


S8
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArCrGrGrCr



GrUrCrGrGrUrGrCrGrGrUrGrCr





ArGrArCrUrUrCrUrCrUrArCrAr



ArCrCrUrGrArCrUrCrCrUrGrUr





GrGrArGrUrC*mA*mG*mG



GrGrArGrArA*mG*mU*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20657
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20747
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




7_PB
UrArGrCrArArGrUrUrArArArAr


8_PB
UrArGrCrArArGrUrUrArArArAr




S17
UrArArGrGrCrUrArGrUrCrCrGr


S17
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrGrCrArGr



GrUrCrGrGrUrGrCrGrUrGrCrAr





ArCrUrUrCrUrCrUrArCrArGrGr



CrCrUrGrArCrUrCrCrUrGrUrGr





ArGrUrCrArGrGrUrGrCrArCrC*



GrArGrArArGrUrCrUrGrCrCrGr





mA*mU*mG



U*mU*mA*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20658
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20748
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




7_PB
UrArGrCrArArGrUrUrArArArAr


8_PB
UrArGrCrArArGrUrUrArArArAr




S16
UrArArGrGrCrUrArGrUrCrCrGr


S16
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrGrCrArGr



GrUrCrGrGrUrGrCrGrUrGrCrAr





ArCrUrUrCrUrCrUrArCrArGrGr



CrCrUrGrArCrUrCrCrUrGrUrGr





ArGrUrCrArGrGrUrGrCrArC*m



GrArGrArArGrUrCrUrGrCrCrG*





C*mA*mU



mU*mU*mA







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20659
++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20749
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




7_PB
UrArGrCrArArGrUrUrArArArAr


8_PB
UrArGrCrArArGrUrUrArArArAr




S15
UrArArGrGrCrUrArGrUrCrCrGr


S15
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrGrCrArGr



GrUrCrGrGrUrGrCrGrUrGrCrAr





ArCrUrUrCrUrCrUrArCrArGrGr



CrCrUrGrArCrUrCrCrUrGrUrGr





ArGrUrCrArGrGrUrGrCrA*mC*



GrArGrArArGrUrCrUrGrCrC*m





mC*mA



G*mU*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20660
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20750
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




7_PB
UrArGrCrArArGrUrUrArArArAr


8_PB
UrArGrCrArArGrUrUrArArArAr




S14
UrArArGrGrCrUrArGrUrCrCrGr


S14
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrGrCrArGr



GrUrCrGrGrUrGrCrGrUrGrCrAr





ArCrUrUrCrUrCrUrArCrArGrGr



CrCrUrGrArCrUrCrCrUrGrUrGr





ArGrUrCrArGrGrUrGrC*mA*m



GrArGrArArGrUrCrUrGrC*mC*





C*mC



mG*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20661
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20751
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




7_PB
UrArGrCrArArGrUrUrArArArAr


8_PB
UrArGrCrArArGrUrUrArArArAr




S13
UrArArGrGrCrUrArGrUrCrCrGr


S13
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrGrCrArGr



GrUrCrGrGrUrGrCrGrUrGrCrAr





ArCrUrUrCrUrCrUrArCrArGrGr



CrCrUrGrArCrUrCrCrUrGrUrGr





ArGrUrCrArGrGrUrG*mC*mA*



GrArGrArArGrUrCrUrG*mC*m





mC



C*mG







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20662
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20752
++


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




7_PB
UrArGrCrArArGrUrUrArArArAr


8_PB
UrArGrCrArArGrUrUrArArArAr




S12
UrArArGrGrCrUrArGrUrCrCrGr


S12
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrGrCrArGr



GrUrCrGrGrUrGrCrGrUrGrCrAr





ArCrUrUrCrUrCrUrArCrArGrGr



CrCrUrGrArCrUrCrCrUrGrUrGr





ArGrUrCrArGrGrU*mG*mC*m



GrArGrArArGrUrCrU*mG*mC*





A



mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20663
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20753
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




7_PB
UrArGrCrArArGrUrUrArArArAr


8_PB
UrArGrCrArArGrUrUrArArArAr




S11
UrArArGrGrCrUrArGrUrCrCrGr


S11
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrGrCrArGr



GrUrCrGrGrUrGrCrGrUrGrCrAr





ArCrUrUrCrUrCrUrArCrArGrGr



CrCrUrGrArCrUrCrCrUrGrUrGr





ArGrUrCrArGrG*mU*mG*mC



GrArGrArArGrUrC*mU*mG*m









C







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20664
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20754
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




7_PB
UrArGrCrArArGrUrUrArArArAr


8_PB
UrArGrCrArArGrUrUrArArArAr




S10
UrArArGrGrCrUrArGrUrCrCrGr


S10
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrGrCrArGr



GrUrCrGrGrUrGrCrGrUrGrCrAr





ArCrUrUrCrUrCrUrArCrArGrGr



CrCrUrGrArCrUrCrCrUrGrUrGr





ArGrUrCrArG*mG*mU*mG



GrArGrArArGrU*mC*mU*mG







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20665
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20755
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




7_PB
UrArGrCrArArGrUrUrArArArAr


8_PB
UrArGrCrArArGrUrUrArArArAr




S9
UrArArGrGrCrUrArGrUrCrCrGr


S9
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrGrCrArGr



GrUrCrGrGrUrGrCrGrUrGrCrAr





ArCrUrUrCrUrCrUrArCrArGrGr



CrCrUrGrArCrUrCrCrUrGrUrGr





ArGrUrCrA*mG*mG*mU



GrArGrArArG*mU*mC*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20666
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20756
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




7_PB
UrArGrCrArArGrUrUrArArArAr


8_PB
UrArGrCrArArGrUrUrArArArAr




S8
UrArArGrGrCrUrArGrUrCrCrGr


S8
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrGrCrArGr



GrUrCrGrGrUrGrCrGrUrGrCrAr





ArCrUrUrCrUrCrUrArCrArGrGr



CrCrUrGrArCrUrCrCrUrGrUrGr





ArGrUrC*mA*mG*mG



GrArGrArA*mG*mU*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20667
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20757
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




6_PB
UrArGrCrArArGrUrUrArArArAr


7_PB
UrArGrCrArArGrUrUrArArArAr




S17
UrArArGrGrCrUrArGrUrCrCrGr


S17
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrCrArGrAr



GrUrCrGrGrUrGrCrUrGrCrArCr





CrUrUrCrUrCrUrArCrArGrGrAr



CrUrGrArCrUrCrCrUrGrUrGrGr





GrUrCrArGrGrUrGrCrArCrC*m



ArGrArArGrUrCrUrGrCrCrGrU*





A*mU*mG



mU*mA*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20668
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20758
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




6_PB
UrArGrCrArArGrUrUrArArArAr


7_PB
UrArGrCrArArGrUrUrArArArAr




S16
UrArArGrGrCrUrArGrUrCrCrGr


S16
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrCrArGrAr



GrUrCrGrGrUrGrCrUrGrCrArCr





CrUrUrCrUrCrUrArCrArGrGrAr



CrUrGrArCrUrCrCrUrGrUrGrGr





GrUrCrArGrGrUrGrCrArC*mC*



ArGrArArGrUrCrUrGrCrCrG*m





mA*mU



U*mU*mA







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20669
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20759
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




6_PB
UrArGrCrArArGrUrUrArArArAr


7_PB
UrArGrCrArArGrUrUrArArArAr




S15
UrArArGrGrCrUrArGrUrCrCrGr


S15
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrCrArGrAr



GrUrCrGrGrUrGrCrUrGrCrArCr





CrUrUrCrUrCrUrArCrArGrGrAr



CrUrGrArCrUrCrCrUrGrUrGrGr





GrUrCrArGrGrUrGrCrA*mC*m



ArGrArArGrUrCrUrGrCrC*mG*





C*mA



mU*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20670
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20760
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




6_PB
UrArGrCrArArGrUrUrArArArAr


7_PB
UrArGrCrArArGrUrUrArArArAr




S14
UrArArGrGrCrUrArGrUrCrCrGr


S14
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrCrArGrAr



GrUrCrGrGrUrGrCrUrGrCrArCr





CrUrUrCrUrCrUrArCrArGrGrAr



CrUrGrArCrUrCrCrUrGrUrGrGr





GrUrCrArGrGrUrGrC*mA*mC*



ArGrArArGrUrCrUrGrC*mC*m





mC



G*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20671
++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20761
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




6_PB
UrArGrCrArArGrUrUrArArArAr


7_PB
UrArGrCrArArGrUrUrArArArAr




S13
UrArArGrGrCrUrArGrUrCrCrGr


S13
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrCrArGrAr



GrUrCrGrGrUrGrCrUrGrCrArCr





CrUrUrCrUrCrUrArCrArGrGrAr



CrUrGrArCrUrCrCrUrGrUrGrGr





GrUrCrArGrGrUrG*mC*mA*m



ArGrArArGrUrCrUrG*mC*mC*





C



mG







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20672
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20762
++


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




6_PB
UrArGrCrArArGrUrUrArArArAr


7_PB
UrArGrCrArArGrUrUrArArArAr




S12
UrArArGrGrCrUrArGrUrCrCrGr


S12
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrCrArGrAr



GrUrCrGrGrUrGrCrUrGrCrArCr





CrUrUrCrUrCrUrArCrArGrGrAr



CrUrGrArCrUrCrCrUrGrUrGrGr





GrUrCrArGrGrU*mG*mC*mA



ArGrArArGrUrCrU*mG*mC*m









C







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20673
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20763
++


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




6_PB
UrArGrCrArArGrUrUrArArArAr


7_PB
UrArGrCrArArGrUrUrArArArAr




S11
UrArArGrGrCrUrArGrUrCrCrGr


S11
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrCrArGrAr



GrUrCrGrGrUrGrCrUrGrCrArCr





CrUrUrCrUrCrUrArCrArGrGrAr



CrUrGrArCrUrCrCrUrGrUrGrGr





GrUrCrArGrG*mU*mG*mC



ArGrArArGrUrC*mU*mG*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20674
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20764
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




6_PB
UrArGrCrArArGrUrUrArArArAr


7_PB
UrArGrCrArArGrUrUrArArArAr




S10
UrArArGrGrCrUrArGrUrCrCrGr


S10
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrCrArGrAr



GrUrCrGrGrUrGrCrUrGrCrArCr





CrUrUrCrUrCrUrArCrArGrGrAr



CrUrGrArCrUrCrCrUrGrUrGrGr





GrUrCrArG*mG*mU*mG



ArGrArArGrU*mC*mU*mG







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20675
++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20765
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




6_PB
UrArGrCrArArGrUrUrArArArAr


7_PB
UrArGrCrArArGrUrUrArArArAr




S9
UrArArGrGrCrUrArGrUrCrCrGr


S9
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrCrArGrAr



GrUrCrGrGrUrGrCrUrGrCrArCr





CrUrUrCrUrCrUrArCrArGrGrAr



CrUrGrArCrUrCrCrUrGrUrGrGr





GrUrCrA*mG*mG*mU



ArGrArArG*mU*mC*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20676
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20766
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




6_PB
UrArGrCrArArGrUrUrArArArAr


7_PB
UrArGrCrArArGrUrUrArArArAr




S8
UrArArGrGrCrUrArGrUrCrCrGr


S8
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrCrArGrAr



GrUrCrGrGrUrGrCrUrGrCrArCr





CrUrUrCrUrCrUrArCrArGrGrAr



CrUrGrArCrUrCrCrUrGrUrGrGr





GrUrC*mA*mG*mG



ArGrArA*mG*mU*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20677
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20767
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




4_PB
UrArGrCrArArGrUrUrArArArAr


6_PB
UrArGrCrArArGrUrUrArArArAr




S17
UrArArGrGrCrUrArGrUrCrCrGr


S17
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArGrArCrUr



GrUrCrGrGrUrGrCrGrCrArCrCr





UrCrUrCrUrArCrArGrGrArGrUr



UrGrArCrUrCrCrUrGrUrGrGrAr





CrArGrGrUrGrCrArCrC*mA*m



GrArArGrUrCrUrGrCrCrGrU*m





U*mG



U*mA*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20678
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20768
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




4_PB
UrArGrCrArArGrUrUrArArArAr


6_PB
UrArGrCrArArGrUrUrArArArAr




S16
UrArArGrGrCrUrArGrUrCrCrGr


S16
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArGrArCrUr



GrUrCrGrGrUrGrCrGrCrArCrCr





UrCrUrCrUrArCrArGrGrArGrUr



UrGrArCrUrCrCrUrGrUrGrGrAr





CrArGrGrUrGrCrArC*mC*mA*



GrArArGrUrCrUrGrCrCrG*mU*





mU



mU*mA







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20679
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20769
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




4_PB
UrArGrCrArArGrUrUrArArArAr


6_PB
UrArGrCrArArGrUrUrArArArAr




S15
UrArArGrGrCrUrArGrUrCrCrGr


S15
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArGrArCrUr



GrUrCrGrGrUrGrCrGrCrArCrCr





UrCrUrCrUrArCrArGrGrArGrUr



UrGrArCrUrCrCrUrGrUrGrGrAr





CrArGrGrUrGrCrA*mC*mC*m



GrArArGrUrCrUrGrCrC*mG*m





A



U*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20680
++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20770
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




4_PB
UrArGrCrArArGrUrUrArArArAr


6_PB
UrArGrCrArArGrUrUrArArArAr




S14
UrArArGrGrCrUrArGrUrCrCrGr


S14
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArGrArCrUr



GrUrCrGrGrUrGrCrGrCrArCrCr





UrCrUrCrUrArCrArGrGrArGrUr



UrGrArCrUrCrCrUrGrUrGrGrAr





CrArGrGrUrGrC*mA*mC*mC



GrArArGrUrCrUrGrC*mC*mG*









mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20681
++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20771
++


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




4_PB
UrArGrCrArArGrUrUrArArArAr


6_PB
UrArGrCrArArGrUrUrArArArAr




S13
UrArArGrGrCrUrArGrUrCrCrGr


S13
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArGrArCrUr



GrUrCrGrGrUrGrCrGrCrArCrCr





UrCrUrCrUrArCrArGrGrArGrUr



UrGrArCrUrCrCrUrGrUrGrGrAr





CrArGrGrUrG*mC*mA*mC



GrArArGrUrCrUrG*mC*mC*m







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20682
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20772
++


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




4_PB
UrArGrCrArArGrUrUrArArArAr


6_PB
UrArGrCrArArGrUrUrArArArAr




S12
UrArArGrGrCrUrArGrUrCrCrGr


S12
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArGrArCrUr



GrUrCrGrGrUrGrCrGrCrArCrCr





UrCrUrCrUrArCrArGrGrArGrUr



UrGrArCrUrCrCrUrGrUrGrGrAr





CrArGrGrU*mG*mC*mA



GrArArGrUrCrU*mG*mC*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20683
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20773
++


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




4_PB
UrArGrCrArArGrUrUrArArArAr


6_PB
UrArGrCrArArGrUrUrArArArAr




S11
UrArArGrGrCrUrArGrUrCrCrGr


S11
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArGrArCrUr



GrUrCrGrGrUrGrCrGrCrArCrCr





UrCrUrCrUrArCrArGrGrArGrUr



UrGrArCrUrCrCrUrGrUrGrGrAr





CrArGrG*mU*mG*mC



GrArArGrUrC*mU*mG*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20684
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20774
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




4_PB
UrArGrCrArArGrUrUrArArArAr


6_PB
UrArGrCrArArGrUrUrArArArAr




S10
UrArArGrGrCrUrArGrUrCrCrGr


S10
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArGrArCrUr



GrUrCrGrGrUrGrCrGrCrArCrCr





UrCrUrCrUrArCrArGrGrArGrUr



UrGrArCrUrCrCrUrGrUrGrGrAr





CrArG*mG*mU*mG



GrArArGrU*mC*mU*mG







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20685
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20775
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




4_PB
UrArGrCrArArGrUrUrArArArAr


6_PB
UrArGrCrArArGrUrUrArArArAr




S9
UrArArGrGrCrUrArGrUrCrCrGr


S9
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArGrArCrUr



GrUrCrGrGrUrGrCrGrCrArCrCr





UrCrUrCrUrArCrArGrGrArGrUr



UrGrArCrUrCrCrUrGrUrGrGrAr





CrA*mG*mG*mU



GrArArG*mU*mC*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20686
++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20776
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




4_PB
UrArGrCrArArGrUrUrArArArAr


6_PB
UrArGrCrArArGrUrUrArArArAr




S8
UrArArGrGrCrUrArGrUrCrCrGr


S8
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArGrArCrUr



GrUrCrGrGrUrGrCrGrCrArCrCr





UrCrUrCrUrArCrArGrGrArGrUr



UrGrArCrUrCrCrUrGrUrGrGrAr





C*mA*mG*mG



GrArA*mG*mU*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20687
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20777
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr





UrArGrCrArArGrUrUrArArArAr



UrArGrCrArArGrUrUrArArArAr




3_PB
UrArArGrGrCrUrArGrUrCrCrGr


4_PB
UrArArGrGrCrUrArGrUrCrCrGr




S17
UrUrArUrCrArArCrUrUrGrArAr


S17
UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrArCrUrUr



GrUrCrGrGrUrGrCrArCrCrUrGr





CrUrCrUrArCrArGrGrArGrUrCr



ArCrUrCrCrUrGrUrGrGrArGrAr





ArGrGrUrGrCrArCrC*mA*mU*



ArGrUrCrUrGrCrCrGrU*mU*m





mG



A*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20688
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20778
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




3_PB
UrArGrCrArArGrUrUrArArArAr


4_PB
UrArGrCrArArGrUrUrArArArAr




S16
UrArArGrGrCrUrArGrUrCrCrGr


S16
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrArCrUrUr



GrUrCrGrGrUrGrCrArCrCrUrGr





CrUrCrUrArCrArGrGrArGrUrCr



ArCrUrCrCrUrGrUrGrGrArGrAr





ArGrGrUrGrCrArC*mC*mA*m



ArGrUrCrUrGrCrCrG*mU*mU*





U



mA







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20689
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20779
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




3_PB
UrArGrCrArArGrUrUrArArArAr


4_PB
UrArGrCrArArGrUrUrArArArAr




S15
UrArArGrGrCrUrArGrUrCrCrGr


S15
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrArCrUrUr



GrUrCrGrGrUrGrCrArCrCrUrGr





CrUrCrUrArCrArGrGrArGrUrCr



ArCrUrCrCrUrGrUrGrGrArGrAr





ArGrGrUrGrCrA*mC*mC*mA



ArGrUrCrUrGrCrC*mG*mU*m









U







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20690
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20780
++


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




3_PB
UrArGrCrArArGrUrUrArArArAr


4_PB
UrArGrCrArArGrUrUrArArArAr




S14
UrArArGrGrCrUrArGrUrCrCrGr


S14
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrArCrUrUr



GrUrCrGrGrUrGrCrArCrCrUrGr





CrUrCrUrArCrArGrGrArGrUrCr



ArCrUrCrCrUrGrUrGrGrArGrAr





ArGrGrUrGrC*mA*mC*mC



ArGrUrCrUrGrC*mC*mG*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20691
++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20781
++


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




3_PB
UrArGrCrArArGrUrUrArArArAr


4_PB
UrArGrCrArArGrUrUrArArArAr




S13
UrArArGrGrCrUrArGrUrCrCrGr


S13
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrArCrUrUr



GrUrCrGrGrUrGrCrArCrCrUrGr





CrUrCrUrArCrArGrGrArGrUrCr



ArCrUrCrCrUrGrUrGrGrArGrAr





ArGrGrUrG*mC*mA*mC



ArGrUrCrUrG*mC*mC*mG







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20692
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20782
++


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




3_PB
UrArGrCrArArGrUrUrArArArAr


4_PB
UrArGrCrArArGrUrUrArArArAr




S12
UrArArGrGrCrUrArGrUrCrCrGr


S12
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrArCrUrUr



GrUrCrGrGrUrGrCrArCrCrUrGr





CrUrCrUrArCrArGrGrArGrUrCr



ArCrUrCrCrUrGrUrGrGrArGrAr





ArGrGrU*mG*mC*mA



ArGrUrCrU*mG*mC*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20693
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20783
++


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




3_PB
UrArGrCrArArGrUrUrArArArAr


4_PB
UrArGrCrArArGrUrUrArArArAr




S11
UrArArGrGrCrUrArGrUrCrCrGr


S11
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrArCrUrUr



GrUrCrGrGrUrGrCrArCrCrUrGr





CrUrCrUrArCrArGrGrArGrUrCr



ArCrUrCrCrUrGrUrGrGrArGrAr





ArGrG*mU*mG*mC



ArGrUrC*mU*mG*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20694
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20784
++


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




3_PB
UrArGrCrArArGrUrUrArArArAr


4_PB
UrArGrCrArArGrUrUrArArArAr




S10
UrArArGrGrCrUrArGrUrCrCrGr


S10
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrArCrUrUr



GrUrCrGrGrUrGrCrArCrCrUrGr





CrUrCrUrArCrArGrGrArGrUrCr



ArCrUrCrCrUrGrUrGrGrArGrAr





ArG*mG*mU*mG



ArGrU*mC*mU*mG







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20695
+++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20785
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




3_PB
UrArGrCrArArGrUrUrArArArAr


4_PB
UrArGrCrArArGrUrUrArArArAr




S9
UrArArGrGrCrUrArGrUrCrCrGr


S9
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrArCrUrUr



GrUrCrGrGrUrGrCrArCrCrUrGr





CrUrCrUrArCrArGrGrArGrUrCr



ArCrUrCrCrUrGrUrGrGrArGrAr





A*mG*mG*mU



ArG*mU*mC*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20696
++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20786
++


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




3_PB
UrArGrCrArArGrUrUrArArArAr


4_PB
UrArGrCrArArGrUrUrArArArAr




S8
UrArArGrGrCrUrArGrUrCrCrGr


S8
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrGrArCrUrUr



GrUrCrGrGrUrGrCrArCrCrUrGr





CrUrCrUrArCrArGrGrArGrUrC*



ArCrUrCrCrUrGrUrGrGrArGrAr





mA*mG*mG



A*mG*mU*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20697
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20787
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




2_PB
UrArGrCrArArGrUrUrArArArAr


1_PB
UrArGrCrArArGrUrUrArArArAr




S17
UrArArGrGrCrUrArGrUrCrCrGr


S17
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArCrUrUrCr



GrUrCrGrGrUrGrCrUrGrArCrUr





UrCrUrArCrArGrGrArGrUrCrAr



CrCrUrGrUrGrGrArGrArArGrUr





GrGrUrGrCrArCrC*mA*mU*m



CrUrGrCrCrGrU*mU*mA*mC





G








HBB
mC*mA*mU*rGrGrUrGrCrArCr
20698
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20788
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




2_PB
UrArGrCrArArGrUrUrArArArAr


1_PB
UrArGrCrArArGrUrUrArArArAr




S16
UrArArGrGrCrUrArGrUrCrCrGr


S16
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArCrUrUrCr



GrUrCrGrGrUrGrCrUrGrArCrUr





UrCrUrArCrArGrGrArGrUrCrAr



CrCrUrGrUrGrGrArGrArArGrUr





GrGrUrGrCrArC*mC*mA*mU



CrUrGrCrCrG*mU*mU*mA







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20699
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20789
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




2_PB
UrArGrCrArArGrUrUrArArArAr


1_PB
UrArGrCrArArGrUrUrArArArAr




S15
UrArArGrGrCrUrArGrUrCrCrGr


S15
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArCrUrUrCr



GrUrCrGrGrUrGrCrUrGrArCrUr





UrCrUrArCrArGrGrArGrUrCrAr



CrCrUrGrUrGrGrArGrArArGrUr





GrGrUrGrCrA*mC*mC*mA



CrUrGrCrC*mG*mU*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20700
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20790
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




2_PB
UrArGrCrArArGrUrUrArArArAr


1_PB
UrArGrCrArArGrUrUrArArArAr




S14
UrArArGrGrCrUrArGrUrCrCrGr


S14
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArCrUrUrCr



GrUrCrGrGrUrGrCrUrGrArCrUr





UrCrUrArCrArGrGrArGrUrCrAr



CrCrUrGrUrGrGrArGrArArGrUr





GrGrUrGrC*mA*mC*mC



CrUrGrC*mC*mG*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20701
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20791
++


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




2_PB
UrArGrCrArArGrUrUrArArArAr


1_PB
UrArGrCrArArGrUrUrArArArAr




S13
UrArArGrGrCrUrArGrUrCrCrGr


S13
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArCrUrUrCr



GrUrCrGrGrUrGrCrUrGrArCrUr





UrCrUrArCrArGrGrArGrUrCrAr



CrCrUrGrUrGrGrArGrArArGrUr





GrGrUrG*mC*mA*mC



CrUrG*mC*mC*mG







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20702
++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20792
++


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




2_PB
UrArGrCrArArGrUrUrArArArAr


1_PB
UrArGrCrArArGrUrUrArArArAr




S12
UrArArGrGrCrUrArGrUrCrCrGr


S12
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArCrUrUrCr



GrUrCrGrGrUrGrCrUrGrArCrUr





UrCrUrArCrArGrGrArGrUrCrAr



CrCrUrGrUrGrGrArGrArArGrUr





GrGrU*mG*mC*mA



CrU*mG*mC*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20703
++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20793
++


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




2_PB
UrArGrCrArArGrUrUrArArArAr


1_PB
UrArGrCrArArGrUrUrArArArAr




S11
UrArArGrGrCrUrArGrUrCrCrGr


S11
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArCrUrUrCr



GrUrCrGrGrUrGrCrUrGrArCrUr





UrCrUrArCrArGrGrArGrUrCrAr



CrCrUrGrUrGrGrArGrArArGrUr





GrG*mU*mG*mC



C*mU*mG*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20704
++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20794
++


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




2_PB
UrArGrCrArArGrUrUrArArArAr


1_PB
UrArGrCrArArGrUrUrArArArAr




S10
UrArArGrGrCrUrArGrUrCrCrGr


S10
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArCrUrUrCr



GrUrCrGrGrUrGrCrUrGrArCrUr





UrCrUrArCrArGrGrArGrUrCrAr



CrCrUrGrUrGrGrArGrArArGrU





G*mG*mU*mG



*mC*mU*mG







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20705
++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20795
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




2_PB
UrArGrCrArArGrUrUrArArArAr


1_PB
UrArGrCrArArGrUrUrArArArAr




S9
UrArArGrGrCrUrArGrUrCrCrGr


S9
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArCrUrUrCr



GrUrCrGrGrUrGrCrUrGrArCrUr





UrCrUrArCrArGrGrArGrUrCrA*



CrCrUrGrUrGrGrArGrArArG*m





mG*mG*mU



U*mC*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20706
++
HBB
mG*mU*mA*rArCrGrGrCrArGr
20796
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




2_PB
UrArGrCrArArGrUrUrArArArAr


1_PB
UrArGrCrArArGrUrUrArArArAr




S8
UrArArGrGrCrUrArGrUrCrCrGr


S8
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrArCrUrUrCr



GrUrCrGrGrUrGrCrUrGrArCrUr





UrCrUrArCrArGrGrArGrUrC*m



CrCrUrGrUrGrGrArGrArA*mG*





A*mG*mG



mU*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20707
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20797
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




0_PB
UrArGrCrArArGrUrUrArArArAr


0_PB
UrArGrCrArArGrUrUrArArArAr




S17
UrArArGrGrCrUrArGrUrCrCrGr


S17
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrUrUrCrUrCr



GrUrCrGrGrUrGrCrGrArCrUrCr





UrArCrArGrGrArGrUrCrArGrGr



CrUrGrUrGrGrArGrArArGrUrCr





UrGrCrArCrC*mA*mU*mG



UrGrCrCrGrU*mU*mA*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20708
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20798
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




0_PB
UrArGrCrArArGrUrUrArArArAr


0_PB
UrArGrCrArArGrUrUrArArArAr




S16
UrArArGrGrCrUrArGrUrCrCrGr


S16
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrUrUrCrUrCr



GrUrCrGrGrUrGrCrGrArCrUrCr





UrArCrArGrGrArGrUrCrArGrGr



CrUrGrUrGrGrArGrArArGrUrCr





UrGrCrArC*mC*mA*mU



UrGrCrCrG*mU*mU*mA







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20709
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20799
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




0_PB
UrArGrCrArArGrUrUrArArArAr


0_PB
UrArGrCrArArGrUrUrArArArAr




S15
UrArArGrGrCrUrArGrUrCrCrGr


S15
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrUrUrCrUrCr



GrUrCrGrGrUrGrCrGrArCrUrCr





UrArCrArGrGrArGrUrCrArGrGr



CrUrGrUrGrGrArGrArArGrUrCr





UrGrCrA*mC*mC*mA



UrGrCrC*mG*mU*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20710
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20800
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


RT
UrUrArGrArGrCrUrArGrArArAr




0_PB
UrArGrCrArArGrUrUrArArArAr


0_PB
UrArGrCrArArGrUrUrArArArAr




S14
UrArArGrGrCrUrArGrUrCrCrGr


S14
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrUrUrCrUrCr



GrUrCrGrGrUrGrCrGrArCrUrCr





UrArCrArGrGrArGrUrCrArGrGr



CrUrGrUrGrGrArGrArArGrUrCr





UrGrC*mA*mC*mC



UrGrC*mC*mG*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20711
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20801
++


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




0_PB
UrArGrCrArArGrUrUrArArArAr


0_PB
UrArGrCrArArGrUrUrArArArAr




S13
UrArArGrGrCrUrArGrUrCrCrGr


S13
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrUrUrCrUrCr



GrUrCrGrGrUrGrCrGrArCrUrCr





UrArCrArGrGrArGrUrCrArGrGr



CrUrGrUrGrGrArGrArArGrUrCr





UrG*mC*mA*mC



UrG*mC*mC*mG







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20712
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20802
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




0_PB
UrArGrCrArArGrUrUrArArArAr


0_PB
UrArGrCrArArGrUrUrArArArAr




S12
UrArArGrGrCrUrArGrUrCrCrGr


S12
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrUrUrCrUrCr



GrUrCrGrGrUrGrCrGrArCrUrCr





UrArCrArGrGrArGrUrCrArGrGr



CrUrGrUrGrGrArGrArArGrUrCr





U*mG*mC*mA



U*mG*mC*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20713
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20803
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




0_PB
UrArGrCrArArGrUrUrArArArAr


0_PB
UrArGrCrArArGrUrUrArArArAr




S11
UrArArGrGrCrUrArGrUrCrCrGr


S11
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrUrUrCrUrCr



GrUrCrGrGrUrGrCrGrArCrUrCr





UrArCrArGrGrArGrUrCrArGrG



CrUrGrUrGrGrArGrArArGrUrC





*mU*mG*mC



*mU*mG*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20714
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20804
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




0_PB
UrArGrCrArArGrUrUrArArArAr


0_PB
UrArGrCrArArGrUrUrArArArAr




S10
UrArArGrGrCrUrArGrUrCrCrGr


S10
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrUrUrCrUrCr



GrUrCrGrGrUrGrCrGrArCrUrCr





UrArCrArGrGrArGrUrCrArG*m



CrUrGrUrGrGrArGrArArGrU*m





G*mU*mG



C*mU*mG




HBB
mC*mA*mU*rGrGrUrGrCrArCr
20715
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20805
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




0_PB
UrArGrCrArArGrUrUrArArArAr


0_PB
UrArGrCrArArGrUrUrArArArAr




S9
UrArArGrGrCrUrArGrUrCrCrGr


S9
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrUrUrCrUrCr



GrUrCrGrGrUrGrCrGrArCrUrCr





UrArCrArGrGrArGrUrCrA*mG*



CrUrGrUrGrGrArGrArArG*mU*





mG*mU



mC*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20716
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20806
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT1
UrUrArGrArGrCrUrArGrArArAr


_RT1
UrUrArGrArGrCrUrArGrArArAr




0_PB
UrArGrCrArArGrUrUrArArArAr


0_PB
UrArGrCrArArGrUrUrArArArAr




S8
UrArArGrGrCrUrArGrUrCrCrGr


S8
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrUrUrCrUrCr



GrUrCrGrGrUrGrCrGrArCrUrCr





UrArCrArGrGrArGrUrC*mA*m



CrUrGrUrGrGrArGrArA*mG*m





G*mG



U*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20717
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20807
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT9
UrUrArGrArGrCrUrArGrArArAr


_RT9
UrUrArGrArGrCrUrArGrArArAr




_PBS
UrArGrCrArArGrUrUrArArArAr


_PBS
UrArGrCrArArGrUrUrArArArAr




17
UrArArGrGrCrUrArGrUrCrCrGr


17
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrUrCrUrCrUr



GrUrCrGrGrUrGrCrArCrUrCrCr





ArCrArGrGrArGrUrCrArGrGrUr



UrGrUrGrGrArGrArArGrUrCrUr





GrCrArCrC*mA*mU*mG



GrCrCrGrU*mU*mA*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20718
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20808
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT9
UrUrArGrArGrCrUrArGrArArAr


_RT9
UrUrArGrArGrCrUrArGrArArAr




_PBS
UrArGrCrArArGrUrUrArArArAr


_PBS
UrArGrCrArArGrUrUrArArArAr




16
UrArArGrGrCrUrArGrUrCrCrGr


16
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrUrCrUrCrUr



GrUrCrGrGrUrGrCrArCrUrCrCr





ArCrArGrGrArGrUrCrArGrGrUr



UrGrUrGrGrArGrArArGrUrCrUr





GrCrArC*mC*mA*mU



GrCrCrG*mU*mU*mA







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20719
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20809
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT9
UrUrArGrArGrCrUrArGrArArAr


_RT9
UrUrArGrArGrCrUrArGrArArAr




_PBS
UrArGrCrArArGrUrUrArArArAr


_PBS
UrArGrCrArArGrUrUrArArArAr




15
UrArArGrGrCrUrArGrUrCrCrGr


15
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrUrCrUrCrUr



GrUrCrGrGrUrGrCrArCrUrCrCr





ArCrArGrGrArGrUrCrArGrGrUr



UrGrUrGrGrArGrArArGrUrCrUr





GrCrA*mC*mC*mA



GrCrC*mG*mU*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20720
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20810
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT9
UrUrArGrArGrCrUrArGrArArAr


_RT9
UrUrArGrArGrCrUrArGrArArAr




_PBS
UrArGrCrArArGrUrUrArArArAr


_PBS
UrArGrCrArArGrUrUrArArArAr




14
UrArArGrGrCrUrArGrUrCrCrGr


14
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrUrCrUrCrUr



GrUrCrGrGrUrGrCrArCrUrCrCr





ArCrArGrGrArGrUrCrArGrGrUr



UrGrUrGrGrArGrArArGrUrCrUr





GrC*mA*mC*mC



GrC*mC*mG*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20721
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20811
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT9
UrUrArGrArGrCrUrArGrArArAr


_RT9
UrUrArGrArGrCrUrArGrArArAr




_PBS
UrArGrCrArArGrUrUrArArArAr


_PBS
UrArGrCrArArGrUrUrArArArAr




13
UrArArGrGrCrUrArGrUrCrCrGr


13
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrUrCrUrCrUr



GrUrCrGrGrUrGrCrArCrUrCrCr





ArCrArGrGrArGrUrCrArGrGrUr



UrGrUrGrGrArGrArArGrUrCrUr





G*mC*mA*mC



G*mC*mC*mG







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20722
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20812
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT9
UrUrArGrArGrCrUrArGrArArAr


_RT9
UrUrArGrArGrCrUrArGrArArAr




_PBS
UrArGrCrArArGrUrUrArArArAr


_PBS
UrArGrCrArArGrUrUrArArArAr




12
UrArArGrGrCrUrArGrUrCrCrGr


12
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrUrCrUrCrUr



GrUrCrGrGrUrGrCrArCrUrCrCr





ArCrArGrGrArGrUrCrArGrGrU



UrGrUrGrGrArGrArArGrUrCrU





*mG*mC*mA



*mG*mC*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20723
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20813
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT9
UrUrArGrArGrCrUrArGrArArAr


_RT9
UrUrArGrArGrCrUrArGrArArAr




_PBS
UrArGrCrArArGrUrUrArArArAr


_PBS
UrArGrCrArArGrUrUrArArArAr




11
UrArArGrGrCrUrArGrUrCrCrGr


11
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrUrCrUrCrUr



GrUrCrGrGrUrGrCrArCrUrCrCr





ArCrArGrGrArGrUrCrArGrG*m



UrGrUrGrGrArGrArArGrUrC*m





U*mG*mC



U*mG*mC







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20724
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20814



5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT9
UrUrArGrArGrCrUrArGrArArAr


_RT9
UrUrArGrArGrCrUrArGrArArAr




_PBS
UrArGrCrArArGrUrUrArArArAr


_PBS
UrArGrCrArArGrUrUrArArArAr




10
UrArArGrGrCrUrArGrUrCrCrGr


10
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrUrCrUrCrUr



GrUrCrGrGrUrGrCrArCrUrCrCr





ArCrArGrGrArGrUrCrArG*mG*



UrGrUrGrGrArGrArArGrU*mC*





mU*mG



mU*mG







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20725
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20815
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT9
UrUrArGrArGrCrUrArGrArArAr


_RT9
UrUrArGrArGrCrUrArGrArArAr




_PBS
UrArGrCrArArGrUrUrArArArAr


_PBS
UrArGrCrArArGrUrUrArArArAr




9
UrArArGrGrCrUrArGrUrCrCrGr


9
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrUrCrUrCrUr



GrUrCrGrGrUrGrCrArCrUrCrCr





ArCrArGrGrArGrUrCrA*mG*m



UrGrUrGrGrArGrArArG*mU*m





G*mU



C*mU







HBB
mC*mA*mU*rGrGrUrGrCrArCr
20726
+
HBB
mG*mU*mA*rArCrGrGrCrArGr
20816
+


5_inst
CrUrGrArCrUrCrCrUrGrGrUrUr


8_inst
ArCrUrUrCrUrCrCrUrCrGrUrUr




_RT9
UrUrArGrArGrCrUrArGrArArAr


_RT9
UrUrArGrArGrCrUrArGrArArAr




_PBS
UrArGrCrArArGrUrUrArArArAr


_PBS
UrArGrCrArArGrUrUrArArArAr




8
UrArArGrGrCrUrArGrUrCrCrGr


8
UrArArGrGrCrUrArGrUrCrCrGr





UrUrArUrCrArArCrUrUrGrArAr



UrUrArUrCrArArCrUrUrGrArAr





ArArArGrUrGrGrCrArCrCrGrAr



ArArArGrUrGrGrCrArCrCrGrAr





GrUrCrGrGrUrGrCrUrCrUrCrUr



GrUrCrGrGrUrGrCrArCrUrCrCr





ArCrArGrGrArGrUrC*mA*mG*



UrGrUrGrGrArGrArA*mG*mU*





mG



mC









Example 4: Screening Configurations of Template RNAs that Correct the SCD Mutation in a Genomic Landing Pad in Human Cells

This example describes the use of gene modifying system containing a gene modifying polypeptide and template RNAs comprising varied lengths of heterologous object sequences and PBS sequences to identify favorable configurations for correction of the SCD mutation. In this example, a template RNA contains:

    • (1) a gRNA spacer;
    • (2) a gRNA scaffold;
    • (3) a heterologous object sequence; and
    • (4) a primer binding site (PBS) sequence.


The template RNAs were designed to contain 8-17 nt PBS sequences and 9-20 nt heterologous object sequences (Tables A and B). Two different gRNA spacer sequences, designated HBB5 (see Table A) and HBB8 (see Table B), were used to target sites proximal to the SCD mutation in the custom genomic landing pad in human cells. The heterologous object sequences and PBS sequences were designed to correct the SCD mutation in the landing pad by replacing a “T” nucleotide with an “A” nucleotide at the mutation site using a gene modifying system described herein.


A cell line was created to have a “landing pad” or a stable integration that mimic a region of the HBB gene that contains sequences flanking the SCD mutation site. The DNA for the landing pad was chemically synthesized and cloned into the pLenti-N-tGFP vector. The cloned landing pad into the lentiviral expression vector was confirmed and the sequence was verified by Sanger sequencing of the landing pad. The sequence verified plasmids (9 ug) along with the lentiviral packaging mix (9 ug, obtained from Biosettia) were transfected using Lipofectamine2000TM according to the manufacturer instructions into a packaging cell line, LentiX-293T (Takara Bio). The transfected cells were incubated at 37° C., 5% CO2 for 48 hours (including one medium change at 24 hrs) and the viral particle containing medium was collected from the cell culture dish. The collected medium was filtered through a 0.2 μm filter to remove cell debris and prepared for transduction of HEK293T cells. The virus-containing medium was diluted in DMEM and mixed with polybrene to prepare a dilution series for transduction of HEK293T cells where the final concentration of polybrene was 8 ug/ml. The HEK293T cells were grown in viral containing medium for 48 hour and then split with fresh medium. The split cells were grown to confluence and transduction efficiency of the different dilutions of virus were measured by GFP expression via flow cytometry and ddPCR detection of the genomic integrated lentivirus that contained GFP and the HBB landing pads.


A gene modifying system comprising a gene modifying polypeptide (see Table C) and a template RNA was transfected into the HEK293T landing pad cell line. The gene modifying polypeptide and the template RNAs were delivered by nucleofection in RNA format. Specifically, 1 μg of gene modifying polypeptide mRNA was combined with 10 μM template RNAs. The mRNA and template RNAs were added to 25 μL SF buffer containing 250,000 HEK293T landing pad cells and cells were nucleofected using program DS-150. After nucleofection, cells were grown at 37° C., 5% CO2 for 3 days prior to cell lysis and genomic DNA extraction. To analyze gene editing activity, primers flanking the HBB genomic target site were used to amplify across the locus. Amplicons are analyzed via short read sequencing using an Illumina MiSeq. Gene editing activity with high editing efficiency was detected in the configurations with 9-12nt PBS sequence and 12-14 nt heterologous object sequence, and is shown in Tables A and B. In particular, in Tables A and B, “+” indicates an editing frequency of <3%, “++” indicates an editing frequency of 3-7%, and “+++” indicates an editing frequency of >=7%.


It is understood that the template RNA sequences shown in Tables A and may be customized depending on the cell being targeted. For example, HEK293T cells have a SNP in the HBB gene (NC_000011.10: g.5227013A>G (T>C in HBB coding strand) relative to human hg38 reference genome), and thus the template RNA sequences shown in Tables A and B are suitable for use in a cell with that SNP. Template RNAs suitable for use in a cell with a different sequence at that SNP position (“no SNP”) may utilize the sequences below, wherein capital letters indicate core sequences and lower case letters indicate flanking sequences, and underlining indicates the mutation region. Similarly, in some embodiments it is desired to inactive a PAM sequence upon editing (“PAM-kill”) and in other embodiments it is preferred to leave the PAM sequence intact (no PAM-kill). The RT template can be designed as a “PAM-kill” or “no PAM-kill” version, for example, as shown below.











HBB5 Spacer (no SNP):



(SEQ ID NO: 21668)



CATGGTGCATCTGACTCCTG






HBB5_PBS (no SNP):



(SEQ ID NO: 21669)



GAGTCAGAtgcaccatg






HBB5 RT template (no PAM-kill):



(SEQ ID NO: 21670)



aacggcagactTCTCGTCAG






HBB8 RT template (no SNP):



(SEQ ID NO: 21671)



tggtgcatctgACTCCTGAG













TABLE A







HBB5 Sequences. The columns indicate, from left to right:


1) Name of the template RNA,  2) gRNA spacer sequence of the template RNA, which contains a SNP relative to hg38 that is


present in HEK293T cells, 3) SpCas9 gRNA scaffold sequence of the template RNA, 4) PBS sequence of the template RNA, which


contains a SNP relative to hg38 that is present in HEK293T cells, 5) RT template sequence of the template RNA, wherein the


PAM-kill mutation is bolded and the mutation region is underlined, 6) full template RNA sequence comprising HEK293T SNP and


PAM-kill edit, 7) Full template RNA sequence depicted as RNA corresponding to column 6, further showing chemical modifications


as used in Example 4, 8) alternative template RNA sequence designed relative to hg38 reference genome (lacking HEK293T SNP)


and comprising PAM-kill edit, 9) alternative template RNA sequence designed relative to hg38 reference genome


(lacking HEK293T SNP) and lacking PAM-kill edit and 10) observed activity


of template RNA of column 7 as defined in Example 4.































RT



Template

Template

Template











Template:

Template

Sequence

Sequence

Sequence






SEQ
gRNA
SEQ

SEQ
(PAM-
SEQ
Sequence
SEQ
(+SNP
SEQ
(no SNP
SEQ
(no SNP
SEQ
Ac-




ID
Scaf-
ID

ID
kill;
ID
(+SNP
ID
+PAM-kill)
ID
+PAM
ID
no PAM-
ID
tiv-


Name
Spacer
NO
fold
NO
PBS
NO
Correction)
NO
+PAM-kill)
NO
(RNA)
NO
-kill)
NO
kill)
NO
ity





HBB
CAT
19251
GTTT
19341
GAGT
19431
AACGG
19521
CATGGTGCACCT
19611
mC*mA*mU*rGrGr
19701
CATGGTG
19791
CATGG
19881
++


5_RT
GG

TAGA

CAGG

CAGAC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




20_P
TGC

GCTA

TGCA

TTCTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS17
AC

GAA

CCAT


T
TCAG


TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG

G



ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CAACGGCAGACT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





TCTCTTCAGGAG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





TCAGGTGCACCA

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA





TG

CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArArCrGrGr

TGGCACC

CGTTA







CTTG







CrArGrArCrUrUrCr

GAGTCG

TCAAC







AAA







UrCrUrUrCrArGrGr

GTGCAA

TTGAA







AAGT







ArGrUrCrArGrGrUr

CGGCAG

AAAGT







GGC







GrCrArCrC*mA*mU

ACTTCTC

GGCAC







ACCG







*mG

TTCAGGA

CGAGT







AGTC









GTCAGAT

CGGTG







GGTG









GCACCAT

CAACG







C









G

GCAGA



















CTTCTC



















GTCAG



















GAGTC



















AGATG



















CACCA



















TG







HBB
CAT
19252
GTTT
19342
GAGT
19432
AACGG
19522
CATGGTGCACCT
19612
mC*mA*mU*rGrGr
19702
CATGGTG
19792
CATGG
19882
+++


5_RT
GG

TAGA

CAGG

CAGAC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




20_P
TGC

GCTA

TGCA

TTCTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS16
AC

GAA

CCAT


T
TCAG


TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CAACGGCAGACT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





TCTCTTCAGGAG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





TCAGGTGCACCA

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA





T

CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArArCrGrGr

TGGCACC

CGTTA







CTTG







CrArGrArCrUrUrCr

GAGTCG

TCAAC







AAA







UrCrUrUrCrArGrGr

GTGCAA

TTGAA







AAGT







ArGrUrCrArGrGrUr

CGGCAG

AAAGT







GGC







GrCrArC*mC*mA*

ACTTCTC

GGCAC







ACCG







mU

TTCAGGA

CGAGT







AGTC









GTCAGAT

CGGTG







GGTG









GCACCAT

CAACG







C











GCAGA



















CTTCTC



















GTCAG



















GAGTC



















AGATG



















CACCA



















T







HBB
CAT
19253
GTTT
19343
GAGT
19433
AACGG
19523
CATGGTGCACCT
19613
mC*mA*mU*rGrGr
19703
CATGGTG
19793
CATGG
19883
+++


5_RT
GG

TAGA

CAGG

CAGAC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




20_P
TGC

GCTA

TGCA

TTCTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS15
AC

GAA

CCA


T
TCAG


TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CAACGGCAGACT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





TCTCTTCAGGAG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





TCAGGTGCACCA

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArArCrGrGr

TGGCACC

CGTTA







CTTG







CrArGrArCrUrUrCr

GAGTCG

TCAAC







AAA







UrCrUrUrCrArGrGr

GTGCAA

TTGAA







AAGT







ArGrUrCrArGrGrUr

CGGCAG

AAAGT







GGC







GrCrA*mC*mC*mA

ACTTCTC

GGCAC







ACCG









TTCAGGA

CGAGT







AGTC









GTCAGAT

CGGTG







GGTG









GCACCA

CAACG







C











GCAGA



















CTTCTC



















GTCAG



















GAGTC



















AGATG



















CACCA







HBB
CAT
19254
GTTT
19344
GAGT
19434
AACGG
19524
CATGGTGCACCT
19614
mC*mA*mU*rGrGr
19704
CATGGTG
19794
CATGG
19884
+++


5_RT
GG

TAGA

CAGG

CAGAC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




20_P
TGC

GCTA

TGCA

TTCTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS14
AC

GAA

CC


T
TCAG


TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CAACGGCAGACT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





TCTCTTCAGGAG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





TCAGGTGCACC

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArArCrGrGr

TGGCACC

CGTTA







CTTG







CrArGrArCrUrUrCr

GAGTCG

TCAAC







AAA







UrCrUrUrCrArGrGr

GTGCAA

TTGAA







AAGT







ArGrUrCrArGrGrUr

CGGCAG

AAAGT







GGC







GrC*mA*mC*mC

ACTTCTC

GGCAC







ACCG









TTCAGGA









AGTC









GTCAGAT

CGAGT







GGTG









GCACC

CGGTG







C











CAACG



















GCAGA



















CTTCTC



















GTCAG



















GAGTC



















AGATG



















CACC







HBB
CAT
19255
GTTT
19345
GAGT
19435
AACGG
19525
CATGGTGCACCT
19615
mC*mA*mU*rGrGr
19705
CATGGTG
19795
CATGG
19885
+++


5_RT
GG

TAGA

CAGG

CAGAC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




20_P
TGC

GCTA

TGCA

TTCTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS13
AC

GAA

C


T
TCAG


TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CAACGGCAGACT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





TCTCTTCAGGAG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





TCAGGTGCAC

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArArCrGrGr

TGGCACC

CGTTA







CTTG







CrArGrArCrUrUrCr

GAGTCG

TCAAC







AAA







UrCrUrUrCrArGrGr

GTGCAA

TTGAA







AAGT







ArGrUrCrArGrGrUr

CGGCAG

AAAGT







GGC







G*mC*mA*mC

ACTTCTC

GGCAC







ACCG









TTCAGGA

CGAGT







AGTC









GTCAGAT

CGGTG







GGTG









GCAC

CAACG







C











GCAGA



















CTTCTC



















GTCAG



















GAGTC



















AGATG



















CAC







HBB
CAT
19256
GTTT
19346
GAGT
19436
AACGG
19526
CATGGTGCACCT
19616
mC*mA*mU*rGrGr
19706
CATGGTG
19796
CATGG
19886
+++


5_RT
GG

TAGA

CAGG

CAGAC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




20_P
TGC

GCTA

TGCA

TTCTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS12
AC

GAA




T
TCAG


TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CAACGGCAGACT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





TCTCTTCAGGAG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





TCAGGTGCA

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArArCrGrGr

TGGCACC

CGTTA







CTTG







CrArGrArCrUrUrCr

GAGTCG

TCAAC







AAA







UrCrUrUrCrArGrGr

GTGCAA

TTGAA







AAGT







ArGrUrCrArGrGrU*

CGGCAG

AAAGT







GGC







mG*mC*mA

ACTTCTC

GGCAC







ACCG









TTCAGGA

CGAGT







AGTC









GTCAGAT

CGGTG







GGTG









GCA

CAACG







C











GCAGA



















CTTCTC



















GTCAG



















GAGTC



















AGATG



















CA







HBB
CAT
19257
GTTT
19347
GAGT
19437
AACGG
19527
CATGGTGCACCT
19617
mC*mA*mU*rGrGr
19707
CATGGTG
19797
CATGG
19887
+++


5_RT
GG

TAGA

CAGG

CAGAC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




20_P
TGC

GCTA

TGC

TTCTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS11
AC

GAA




T
TCAG


TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CAACGGCAGACT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





TCTCTTCAGGAG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





TCAGGTGC

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArArCrGrGr

TGGCACC

CGTTA







CTTG







CrArGrArCrUrUrCr

GAGTCG

TCAAC







AAA







UrCrUrUrCrArGrGr

GTGCAA

TTGAA







AAGT







ArGrUrCrArGrG*m

CGGCAG

AAAGT







GGC







U*mG*mC

ACTTCTC

GGCAC







ACCG









TTCAGGA

CGAGT







AGTC









GTCAGAT

CGGTG







GGTG









GC

CAACG







C











GCAGA



















CTTCTC



















GTCAG



















GAGTC



















AGATG



















C







HBB
CAT
19258
GTTT
19348
GAGT
19438
AACGG
19528
CATGGTGCACCT
19618
mC*mA*mU*rGrGr
19708
CATGGTG
19798
CATGG
19888
+++


5_RT
GG

TAGA

CAGG

CAGAC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




20_P
TGC

GCTA

TG

TTCTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS10
AC

GAA




T
TCAG


TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CAACGGCAGACT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





TCTCTTCAGGAG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





TCAGGTG

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArArCrGrGr

TGGCACC

CGTTA







CTTG







CrArGrArCrUrUrCr

GAGTCG

TCAAC







AAA







UrCrUrUrCrArGrGr

GTGCAA

TTGAA







AAGT







ArGrUrCrArG*mG*

CGGCAG

AAAGT







GGC







mU*mG

ACTTCTC

GGCAC







ACCG









TTCAGGA

CGAGT







AGTC









GTCAGAT

CGGTG







GGTG









G

CAACG







C











GCAGA



















CTTCTC



















GTCAG



















GAGTC



















AGATG







HBB
CAT
19259
GTTT
19349
GAGT

AACGG
19529
CATGGTGCACCT
19619
mC*mA*mU*rGrGr
19709
CATGGTG
19799
CATGG
19889
+++


5_RT
GG

TAGA

CAGG

CAGAC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




20_P
TGC

GCTA

T

TTCTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS9
AC

GAA




T
TCAG


TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CAACGGCAGACT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





TCTCTTCAGGAG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





TCAGGT

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArArCrGrGr

TGGCACC

CGTTA







CTTG







CrArGrArCrUrUrCr

GAGTCG

TCAAC







AAA







UrCrUrUrCrArGrGr

GTGCAA

TTGAA







AAGT







ArGrUrCrA*mG*mG

CGGCAG

AAAGT







GGC







*mU

ACTTCTC

GGCAC







ACCG









TTCAGGA

CGAGT







AGTC









GTCAGAT

CGGTG







GGTG











CAACG







C











GCAGA



















CTTCTC



















GTCAG



















GAGTC



















AGAT







HBB
CAT
19260
GTTT
19350
GAGT

AACGG
19530
CATGGTGCACCT
19620
mC*mA*mU*rGrGr
19710
CATGGTG
19800
CATGG
19890
+++


5_RT
GG

TAGA

CAGG

CAGAC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




20_P
TGC

GCTA



TTCTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS8
AC

GAA




T
TCAG


TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CAACGGCAGACT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





TCTCTTCAGGAG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





TCAGG

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArArCrGrGr

TGGCACC

CGTTA







CTTG







CrArGrArCrUrUrCr

GAGTCG

TCAAC







AAA







UrCrUrUrCrArGrGr

GTGCAA

TTGAA







AAGT







ArGrUrC*mA*mG*

CGGCAG

AAAGT







GGC







mG

ACTTCTC

GGCAC







ACCG









TTCAGGA

CGAGT







AGTC









GTCAGA

CGGTG







GGTG











CAACG







C











GCAGA



















CTTCTC



















GTCAG



















GAGTC



















AGA







HBB
CAT
19261
GTTT
19351
GAGT
19441
ACGGC
19531
CATGGTGCACCT
19621
mC*mA*mU*rGrGr
19711
CATGGTG
19801
CATGG
19891
+++


5_RT
GG

TAGA

CAGG

AGACT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




19_P
TGC

GCTA

TGCA

TCTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS17
AC

GAA

CCAT


T
TCAG


TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG

G



ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CACGGCAGACTT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CTCTTCAGGAGT

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





CAGGTGCACCAT

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA





G

CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArCrGrGrCr

TGGCACC

CGTTA







CTTG







ArGrArCrUrUrCrUr

GAGTCG

TCAAC







AAA







CrUrUrCrArGrGrAr

GTGCAC

TTGAA







AAGT







GrUrCrArGrGrUrGr

GGCAGA

AAAGT







GGC







CrArCrC*mA*mU*

CTTCTCT

GGCAC







ACCG







mG

TCAGGA

CGAGT







AGTC









GTCAGAT

CGGTG







GGTG









GCACCAT

CACGG







C









G

CAGAC



















TTCTC



















GTCAG



















GAGTC



















AGATG



















CACCA



















TG







HBB
CAT
19262
GTTT
19352
GAGT
19442
ACGGC
19532
CATGGTGCACCT
19622
mC*mA*mU*rGrGr
19712
CATGGTG
19802
CATGG
19892
+++


5_RT
GG

TAGA

CAGG

AGACT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




19_P
TGC

GCTA

TGCA

TCTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS16
AC

GAA

CCAT


T
TCAG


TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CACGGCAGACTT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CTCTTCAGGAGT

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





CAGGTGCACCAT

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArCrGrGrCr

TGGCACC

CGTTA







CTTG







ArGrArCrUrUrCrUr

GAGTCG

TCAAC







AAA







CrUrUrCrArGrGrAr

GTGCAC

TTGAA







AAGT







GrUrCrArGrGrUrGr

GGCAGA

AAAGT







GGC







CrArC*mC*mA*mU

CTTCTCT

GGCAC







ACCG









TCAGGA

CGAGT







AGTC









GTCAGAT

CGGTG







GGTG









GCACCAT

CACGG







C











CAGAC



















TTCTC



















GTCAG



















GAGTC



















AGATG



















CACCA



















T







HBB
CAT
19263
GTTT
19353
GAGT
19443
ACGGC
19533
CATGGTGCACCT
19623
mC*mA*mU*rGrGr
19713
CATGGTG
19803
CATGG
19893
+++


5_RT
GG

TAGA

CAGG

AGACT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




19_P
TGC

GCTA

TGCA

TCTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS15
AC

GAA

CCA


T
TCAG


TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CACGGCAGACTT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CTCTTCAGGAGT

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





CAGGTGCACCA

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArCrGrGrCr

TGGCACC

CGTTA







CTTG







ArGrArCrUrUrCrUr

GAGTCG

TCAAC







AAA







CrUrUrCrArGrGrAr

GTGCAC

TTGAA







AAGT







GrUrCrArGrGrUrGr

GGCAGA

AAAGT







GGC







CrA*mC*mC*mA

CTTCTCT

GGCAC







ACCG









TCAGGA

CGAGT







AGTC









GTCAGAT

CGGTG







GGTG









GCACCA

CACGG







C











CAGAC



















TTCTC



















GTCAG



















GAGTC



















AGATG



















CACCA







HBB
CAT
19264
GTTT
19354
GAGT
19444
ACGGC
19534
CATGGTGCACCT
19624
mC*mA*mU*rGrGr
19714
CATGGTG
19804
CATGG
19894
+++


5_RT
GG

TAGA

CAGG

AGACT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




19_P
TGC

GCTA

TGCA

TCTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS14
AC

GAA

CC


T
TCAG


TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CACGGCAGACTT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CTCTTCAGGAGT

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





CAGGTGCACC

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArCrGrGrCr

TGGCACC

CGTTA







CTTG







ArGrArCrUrUrCrUr

GAGTCG

TCAAC







AAA







CrUrUrCrArGrGrAr

GTGCAC

TTGAA







AAGT







GrUrCrArGrGrUrGr

GGCAGA

AAAGT







GGC







C*mA*mC*mC

CTTCTCT

GGCAC







ACCG









TCAGGA

CGAGT







AGTC









GTCAGAT

CGGTG







GGTG









GCACC

CACGG







C











CAGAC



















TTCTC



















GTCAG



















GAGTC



















AGATG



















CACC







HBB
CAT
19265
GTTT
19355
GAGT
19445
ACGGC
19535
CATGGTGCACCT
19625
mC*mA*mU*rGrGr
19715
CATGGTG
19805
CATGG
19895
+++


5_RT
GG

TAGA

CAGG

AGACT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




19_P
TGC

GCTA

TGCA

TCTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS13
AC

GAA

C


T
TCAG


TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CACGGCAGACTT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CTCTTCAGGAGT

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





CAGGTGCAC

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArCrGrGrCr

TGGCACC

CGTTA







CTTG







ArGrArCrUrUrCrUr

GAGTCG

TCAAC







AAA







CrUrUrCrArGrGrAr

GTGCAC

TTGAA







AAGT







GrUrCrArGrGrUrG*

GGCAGA

AAAGT







GGC







mC*mA*mC

CTTCTCT

GGCAC







ACCG









TCAGGA

CGAGT







AGTC









GTCAGAT

CGGTG







GGTG









GCAC

CACGG







C











CAGAC



















TTCTC



















GTCAG



















GAGTC



















AGATG



















CAC







HBB
CAT
19266
GTTT
19356
GAGT
19446
ACGGC
19536
CATGGTGCACCT
19626
mC*mA*mU*rGrGr
19716
CATGGTG
19806
CATGG
19896
+++


5_RT
GG

TAGA

CAGG

AGACT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




19_P
TGC

GCTA

TGCA

TCTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS12
AC

GAA




T
TCAG


TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CACGGCAGACTT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CTCTTCAGGAGT

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





CAGGTGCA

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArCrGrGrCr

TGGCACC

CGTTA







CTTG







ArGrArCrUrUrCrUr

GAGTCG

TCAAC







AAA







CrUrUrCrArGrGrAr

GTGCAC

TTGAA







AAGT







GrUrCrArGrGrU*m

GGCAGA

AAAGT







GGC







G*mC*mA

CTTCTCT

GGCAC







ACCG









TCAGGA

CGAGT







AGTC









GTCAGAT

CGGTG







GGTG









GCA

CACGG







C











CAGAC



















TTCTC



















GTCAG



















GAGTC



















AGATG



















CA




HBB
CAT
19267
GTTT
19357
GAGT
19447
ACGGC
19537
CATGGTGCACCT
19627
mC*mA*mU*rGrGr

CATGGTG

CATGG




5_RT
GG

TAGA

CAGG

AGACT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




19_P
TGC

GCTA

TGC

TCTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS11
AC

GAA




T
TCAG


TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CACGGCAGACTT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CTCTTCAGGAGT

CrUrUrGrArArArAr
19717
CCGTTAT
19807
AAAAT
19897
+++





GTCC





CAGGTGC

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArCrGrGrCr

TGGCACC

CGTTA







CTTG







ArGrArCrUrUrCrUr

GAGTCG

TCAAC







AAA







CrUrUrCrArGrGrAr

GTGCAC

TTGAA







AAGT







GrUrCrArGrG*mU*

GGCAGA

AAAGT







GGC







mG*mC

CTTCTCT

GGCAC







ACCG









TCAGGA

CGAGT







AGTC









GTCAGAT

CGGTG







GGTG









GC

CACGG



















CAGAC



















TTCTC



















GTCAG



















GAGTC



















AGATG



















C

















C
























HBB
CAT
19268
GTTT
19358
GAGT
19448
ACGGC
19538
CATGGTGCACCT
19628
mC*mA*mU*rGrGr
19718
CATGGTG
19808
CATGG
19898
+++


5_RT
GG

TAGA

CAGG

AGACT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




19_P
TGC

GCTA

TG

TCTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS10
AC

GAA




T
TCAG


TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CACGGCAGACTT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CTCTTCAGGAGT

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





CAGGTG

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArCrGrGrCr

TGGCACC

CGTTA







CTTG







ArGrArCrUrUrCrUr

GAGTCG

TCAAC







AAA







CrUrUrCrArGrGrAr

GTGCAC

TTGAA







AAGT







GrUrCrArG*mG*mU

GGCAGA

AAAGT







GGC







*mG

CTTCTCT

GGCAC







ACCG









TCAGGA

CGAGT







AGTC









GTCAGAT

CGGTG







GGTG









G

CACGG







C











CAGAC



















TTCTC



















GTCAG



















GAGTC



















AGATG







HBB
CAT
19269
GTTT
19359
GAGT

ACGGC
19539
CATGGTGCACCT
19629
mC*mA*mU*rGrGr
19719
CATGGTG
19809
CATGG
19899
+++


5_RT
GG

TAGA

CAGG

AGACT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




19_P
TGC

GCTA

T

TCTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS9
AC

GAA




T
TCAG


TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CACGGCAGACTT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CTCTTCAGGAGT

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





CAGGT

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArCrGrGrCr

TGGCACC

CGTTA







CTTG







ArGrArCrUrUrCrUr

GAGTCG

TCAAC







AAA







CrUrUrCrArGrGrAr

GTGCAC

TTGAA







AAGT







GrUrCrA*mG*mG*

GGCAGA

AAAGT







GGC







mU

CTTCTCT

GGCAC







ACCG









TCAGGA

CGAGT







AGTC









GTCAGAT

CGGTG







GGTG











CACGG







C











CAGAC



















TTCTC



















GTCAG



















GAGTC



















AGAT







HBB
CAT
19270
GTTT
19360
GAGT

ACGGC
19540
CATGGTGCACCT
19630
mC*mA*mU*rGrGr
19720
CATGGTG
19810
CATGG
19900
+++


5_RT
GG

TAGA

CAGG

AGACT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




19_P
TGC

GCTA



TCTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS8
AC

GAA




T
TCAG


TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CACGGCAGACTT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CTCTTCAGGAGT

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





CAGG

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArCrGrGrCr

TGGCACC

CGTTA







CTTG







ArGrArCrUrUrCrUr

GAGTCG

TCAAC







AAA







CrUrUrCrArGrGrAr

GTGCAC

TTGAA







AAGT







GrUrC*mA*mG*mG

GGCAGA

AAAGT







GGC









CTTCTCT

GGCAC







ACCG









TCAGGA

CGAGT







AGTC









GTCAGA

CGGTG







GGTG











CACGG







C











CAGAC



















TTCTC



















GTCAG



















GAGTC



















AGA







HBB
CAT
19271
GTTT
19361
GAGT
19451
GGCAG
19541
CATGGTGCACCT
19631
mC*mA*mU*rGrGr
19721
CATGGTG
19811
CATGG
19901
+++


5_RT
GG

TAGA

CAGG

ACTTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




17_P
TGC

GCTA

TGCA

TCTTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS17
AC

GAA

CCAT

AG

TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG

G



ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGGCAGACTTCT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CTTCAGGAGTCA

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GGTGCACCATG

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrGrCrArGr

TGGCACC

CGTTA







CTTG







ArCrUrUrCrUrCrUr

GAGTCG

TCAAC







AAA







UrCrArGrGrArGrUr

GTGCGG

TTGAA







AAGT







CrArGrGrUrGrCrAr

CAGACTT

AAAGT







GGC







CrC*mA*mU*mG

CTCTTCA

GGCAC







ACCG









GGAGTC

CGAGT







AGTC









AGATGC

CGGTG







GGTG









ACCATG

CGGCA







C











GACTT



















CTCGT



















CAGGA



















GTCAG



















ATGCA



















CCATG







HBB
CAT
19272
GTTT
19362
GAGT
19452
GGCAG
19542
CATGGTGCACCT
19632
mC*mA*mU*rGrGr
19722
CATGGTG
19812
CATGG
19902
+++


5_RT
GG

TAGA

CAGG

ACTTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




17_P
TGC

GCTA

TGCA

TCTTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS16
AC

GAA

CCAT

AG

TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGGCAGACTTCT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CTTCAGGAGTCA

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GGTGCACCAT

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrGrCrArGr

TGGCACC

CGTTA







CTTG







ArCrUrUrCrUrCrUr

GAGTCG

TCAAC







AAA







UrCrArGrGrArGrUr

GTGCGG

TTGAA







AAGT







CrArGrGrUrGrCrAr

CAGACTT

AAAGT







GGC







C*mC*mA*mU

CTCTTCA

GGCAC







ACCG









GGAGTC

CGAGT







AGTC









AGATGC

CGGTG







GGTG









ACCAT

CGGCA







C











GACTT



















CTCGT



















CAGGA



















GTCAG



















ATGCA



















CCAT







HBB
CAT
19273
GTTT
19363
GAGT
19453
GGCAG
19543
CATGGTGCACCT
19633
mC*mA*mU*rGrGr
19723
CATGGTG
19813
CATGG
19903
+++


5_RT
GG

TAGA

CAGG

ACTTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




17_P
TGC

GCTA

TGCA

TCTTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS15
AC

GAA

CCA

AG

TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGGCAGACTTCT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CTTCAGGAGTCA

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GGTGCACCA

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrGrCrArGr

TGGCACC

CGTTA







CTTG







ArCrUrUrCrUrCrUr

GAGTCG

TCAAC







AAA







UrCrArGrGrArGrUr

GTGCGG

TTGAA







AAGT







CrArGrGrUrGrCrA*

CAGACTT

AAAGT







GGC







mC*mC*mA

CTCTTCA

GGCAC







ACCG









GGAGTC

CGAGT







AGTC









AGATGC

CGGTG







GGTG









ACCA

CGGCA







C











GACTT



















CTCGT



















CAGGA



















GTCAG



















ATGCA



















CCA







HBB
CAT
19274
GTTT
19364
GAGT
19454
GGCAG
19544
CATGGTGCACCT
19634
mC*mA*mU*rGrGr
19724
CATGGTG
19814
CATGG
19904
+++


5_RT
GG

TAGA

CAGG

ACTTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




17_P
TGC

GCTA

TGCA

TCTTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS14
AC

GAA

CC

AG

TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGGCAGACTTCT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CTTCAGGAGTCA

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GGTGCACC

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrGrCrArGr

TGGCACC

CGTTA







CTTG







ArCrUrUrCrUrCrUr

GAGTCG

TCAAC







AAA







UrCrArGrGrArGrUr

GTGCGG

TTGAA







AAGT







CrArGrGrUrGrC*mA

CAGACTT

AAAGT







GGC







*mC*mC

CTCTTCA

GGCAC







ACCG









GGAGTC

CGAGT







AGTC









AGATGC

CGGTG







GGTG









ACC

CGGCA







C











GACTT



















CTCGT



















CAGGA



















GTCAG



















ATGCA



















CC







HBB
CAT
19
GTTT
19365
GAGT
19455
GGCAG
19545
CATGGTGCACCT
19635
mC*mA*mU*rGrGr
19725
CATGGTG
19815
CATGG
199
+++


5_RT
GG
27
TAGA

CAGG

ACTTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT
05



17_P
TGC
5
GCTA

TGCA

TCTTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS13
AC

GAA

C

AG

TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGGCAGACTTCT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CTTCAGGAGTCA

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GGTGCAC

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrGrCrArGr

TGGCACC

CGTTA







CTTG







ArCrUrUrCrUrCrUr

GAGTCG

TCAAC







AAA







UrCrArGrGrArGrUr

GTGCGG

TTGAA







AAGT







CrArGrGrUrG*mC*

CAGACTT

AAAGT







GGC







mA*mC

CTCTTCA

GGCAC







ACCG









GGAGTC

CGAGT







AGTC









AGATGC

CGGTG







GGTG









AC

CGGCA







C











GACTT



















CTCGT



















CAGGA



















GTCAG



















ATGCA



















C







HBB
CAT
19276
GTTT
19366
GAGT
19456
GGCAG
1954
CATGGTGCACCT
19636
mC*mA*mU*rGrGr
19726
CATGGTG
19816
CATGG
19906
+++


5_RT
GG

TAGA

CAGG

ACTTC
6
GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




17_P
TGC

GCTA

TGCA

TCTTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS12
AC

GAA



AG

TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGGCAGACTTCT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA







CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





CTTCAGGAGTCA

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA





GGTGCA

CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrGrCrArGr

TGGCACC

CGTTA







CTTG







ArCrUrUrCrUrCrUr

GAGTCG

TCAAC







AAA







UrCrArGrGrArGrUr

GTGCGG

TTGAA







AAGT







CrArGrGrU*mG*mC

CAGACTT

AAAGT







GGC







*mA

CTCTTCA

GGCAC







ACCG









GGAGTC

CGAGT







AGTC









AGATGC

CGGTG







GGTG









A

CGGCA







C











GACTT



















CTCGT



















CAGGA



















GTCAG



















ATGCA







HBB
CAT
19277
GTTT
19367
GAGT
19457
GGCAG
19547
CATGGTGCACCT
19637
mC*mA*mU*rGrGr
19727
CATGGTG
19817
CATGG
19907
+++


5_RT
GG

TAGA

CAGG

ACTTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




17_P
TGC

GCTA

TGC

TCTTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS11
AC

GAA



AG

TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGGCAGACTTCT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CTTCAGGAGTCA

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GGTGC

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrGrCrArGr

TGGCACC

CGTTA







CTTG







ArCrUrUrCrUrCrUr

GAGTCG

TCAAC







AAA







UrCrArGrGrArGrUr

GTGCGG

TTGAA







AAGT







CrArGrG*mU*mG*

CAGACTT

AAAGT







GGC







mC

CTCTTCA

GGCAC







ACCG









GGAGTC

CGAGT







AGTC









AGATGC

CGGTG







GGTG











CGGCA







C











GACTT



















CTCGT



















CAGGA



















GTCAG



















ATGC







HBB
CAT
19278
GTTT
19368
GAGT
19458
GGCAG
19548
CATGGTGCACCT
19638
mC*mA*mU*rGrGr
19728
CATGGTG
19818
CATGG
19908
+++


5_RT
GG

TAGA

CAGG

ACTTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




17_P
TGC

GCTA

TG

TCTTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS10
AC

GAA



AG

TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGGCAGACTTCT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CTTCAGGAGTCA

CrUrUrGrArArArAr

CCGTTAT

AAAAT







TCC





GGTG

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrGrCrArGr

TGGCACC

CGTTA







CTTG







ArCrUrUrCrUrCrUr

GAGTCG

TCAAC







AAA







UrCrArGrGrArGrUr

GTGCGG

TTGAA







AAGT







CrArG*mG*mU*mG

CAGACTT

AAAGT







GGC









CTCTTCA

GGCAC







ACCG









GGAGTC

CGAGT







AGTC









AGATG

CGGTG







GGTG











CGGCA







C











GACTT



















CTCGT



















CAGGA



















GTCAG



















ATG







HBB
CAT
19279
GTTT
19369
GAGT
19559
GGCAG
19549
CATGGTGCACCT
19639
mC*mA*mU*rGrGr
19729
CATGGTG
19819
CATGG
19909
+++


5_RT
GG

TAGA

CAGG

ACTTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




17_P
TGC

GCTA

T

TCTTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS9
AC

GAA



AG

TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGGCAGACTTCT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CTTCAGGAGTCA

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GGT

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrGrCrArGr

TGGCACC

CGTTA







CTTG







ArCrUrUrCrUrCrUr

GAGTCG

TCAAC







AAA







UrCrArGrGrArGrUr

GTGCGG

TTGAA







AAGT







CrA*mG*mG*mU

CAGACTT

AAAGT







GGC









CTCTTCA

GGCAC







ACCG









GGAGTC

CGAGT







AGTC









AGAT

CGGTG







GGTG











CGGCA







C











GACTT



















CTCGT



















CAGGA



















GTCAG



















AT







HBB
CAT
19280
GTTT
19370
GAGT
19560
GGCAG
19550
CATGGTGCACCT
19640
mC*mA*mU*rGrGr
19730
CATGGTG
19820
CATGG
19910
+++


5_RT
GG

TAGA

CAGG

ACTTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




17_P
TGC

GCTA



TCTTC

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS8
AC

GAA



AG

TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGGCAGACTTCT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CTTCAGGAGTCA

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GG

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrGrCrArGr

TGGCACC

CGTTA







CTTG







ArCrUrUrCrUrCrUr

GAGTCG

TCAAC







AAA







UrCrArGrGrArGrUr

GTGCGG

TTGAA







AAGT







C*mA*mG*mG

CAGACTT

AAAGT







GGC









CTCTTCA

GGCAC







ACCG









GGAGTC

CGAGT







AGTC









AGA

CGGTG







GGTG











CGGCA







C











GACTT



















CTCGT



















CAGGA



















GTCAG



















A







HBB
CAT
19281
GTTT
19371
GAGT
19461
GCAGA
19551
CATGGTGCACCT
19641
mC*mA*mU*rGrGr
19731
CATGGTG
19821
CATGG
19911
+++


5_RT
GG

TAGA

CAGG

CTTCT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT





TGC

GCTA

TGCA

CTTCA

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




16_P
AC

GAA

CCAT

G

TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG




BS17
CTG

ATAG

G



ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGCAGACTTCTC

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





TTCAGGAGTCAG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GTGCACCATG

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrCrArGrAr

TGGCACC

CGTTA







CTTG







CrUrUrCrUrCrUrUrC

GAGTCG

TCAAC







AAA







rArGrGrArGrUrCrAr

GTGCGC

TTGAA







AAGT







GrGrUrGrCrArCrC*

AGACTTC

AAAGT







GGC







mA*mU*mG

TCTTCAG

GGCAC







ACCG









GAGTCA

CGAGT







AGTC









GATGCA

CGGTG







GGTG









CCATG

CGCAG







C











ACTTC



















TCGTC



















AGGAG



















TCAGA



















TGCAC



















CATG







HBB
CAT
19282
GTTT
19372
GAGT
19462
GCAGA
19552
CATGGTGCACCT
19642
mC*mA*mU*rGrGr
19732
CATGGTG
19822
CATGG
19912
+++


5_RT
GG

TAGA

CAGG

CTTCT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




16_P
TGC

GCTA

TGCA

CTTCA

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS16
AC

GAA

CCAT

G

TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGCAGACTTCTC

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





TTCAGGAGTCAG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GTGCACCAT

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrCrArGrAr

TGGCACC

CGTTA







CTTG







CrUrUrCrUrCrUrUrC

GAGTCG

TCAAC







AAA







rArGrGrArGrUrCrAr

GTGCGC

TTGAA







AAGT







GrGrUrGrCrArC*mC

AGACTTC

AAAGT







GGC







*mA*mU

TCTTCAG

GGCAC







ACCG









GAGTCA

CGAGT







AGTC









GATGCA

CGGTG







GGTG









CCAT

CGCAG







C











ACTTC



















TCGTC



















AGGAG



















TCAGA



















TGCAC



















CAT







HBB
CAT
19283
GTTT
19373
GAGT
19463
GCAGA
19553
CATGGTGCACCT
19643
mC*mA*mU*rGrGr
19733
CATGGTG
19823
CATGG
19913
+++


5_RT
GG

TAGA

CAGG

CTTCT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




16_P
TGC

GCTA

TGCA

CTTCA

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS15
AC

GAA

CCA

G

TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGCAGACTTCTC

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





TTCAGGAGTCAG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GTGCACCA

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrCrArGrAr

TGGCACC

CGTTA







CTTG







CrUrUrCrUrCrUrUrC

GAGTCG

TCAAC







AAA







rArGrGrArGrUrCrAr

GTGCGC

TTGAA







AAGT







GrGrUrGrCrA*mC*

AGACTTC

AAAGT







GGC







mC*mA

TCTTCAG

GGCAC







ACCG









GAGTCA

CGAGT







AGTC









GATGCA

CGGTG







GGTG









CCA

CGCAG







C











ACTTC



















TCGTC



















AGGAG



















TCAGA



















TGCAC



















CA







HBB
CAT
19284
GTTT
19374
GAGT
19464
GCAGA
19554
CATGGTGCACCT
19644
mC*mA*mU*rGrGr
19734
CATGGTG
19824
CATGG
19914
+++


5_RT
GG

TAGA

CAGG

CTTCT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




16_P
TGC

GCTA

TGCA

CTTCA

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS14
AC

GAA

CC

G

TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGCAGACTTCTC

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





TTCAGGAGTCAG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GTGCACC

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrCrArGrAr

TGGCACC

CGTTA







CTTG







CrUrUrCrUrCrUrUrC

GAGTCG

TCAAC







AAA







rArGrGrArGrUrCrAr

GTGCGC

TTGAA







AAGT







GrGrUrGrC*mA*mC

AGACTTC

AAAGT







GGC







*mC

TCTTCAG

GGCAC







ACCG









GAGTCA

CGAGT







AGTC









GATGCA

CGGTG







GGTG









CC

CGCAG







C











ACTTC



















TCGTC



















AGGAG



















TCAGA



















TGCAC



















C







HBB
CAT
19285
GTTT
19375
GAGT
19465
GCAGA
19555
CATGGTGCACCT
19645
mC*mA*mU*rGrGr
19735
CATGGTG
19825
CATGG
19915
+++


5_RT
GG

TAGA

CAGG

CTTCT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




16_P
TGC

GCTA

TGCA

CTTCA

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS13
AC

GAA

C

G

TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGCAGACTTCTC

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





TTCAGGAGTCAG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GTGCAC

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrCrArGrAr

TGGCACC

CGTTA







CTTG







CrUrUrCrUrCrUrUrC

GAGTCG

TCAAC







AAA







rArGrGrArGrUrCrAr

GTGCGC

TTGAA







AAGT







GrGrUrG*mC*mA*

AGACTTC

AAAGT







GGC







mC

TCTTCAG

GGCAC







ACCG









GAGTCA

CGAGT







AGTC









GATGCA

CGGTG







GGTG









C

CGCAG







C











ACTTC



















TCGTC



















AGGAG



















TCAGA



















TGCAC







HBB
CAT
19286
GTTT
19376
GAGT
19466
GCAGA
19556
CATGGTGCACCT
19646
mC*mA*mU*rGrGr
19736
CATGGTG
19826
CATGG
19916
+++


5_RT
GG

TAGA

CAGG

CTTCT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




16_P
TGC

GCTA

TGCA

CTTCA

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS12
AC

GAA



G

TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGCAGACTTCTC

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





TTCAGGAGTCAG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GTGCA

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrCrArGrAr

TGGCACC

CGTTA







CTTG







CrUrUrCrUrCrUrUrC

GAGTCG

TCAAC







AAA







rArGrGrArGrUrCrAr

GTGCGC

TTGAA







AAGT







GrGrU*mG*mC*mA

AGACTTC

AAAGT







GGC









TCTTCAG

GGCAC







ACCG









GAGTCA

CGAGT







AGTC









GATGCA

CGGTG







GGTG











CGCAG







C











ACTTC



















TCGTC



















AGGAG



















TCAGA



















TGCA







HBB
CAT
19287
GTTT
19377
GAGT
19467
GCAGA
19557
CATGGTGCACCT
19647
mC*mA*mU*rGrGr
19737
CATGGTG
19827
CATGG
19917
+++


5_RT
GG

TAGA

CAGG

CTTCT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




16_P
TGC

GCTA

TGC

CTTCA

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS11
AC

GAA



G

TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGCAGACTTCTC

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





TTCAGGAGTCAG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GTGC

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrCrArGrAr

TGGCACC

CGTTA







CTTG







CrUrUrCrUrCrUrUrC

GAGTCG

TCAAC







AAA







rArGrGrArGrUrCrAr

GTGCGC

TTGAA







AAGT







GrG*mU*mG*mC

AGACTTC

AAAGT







GGC









TCTTCAG

GGCAC







ACCG









GAGTCA

CGAGT







AGTC









GATGC

CGGTG







GGTG











CGCAG







C











ACTTC



















TCGTC



















AGGAG



















TCAGA



















TGC







HBB
CAT
19288
GTTT
19378
GAGT
19468
GCAGA
19558
CATGGTGCACCT
19648
mC*mA*mU*rGrGr
19738
CATGGTG
19828
CATGG
19918
+++


5_RT
GG

TAGA

CAGG

CTTCT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




16_P
TGC

GCTA

TG

CTTCA

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS10
AC

GAA



G

TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGCAGACTTCTC

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





TTCAGGAGTCAG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GTG

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrCrArGrAr

TGGCACC

CGTTA







CTTG







CrUrUrCrUrCrUrUrC

GAGTCG

TCAAC







AAA







rArGrGrArGrUrCrAr

GTGCGC

TTGAA







AAGT







G*mG*mU*mG

AGACTTC

AAAGT







GGC









TCTTCAG

GGCAC







ACCG









GAGTCA

CGAGT







AGTC









GATG

CGGTG







GGTG











CGCAG







C











ACTTC



















TCGTC



















AGGAG



















TCAGA



















TG







HBB
CAT
19
GTTT
19379
GAGT
19469
GCAGA
19559
CATGGTGCACCT
19649
mC*mA*mU*rGrGr
19739
CATGGTG
19829
CATGG
19919
+++


5_RT
GG
28
TAGA

CAGG

CTTCT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




16_P
TGC
9
GCTA

T

CTTCA

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS9
AC

GAA



G

TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGCAGACTTCTC

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





TTCAGGAGTCAG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GT

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrCrArGrAr

TGGCACC

CGTTA







CTTG







CrUrUrCrUrCrUrUrC

GAGTCG

TCAAC







AAA







rArGrGrArGrUrCrA*

GTGCGC

TTGAA







AAGT







mG*mG*mU

AGACTTC

AAAGT







GGC









TCTTCAG

GGCAC







ACCG









GAGTCA

CGAGT







AGTC









GAT

CGGTG







GGTG











CGCAG







C











ACTTC



















TCGTC



















AGGAG



















TCAGA



















T







HBB
CAT
1929
GTTT
1938
GAGT
19470
GCAGA
19560
CATGGTGCACCT
19650
mC*mA*mU*rGrGr
19740
CATGGTG
19830
CATGG
19920
+++


5_RT
GG
0
TAGA
0
CAGG

CTTCT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




16_P
TGC

GCTA



CTTCA

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS8
AC

GAA



G

TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGCAGACTTCTC

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





TTCAGGAGTCAG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





G

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrCrArGrAr

TGGCACC

CGTTA







CTTG







CrUrUrCrUrCrUrUrC

GAGTCG

TCAAC







AAA







rArGrGrArGrUrC*m

GTGCGC

TTGAA







AAGT







A*mG*mG

AGACTTC

AAAGT







GGC









TCTTCAG

GGCAC







ACCG









GAGTCA

CGAGT







AGTC









GA

CGGTG







GGTG











CGCAG







C











ACTTC



















TCGTC



















AGGAG



















TCAGA







HBB
CAT
19291
GTTT
19381
GAGT
19471
AGACT
19561
CATGGTGCACCT
19651
mC*mA*mU*rGrGr
19741
CATGGTG
19831
CATGG
19921
+++


5_RT
GG

TAGA

CAGG

TCTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




14_P
TGC

GCTA

TGCA


T
TCAG


TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS17
AC

GAA

CCAT



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG

G



ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CAGACTTCTCTT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CAGGAGTCAGGT

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GCACCATG

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArGrArCrUr

TGGCACC

CGTTA







CTTG







UrCrUrCrUrUrCrAr

GAGTCG

TCAAC







AAA







GrGrArGrUrCrArGr

GTGCAG

TTGAA







AAGT







GrUrGrCrArCrC*mA

ACTTCTC

AAAGT







GGC







*mU*mG

TTCAGGA

GGCAC







ACCG









GTCAGAT

CGAGT







AGTC









GCACCAT

CGGTG







GGTG









G

CAGAC







C











TTCTC



















GTCAG



















GAGTC



















AGATG



















CACCA



















TG







HBB
CAT
19292
GTTT
19382
GAGT
19472
AGACT
19562
CATGGTGCACCT
19652
mC*mA*mU*rGrGr
19742
CATGGTG
19832
CATGG
19922
+++


5_RT
GG

TAGA

CAGG

TCTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




14_P
TGC

GCTA

TGCA


T
TCAG


TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS16
AC

GAA

CCAT



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CAGACTTCTCTT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CAGGAGTCAGGT

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GCACCAT

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArGrArCrUr

TGGCACC

CGTTA







CTTG







UrCrUrCrUrUrCrAr

GAGTCG

TCAAC







AAA







GrGrArGrUrCrArGr

GTGCAG

TTGAA







AAGT







GrUrGrCrArC*mC*

ACTTCTC

AAAGT







GGC







mA*mU

TTCAGGA

GGCAC







ACCG









GTCAGAT

CGAGT







AGTC









GCACCAT

CGGTG







GGTG











CAGAC







C











TTCTC



















GTCAG



















GAGTC



















AGATG



















CACCA



















T







HBB
CAT
19293
GTTT
19383
GAGT
19473
AGACT
19563
CATGGTGCACCT
19653
mC*mA*mU*rGrGr
19743
CATGGTG
19833
CATGG
19923
+++


5_RT
GG

TAGA

CAGG

TCTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




14_P
TGC

GCTA

TGCA


T
TCAG


TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS15
AC

GAA

CCA



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CAGACTTCTCTT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CAGGAGTCAGGT

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GCACCA

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArGrArCrUr

TGGCACC

CGTTA







CTTG







UrCrUrCrUrUrCrAr

GAGTCG

TCAAC







AAA







GrGrArGrUrCrArGr

GTGCAG

TTGAA







AAGT







GrUrGrCrA*mC*mC

ACTTCTC

AAAGT







GGC







*mA

TTCAGGA

GGCAC







ACCG









GTCAGAT

CGAGT







AGTC









GCACCA

CGGTG







GGTG











CAGAC







C











TTCTC



















GTCAG



















GAGTC



















AGATG



















CACCA







HBB
CAT
19294
GTTT
19384
GAGT
19474
AGACT
19564
CATGGTGCACCT
19694
mC*mA*mU*rGrGr
19744
CATGGTG
19834
CATGG
19924
+++


5_RT
GG

TAGA

CAGG

TCTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




14_P
TGC

GCTA

TGCA


T
TCAG


TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS14
AC

GAA

CC



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CAGACTTCTCTT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CAGGAGTCAGGT

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GCACC

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArGrArCrUr

TGGCACC

CGTTA







CTTG







UrCrUrCrUrUrCrAr

GAGTCG

TCAAC







AAA







GrGrArGrUrCrArGr

GTGCAG

TTGAA







AAGT







GrUrGrC*mA*mC*

ACTTCTC

AAAGT







GGC







mC

TTCAGGA

GGCAC







ACCG









GTCAGAT

CGAGT







AGTC









GCACC

CGGTG







GGTG











CAGAC







C











TTCTC



















GTCAG



















GAGTC



















AGATG



















CACC







HBB
CAT
19295
GTTT
19385
GAGT
19475
AGACT
19565
CATGGTGCACCT
19655
mC*mA*mU*rGrGr
19745
CATGGTG
19835
CATGG
19925
+++


5_RT
GG

TAGA

CAGG

TCTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




14_P
TGC

GCTA

TGCA


T
TCAG


TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS13
AC

GAA

C



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CAGACTTCTCTT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CAGGAGTCAGGT

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GCAC

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArGrArCrUr

TGGCACC

CGTTA







CTTG







UrCrUrCrUrUrCrAr

GAGTCG

TCAAC







AAA







GrGrArGrUrCrArGr

GTGCAG

TTGAA







AAGT







GrUrG*mC*mA*mC

ACTTCTC

AAAGT







GGC









TTCAGGA

GGCAC







ACCG









GTCAGAT

CGAGT







AGTC









GCAC

CGGTG







GGTG











CAGAC







C











TTCTC



















GTCAG



















GAGTC



















AGATG



















CAC







HBB
CAT
19296
GTTT
19386
GAGT
19476
AGACT
19566
CATGGTGCACCT
19656
mC*mA*mU*rGrGr
19746
CATGGTG
19836
CATGG
19926
+++


5_RT
GG

TAGA

CAGG

TCTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




14 F
TGC

GCTA

TGCA


T
TCAG


TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS12
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CAGACTTCTCTT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CAGGAGTCAGGT

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GCA

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArGrArCrUr

TGGCACC

CGTTA







CTTG







UrCrUrCrUrUrCrAr

GAGTCG

TCAAC







AAA







GrGrArGrUrCrArGr

GTGCAG

TTGAA







AAGT







GrU*mG*mC*mA

ACTTCTC

AAAGT







GGC









TTCAGGA

GGCAC







ACCG









GTCAGAT

CGAGT







AGTC









GCA

CGGTG







GGTG











CAGAC







C











TTCTC



















GTCAG



















GAGTC



















AGATG



















CA







HBB
CAT
19297
GTTT
19387
GAGT
19477
AGACT
19567
CATGGTGCACCT
19657
mC*mA*mU*rGrGr
19747
CATGGTG
198
CATGG
19927
+++


5_RT
GG

TAGA

CAGG

TCTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA
37
TGCAT




14_P
TGC

GCTA

TGC


T
TCAG


TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS11
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CAGACTTCTCTT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CAGGAGTCAGGT

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





GC

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArGrArCrUr

TGGCACC

CGTTA







CTTG







UrCrUrCrUrUrCrAr

GAGTCG

TCAAC







AAA







GrGrArGrUrCrArGr

GTGCAG

TTGAA







AAGT







G*mU*mG*mC

ACTTCTC

AAAGT







GGC









TTCAGGA

GGCAC







ACCG









GTCAGAT

CGAGT







AGTC









GC

CGGTG







GGTG











CAGAC







C











TTCTC



















GTCAG



















GAGTC



















AGATG



















C







HBB
CAT
19
GTTT
19388
GAGT
19478
AGACT
19568
CATGGTGCACCT
19658
mC*mA*mU*rGrGr
19748
CATGGTG
19838
CATGG
19928
+++


5_RT
GG
298
TAGA

CAGG

TCTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




14_P
TGC

GCTA

TG


T
TCAG


TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS10
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CC′]

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CAGACTTCTCTT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CAGGAGTCAGGT

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





G

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArGrArCrUr

TGGCACC

CGTTA







CTTG







UrCrUrCrUrUrCrAr

GAGTCG

TCAAC







AAA







GrGrArGrUrCrArG*

GTGCAG

TTGAA







AAGT







mG*mU*mG

ACTTCTC

AAAGT







GGC









TTCAGGA

GGCAC







ACCG









GTCAGAT

CGAGT







AGTC









G

CGGTG







GGTG











CAGAC







C











TTCTC



















GTCAG



















GAGTC



















AGATG







HBB
CAT
19299
GTTT
19389
GAGT

AGACT
19569
CATGGTGCACCT
19659
mC*mA*mU*rGrGr
19749
CATGGTG
19839
CATGG
19929
+++


5_RT
GG

TAGA

CAGG

TCTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




14_P
TGC

GCTA

T


T
TCAG


TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS9
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CAGACTTCTCTT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CAGGAGTCAGGT

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC







ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArGrArCrUr

TGGCACC

CGTTA







CTTG







UrCrUrCrUrUrCrAr

GAGTCG

TCAAC







AAA







GrGrArGrUrCrA*m

GTGCAG

TTGAA







AAGT







G*mG*mU

ACTTCTC

AAAGT







GGC









TTCAGGA

GGCAC







ACCG









GTCAGAT

CGAGT







AGTC











CGGTG







GGTG











CAGAC







C











TTCTC



















GTCAG



















GAGTC



















AGAT







HBB
CAT
19300
GTTT
19390
GAGT

AGACT
19570
CATGGTGCACCT
19660
mC*mA*mU*rGrGr
19750
CATGGTG
19840
CATGG
19930
+++


5_RT
GG

TAGA

CAGG

TCTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




14_P
TGC

GCTA




T
TCAG


TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS8
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CAGACTTCTCTT

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





CAGGAGTCAGG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC







ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArGrArCrUr

TGGCACC

CGTTA







CTTG







UrCrUrCrUrUrCrAr

GAGTCG

TCAAC







AAA







GrGrArGrUrC*mA*

GTGCAG

TTGAA







AAGT







mG*mG

ACTTCTC

AAAGT







GGC









TTCAGGA

GGCAC







ACCG









GTCAGA

CGAGT







AGTC











CGGTG







GGTG











CAGAC







C











TTCTC



















GTCAG



















GAGTC



















AGA







HBB
CAT
19301
GTTT
19391
GAGT
19481
GACTT
19571
CATGGTGCACCT
19661
mC*mA*mU*rGrGr
19751
CATGGTG
19841
CATGG
19931
+++


5_RT
GG

TAGA

CAGG

CTCTT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




13_P
TGC

GCTA

TGCA

CAG

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS17
AC

GAA

CCAT



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG

G



ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGACTTCTCTTC

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





AGGAGTCAGGTG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





CACCATG

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrArCrUrUr

TGGCACC

CGTTA







CTTG







CrUrCrUrUrCrArGr

GAGTCG

TCAAC







AAA







GrArGrUrCrArGrGr

GTGCGA

TTGAA







AAGT







UrGrCrArCrC*mA*

CTTCTCT

AAAGT







GGC







mU*mG

TCAGGA

GGCAC







ACCG









GTCAGAT

CGAGT







AGTC









GCACCAT

CGGTG







GGTG









G

CGACT







C











TCTCG



















TCAGG



















AGTCA



















GATGC



















ACCAT



















G







HBB
CAT
19302
GTTT
19392
GAGT
19482
GACTT
19572
CATGGTGCACCT
19662
mC*mA*mU*rGrGr
19752
CATGGTG
19842
CATGG
19932
+++


5_RT
GG

TAGA

CAGG

CTCTT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




13_P
TGC

GCTA

TGCA

CAG

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS16
AC

GAA

CCAT



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGACTTCTCTTC

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





AGGAGTCAGGTG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





CACCAT

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrArCrUrUr

TGGCACC

CGTTA







CTTG







CrUrCrUrUrCrArGr

GAGTCG

TCAAC







AAA







GrArGrUrCrArGrGr

GTGCGA

TTGAA







AAGT







UrGrCrArC*mC*mA

CTTCTCT

AAAGT







GGC







*mU

TCAGGA

GGCAC







ACCG









GTCAGAT

CGAGT







AGTC









GCACCAT

CGGTG







GGTG











CGACT







C











TCTCG



















TCAGG



















AGTCA



















GATGC



















ACCAT







HBB
CAT
19303
GTTT
19393
GAGT
19483
GACTT
19573
CATGGTGCACCT
19663
mC*mA*mU*rGrGr
19753
CATGGTG
19843
CATGG
19933
+++


5_RT
GG

TAGA

CAGG

CTCTT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




13_P
TGC

GCTA

TGCA

CAG

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS15
AC

GAA

CCA



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGACTTCTCTTC

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA







CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





AGGAGTCAGGTG

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA





CACCA

CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrArCrUrUr

TGGCACC

CGTTA







CTTG







CrUrCrUrUrCrArGr

GAGTCG

TCAAC







AAA







GrArGrUrCrArGrGr

GTGCGA

TTGAA







AAGT







UrGrCrA*mC*mC*

CTTCTCT

AAAGT







GGC







mA

TCAGGA

GGCAC







ACCG









GTCAGAT

CGAGT







AGTC









GCACCA

CGGTG







GGTG











CGACT







C











TCTCG



















TCAGG



















AGTCA



















GATGC



















ACCA







HBB
CAT
19304
GTTT
19394
GAGT
19484
GACTT
19574
CATGGTGCACCT
19664
mC*mA*mU*rGrGr
19754
CATGGTG
19844
CATGG
19934
+++


5_RT
GG

TAGA

CAGG

CTCTTTT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




13_P
TGC

GCTA

TGCA

CAG

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS14
AC

GAA

CC



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGACTTCTCTTC

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





AGGAGTCAGGTG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





CACC

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrArCrUrUr

TGGCACC

CGTTA







CTTG







CrUrCrUrUrCrArGr

GAGTCG

TCAAC







AAA







GrArGrUrCrArGrGr

GTGCGA

TTGAA







AAGT







UrGrC*mA*mC*mC

CTTCTCT

AAAGT







GGC









TCAGGA

GGCAC







ACCG









GTCAGAT

CGAGT







AGTC









GCACC

CGGTG







GGTG











CGACT







C











TCTCG



















TCAGG



















AGTCA



















GATGC



















ACC







HBB
CAT
19305
GTTT
19395
GAGT
19485
GACTT
19575
CATGGTGCACCT
19665
mC*mA*mU*rGrGr
19755
CATGGTG
19845
CATGG
19935
+++


5_RT
GG

TAGA

CAGG

CTCTT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




13_P
TGC

GCTA

TGCA

CAG

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS13
AC

GAA

C



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGACTTCTCTTC

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





AGGAGTCAGGTG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





CAC

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrArCrUrUr

TGGCACC

CGTTA







CTTG







CrUrCrUrUrCrArGr

GAGTCG

TCAAC







AAA







GrArGrUrCrArGrGr

GTGCGA

TTGAA







AAGT







UrG*mC*mA*mC

CTTCTCT

AAAGT







GGC









TCAGGA

GGCAC







ACCG









GTCAGAT

CGAGT







AGTC









GCAC

CGGTG







GGTG











CGACT







C











TCTCG



















TCAGG



















AGTCA



















GATGC



















AC







HBB
CAT
19306
GTTT
19396
GAGT
19486
GACTT
19576
CATGGTGCACCT
19666
mC*mA*mU*rGrGr
19756
CATGGTG
198
CATGG
199
+++


5_RT
GG

TAGA

CAGG

CTCTT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA
46
TGCAT
36



13_P
TGC

GCTA

TGCA

CAG

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS12
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGACTTCTCTTC

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





AGGAGTCAGGTG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





CA

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrArCrUrUr

TGGCACC

CGTTA







CTTG







CrUrCrUrUrCrArGr

GAGTCG

TCAAC







AAA







GrArGrUrCrArGrGr

GTGCGA

TTGAA







AAGT







U*mG*mC*mA

CTTCTCT

AAAGT







GGC









TCAGGA

GGCAC







ACCG









GTCAGAT

CGAGT







AGTC









GCA

CGGTG







GGTG











CGACT







C











TCTCG



















TCAGG



















AGTCA



















GATGC



















A







HBB
CAT
19307
GTTT
19397
GAGT
19487
GACTT
19577
CATGGTGCACCT
19667
mC*mA*mU*rGrGr
19757
CATGGTG
19847
CATGG
19937
+++


5_RT
GG

TAGA

CAGG

CTCTT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




13_P
TGC

GCTA

TGC

CAG

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS11
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGACTTCTCTTC

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





AGGAGTCAGGTG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





C

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrArCrUrUr

TGGCACC

CGTTA







CTTG







CrUrCrUrUrCrArGr

GAGTCG

TCAAC







AAA







GrArGrUrCrArGrG*

GTGCGA

TTGAA







AAGT







mU*mG*mC

CTTCTCT

AAAGT







GGC









TCAGGA

GGCAC







ACCG









GTCAGAT

CGAGT







AGTC









GC

CGGTG







GGTG











CGACT







C











TCTCG



















TCAGG



















AGTCA



















GATGC







HBB
CAT
19308
GTTT
19398
GAGT
19488
GACTT
19578
CATGGTGCACCT
19668
mC*mA*mU*rGrGr
19758
CATGGTG
19848
CATGG
19938
+++


5_RT
GG

TAGA

CAGG

CTCTT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




13_P
TGC

GCTA

TG

CAG

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS10
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGACTTCTCTTC

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





AGGAGTCAGGTG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC







ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrArCrUrUr

TGGCACC

CGTTA







CTTG







CrUrCrUrUrCrArGr

GAGTCG

TCAAC







AAA







GrArGrUrCrArG*m

GTGCGA

TTGAA







AAGT







G*mU*mG

CTTCTCT

AAAGT







GGC









TCAGGA

GGCAC







ACCG









GTCAGAT

CGAGT







AGTC









G

CGGTG







GGTG











CGACT







C











TCTCG



















TCAGG



















AGTCA



















GATG







HBB
CAT
19309
GTTT
19399
GAGT

GACTT
19579
CATGGTGCACCT
19669
mC*mA*mU*rGrGr
19759
CATGGTG
19849
CATGG
19939
+++


5_RT
GG

TAGA

CAGG

CTCTT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




13_P
TGC

GCTA

T

CAG

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS9
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGACTTCTCTTC

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





AGGAGTCAGGT

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC







ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrArCrUrUr

TGGCACC

CGTTA







CTTG







CrUrCrUrUrCrArGr

GAGTCG

TCAAC







AAA







GrArGrUrCrA*mG*

GTGCGA

TTGAA







AAGT







mG*mU

CTTCTCT

AAAGT







GGC









TCAGGA

GGCAC







ACCG









GTCAGAT

CGAGT







AGTC











CGGTG







GGTG











CGACT







C











TCTCG



















TCAGG



















AGTCA



















GAT







HBB
CAT
19310
GTTT
19400
GAGT

GACTT
19580
CATGGTGCACCT
19670
mC*mA*mU*rGrGr
19760
CATGGTG
19850
CATGG
19940
++


5_RT
GG

TAGA

CAGG

CTCTT

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




13_P
TGC

GCTA



CAG

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS8
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CGACTTCTCTTC

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





AGGAGTCAGG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC







ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrGrArCrUrUr

TGGCACC

CGTTA







CTTG







CrUrCrUrUrCrArGr

GAGTCG

TCAAC







AAA







GrArGrUrC*mA*mG

GTGCGA

TTGAA







AAGT







*mG

CTTCTCT

AAAGT







GGC









TCAGGA

GGCAC







ACCG









GTCAGA

CGAGT







AGTC











CGGTG







GGTG











CGACT







C











TCTCG



















TCAGG



















AGTCA



















GA







HBB
CAT
19311
GTTT
19401
GAGT
19491
ACTTC
19581
CATGGTGCACCT
19671
mC*mA*mU*rGrGr
19761
CATGGTG
19851
CATGG
19941
+++


5_RT
GG

TAGA

CAGG

TCTTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




12_P
TGC

GCTA

TGCA

AG

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS17
AC

GAA

CCAT



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG

G



ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CACTTCTCTTCA

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





GGAGTCAGGTGC

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





ACCATG

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArCrUrUrCr

TGGCACC

CGTTA







CTTG







UrCrUrUrCrArGrGr

GAGTCG

TCAAC







AAA







ArGrUrCrArGrGrUr

GTGCACT

TTGAA







AAGT







GrCrArCrC*mA*mU

TCTCTTC

AAAGT







GGC







*mG

AGGAGT

GGCAC







ACCG









CAGATG

CGAGT







AGTC









CACCATG

CGGTG







GGTG











CACTT







C











CTCGT



















CAGGA



















GTCAG



















ATGCA



















CCATG







HBB
CAT
19312
GTTT
19402
GAGT
19492
ACTTC
19582
CATGGTGCACCT
19672
mC*mA*mU*rGrGr
19762
CATGGTG
19852
CATGG
19942
+++


5_RT
GG

TAGA

CAGG

TCTTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




12_P
TGC

GCTA

TGCA

AG

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS16
AC

GAA

CCAT



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CACTTCTCTTCA

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





GGAGTCAGGTGC

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





ACCAT

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArCrUrUrCr

TGGCACC

CGTTA







CTTG







UrCrUrUrCrArGrGr

GAGTCG

TCAAC







AAA







ArGrUrCrArGrGrUr

GTGCACT

TTGAA







AAGT







GrCrArC*mC*mA*

TCTCTTC

AAAGT







GGC







mU

AGGAGT

GGCAC







ACCG









CAGATG

CGAGT







AGTC









CACCAT

CGGTG







GGTG











CACTT







C











CTCGT



















CAGGA



















GTCAG



















ATGCA



















CCAT







HBB
CAT
19313
GTTT
19403
GAGT
19493
ACTTC
19583
CATGGTGCACCT
19673
mC*mA*mU*rGrGr
19763
CATGGTG
19853
CATGG
19943
+++


5_RT
GG

TAGA

CAGG

TCTTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




12_P
TGC

GCTA

TGCA

AG

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS15
AC

GAA

CCA



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CACTTCTCTTCA

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





GGAGTCAGGTGC

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





ACCA

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArCrUrUrCr

TGGCACC

CGTTA







CTTG







UrCrUrUrCrArGrGr

GAGTCG

TCAAC







AAA







ArGrUrCrArGrGrUr

GTGCACT

TTGAA







AAGT







GrCrA*mC*mC*mA

TCTCTTC

AAAGT







GGC









AGGAGT

GGCAC







ACCG









CAGATG

CGAGT







AGTC









CACCA

CGGTG







GGTG











CACTT







C











CTCGT



















CAGGA



















GTCAG



















ATGCA



















CCA







HBB
CAT
19314
GTTT
19404
GAGT
19494
ACTTC
19584
CATGGTGCACCT
19674
mC*mA*mU*rGrGr
19764
CATGGTG
19854
CATGG
19944
+++


5_RT
GG

TAGA

CAGG

TCTTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




12_P
TGC

GCTA

TGCA

AG

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS14
AC

GAA

CC



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CACTTCTCTTCA

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





GGAGTCAGGTGC

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





ACC

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArCrUrUrCr

TGGCACC

CGTTA







CTTG







UrCrUrUrCrArGrGr

GAGTCG

TCAAC







AAA







ArGrUrCrArGrGrUr

GTGCACT

TTGAA







AAGT







GrC*mA*mC*mC

TCTCTTC

AAAGT







GGC









AGGAGT

GGCAC







ACCG









CAGATG

CGAGT







AGTC









CACC

CGGTG







GGTG











CACTT







C











CTCGT



















CAGGA



















GTCAG



















ATGCA



















CC







HBB
CAT
19315
GTTT
19405
GAGT
19495
ACTTC
19585
CATGGTGCACCT
19675
mC*mA*mU*rGrGr
19765
CATGGTG
19855
CATGG
19945
+++


5_RT
GG

TAGA

CAGG

TCTTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




12_P
TGC

GCTA

TGCA

AG

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS13
AC

GAA

C



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC





CAT

AAG





CACTTCTCTTCA

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





GGAGTCAGGTGC

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





AC

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArCrUrUrCr

TGGCACC

CGTTA







CTTG







UrCrUrUrCrArGrGr

GAGTCG

TCAAC







AAA







ArGrUrCrArGrGrUr

GTGCACT

TTGAA







AAGT







G*mC*mA*mC

TCTCTTC

AAAGT







GGC









AGGAGT

GGCAC







ACCG









CAGATG

CGAGT







AGTC









CAC

CGGTG







GGTG











CACTT







C











CTCGT



















CAGGA



















GTCAG



















ATGCA



















C







HBB
GG
19316
GTTT
19406
GAGT
19496
ACTTC
19586
CATGGTGCACCT
19676
mC*mA*mU*rGrGr
19766
CATGGTG
19856
CATGG
19946
+++


5_RT
TGC

TAGA

CAGG

TCTTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




12_P
AC

GCTA

TGCA

AG

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS12
CTG

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





ACT

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





CCT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





G

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA







AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CACTTCTCTTCA

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





GGAGTCAGGTGC

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





A

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArCrUrUrCr

TGGCACC

CGTTA







CTTG







UrCrUrUrCrArGrGr

GAGTCG

TCAAC







AAA







ArGrUrCrArGrGrU*

GTGCACT

TTGAA







AAGT







mG*mC*mA

TCTCTTC

AAAGT







GGC









AGGAGT

GGCAC







ACCG









CAGATG

CGAGT







AGTC









CA

CGGTG







GGTG











CACTT







C











CTCGT



















CAGGA



















GTCAG



















ATGCA







HBB
CAT
19317
GTTT
19407
GAGT
19497
ACTTC
19587
CATGGTGCACCT
19677
mC*mA*mU*rGrGr
19767
CATGGTG
19857
CATGG
19947
+++


5_RT
GG

TAGA

CAGG

TCTTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




12_P
TGC

GCTA

TGC

AG

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS11
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CACTTCTCTTCA

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





GGAGTCAGGTGC

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC







ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArCrUrUrCr

TGGCACC

CGTTA







CTTG







UrCrUrUrCrArGrGr

GAGTCG

TCAAC







AAA







ArGrUrCrArGrG*m

GTGCACT

TTGAA







AAGT







U*mG*mC

TCTCTTC

AAAGT







GGC









AGGAGT

GGCAC







ACCG









CAGATG

CGAGT







AGTC









C

CGGTG







GGTG











CACTT







C











CTCGT



















CAGGA



















GTCAG



















ATGC







HBB
CAT
19318
GTTT
19408
GAGT
19498
ACTTC
1958
CATGGTGCACCT
19678
mC*mA*mU*rGrGr
19768
CATGGTG
19858
CATGG
19948
+++


5_RT
GG

TAGA

CAGG

TCTTC
8
GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




12_P
TGC

GCTA

TG

AG

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS10
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CACTTCTCTTCA

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





GGAGTCAGGTG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC







ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArCrUrUrCr

TGGCACC

CGTTA







CTTG







UrCrUrUrCrArGrGr

GAGTCG

TCAAC







AAA







ArGrUrCrArG*mG*

GTGCACT

TTGAA







AAGT







mU*mG

TCTCTTC

AAAGT







GGC









AGGAGT

GGCAC







ACCG









CAGATG

CGAGT







AGTC











CGGTG







GGTG











CACTT







C











CTCGT



















CAGGA



















GTCAG



















ATG







HBB
CAT
19319
GTTT
19409
GAGT

ACTTC
19589
CATGGTGCACCT
1967
mC*mA*mU*rGrGr
19769
CATGGTG
19859
CATGG
19949
+++


5_RT
GG

TAGA

CAGG

TCTTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




12_P
TGC

GCTA

T

AG

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS9
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CACTTCTCTTCA

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





GGAGTCAGGT

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC







ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArCrUrUrCr

TGGCACC

CGTTA







CTTG







UrCrUrUrCrArGrGr

GAGTCG

TCAAC







AAA







ArGrUrCrA*mG*mG

GTGCACT

TTGAA







AAGT







*mU

TCTCTTC

AAAGT







GGC









AGGAGT

GGCAC







ACCG









CAGAT

CGAGT







AGTC











CGGTG







GGTG











CACTT







C











CTCGT



















CAGGA



















GTCAG



















AT




HBB
CAT
19320
GTTT
19410
GAGT

ACTTC
19590
CATGGTGCACCT
19680
mC*mA*mU*rGrGr
19770
CATGGTG
19860
CATGG
19950
+++


5_RT
GG

TAGA

CAGG

TCTTC

GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




12_P
TGC

GCTA



AG

TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS8
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CACTTCTCTTCA

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





GGAGTCAGG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC







ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrArCrUrUrCr

TGGCACC

CGTTA







CTTG







UrCrUrUrCrArGrGr

GAGTCG

TCAAC







AAA







ArGrUrC*mA*mG*

GTGCACT

TTGAA







AAGT







mG

TCTCTTC

AAAGT







GGC









AGGAGT

GGCAC







ACCG









CAGA

CGAGT







AGTC











CGGTG







GGTG











CACTT







C











CTCGT



















CAGGA



















GTCAG



















A







HBB
CAT
19321
GTTT
19411
GAGT
19501
TTCTC
19591
CATGGTGCACCT
19681
mC*mA*mU*rGrGr
19771
CATGGTG
19861
CATGG
19951
+++


5_RT
GG

TAGA

CAGG


T
TCAG


GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




10_P
TGC

GCTA

TGCA



TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS17
AC

GAA

CCAT



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG

G



ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CTTCTCTTCAGG

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





AGTCAGGTGCAC

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





CATG

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrUrUrCrUrCr

TGGCACC

CGTTA







CTTG







UrUrCrArGrGrArGr

GAGTCG

TCAAC







AAA







UrCrArGrGrUrGrCr

GTGCTTC

TTGAA







AAGT







ArCrC*mA*mU*mG

TCTTCAG

AAAGT







GGC









GAGTCA

GGCAC







ACCG









GATGCA

CGAGT







AGTC









CCATG

CGGTG







GGTG











CTTCTC







C











GTCAG



















GAGTC



















AGATG



















CACCA



















TG







HBB
CAT
19322
GTTT
19412
GAGT
19502
TTCTC
19592
CATGGTGCACCT
19682
mC*mA*mU*rGrGr
19772
CATGGTG
19862
CATGG
19952
+++


5_RT
GG

TAGA

CAGG


T
TCAG


GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




10_P
TGC

GCTA

TGCA



TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS16
AC

GAA

CCAT



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CTTCTCTTCAGG

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





AGTCAGGTGCAC

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





CAT

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrUrUrCrUrCr

TGGCACC

CGTTA







CTTG







UrUrCrArGrGrArGr

GAGTCG

TCAAC







AAA







UrCrArGrGrUrGrCr

GTGCTTC

TTGAA







AAGT







ArC*mC*mA*mU

TCTTCAG

AAAGT







GGC









GAGTCA

GGCAC







ACCG









GATGCA

CGAGT







AGTC









CCAT

CGGTG







GGTG











CTTCTC







C











GTCAG



















GAGTC



















AGATG



















CACCA



















T




HBB
CAT
19323
GTTT
19413
GAGT
19503
TTCTC
19593
CATGGTGCACCT
19683
mC*mA*mU*rGrGr
19773
CATGGTG
19863
CATGG
19953
+++


5_RT
GG

TAGA

CAGG


T
TCAG


GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




10_P
TGC

GCTA

TGCA



TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS15
AC

GAA

CCA



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC





CAT

AAG





CTTCTCTTCAGG

GrUrUrArUrCrArAr

GGCTAGT

AAGTT





GG

GCTA





AGTCAGGTGCAC

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





CA

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrUrUrCrUrCr

TGGCACC

CGTTA







CTTG







UrUrCrArGrGrArGr

GAGTCG

TCAAC







AAA







UrCrArGrGrUrGrCr

GTGCTTC

TTGAA







AAGT







A*mC*mC*mA

TCTTCAG

AAAGT







GGC









GAGTCA

GGCAC







ACCG









GATGCA

CGAGT







AGTC









CCA

CGGTG







GGTG











CTTCTC







C











GTCAG



















GAGTC



















AGATG



















CACCA







HBB
TGC
19324
GTTT
19414
GAGT
19504
TTCTC
19594
CATGGTGCACCT
19684
mC*mA*mU*rGrGr
19774
CATGGTG
19864
CATGG
19954
+++


5_RT
AC

TAGA

CAGG


T
TCAG


GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




10_P
CTG

GCTA

TGCA



TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS14
ACT

GAA

CC



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CCT

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





G

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC







GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA







AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CTTCTCTTCAGG

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





AGTCAGGTGCAC

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





C

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrUrUrCrUrCr

TGGCACC

CGTTA







CTTG







UrUrCrArGrGrArGr

GAGTCG

TCAAC







AAA







UrCrArGrGrUrGrC*

GTGCTTC

TTGAA







AAGT







mA*mC*mC

TCTTCAG

AAAGT







GGC









GAGTCA

GGCAC







ACCG









GATGCA

CGAGT







AGTC









CC

CGGTG







GGTG











CTTCTC







C











GTCAG



















GAGTC



















AGATG



















CACC




HBB
CAT
19325
GTTT
19415
GAGT
19505
TTCTC
19595
CATGGTGCACCT
19685
mC*mA*mU*rGrGr
19775
CATGGTG
19865
CATGG
19955
+++


5_RT
GG

TAGA

CAGG


T
TCAG


GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




10_P
TGC

GCTA

TGCA



TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS13
AC

GAA

C



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CTTCTCTTCAGG

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





AGTCAGGTGCAC

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC







ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrUrUrCrUrCr

TGGCACC

CGTTA







CTTG







UrUrCrArGrGrArGr

GAGTCG

TCAAC







AAA







UrCrArGrGrUrG*mC

GTGCTTC

TTGAA







AAGT







*mA*mC

TCTTCAG

AAAGT







GGC









GAGTCA

GGCAC







ACCG









GATGCA

CGAGT







AGTC









C

CGGTG







GGTG











CTTCTC







C











GTCAG



















GAGTC



















AGATG



















CAC







HBB
CAT
19326
GTTT
19416
GAGT
19506
TTCTC
19596
CATGGTGCACCT
19686
mC*mA*mU*rGrGr
19776
CATGGTG
19866
CATGG
19956
+++


5_RT
GG

TAGA

CAGG


T
TCAG


GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT





TGC

GCTA

TGCA



TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




10_P
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG




BS12
CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CTTCTCTTCAGG

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





AGTCAGGTGCA

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC







ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrUrUrCrUrCr

TGGCACC

CGTTA







CTTG







UrUrCrArGrGrArGr

GAGTCG

TCAAC







AAA







UrCrArGrGrU*mG*

GTGCTTC

TTGAA







AAGT







mC*mA

TCTTCAG

AAAGT







GGC









GAGTCA

GGCAC







ACCG









GATGCA

CGAGT







AGTC











CGGTG







GGTG











CTTCTC







C











GTCAG



















GAGTC



















AGATG



















CA







HBB
CAT
19327
GTTT
19417
GAGT
19507
TTCTC
19597
CATGGTGCACCT
19687
mC*mA*mU*rGrGr
19777
CATGGTG
19867
CATGG
19957
+++


5_RT
GG

TAGA

CAGG


T
TCAG


GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




10_P
TGC

GCTA

TGC



TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS11
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CTTCTCTTCAGG

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





AGTCAGGTGC

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC







ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrUrUrCrUrCr

TGGCACC

CGTTA







CTTG







UrUrCrArGrGrArGr

GAGTCG

TCAAC







AAA







UrCrArGrG*mU*mG

GTGCTTC

TTGAA







AAGT







*mC

TCTTCAG

AAAGT







GGC









GAGTCA

GGCAC







ACCG









GATGC

CGAGT







AGTC











CGGTG







GGTG











CTTCTC







C











GTCAG



















GAGTC



















AGATG



















C







HBB
CAT
19328
GTTT
19418
GAGT
19508
TTCTC
19598
CATGGTGCACCT
19688
mC*mA*mU*rGrGr
19778
CATGGTG
19868
CATGG
19958
+++


5_RT
GG

TAGA

CAGG


T
TCAG


GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




10_P
TGC

GCTA

TG



TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS10
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CTTCTCTTCAGG

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





AGTCAGGTG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC







ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrUrUrCrUrCr

TGGCACC

CGTTA







CTTG







UrUrCrArGrGrArGr

GAGTCG

TCAAC







AAA







UrCrArG*mG*mU*

GTGCTTC

TTGAA







AAGT







mG

TCTTCAG

AAAGT







GGC









GAGTCA

GGCAC







ACCG









GATG

CGAGT







AGTC











CGGTG







GGTG











CTTCTC







C











GTCAG



















GAGTC



















AGATG







HBB
CAT
19329
GTTT
19419
GAGT

TTCTC
19599
CATGGTGCACCT
19689
mC*mA*mU*rGrGr
19779
CATGGTG
19869
CATGG
19959
+++


5_RT
GG

TAGA

CAGG


T
TCAG


GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




10_P
TGC

GCTA

T



TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS9
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CTTCTCTTCAGG

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





AGTCAGGT

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC







ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrUrUrCrUrCr

TGGCACC

CGTTA







CTTG







UrUrCrArGrGrArGr

GAGTCG

TCAAC







AAA







UrCrA*mG*mG*mU

GTGCTTC

TTGAA







AAGT









TCTTCAG

AAAGT







GGC









GAGTCA

GGCAC







ACCG









GAT

CGAGT







AGTC











CGGTG







GGTG











CTTCTC







C











GTCAG



















GAGTC



















AGAT







HBB
CAT
19330
GTTT
19420
GAGT

TTCTC
19600
CATGGTGCACCT
19690
mC*mA*mU*rGrGr
19780
CATGGTG
19870
CATGG
19960
+++


5_RT
GG

TAGA

CAGG


T
TCAG


GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




10_P
TGC

GCTA





TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




BS8
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CTTCTCTTCAGG

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





AGTCAGG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC







ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrUrUrCrUrCr

TGGCACC

CGTTA







CTTG







UrUrCrArGrGrArGr

GAGTCG

TCAAC







AAA







UrC*mA*mG*mG

GTGCTTC

TTGAA







AAGT









TCTTCAG

AAAGT







GGC









GAGTCA

GGCAC







ACCG









GA

CGAGT







AGTC











CGGTG







GGTG











CTTCTC







C











GTCAG



















GAGTC



















AGA







HBB
CAT
19331
GTTT
19421
GAGT
19511
TCTC

CATGGTGCACCT
19691
mC*mA*mU*rGrGr
19781
CATGGTG
19871
CATGG
19961
+


5_RT
GG

TAGA

CAGG


T
TCAG


GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




9_PB
TGC

GCTA

TGCA



TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




S17
AC

GAA

CCAT



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG

G



ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC







GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





CCT

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC





G

AAG





CTCTCTTCAGGA

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





GTCAGGTGCACC

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





ATG

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrUrCrUrCrUr

TGGCACC

CGTTA







CTTG







UrCrArGrGrArGrUr

GAGTCG

TCAAC







AAA







CrArGrGrUrGrCrAr

GTGCTCT

TTGAA







AAGT







CrC*mA*mU*mG

CTTCAGG

AAAGT







GGC









AGTCAG

GGCAC







ACCG









ATGCACC

CGAGT







AGTC









ATG

CGGTG







GGTG











CTCTC







C











GTCAG



















GAGTC



















AGATG



















CACCA



















TG







HBB
CAT
19332
GTTT
19422
GAGT
19512
TCTC

CATGGTGCACCT
19692
mC*mA*mU*rGrGr
19782
CATGGTG
19872
CATGG
19962
++


5_RT
GG

TAGA

CAGG


T
TCAG


GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




9_PB
TGC

GCTA

TGCA



TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




S16
AC

GAA

CCAT



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CTCTCTTCAGGA

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





GTCAGGTGCACC

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





AT

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrUrCrUrCrUr

TGGCACC

CGTTA







CTTG







UrCrArGrGrArGrUr

GAGTCG

TCAAC







AAA







CrArGrGrUrGrCrAr

GTGCTCT

TTGAA







AAGT







C*mC*mA*mU

CTTCAGG

AAAGT







GGC









AGTCAG

GGCAC







ACCG









ATGCACC

CGAGT







AGTC









AT

CGGTG







GGTG











CTCTC







C











GTCAG



















GAGTC



















AGATG



















CACCA



















T




HBB
CAT
19333
GTTT
19423
GAGT
19513
TCTC

CATGGTGCACCT
19693
mC*mA*mU*rGrGr
19783
CATGGTG
19873
CATGG
19963
++


5_RT
GG

TAGA

CAGG


T
TCAG


GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




9_PB
TGC

GCTA

TGCA



TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




S15
AC

GAA

CCA



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CTCTCTTCAGGA

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





GTCAGGTGCACC

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC





A

ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrUrCrUrCrUr

TGGCACC

CGTTA







CTTG







UrCrArGrGrArGrUr

GAGTCG

TCAAC







AAA







CrArGrGrUrGrCrA*

GTGCTCT

TTGAA







AAGT







mC*mC*mA

CTTCAGG

AAAGT







GGC









AGTCAG

GGCAC







ACCG









ATGCACC

CGAGT







AGTC









A

CGGTG







GGTG











CTCTC







C











GTCAG



















GAGTC



















AGATG



















CACCA







HBB
CAT
19334
GTTT
19424
GAGT
19514
TCTC

CATGGTGCACCT
19594
mC*mA*mU*rGrGr
19784
CATGGTG
19874
CATGG
19964
++


5_RT
GG

TAGA

CAGG


T
TCAG


GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




9_PB
TGC

GCTA

TGCA



TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




S14
AC

GAA

CC



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CTCTCTTCAGGA

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





GTCAGGTGCACC

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC







ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrUrCrUrCrUr

TGGCACC

CGTTA







CTTG







UrCrArGrGrArGrUr

GAGTCG

TCAAC







AAA







CrArGrGrUrGrC*mA

GTGCTCT

TTGAA







AAGT







*mC*mC

CTTCAGG

AAAGT







GGC









AGTCAG

GGCAC







ACCG









ATGCACC

CGAGT







AGTC











CGGTG







GGTG











CTCTC







C











GTCAG



















GAGTC



















AGATG



















CACC




HBB
CAT
19335
GTTT
19425
GAGT
19515
TCTC

CATGGTGCACCT
19695
mC*mA*mU*rGrGr
19785
CATGGTG
19875
CATGG
19965
++


5_RT
GG

TAGA

CAGG


T
TCAG


GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




9_PB
TGC

GCTA

TGCA



TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




S13
AC

GAA

C



TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CTCTCTTCAGGA

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





GTCAGGTGCAC

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC







ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrUrCrUrCrUr

TGGCACC

CGTTA







CTTG







UrCrArGrGrArGrUr

GAGTCG

TCAAC







AAA







CrArGrGrUrG*mC*

GTGCTCT

TTGAA







AAGT







mA*mC

CTTCAGG

AAAGT







GGC









AGTCAG

GGCAC







ACCG









ATGCAC

CGAGT







AGTC











CGGTG







GGTG











CTCTC







C











GTCAG



















GAGTC



















AGATG



















CAC







HBB
CAT
19336
GTTT
19426
GAGT
19516
TCTC

CATGGTGCACCT
19696
mC*mA*mU*rGrGr
19786
CATGGTG
19876
CATGG
19966
+++


5_RT
GG

TAGA

CAGG


T
TCAG


GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




9_PB
TGC

GCTA

TGCA



TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




S12
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CTCTCTTCAGGA

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





GTCAGGTGCA

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC







ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrUrCrUrCrUr

TGGCACC

CGTTA







CTTG







UrCrArGrGrArGrUr

GAGTCG

TCAAC







AAA







CrArGrGrU*mG*mC

GTGCTCT

TTGAA







AAGT







*mA

CTTCAGG

AAAGT







GGC









AGTCAG

GGCAC







ACCG









ATGCA

CGAGT







AGTC











CGGTG







GGTG











CTCTC







C











GTCAG



















GAGTC



















AGATG



















CA







HBB
CAT
19337
GTTT
19427
GAGT
19517
TCTC

CATGGTGCACCT
19697
mC*mA*mU*rGrGr
19787
CATGGTG
19877
CATGG
19967
++


5_RT
GG

TAGA

CAGG


T
TCAG


GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




9_PB
TGC

GCTA

TGC



TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




S11
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CTCTCTTCAGGA

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





GTCAGGTGC

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC







ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrUrCrUrCrUr

TGGCACC

CGTTA







CTTG







UrCrArGrGrArGrUr

GAGTCG

TCAAC







AAA







CrArGrG*mU*mG*

GTGCTCT

TTGAA







AAGT







mC

CTTCAGG

AAAGT







GGC









AGTCAG

GGCAC







ACCG









ATGC

CGAGT







AGTC











CGGTG







GGTG











CTCTC







C































GTCAG



















GAGTC



















AGATG



















C







HBB
CAT
19338
GTTT
19428
GAGT
19518
TCTC

CATGGTGCACCT
19698
mC*mA*mU*rGrGr
19788
CATGGTG
19878
CATGG
19968
+++


5_RT
GG

TAGA

CAGG


T
TCAG


GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




9_PB
TGC

GCTA

TG



TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




S10
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CTCTCTTCAGGA

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





GTCAGGTG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC







ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrUrCrUrCrUr

TGGCACC

CGTTA







CTTG







UrCrArGrGrArGrUr

GAGTCG

TCAAC







AAA







CrArG*mG*mU*mG

GTGCTCT

TTGAA







AAGT









CTTCAGG

AAAGT







GGC









AGTCAG

GGCAC







ACCG









ATG

CGAGT







AGTC











CGGTG







GGTG











CTCTC







C











GTCAG



















GAGTC



















AGATG







HBB
CAT
19339
GTTT
19429
GAGT

TCTC

CATGGTGCACCT
19699
mC*mA*mU*rGrGr
19789
CATGGTG
19879
CATGG
19969
+++


5_RT
GG

TAGA

CAGG


T
TCAG


GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




9_PB
TGC

GCTA

T



TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




S9
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CTCTCTTCAGGA

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





GTCAGGT

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC







ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrUrCrUrCrUr

TGGCACC

CGTTA







CTTG







UrCrArGrGrArGrUr

GAGTCG

TCAAC







AAA







CrA*mG*mG*mU

GTGCTCT

TTGAA







AAGT









CTTCAGG

AAAGT







GGC









AGTCAG

GGCAC







ACCG









AT

CGAGT







AGTC











CGGTG







GGTG











CTCTC







C











GTCAG



















GAGTC



















AGAT




HBB
CAT
19340
GTTT
19430
GAGT

TCTC

CATGGTGCACCT
19700
mC*mA*mU*rGrGr
19790
CATGGTG
19880
CATGG
19970
+++


5_RT
GG

TAGA

CAGG


T
TCAG


GACTCCTGGTTT

UrGrCrArCrCrUrGr

CATCTGA

TGCAT




9_PB
TGC

GCTA





TAGAGCTAGAAA

ArCrUrCrCrUrGrGr

CTCCTGG

CTGAC




S8
AC

GAA





TAGCAAGTTAAA

UrUrUrUrArGrArGr

TTTTAGA

TCCTG





CTG

ATAG





ATAAGGCTAGTC

CrUrArGrArArArUr

GCTAGA

GTTTT





ACT

CAA





CGTTATCAACTT

ArGrCrArArGrUrUr

AATAGC

AGAGC





CCT

GTTA





GAAAAAGTGGC

ArArArArUrArArGr

AAGTTA

TAGAA





G

AAAT





ACCGAGTCGGTG

GrCrUrArGrUrCrCr

AAATAA

ATAGC







AAG





CTCTCTTCAGGA

GrUrUrArUrCrArAr

GGCTAGT

AAGTT







GCTA





GTCAGG

CrUrUrGrArArArAr

CCGTTAT

AAAAT







GTCC







ArGrUrGrGrCrArCr

CAACTTG

AAGGC







GTTA







CrGrArGrUrCrGrGr

AAAAAG

TAGTC







TCAA







UrGrCrUrCrUrCrUr

TGGCACC

CGTTA







CTTG







UrCrArGrGrArGrUr

GAGTCG

TCAAC







AAA







C*mA*mG*mG

GTGCTCT

TTGAA







AAGT









CTTCAGG

AAAGT







GGC









AGTCAG

GGCAC







ACCG









A

CGAGT







AGTC











CGGTG







GGTG











CTCTC







C











GTCAG



















GAGTC



















AGA
















TABLE AA







Table A Sequences Reproduced without Nucleotide Modifications.


The Template Sequence (+SNP +PAM-kill) (RNA) sequences


from Table A are reproduced below without nucleotide modifications.


In some embodiments, In some embodiments, the sequences used in this


table can be used without chemical modifications.











SEQ




ID


Name
Teplate Sequence (+SNP +PA-kill) (NA)
NO





HBB5_RT20
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21677


_PBS17
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCAACGGCAGACUUCUCUUCAGGAGUCAGGUGCACCAUG






HBB5_RT20
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21678


_PBS16
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCAACGGCAGACUUCUCUUCAGGAGUCAGGUGCACCAU






HBB5_RT20
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21679


_PBS15
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCAACGGCAGACUUCUCUUCAGGAGUCAGGUGCACCA






HBB5_RT20
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21680


_PBS14
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCAACGGCAGACUUCUCUUCAGGAGUCAGGUGCACC






HBB5_RT20
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21681


_PBS13
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCAACGGCAGACUUCUCUUCAGGAGUCAGGUGCAC






HBB5_RT20
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21682


_PBS12
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCAACGGCAGACUUCUCUUCAGGAGUCAGGUGCA






HBB5_RT20
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21683


_PBS11
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCAACGGCAGACUUCUCUUCAGGAGUCAGGUGC






HBB5_RT20
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21684


_PBS10
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCAACGGCAGACUUCUCUUCAGGAGUCAGGUG






HBB5_RT20
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21685


_PBS9
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCAACGGCAGACUUCUCUUCAGGAGUCAGGU






HBB5_RT20
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21686


_PBS8
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCAACGGCAGACUUCUCUUCAGGAGUCAGG






HBB5_RT19
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21687


_PBS17
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACGGCAGACUUCUCUUCAGGAGUCAGGUGCACCAUG






HBB5_RT19
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21688


_PBS16
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACGGCAGACUUCUCUUCAGGAGUCAGGUGCACCAU






HBB5_RT19
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21689


_PBS15
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACGGCAGACUUCUCUUCAGGAGUCAGGUGCACCA






HBB5_RT19
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21690


_PBS14
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACGGCAGACUUCUCUUCAGGAGUCAGGUGCACC






HBB5_RT19
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21691


_PBS13
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACGGCAGACUUCUCUUCAGGAGUCAGGUGCAC






HBB5_RT19
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21692


_PBS12
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACGGCAGACUUCUCUUCAGGAGUCAGGUGCA






HBB5_RT19
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21693


_PBS11
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACGGCAGACUUCUCUUCAGGAGUCAGGUGC






HBB5_RT19
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21694


_PBS10
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACGGCAGACUUCUCUUCAGGAGUCAGGUG






HBB5_RT19
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21695


_PBS9
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACGGCAGACUUCUCUUCAGGAGUCAGGU






HBB5_RT19
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21696


_PBS8
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACGGCAGACUUCUCUUCAGGAGUCAGG






HBB5_RT17
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21697


_PBS17
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGGCAGACUUCUCUUCAGGAGUCAGGUGCACCAUG






HBB5_RT17
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21698


_PBS16
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGGCAGACUUCUCUUCAGGAGUCAGGUGCACCAU






HBB5_RT17
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21699


_PBS15
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGGCAGACUUCUCUUCAGGAGUCAGGUGCACCA






HBB5_RT17
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21700


_PBS14
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGGCAGACUUCUCUUCAGGAGUCAGGUGCACC






HBB5_RT17
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21701


_PBS13
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGGCAGACUUCUCUUCAGGAGUCAGGUGCAC






HBB5_RT17
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21702


_PBS12
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGGCAGACUUCUCUUCAGGAGUCAGGUGCA






HBB5_RT17
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21703


_PBS11
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGGCAGACUUCUCUUCAGGAGUCAGGUGC






HBB5_RT17
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21704


_PBS10
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGGCAGACUUCUCUUCAGGAGUCAGGUG






HBB5_RT17
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21705


_PBS9
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGGCAGACUUCUCUUCAGGAGUCAGGU






HBB5_RT17
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21706


_PBS8
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGGCAGACUUCUCUUCAGGAGUCAGG






HBB5_RT16
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21707


_PBS17
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGCAGACUUCUCUUCAGGAGUCAGGUGCACCAUG






HBB5_RT16
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21708


_PBS16
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGCAGACUUCUCUUCAGGAGUCAGGUGCACCAU






HBB5_RT16
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21709


_PBS15
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGCAGACUUCUCUUCAGGAGUCAGGUGCACCA






HBB5_RT16
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21710


_PBS14
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGCAGACUUCUCUUCAGGAGUCAGGUGCACC






HBB5_RT16
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21711


_PBS13
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGCAGACUUCUCUUCAGGAGUCAGGUGCAC






HBB5_RT16
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21712


_PBS12
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGCAGACUUCUCUUCAGGAGUCAGGUGCA






HBB5_RT16
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21713


_PBS11
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGCAGACUUCUCUUCAGGAGUCAGGUGC






HBB5_RT16
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21714


_PBS10
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGCAGACUUCUCUUCAGGAGUCAGGUG






HBB5_RT16
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21715


_PBS9
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGCAGACUUCUCUUCAGGAGUCAGGU






HBB5_RT16
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21716


_PBS8
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGCAGACUUCUCUUCAGGAGUCAGG






HBB5_RT14
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21717


_PBS17
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCAGACUUCUCUUCAGGAGUCAGGUGCACCAUG






HBB5_RT14
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21718


_PBS16
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCAGACUUCUCUUCAGGAGUCAGGUGCACCAU






HBB5_RT14
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21719


_PBS15
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCAGACUUCUCUUCAGGAGUCAGGUGCACCA






HBB5_RT14
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21720


_PBS14
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCAGACUUCUCUUCAGGAGUCAGGUGCACC






HBB5_RT14
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21721


_PBS13
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCAGACUUCUCUUCAGGAGUCAGGUGCAC






HBB5_RT14
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21722


_PBS12
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCAGACUUCUCUUCAGGAGUCAGGUGCA






HBB5_RT14
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21723


_PBS11
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCAGACUUCUCUUCAGGAGUCAGGUGC






HBB5_RT14
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21724


_PBS10
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCAGACUUCUCUUCAGGAGUCAGGUG






HBB5_RT14
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21725


_PBS9
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCAGACUUCUCUUCAGGAGUCAGGU






HBB5_RT14
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21726


_PBS8
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCAGACUUCUCUUCAGGAGUCAGG






HBB5_RT13
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21727


_PBS17
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGACUUCUCUUCAGGAGUCAGGUGCACCAUG






HBB5_RT13
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21728


_PBS16
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGACUUCUCUUCAGGAGUCAGGUGCACCAU






HBB5_RT13
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21729


_PBS15
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGACUUCUCUUCAGGAGUCAGGUGCACCA






HBB5_RT13
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21730


_PBS14
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGACUUCUCUUCAGGAGUCAGGUGCACC






HBB5_RT13
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21731


_PBS13
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGACUUCUCUUCAGGAGUCAGGUGCAC






HBB5_RT13
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21732


_PBS12
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGACUUCUCUUCAGGAGUCAGGUGCA






HBB5_RT13
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21733


_PBS11
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGACUUCUCUUCAGGAGUCAGGUGC






HBB5_RT13
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21734


_PBS10
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGACUUCUCUUCAGGAGUCAGGUG






HBB5_RT13
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21735


_PBS9
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGACUUCUCUUCAGGAGUCAGGU






HBB5_RT13
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21736


_PBS8
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGACUUCUCUUCAGGAGUCAGG






HBB5_RT12
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21737


_PBS17
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACUUCUCUUCAGGAGUCAGGUGCACCAUG






HBB5_RT12
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21738


_PBS16
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACUUCUCUUCAGGAGUCAGGUGCACCAU






HBB5_RT12
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21739


_PBS15
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACUUCUCUUCAGGAGUCAGGUGCACCA






HBB5_RT12
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21740


_PBS14
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACUUCUCUUCAGGAGUCAGGUGCACC






HBB5_RT12
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21741


_PBS13
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACUUCUCUUCAGGAGUCAGGUGCAC






HBB5_RT12
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21742


_PBS12
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACUUCUCUUCAGGAGUCAGGUGCA






HBB5_RT12
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21743


_PBS11
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACUUCUCUUCAGGAGUCAGGUGC






HBB5_RT12
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21744


_PBS10
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACUUCUCUUCAGGAGUCAGGUG






HBB5_RT12
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21745


_PBS9
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACUUCUCUUCAGGAGUCAGGU






HBB5_RT12
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21746


_PBS8
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACUUCUCUUCAGGAGUCAGG






HBB5_RT10
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21747


_PBS17
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUUCUCUUCAGGAGUCAGGUGCACCAUG






HBB5_RT10
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21748


_PBS16
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUUCUCUUCAGGAGUCAGGUGCACCAU






HBB5_RT10
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21749


_PBS15
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUUCUCUUCAGGAGUCAGGUGCACCA






HBB5_RT10
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21750


_PBS14
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUUCUCUUCAGGAGUCAGGUGCACC






HBB5_RT10
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21751


_PBS13
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUUCUCUUCAGGAGUCAGGUGCAC






HBB5_RT10
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21752


_PBS12
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUUCUCUUCAGGAGUCAGGUGCA






HBB5_RT10
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21753


_PBS11
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUUCUCUUCAGGAGUCAGGUGC






HBB5_RT10
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21754


_PBS10
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUUCUCUUCAGGAGUCAGGUG






HBB5_RT10
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21755


_PBS9
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUUCUCUUCAGGAGUCAGGU






HBB5_RT10
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21756


_PBS8
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUUCUCUUCAGGAGUCAGG






HBB5_RT9
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21757


_PBS17
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUCUCUUCAGGAGUCAGGUGCACCAUG






HBB5_RT9
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21758


_PBS16
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUCUCUUCAGGAGUCAGGUGCACCAU






HBB5_RT9
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21759


_PBS15
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUCUCUUCAGGAGUCAGGUGCACCA






HBB5_RT9
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21760


_PBS14
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUCUCUUCAGGAGUCAGGUGCACC






HBB5_RT9
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21761


_PBS13
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUCUCUUCAGGAGUCAGGUGCAC






HBB5_RT9
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21762


_PBS12
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUCUCUUCAGGAGUCAGGUGCA






HBB5_RT9
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21763


_PBS11
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUCUCUUCAGGAGUCAGGUGC






HBB5_RT9
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21764


_PBS10
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUCUCUUCAGGAGUCAGGUG






HBB5_RT9
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21765


_PBS9
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUCUCUUCAGGAGUCAGGU






HBB5_RT9
CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU
21766


_PBS8
AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUCUCUUCAGGAGUCAGG
















TABLE B







HBB8 Sequences. The columns indicate, from left to right:


1) Name of the template RNA, 2) gRNA spacer sequence of the template RNA, 3) SpCas9 gRNA scaffold sequence of the template RNA,


4)PBS sequence of the template RNA, 5) RT template sequence of the template RNA, wherein a SNP relative to hg38 that is present


in HEK293T cells is bolded, and wherein the mutation region is underlined, 6) full template RNA sequence comprising HEK293T SNP,


7) Full template RNA sequence depicted as RNA corresponding to column 6, further showing chemical modifications as used in Example 4,


8) alternative template RNA sequence designed relative to hg38 reference genome (lacking HEK293T SNP) and


9) observed activity of template RNA of column 7 as defined in Example 4.





























RT













gRNA



Template



Template








SEQ
Scaffold
SEQ

SEQ
(293T
SEQ
Template
SEQ
Sequence
SEQ
Template
SEQ
Ac-




ID
(SpCas9
ID

ID
SNP;
ID
Sequence
ID
(+SNP)
ID
Sequence
ID
tiv-


Name
Spacer
NO
scaffold)
NO
PBS
NO
Correction)
NO
(+SNP)
NO
(RNA)
NO
(no SNP)
NO
ity 





HBB8_
GTAACG
19971
GTTTTAGAG
20061
GAG
20151
TGGTGC
20241
GTAACGGCAGA
20331
mG*mU*mA*rAr
20421
GTAACGG
20511
+


RT20_PB
GCAGA

CTAGAAAT

AAG

ACCTGA

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




S17
CTTCTC

AGCAAGTT

TCT

CTCCTG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT





CAC

AAAATAAG

GCC


AG


ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT

AC



CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGGTGCACCT

GrCrUrArGrUrCr

CCGTTATC







C





GACTCCTGAGGA

CrGrUrUrArUrCr

AACTTGA













GAAGTCTGCCGT

ArArCrUrUrGrAr

AAAAGTG













TAC

ArArArArGrUrGr

GCACCGA







HBB8_
GTAACG
19972
GTTTTAGAG
20062
GAG
20152
TGGTGC
20242
GTAACGGCAGA
20332
mG*mU*mA*rAr
20422
GTAACGG
20512
+


RT20_
GCAGA

CTAGAAAT

AAG

ACCTGA

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PBS16
CTTCTC

AGCAAGTT

TCT

CTCCTG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT





CAC

AAAATAAG

GCC


AG


ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT

A



CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGGTGCACCT

GrCrUrArGrUrCr

CCGTTATC







C





GACTCCTGAGGA

CrGrUrUrArUrCr

AACTTGA













GAAGTCTGCCGT

ArArCrUrUrGrAr

AAAAGTG













TA

ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGGTGC















CrUrGrGrUrGrCr

ATCTGACT















ArCrCrUrGrArCr

CCTGAGG















UrCrCrUrGrArGr

AGAAGTC















GrArGrArArGrUr

TGCCGTTA















CrUrGrCrCrG*m

















U*mU*mA









HBB8_
GTAACG
19973
GTTTTAGAG
20063
GAG
20153
TGGTGC
20243
GTAACGGCAGA
20333
mG*mU*mA*rAr
20423
GTAACGG
20513
+


RT20_PB
GCAGA

CTAGAAAT

AAG

ACCTGA

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




S15
CTTCTC

AGCAAGTT

TCT

CTCCTG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT





CAC

AAAATAAG

GCC


AG


ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGGTGCACCT

GrCrUrArGrUrCr

CCGTTATC







C





GACTCCTGAGGA

CrGrUrUrArUrCr

AACTTGA













GAAGTCTGCCGT

ArArCrUrUrGrAr

AAAAGTG













T

ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGGTGC















CrUrGrGrUrGrCr

ATCTGACT















ArCrCrUrGrArCr

CCTGAGG















UrCrCrUrGrArGr

AGAAGTC















GrArGrArArGrUr

TGCCGTT















CrUrGrCrC*mG*

















mU*mU









HBB8_
GTAACG
19974
GTTTTAGAG
20064
GAG
20154
TGGTGC
20244
GTAACGGCAGA
20334
mG*mU*mA*rAr
20424
GTAACGG
20514
+


RT20_PB
GCAGA

CTAGAAAT

AAG

ACCTGA

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




S14
CTTCTC

AGCAAGTT

TCT

CTCCTG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT





CAC

AAAATAAG

GCC


AG


ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGGTGCACCT

GrCrUrArGrUrCr

CCGTTATC







C





GACTCCTGAGGA

CrGrUrUrArUrCr

AACTTGA













GAAGTCTGCCGT

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGGTGC















CrUrGrGrUrGrCr

ATCTGACT















ArCrCrUrGrArCr

CCTGAGG















UrCrCrUrGrArGr

AGAAGTC















GrArGrArArGrUr

TGCCGT















CrUrGrC*mC*m

















G*mU









HBB8_
GTAACG
19975
GTTTTAGAG
20065
GAG
20155
TGGTGC
20245
GTAACGGCAGA
20335
mG*mU*mA*rAr
20425
GTAACGG
20515
+


RT20_
GCAGA

CTAGAAAT

AAG

ACCTGA

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

CTCCTG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S13
CAC

AAAATAAG

GCC


AG


ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

G



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGGTGCACCT

GrCrUrArGrUrCr

CCGTTATC







C





GACTCCTGAGGA

CrGrUrUrArUrCr

AACTTGA













GAAGTCTGCCG

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGGTGC















CrUrGrGrUrGrCr

ATCTGACT















ArCrCrUrGrArCr

CCTGAGG















UrCrCrUrGrArGr

AGAAGTC















GrArGrArArGrUr

TGCCG















CrUrG*mC*mC*

















mG









HBB8_
GTAACG
19976
GTTTTAGAG
20066
GAG
20156
TGGTGC
20246
GTAACGGCAGA
20336
mG*mU*mA*rAr
20426
GTAACGG
20516
++


RT20_
GCAGA

CTAGAAAT

AAG

ACCTGA

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

CTCCTG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S12
CAC

AAAATAAG

GCC


AG


ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGGTGCACCT

GrCrUrArGrUrCr

CCGTTATC







C





GACTCCTGAGGA

CrGrUrUrArUrCr

AACTTGA













GAAGTCTGCC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGGTGC















CrUrGrGrUrGrCr

ATCTGACT















ArCrCrUrGrArCr

CCTGAGG















UrCrCrUrGrArGr

AGAAGTC















GrArGrArArGrUr

TGCC















CrU*mG*mC*m

















C









HBB8_
GTAACG
19977
GTTTTAGAG
20067
GAG
20157
TGGTGC
20247
GTAACGGCAGA
20337
mG*mU*mA*rAr
20427
GTAACGG
20517
+++


RT20_
GCAGA

CTAGAAAT

AAG

ACCTGA

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

CTCCTG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S11
CAC

AAAATAAG

GC


AG


ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGGTGCACCT

GrCrUrArGrUrCr

CCGTTATC







C





GACTCCTGAGGA

CrGrUrUrArUrCr

AACTTGA













GAAGTCTGC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGGTGC















CrUrGrGrUrGrCr

ATCTGACT















ArCrCrUrGrArCr

CCTGAGG















UrCrCrUrGrArGr

AGAAGTC















GrArGrArArGrUr

TGC















C*mU*mG*mC









HBB8_
GTAACG
19978
GTTTTAGAG
20068
GAG
20158
TGGTGC
20248
GTAACGGCAGA
20338
mG*mU*mA*rAr
20428
GTAACGG
20518
++


RT20_PB
GCAGA

CTAGAAAT

AAG

ACCTGA

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




S10
CTTCTC

AGCAAGTT

TCT

CTCCTG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT





CAC

AAAATAAG

G


AG


ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGGTGCACCT

GrCrUrArGrUrCr

CCGTTATC







C





GACTCCTGAGGA

CrGrUrUrArUrCr

AACTTGA













GAAGTCTG

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGGTGC















CrUrGrGrUrGrCr

ATCTGACT















ArCrCrUrGrArCr

CCTGAGG















UrCrCrUrGrArGr

AGAAGTC















GrArGrArArGrU*

TG















mC*mU*mG









HBB8_
GTAACG
19979
GTTTTAGAG
20069
GAG

TGGTGC
20249
GTAACGGCAGA
20339
mG*mU*mA*rAr
20429
GTAACGG
20519
++


RT20_PB
GCAGA

CTAGAAAT

AAG

ACCTGA

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




S9
CTTCTC

AGCAAGTT

TCT

CTCCTG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT





CAC

AAAATAAG




AG


ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGGTGCACCT

GrCrUrArGrUrCr

CCGTTATC







C





GACTCCTGAGGA

CrGrUrUrArUrCr

AACTTGA













GAAGTCT

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGGTGC















CrUrGrGrUrGrCr

ATCTGACT















ArCrCrUrGrArCr

CCTGAGG















UrCrCrUrGrArGr

AGAAGTC















GrArGrArArG*m

T















U*mC*mU









HBB8_
GTAACG
19980
GTTTTAGAG
20070
GAG

TGGTGC
20250
GTAACGGCAGA
20340
mG*mU*mA*rAr
20430
GTAACGG
20520
++


RT20_PB
GCAGA

CTAGAAAT

AAG

ACCTGA

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




S8
CTTCTC

AGCAAGTT

TC

CTCCTG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT





CAC

AAAATAAG




AG


ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGGTGCACCT

GrCrUrArGrUrCr

CCGTTATC







C





GACTCCTGAGGA

CrGrUrUrArUrCr

AACTTGA













GAAGTC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGGTGC















CrUrGrGrUrGrCr

ATCTGACT















ArCrCrUrGrArCr

CCTGAGG















UrCrCrUrGrArGr

AGAAGTC















GrArGrArA*mG*

















mU*mC









HBB8_
GTAACG
19981
GTTTTAGAG
20071
GAG
20161
GGTGCA
20251
GTAACGGCAGA
20341
mG*mU*mA*rAr
20431
GTAACGG
20521
+


RT19_
GCAGA

CTAGAAAT

AAG


CCTGAC


CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

TCCTGA

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S17
CAC

AAAATAAG

GCC

G

ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT

AC



CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGGTGCACCTG

GrCrUrArGrUrCr

CCGTTATC







C





ACTCCTGAGGAG

CrGrUrUrArUrCr

AACTTGA













AAGTCTGCCGTT

ArArCrUrUrGrAr

AAAAGTG













AC

ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGGTGCA















CrGrGrUrGrCrAr

TCTGACTC















CrCrUrGrArCrUr

CTGAGGA















CrCrUrGrArGrGr

GAAGTCT















ArGrArArGrUrCr

GCCGTTAC















UrGrCrCrGrU*m

















U*mA*mC









HBB8_
GTAACG
19982
GTTTTAGAG
20072
GAG
20162
GGTGCA
20252
GTAACGGCAGA
20342
mG*mU*mA*rAr
20432
GTAACGG
20522
+


RT19_
GCAGA

CTAGAAAT

AAG


CCTGAC


CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

TCCTGA

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S16
CAC

AAAATAAG

GCC

G

ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT

A



CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGGTGCACCTG

GrCrUrArGrUrCr

CCGTTATC







C





ACTCCTGAGGAG

CrGrUrUrArUrCr

AACTTGA













AAGTCTGCCGTT

ArArCrUrUrGrAr

AAAAGTG













A

ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGGTGCA















CrGrGrUrGrCrAr

TCTGACTC















CrCrUrGrArCrUr

CTGAGGA















CrCrUrGrArGrGr

GAAGTCT















ArGrArArGrUrCr

GCCGTTA















UrGrCrCrG*mU*

















mU*mA









HBB8_
GTAACG
19983
GTTTTAGAG
20073
GAG
20163
GGTGCA
20253
GTAACGGCAGA
20343
mG*mU*mA*rAr
20433
GTAACGG
20523
+


RT19_
GCAGA

CTAGAAAT

AAG


CCTGAC


CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

TCCTGA

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S15
CAC

AAAATAAG

GCC

G

ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGGTGCACCTG

GrCrUrArGrUrCr

CCGTTATC







C





ACTCCTGAGGAG

CrGrUrUrArUrCr

AACTTGA













AAGTCTGCCGTT

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGGTGCA















CrGrGrUrGrCrAr

TCTGACTC















CrCrUrGrArCrUr

CTGAGGA















CrCrUrGrArGrGr

GAAGTCT















ArGrArArGrUrCr

GCCGTT















UrGrCrC*mG*m

















U*mU









HBB8_
GTAACG
19984
GTTTTAGAG
20074
GAG
20164
GGTGCA
20254
GTAACGGCAGA
20344
mG*mU*mA*rAr
20434
GTAACGG
20524
+


RT19_
GCAGA

CTAGAAAT

AAG

CCTGAC

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

TCCTGA

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S14
CAC

AAAATAAG

GCC

G

ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGGTGCACCTG

GrCrUrArGrUrCr

CCGTTATC







C





ACTCCTGAGGAG

CrGrUrUrArUrCr

AACTTGA













AAGTCTGCCGT

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGGTGCA















CrGrGrUrGrCrAr

TCTGACTC















CrCrUrGrArCrUr

CTGAGGA















CrCrUrGrArGrGr

GAAGTCT















ArGrArArGrUrCr

GCCGT















UrGrC*mC*mG*

















mU









HBB8_
GTAACG
19985
GTTTTAGAG
20075
GAG
20165
GGTGCA
20255
GTAACGGCAGA
20345
mG*mU*mA*rAr
20435
GTAACGG
20525
++


RT19_
GCAGA

CTAGAAAT

AAG


CCTGAC


CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

TCCTGA

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S13
CAC

AAAATAAG

GCC

G

ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

G



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGGTGCACCTG

GrCrUrArGrUrCr

CCGTTATC







C





ACTCCTGAGGAG

CrGrUrUrArUrCr

AACTTGA













AAGTCTGCCG

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGGTGCA















CrGrGrUrGrCrAr

TCTGACTC















CrCrUrGrArCrUr

CTGAGGA















CrCrUrGrArGrGr

GAAGTCT















ArGrArArGrUrCr

GCCG















UrG*mC*mC*m

















G









HBB8_
GTAACG
19986
GTTTTAGAG
20076
GAG
20166
GGTGCA
20256
GTAACGGCAGA
20346
mG*mU*mA*rAr
20436
GTAACGG
20526
++


RT19_
GCAGA

CTAGAAAT

AAG


CCTGAC


CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

TCCTGA

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S12
CAC

AAAATAAG

GCC

G

ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGGTGCACCTG

GrCrUrArGrUrCr

CCGTTATC







C





ACTCCTGAGGAG

CrGrUrUrArUrCr

AACTTGA













AAGTCTGCC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGGTGCA















CrGrGrUrGrCrAr

TCTGACTC















CrCrUrGrArCrUr

CTGAGGA















CrCrUrGrArGrGr

GAAGTCT















ArGrArArGrUrCr

GCC















U*mG*mC*mC









HBB8_
GTAACG
19987
GTTTTAGAG
20077
GAG
20167
GGTGCA
20257
GTAACGGCAGA
20347
mG*mU*mA*rAr
20437
GTAACGG
20527
+++


RT19_
GCAGA

CTAGAAAT

AAG


CCTGAC


CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

TCCTGA

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S11
CAC

AAAATAAG

GC

G

ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGGTGCACCTG

GrCrUrArGrUrCr

CCGTTATC







C





ACTCCTGAGGAG

CrGrUrUrArUrCr

AACTTGA













AAGTCTGC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGGTGCA















CrGrGrUrGrCrAr

TCTGACTC















CrCrUrGrArCrUr

CTGAGGA















CrCrUrGrArGrGr

GAAGTCT















ArGrArArGrUrC*

GC















mU*mG*mC









HBB8_
GTAACG
19988
GTTTTAGAG
20078
GAG
20168
GGTGCA
20258
GTAACGGCAGA
20348
mG*mU*mA*rAr
20438
GTAACGG
20528
++


RT19_
GCAGA

CTAGAAAT

AAG


CCTGAC


CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

TCCTGA

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S10
CAC

AAAATAAG

G

G

ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGGTGCACCTG

GrCrUrArGrUrCr

CCGTTATC







C





ACTCCTGAGGAG

CrGrUrUrArUrCr

AACTTGA













AAGTCTG

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGGTGCA















CrGrGrUrGrCrAr

TCTGACTC















CrCrUrGrArCrUr

CTGAGGA















CrCrUrGrArGrGr

GAAGTCT















ArGrArArGrU*m

G















C*mU*mG









HBB8_
GTAACG
19989
GTTTTAGAG
20079
GAG

GGTGCA
20259
GTAACGGCAGA
20349
mG*mU*mA*rAr
20439
GTAACGG
20529
++


RT19_
GCAGA

CTAGAAAT

AAG


CCTGAC


CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

TCCTGA

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S9
CAC

AAAATAAG



G

ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGGTGCACCTG

GrCrUrArGrUrCr

CCGTTATC







C





ACTCCTGAGGAG

CrGrUrUrArUrCr

AACTTGA













AAGTCT

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGGTGCA















CrGrGrUrGrCrAr

TCTGACTC















CrCrUrGrArCrUr

CTGAGGA















CrCrUrGrArGrGr

GAAGTCT















ArGrArArG*mU*

















mC*mU









HBB8_
GTAACG
19990
GTTTTAGAG
20080
GAG

GGTGCA
20260
GTAACGGCAGA
20350
mG*mU*mA*rAr
20440
GTAACGG
20530
++


RT19_
GCAGA

CTAGAAAT

AAG


CCTGAC


CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TC

TCCTGA

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S8
CAC

AAAATAAG



G

ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGGTGCACCTG

GrCrUrArGrUrCr

CCGTTATC







C





ACTCCTGAGGAG

CrGrUrUrArUrCr

AACTTGA













AAGTC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGGTGCA















CrGrGrUrGrCrAr

TCTGACTC















CrCrUrGrArCrUr

CTGAGGA















CrCrUrGrArGrGr

GAAGTC















ArGrArA*mG*m

















U*mC









HBB8_
GTAACG
19991
GTTTTAGAG
20081
GAG
20171
GTGCAC
20261
GTAACGGCAGA
20351
mG*mU*mA*rAr
20441
GTAACGG
20531
+


RT18_
GCAGA

CTAGAAAT

AAG

CTGACT

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

CCTGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S17
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT

AC



CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGTGCACCTGA

GrCrUrArGrUrCr

CCGTTATC







C





CTCCTGAGGAGA

CrGrUrUrArUrCr

AACTTGA













AGTCTGCCGTTA

ArArCrUrUrGrAr

AAAAGTG













C

ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGTGCATC















CrGrUrGrCrArCr

TGACTCCT















CrUrGrArCrUrCr

GAGGAGA















CrUrGrArGrGrAr

AGTCTGCC















GrArArGrUrCrUr

GTTAC















GrCrCrGrU*mU*

















mA*mC









HBB8_
GTAACG
19992
GTTTTAGAG
20082
GAG
20172
GTGCAC
20262
GTAACGGCAGA
20352
mG*mU*mA*rAr
20442
GTAACGG
20532
++


RT18_
GCAGA

CTAGAAAT

AAG

CTGACT

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

CCTGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S16
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT

A



CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGTGCACCTGA

GrCrUrArGrUrCr

CCGTTATC







C





CTCCTGAGGAGA

CrGrUrUrArUrCr

AACTTGA













AGTCTGCCGTTA

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGTGCATC















CrGrUrGrCrArCr

TGACTCCT















CrUrGrArCrUrCr

GAGGAGA















CrUrGrArGrGrAr

AGTCTGCC















GrArArGrUrCrUr

GTTA















GrCrCrG*mU*m

















U*mA









HBB8_
GTAACG
19993
GTTTTAGAG
20083
GAG
20173
GTGCAC
20263
GTAACGGCAGA
20353
mG*mU*mA*rAr
20443
GTAACGG
20533
++


RT18_
GCAGA

CTAGAAAT

AAG

CTGACT

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

CCTGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S15
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGTGCACCTGA

GrCrUrArGrUrCr

CCGTTATC







C





CTCCTGAGGAGA

CrGrUrUrArUrCr

AACTTGA













AGTCTGCCGTT

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGTGCATC















CrGrUrGrCrArCr

TGACTCCT















CrUrGrArCrUrCr

GAGGAGA















CrUrGrArGrGrAr

AGTCTGCC















GrArArGrUrCrUr

GTT















GrCrC*mG*mU*

















mU









HBB8_
GTAACG
19994
GTTTTAGAG
20084
GAG
20174
GTGCAC
20264
GTAACGGCAGA
20354
mG*mU*mA*rAr
20444
GTAACGG
20534
++


RT18_
GCAGA

CTAGAAAT

AAG

CTGACT

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PBS14
CTTCTC

AGCAAGTT

TCT

CCTGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT





CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGTGCACCTGA

GrCrUrArGrUrCr

CCGTTATC







C





CTCCTGAGGAGA

CrGrUrUrArUrCr

AACTTGA













AGTCTGCCGT

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGTGCATC















CrGrUrGrCrArCr

TGACTCCT















CrUrGrArCrUrCr

GAGGAGA















CrUrGrArGrGrAr

AGTCTGCC















GrArArGrUrCrUr

GT















GrC*mC*mG*m

















U









HBB8_
GTAACG
19995
GTTTTAGAG
20085
GAG
20175
GTGCAC
20265
GTAACGGCAGA
20355
mG*mU*mA*rAr
20445
GTAACGG
20535
++


RT18_
GCAGA

CTAGAAAT

AAG

CTGACT

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PBS13
CTTCTC

AGCAAGTT

TCT

CCTGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT





CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

G



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGTGCACCTGA

GrCrUrArGrUrCr

CCGTTATC







C





CTCCTGAGGAGA

CrGrUrUrArUrCr

AACTTGA













AGTCTGCCG

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















CrGrUrGrCrArCr

TGACTCCT















CrUrGrArCrUrCr

GAGGAGA















CrUrGrArGrGrAr

AGTCTGCC















GrArArGrUrCrUr

G















G*mC*mC*mG









HBB8_
GTAACG
19996
GTTTTAGAG
20086
GAG
20176
GTGCAC
20266
GTAACGGCAGA
20356
mG*mU*mA*rAr
20446
GTAACGG
20536
+++


RT1
GCAGA

CTAGAAAT

AAG

CTGACT

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




8_PB
CTTCTC

AGCAAGTT

TCT

CCTGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S12
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGTGCACCTGA

GrCrUrArGrUrCr

CCGTTATC







C





CTCCTGAGGAGA

CrGrUrUrArUrCr

AACTTGA













AGTCTGCC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGTGCATC















CrGrUrGrCrArCr

TGACTCCT















CrUrGrArCrUrCr

GAGGAGA















CrUrGrArGrGrAr

AGTCTGCC















GrArArGrUrCrU*

















mG*mC*mC









HBB8_
GTAACG
19997
GTTTTAGAG
20087
GAG
20177
GTGCAC
20267
GTAACGGCAGA
20357
mG*mU*mA*rAr
20447
GTAACGG
20537
+++


RT1
GCAGA

CTAGAAAT

AAG

CTGACT

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




8_
CTTCTC

AGCAAGTT

TCT

CCTGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




PB
CAC

AAAATAAG

GC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT




S11


GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGTGCACCTGA

GrCrUrArGrUrCr

CCGTTATC







C





CTCCTGAGGAGA

CrGrUrUrArUrCr

AACTTGA













AGTCTGC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGTGCATC















CrGrUrGrCrArCr

TGACTCCT















CrUrGrArCrUrCr

GAGGAGA















CrUrGrArGrGrAr

AGTCTGC















GrArArGrUrC*m

















U*mG*mC









HBB8
GTAACG
19998
GTTTTAGAG
20088
GAG
20178
GTGCAC
20268
GTAACGGCAGA
20358
mG*mU*mA*rAr
20448
GTAACGG
20538
+++


_RT1
GCAGA

CTAGAAAT

AAG

CTGACT

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




8_PB
CTTCTC

AGCAAGTT

TCT

CCTGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S10
CAC

AAAATAAG

G



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGTGCACCTGA

GrCrUrArGrUrCr

CCGTTATC







C





CTCCTGAGGAGA

CrGrUrUrArUrCr

AACTTGA













AGTCTG

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGTGCATC















CrGrUrGrCrArCr

TGACTCCT















CrUrGrArCrUrCr

GAGGAGA















CrUrGrArGrGrAr

AGTCTG















GrArArGrU*mC*

















mU*mG









HBB8
GTAACG
19999
GTTTTAGAG
20089
GAG

GTGCAC
20269
GTAACGGCAGA
20359
mG*mU*mA*rAr
20449
GTAACGG
20539
++


_RT1
GCAGA

CTAGAAAT

AAG

CTGACT

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




8_PB
CTTCTC

AGCAAGTT

TCT

CCTGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S9
CAC

AAAATAAG





ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGTGCACCTGA

GrCrUrArGrUrCr

CCGTTATC







C





CTCCTGAGGAGA

CrGrUrUrArUrCr

AACTTGA













AGTCT

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGTGCATC















CrGrUrGrCrArCr

TGACTCCT















CrUrGrArCrUrCr

GAGGAGA















CrUrGrArGrGrAr

AGTCT















GrArArG*mU*m

















C*mU









HBB8
GTAACG
20000
GTTTTAGAG
20090
GAG

GTGCAC
20270
GTAACGGCAGA
20360
mG*mU*mA*rAr
20450
GTAACGG
20540
++


_RT1
GCAGA

CTAGAAAT

AAG

CTGACT

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




8_
CTTCTC

AGCAAGTT

TC

CCTGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




PB
CAC

AAAATAAG





ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT




S8


GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGTGCACCTGA

GrCrUrArGrUrCr

CCGTTATC







C





CTCCTGAGGAGA

CrGrUrUrArUrCr

AACTTGA













AGTC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGTGCATC















CrGrUrGrCrArCr

TGACTCCT















CrUrGrArCrUrCr

GAGGAGA















CrUrGrArGrGrAr

AGTC















GrArA*mG*mU*

















mC









HBB8_
GTAACG
20001
GTTTTAGAG
20091
GAG
20181
TGCACC
20271
GTAACGGCAGA
20361
mG*mU*mA*rAr
20451
GTAACGG
20541
+


RT17_
GCAGA

CTAGAAAT

AAG

TGACTC

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

CTGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S17
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT

AC



CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGCACCTGAC

GrCrUrArGrUrCr

CCGTTATC







C





TCCTGAGGAGA

CrGrUrUrArUrCr

AACTTGA













AGTCTGCCGTTA

ArArCrUrUrGrAr

AAAAGTG













C

ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGCATCT















CrUrGrCrArCrCr

GACTCCTG















UrGrArCrUrCrCr

AGGAGAA















UrGrArGrGrArGr

GTCTGCCG















ArArGrUrCrUrGr

TTAC















CrCrGrU*mU*m

















A*mC









HBB8
GTAACG
20002
GTTTTAGAG
20092
GAG
20182
TGCACC
20272
GTAACGGCAGA
20362
mG*mU*mA*rAr
20452
GTAACGG
20542
+


_RT1
GCAGA

CTAGAAAT

AAG

TGACTC

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




7_
CTTCTC

AGCAAGTT

TCT

CTGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




PB
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT




S16


GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT

A



CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGCACCTGAC

GrCrUrArGrUrCr

CCGTTATC







C





TCCTGAGGAGA

CrGrUrUrArUrCr

AACTTGA













AGTCTGCCGTTA

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGCATCT















CrUrGrCrArCrCr

GACTCCTG















UrGrArCrUrCrCr

AGGAGAA















UrGrArGrGrArGr

GTCTGCCG















ArArGrUrCrUrGr

TTA















CrCrG*mU*mU*

















mA









HBB8
GTAACG
20003
GTTTTAGAG
20093
GAG
20183
TGCACC
20273
GTAACGGCAGA
20363
mG*mU*mA*rAr
20453
GTAACGG
20543
+


_RT1
GCAGA

CTAGAAAT

AAG

TGACTC

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




7_PB
CTTCTC

AGCAAGTT

TCT

CTGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S15
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGCACCTGAC

GrCrUrArGrUrCr

CCGTTATC







C





TCCTGAGGAGA

CrGrUrUrArUrCr

AACTTGA













AGTCTGCCGTT

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGCATCT















CrUrGrCrArCrCr

GACTCCTG















UrGrArCrUrCrCr

AGGAGAA















UrGrArGrGrArGr

GTCTGCCG















ArArGrUrCrUrGr

TT















CrC*mG*mU*m

















U









HBB8
GTAACG
20004
GTTTTAGAG
20094
GAG
20184
TGCACC
20274
GTAACGGCAGA
20364
mG*mU*mA*rAr
20454
GTAACGG
20544
++


RT1
GCAGA

CTAGAAAT

AAG

TGACTC

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




7_
CTTCTC

AGCAAGTT

TCT

CTGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




PB
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT




S14


GCTAGTCCG

GT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGCACCTGAC

GrCrUrArGrUrCr

CCGTTATC







C





TCCTGAGGAGA

CrGrUrUrArUrCr

AACTTGA













AGTCTGCCGT

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGCATCT















CrUrGrCrArCrCr

GACTCCTG















UrGrArCrUrCrCr

AGGAGAA















UrGrArGrGrArGr

GTCTGCCG















ArArGrUrCrUrGr

T















C*mC*mG*mU









HBB8_
GTAACG
20005
GTTTTAGAG
20095
GAG
20185
TGCACC
20275
GTAACGGCAGA
20365
mG*mU*mA*rAr
20455
GTAACGG
20545
+


RT17_
GCAGA

CTAGAAAT

AAG

TGACTC

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

CTGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S13
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

G



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGCACCTGAC

GrCrUrArGrUrCr

CCGTTATC







C





TCCTGAGGAGA

CrGrUrUrArUrCr

AACTTGA













AGTCTGCCG

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGCATCT















CrUrGrCrArCrCr

GACTCCTG















UrGrArCrUrCrCr

AGGAGAA















UrGrArGrGrArGr

GTCTGCCG















ArArGrUrCrUrG*

















mC*mC*mG









HBB8_
GTAACG
20006
GTTTTAGAG
20096
GAG
20186
TGCACC
20276
GTAACGGCAGA
20366
mG*mU*mA*rAr
20456
GTAACGG
20546
++


RT17_
GCAGA

CTAGAAAT

AAG

TGACTC

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PBS12
CTTCTC

AGCAAGTT

TCT

CTGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT





CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGCACCTGAC

GrCrUrArGrUrCr

CCGTTATC







C





TCCTGAGGAGA

CrGrUrUrArUrCr

AACTTGA













AGTCTGCC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGCATCT















CrUrGrCrArCrCr

GACTCCTG















UrGrArCrUrCrCr

AGGAGAA















UrGrArGrGrArGr

GTCTGCC















ArArGrUrCrU*m

















G*mC*mC









HBB8_
GTAACG
20007
GTTTTAGAG
20097
GAG
20187
TGCACC
20277
GTAACGGCAGA
20367
mG*mU*mA*rAr
20457
GTAACGG
20547
+++


RT17_
GCAGA

CTAGAAAT

AAG

TGACTC

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

CTGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S11
CAC

AAAATAAG

GC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGCACCTGAC

GrCrUrArGrUrCr

CCGTTATC







C





TCCTGAGGAGA

CrGrUrUrArUrCr

AACTTGA













AGTCTGC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGCATCT















CrUrGrCrArCrCr

GACTCCTG















UrGrArCrUrCrCr

AGGAGAA















UrGrArGrGrArGr

GTCTGC















ArArGrUrC*mU*

















mG*mC









HBB8_
GTAACG
20008
GTTTTAGAG
20098
GAG
20188
TGCACC
20278
GTAACGGCAGA
20368
mG*mU*mA*rAr
20458
GTAACGG
20548
++


RT17_
GCAGA

CTAGAAAT

AAG

TGACTC

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

CTGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S10
CAC

AAAATAAG

G



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGCACCTGAC

GrCrUrArGrUrCr

CCGTTATC







C





TCCTGAGGAGA

CrGrUrUrArUrCr

AACTTGA













AGTCTG

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGCATCT















CrUrGrCrArCrCr

GACTCCTG















UrGrArCrUrCrCr

AGGAGAA















UrGrArGrGrArGr

GTCTG















ArArGrU*mC*m

















U*mG









HBB8_
GTAACG
20009
GTTTTAGAG
20099
GAG

TGCACC
20279
GTAACGGCAGA
20369
mG*mU*mA*rAr
20459
GTAACGG
20549
++


RT17_
GCAGA

CTAGAAAT

AAG

TGACTC

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

CTGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S9
CAC

AAAATAAG





ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGCACCTGAC

GrCrUrArGrUrCr

CCGTTATC







C





TCCTGAGGAGA

CrGrUrUrArUrCr

AACTTGA













AGTCT

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGCATCT















CrUrGrCrArCrCr

GACTCCTG















UrGrArCrUrCrCr

AGGAGAA















UrGrArGrGrArGr

GTCT















ArArG*mU*mC*

















mU









HBB8_
GTAACG
20010
GTTTTAGAG
20100
GAG

TGCACC
20280
GTAACGGCAGA
20370
mG*mU*mA*rAr
20460
GTAACGG
20550
++


RT17_
GCAGA

CTAGAAAT

AAG

TGACTC

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TC

CTGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S8
CAC

AAAATAAG





ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGCACCTGAC

GrCrUrArGrUrCr

CCGTTATC







C





TCCTGAGGAGA

CrGrUrUrArUrCr

AACTTGA













AGTC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGCATCT















CrUrGrCrArCrCr

GACTCCTG















UrGrArCrUrCrCr

AGGAGAA















UrGrArGrGrArGr

GTC















ArA*mG*mU*m

















C









HBB8_
GTAACG
20011
GTTTTAGAG
20101
GAG
20191
GCACCT
20281
GTAACGGCAGA
20371
mG*mU*mA*rAr
20461
GTAACGG
20551
+


RT16_
GCAGA

CTAGAAAT

AAG

GACTCC

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

TGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S17
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT

AC



CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGCACCTGACT

GrCrUrArGrUrCr

CCGTTATC







C





CCTGAGGAGAA

CrGrUrUrArUrCr

AACTTGA













GTCTGCCGTTAC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGCATCTG















CrGrCrArCrCrUr

ACTCCTGA















GrArCrUrCrCrUr

GGAGAAG















GrArGrGrArGrAr

TCTGCCGT















ArGrUrCrUrGrCr

TAC















CrGrU*mU*mA*

















mC









HBB8_
GTAACG
20012
GTTTTAGAG
20102
GAG
20192
GCACCT
20282
GTAACGGCAGA
20372
mG*mU*mA*rAr
20462
GTAACGG
20552
++


RT16_
GCAGA

CTAGAAAT

AAG

GACTCC

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

TGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S16
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT

A



CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGCACCTGACT

GrCrUrArGrUrCr

CCGTTATC







C





CCTGAGGAGAA

CrGrUrUrArUrCr

AACTTGA













GTCTGCCGTTA

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGCATCTG















CrGrCrArCrCrUr

ACTCCTGA















GrArCrUrCrCrUr

GGAGAAG















GrArGrGrArGrAr

TCTGCCGT















ArGrUrCrUrGrCr

TA















CrG*mU*mU*m

















A









HBB8_
GTAACG
20013
GTTTTAGAG
20103
GAG
20193
GCACCT
20283
GTAACGGCAGA
20373
mG*mU*mA*rAr
20463
GTAACGG
20553
++


RT16_
GCAGA

CTAGAAAT

AAG

GACTCC

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

TGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S15
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGCACCTGACT

GrCrUrArGrUrCr

CCGTTATC







C





CCTGAGGAGAA

CrGrUrUrArUrCr

AACTTGA













GTCTGCCGTT

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGCATCTG















CrGrCrArCrCrUr

ACTCCTGA















GrArCrUrCrCrUr

GGAGAAG















GrArGrGrArGrAr

TCTGCCGT















ArGrUrCrUrGrCr

T















C*mG*mU*mU









HBB8_
GTAACG
20014
GTTTTAGAG
20104
GAG
20194
GCACCT
20284
GTAACGGCAGA
20374
mG*mU*mA*rAr
20464
GTAACGG
20554
++


RT16_
GCAGA

CTAGAAAT

AAG

GACTCC

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

TGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S14
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGCACCTGACT

GrCrUrArGrUrCr

CCGTTATC







C





CCTGAGGAGAA

CrGrUrUrArUrCr

AACTTGA













GTCTGCCGT

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGCATCTG















CrGrCrArCrCrUr

ACTCCTGA















GrArCrUrCrCrUr

GGAGAAG















GrArGrGrArGrAr

TCTGCCGT















ArGrUrCrUrGrC*

















mC*mG*mU









HBB8_
GTAACG
20015
GTTTTAGAG
20105
GAG
20195
GCACCT
20285
GTAACGGCAGA
20375
mG*mU*mA*rAr
20465
GTAACGG
20555
++


RT16_
GCAGA

CTAGAAAT

AAG

GACTCC

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

TGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S13
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

G



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGCACCTGACT

GrCrUrArGrUrCr

CCGTTATC







C





CCTGAGGAGAA

CrGrUrUrArUrCr

AACTTGA













GTCTGCCG

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGCATCTG















CrGrCrArCrCrUr

ACTCCTGA















GrArCrUrCrCrUr

GGAGAAG















GrArGrGrArGrAr

TCTGCCG















ArGrUrCrUrG*m

















C*mC*mG









HBB8_
GTAACG
20016
GTTTTAGAG
20106
GAG
20196
GCACCT
20286
GTAACGGCAGA
20376
mG*mU*mA*rAr
20466
GTAACGG
20556
+++


RT16_
GCAGA

CTAGAAAT

AAG

GACTCC

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

TGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S12
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGCACCTGACT

GrCrUrArGrUrCr

CCGTTATC







C





CCTGAGGAGAA

CrGrUrUrArUrCr

AACTTGA













GTCTGCC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGCATCTG















CrGrCrArCrCrUr

ACTCCTGA















GrArCrUrCrCrUr

GGAGAAG















GrArGrGrArGrAr

TCTGCC















ArGrUrCrU*mG*

















mC*mC









HBB8_
GTAACG
20017
GTTTTAGAG
20107
GAG
20197
GCACCT
20287
GTAACGGCAGA
20377
mG*mU*mA*rAr
20467
GTAACGG
20557
++


RT16_
GCAGA

CTAGAAAT

AAG

GACTCC

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

TGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S11
CAC

AAAATAAG

GC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGCACCTGACT

GrCrUrArGrUrCr

CCGTTATC







C





CCTGAGGAGAA

CrGrUrUrArUrCr

AACTTGA













GTCTGC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGCATCTG















CrGrCrArCrCrUr

ACTCCTGA















GrArCrUrCrCrUr

GGAGAAG















GrArGrGrArGrAr

TCTGC















ArGrUrC*mU*m

















G*mC









HBB8_
GTAACG
20018
GTTTTAGAG
20108
GAG
20198
GCACCT
20288
GTAACGGCAGA
20378
mG*mU*mA*rAr
20468
GTAACGG
20558
++


RT16_
GCAGA

CTAGAAAT

AAG

GACTCC

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

TGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S10
CAC

AAAATAAG

G



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGCACCTGACT

GrCrUrArGrUrCr

CCGTTATC







C





CCTGAGGAGAA

CrGrUrUrArUrCr

AACTTGA













GTCTG

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGCATCTG















CrGrCrArCrCrUr

ACTCCTGA















GrArCrUrCrCrUr

GGAGAAG















GrArGrGrArGrAr

TCTG















ArGrU*mC*mU*

















mG









HBB8_
GTAACG
20019
GTTTTAGAG
20109
GAG

GCACCT
20289
GTAACGGCAGA
20379
mG*mU*mA*rAr
20469
GTAACGG
20559
++


RT16_
GCAGA

CTAGAAAT

AAG

GACTCC

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT

TGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S9
CAC

AAAATAAG





ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGCACCTGACT

GrCrUrArGrUrCr

CCGTTATC







C





CCTGAGGAGAA

CrGrUrUrArUrCr

AACTTGA













GTCT

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGCATCTG















CrGrCrArCrCrUr

ACTCCTGA















GrArCrUrCrCrUr

GGAGAAG















GrArGrGrArGrAr

TCT















ArG*mU*mC*m

















U









HBB8_
GTAACG
20020
GTTTTAGAG
20110
GAG

GCACCT
20290
GTAACGGCAGA
20380
mG*mU*mA*rAr
20470
GTAACGG
20560
++


RT16_
GCAGA

CTAGAAAT

AAG

GACTCC

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TC

TGAG

TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S8
CAC

AAAATAAG





ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGCACCTGACT

GrCrUrArGrUrCr

CCGTTATC







C





CCTGAGGAGAA

CrGrUrUrArUrCr

AACTTGA













GTC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGCATCTG















CrGrCrArCrCrUr

ACTCCTGA















GrArCrUrCrCrUr

GGAGAAG















GrArGrGrArGrAr

TC















A*mG*mU*mC









HBB8_
GTAACG
20021
GTTTTAGAG
20111
GAG
20201
ACCTGA
20291
GTAACGGCAGA
20381
mG*mU*mA*rAr
20471
GTAACGG
20561



RT14_
GCAGA

CTAGAAAT

AAG

CTCCTG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT


AG


TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S17
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT

AC



CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCACCTGACTCC

GrCrUrArGrUrCr

CCGTTATC







C





TGAGGAGAAGT

CrGrUrUrArUrCr

AACTTGA













CTGCCGTTAC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CATCTGAC















CrArCrCrUrGrAr

TCCTGAG















CrUrCrCrUrGrAr

GAGAAGT















GrGrArGrArArGr

CTGCCGTT















UrCrUrGrCrCrGr

AC















U*mU*mA*mC









HBB8_
GTAACG
20022
GTTTTAGAG
20112
GAG
20202
ACCTGA
20292
GTAACGGCAGA
20382
mG*mU*mA*rAr
20472
GTAACGG
20562
++


RT1
GCAGA

CTAGAAAT

AAG

CTCCTG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




4_
CTTCTC

AGCAAGTT

TCT


AG


TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




PB
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT




S16


GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT

A



CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCACCTGACTCC

GrCrUrArGrUrCr

CCGTTATC







C





TGAGGAGAAGT

CrGrUrUrArUrCr

AACTTGA













CTGCCGTTA

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CATCTGAC















CrArCrCrUrGrAr

TCCTGAG















CrUrCrCrUrGrAr

GAGAAGT















GrGrArGrArArGr

CTGCCGTT















UrCrUrGrCrCrG*

A















mU*mU*mA









HBB8_
GTAACG
20023
GTTTTAGAG
20113
GAG
20203
ACCTGA
20293
GTAACGGCAGA
20383
mG*mU*mA*rAr
20473
GTAACGG
20563
++


RT14_
GCAGA

CTAGAAAT

AAG

CTCCTG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT


AG


TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S15
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCACCTGACTCC

GrCrUrArGrUrCr

CCGTTATC







C





TGAGGAGAAGT

CrGrUrUrArUrCr

AACTTGA













CTGCCGTT

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CATCTGAC















CrArCrCrUrGrAr

TCCTGAG















CrUrCrCrUrGrAr

GAGAAGT















GrGrArGrArArGr

CTGCCGTT















UrCrUrGrCrC*m

















G*mU*mU









HBB8_
GTAACG
20024
GTTTTAGAG
20114
GAG
20204
ACCTGA
20294
GTAACGGCAGA
20384
mG*mU*mA*rAr
20474
GTAACGG
20564
+++


RT14_
GCAGA

CTAGAAAT

AAG

CTCCTG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT


AG


TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S14
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCACCTGACTCC

GrCrUrArGrUrCr

CCGTTATC







C





TGAGGAGAAGT

CrGrUrUrArUrCr

AACTTGA













CTGCCGT

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CATCTGAC















CrArCrCrUrGrAr

TCCTGAG















CrUrCrCrUrGrAr

GAGAAGT















GrGrArGrArArGr

CTGCCGT















UrCrUrGrC*mC*

















mG*mU









HBB8_
GTAACG
20025
GTTTTAGAG
20115
GAG
20205
ACCTGA
20295
GTAACGGCAGA
20385
mG*mU*mA*rAr
20475
GTAACGG
20565
+++


RT14_
GCAGA

CTAGAAAT

AAG

CTCCTG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT


AG


TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S13
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

G



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCACCTGACTCC

GrCrUrArGrUrCr

CCGTTATC







C





TGAGGAGAAGT

CrGrUrUrArUrCr

AACTTGA













CTGCCG

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CATCTGAC















CrArCrCrUrGrAr

TCCTGAG















CrUrCrCrUrGrAr

GAGAAGT















GrGrArGrArArGr

CTGCCG















UrCrUrG*mC*m

















C*mG









HBB8_
GTAACG
20026
GTTTTAGAG
20116
GAG
20206
ACCTGA
20296
GTAACGGCAGA
20386
mG*mU*mA*rAr
20476
GTAACGG
20566
+++


RT14_
GCAGA

CTAGAAAT

AAG

CTCCTG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT


AG


TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S12
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCACCTGACTCC

GrCrUrArGrUrCr

CCGTTATC







C





TGAGGAGAAGT

CrGrUrUrArUrCr

AACTTGA













CTGCC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CATCTGAC















CrArCrCrUrGrAr

TCCTGAG















CrUrCrCrUrGrAr

GAGAAGT















GrGrArGrArArGr

CTGCC















UrCrU*mG*mC*

















mC









HBB8_
GTAACG
20027
GTTTTAGAG
20117
GAG
20207
ACCTGA
20297
GTAACGGCAGA
20387
mG*mU*mA*rAr
20477
GTAACGG
20567
++


RT14_
GCAGA

CTAGAAAT

AAG

CTCCTG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT


AG


TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S11
CAC

AAAATAAG

GC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCACCTGACTCC

GrCrUrArGrUrCr

CCGTTATC







C





TGAGGAGAAGT

CrGrUrUrArUrCr

AACTTGA













CTGC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CATCTGAC















CrArCrCrUrGrAr

TCCTGAG















CrUrCrCrUrGrAr

GAGAAGT















GrGrArGrArArGr

CTGC















UrC*mU*mG*m

















C









HBB8_
GTAACG
20028
GTTTTAGAG
20118
GAG
20208
ACCTGA
20298
GTAACGGCAGA
20388
mG*mU*mA*rAr
20478
GTAACGG
20568
++


RT14_
GCAGA

CTAGAAAT

AAG

CTCCTG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT


AG


TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S10
CAC

AAAATAAG

G



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCACCTGACTCC

GrCrUrArGrUrCr

CCGTTATC







C





TGAGGAGAAGT

CrGrUrUrArUrCr

AACTTGA













CTG

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CATCTGAC















CrArCrCrUrGrAr

TCCTGAG















CrUrCrCrUrGrAr

GAGAAGT















GrGrArGrArArGr

CTG















U*mC*mU*mG









HBB8_
GTAACG
20029
GTTTTAGAG
20119
GAG

ACCTGA
20299
GTAACGGCAGA
20389
mG*mU*mA*rAr
20479
GTAACGG
20569
++


RT14_
GCAGA

CTAGAAAT

AAG

CTCCTG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT


AG


TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S9
CAC

AAAATAAG





ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCACCTGACTCC

GrCrUrArGrUrCr

CCGTTATC







C





TGAGGAGAAGT

CrGrUrUrArUrCr

AACTTGA













CT

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CATCTGAC















CrArCrCrUrGrAr

TCCTGAG















CrUrCrCrUrGrAr

GAGAAGT















GrGrArGrArArG*

CT















mU*mC*mU









HBB8_
GTAACG
20030
GTTTTAGAG
20120
GAG

ACCTGA
20300
GTAACGGCAGA
20390
mG*mU*mA*rAr
20480
GTAACGG
20570
++


RT14_
GCAGA

CTAGAAAT

AAG

CTCCTG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TC


AG


TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S8
CAC

AAAATAAG





ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCACCTGACTCC

GrCrUrArGrUrCr

CCGTTATC







C





TGAGGAGAAGT

CrGrUrUrArUrCr

AACTTGA













C

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CATCTGAC















CrArCrCrUrGrAr

TCCTGAG















CrUrCrCrUrGrAr

GAGAAGT















GrGrArGrArA*m

C















G*mU*mC









HBB8
GTAACG
20031
GTTTTAGAG
20121
GAG
20211
TGACTC
20301
GTAACGGCAGA
20391
mG*mU*mA*rAr
20481
GTAACGG
20571
+


_RT1
GCAGA

CTAGAAAT

AAG

CTGAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




1_
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




PB
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT




S17


GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT

AC



CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGACTCCTGA

GrCrUrArGrUrCr

CCGTTATC







C





GGAGAAGTCTG

CrGrUrUrArUrCr

AACTTGA













CCGTTAC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGACTCC















CrUrGrArCrUrCr

TGAGGAG















CrUrGrArGrGrAr

AAGTCTG















GrArArGrUrCrUr

CCGTTAC















GrCrCrGrU*mU*

















mA*mC









HBB8
GTAACG
20032
GTTTTAGAG
20122
GAG
20212
TGACTC
20302
GTAACGGCAGA
20392
mG*mU*mA*rAr
20482
GTAACGG
20572
++


_RT1
GCAGA

CTAGAAAT

AAG

CTGAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




1_PB
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S16
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT

A



CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGACTCCTGA

GrCrUrArGrUrCr

CCGTTATC







C





GGAGAAGTCTG

CrGrUrUrArUrCr

AACTTGA













CCGTTA

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGACTCC















CrUrGrArCrUrCr

TGAGGAG















CrUrGrArGrGrAr

















GrArArGrUrCrUr

AAGTCTG















GrCrCrG*mU*m

CCGTTA















U*mA









HBB8
GTAACG
20033
GTTTTAGAG
20123
GAG
20213
TGACTC
20303
GTAACGGCAGA
20393
mG*mU*mA*rAr
20483
GTAACGG
20573
++


_RT1
GCAGA

CTAGAAAT

AAG

CTGAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




1_
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




PB
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT




S15


GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGACTCCTGA

GrCrUrArGrUrCr

CCGTTATC







C





GGAGAAGTCTG

CrGrUrUrArUrCr

AACTTGA













CCGTT

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGACTCC















CrUrGrArCrUrCr

TGAGGAG















CrUrGrArGrGrAr

AAGTCTG















GrArArGrUrCrUr

CCGTT















GrCrC*mG*mU*

















mU









HBB8_
GTAACG
20034
GTTTTAGAG
20124
GAG
20214
TGACTC
20304
GTAACGGCAGA
20394
mG*mU*mA*rAr
20484
GTAACGG
20574
++


RT11_
GCAGA

CTAGAAAT

AAG

CTGAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S14
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGACTCCTGA

GrCrUrArGrUrCr

CCGTTATC







C





GGAGAAGTCTG

CrGrUrUrArUrCr

AACTTGA













CCGT

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGACTCC















CrUrGrArCrUrCr

TGAGGAG















CrUrGrArGrGrAr

AAGTCTG















GrArArGrUrCrUr

CCGT















GrC*mC*mG*m

















U









HBB8_
GTAACG
20035
GTTTTAGAG
20125
GAG
20215
TGACTC
20305
GTAACGGCAGA
20395
mG*mU*mA*rAr
20485
GTAACGG
20575
+++


RT11_
GCAGA

CTAGAAAT

AAG

CTGAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S13
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

G



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGACTCCTGA

GrCrUrArGrUrCr

CCGTTATC







C





GGAGAAGTCTG

CrGrUrUrArUrCr

AACTTGA













CCG

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGACTCC















CrUrGrArCrUrCr

TGAGGAG















CrUrGrArGrGrAr

AAGTCTG















GrArArGrUrCrUr

CCG















G*mC*mC*mG









HBB8_
GTAACG
20036
GTTTTAGAG
20126
GAG
20216
TGACTC
20306
GTAACGGCAGA
20396
mG*mU*mA*rAr
20486
GTAACGG
20576
+++


RT11_
GCAGA

CTAGAAAT

AAG

CTGAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S12
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGACTCCTGA

GrCrUrArGrUrCr

CCGTTATC







C





GGAGAAGTCTG

CrGrUrUrArUrCr

AACTTGA













CC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGACTCC















CrUrGrArCrUrCr

TGAGGAG















CrUrGrArGrGrAr

AAGTCTG















GrArArGrUrCrU*

CC















mG*mC*mC









HBB8_
GTAACG
20037
GTTTTAGAG
20127
GAG
20217
TGACTC
20307
GTAACGGCAGA
20397
mG*mU*mA*rAr
20487
GTAACGG
20577
+++


RT11_
GCAGA

CTAGAAAT

AAG

CTGAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S11
CAC

AAAATAAG

GC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGACTCCTGA

GrCrUrArGrUrCr

CCGTTATC







C





GGAGAAGTCTG

CrGrUrUrArUrCr

AACTTGA













C

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGACTCC















CrUrGrArCrUrCr

TGAGGAG















CrUrGrArGrGrAr

AAGTCTG















GrArArGrUrC*m

C















U*mG*mC









HBB8_
GTAACG
20038
GTTTTAGAG
20128
GAG
20218
TGACTC
20308
GTAACGGCAGA
20398
mG*mU*mA*rAr
20488
GTAACGG
20578
++


RT11_
GCAGA

CTAGAAAT

AAG

CTGAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S10
CAC

AAAATAAG

G



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGACTCCTGA

GrCrUrArGrUrCr

CCGTTATC







C





GGAGAAGTCTG

CrGrUrUrArUrCr

AACTTGA















ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGACTCC















CrUrGrArCrUrCr

TGAGGAG















CrUrGrArGrGrAr

AAGTCTG















GrArArGrU*mC*

















mU*mG









HBB8_
GTAACG
20039
GTTTTAGAG
20129
GAG

TGACTC
20309
GTAACGGCAGA
20390
mG*mU*mA*rAr
20489
GTAACGG
20579
++


RT11_
GCAGA

CTAGAAAT

AAG

CTGAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S9
CAC

AAAATAAG





ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGACTCCTGA

GrCrUrArGrUrCr

CCGTTATC







C





GGAGAAGTCT

CrGrUrUrArUrCr

AACTTGA















ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGACTCC















CrUrGrArCrUrCr

TGAGGAG















CrUrGrArGrGrAr

AAGTCT















GrArArG*mU*m

















C*mU









HBB8_
GTAACG
20040
GTTTTAGAG
20130
GAG

TGACTC
20310
GTAACGGCAGA
20400
mG*mU*mA*rAr
20490
GTAACGG
20580
++


RT11_
GCAGA

CTAGAAAT

AAG

CTGAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TC



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S8
CAC

AAAATAAG





ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCTGACTCCTGA

GrCrUrArGrUrCr

CCGTTATC







C





GGAGAAGTC

CrGrUrUrArUrCr

AACTTGA















ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CTGACTCC















CrUrGrArCrUrCr

TGAGGAG















CrUrGrArGrGrAr

AAGTC















GrArA*mG*mU*

















mC









HBB8_
GTAACG
20041
GTTTTAGAG
20131
GAG
20221
GACTCC
20311
GTAACGGCAGA
20401
mG*mU*mA*rAr
20491
GTAACGG
20581
+


RT10_
GCAGA

CTAGAAAT

AAG

TGAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S17
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT

AC



CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGACTCCTGAG

GrCrUrArGrUrCr

CCGTTATC







C





GAGAAGTCTGCC

CrGrUrUrArUrCr

AACTTGA













GTTAC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGACTCCT















CrGrArCrUrCrCr

GAGGAGA















UrGrArGrGrArGr

AGTCTGCC















ArArGrUrCrUrGr

GTTAC















CrCrGrU*mU*m

















A*mC









HBB8_
GTAACG
20042
GTTTTAGAG
20132
GAG
20222
GACTCC
20312
GTAACGGCAGA
20402
mG*mU*mA*rAr
20492
GTAACGG
20582
++


RT10_
GCAGA

CTAGAAAT

AAG

TGAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PBS16
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT





CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT

A



CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGACTCCTGAG

GrCrUrArGrUrCr

CCGTTATC







C





GAGAAGTCTGCC

CrGrUrUrArUrCr

AACTTGA













GTTA

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGACTCCT















CrGrArCrUrCrCr

GAGGAGA















UrGrArGrGrArGr

AGTCTGCC















ArArGrUrCrUrGr

GTTA















CrCrG*mU*mU*

















mA









HBB8_
GTAACG
20043
GTTTTAGAG
20133
GAG
20223
GACTCC
20313
GTAACGGCAGA
20403
mG*mU*mA*rAr
20493
GTAACGG
20583
++


RT10_
GCAGA

CTAGAAAT

AAG

TGAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PBS15
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT





CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGACTCCTGAG

GrCrUrArGrUrCr

CCGTTATC







C







CrGrUrUrArUrCr

AACTTGA













GAGAAGTCTGCC

ArArCrUrUrGrAr

AAAAGTG













GTT

ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGACTCCT















CrGrArCrUrCrCr

GAGGAGA















UrGrArGrGrArGr

AGTCTGCC















ArArGrUrCrUrGr

GTT















CrC*mG*mU*m

















U









HBB8_
GTAACG
20044
GTTTTAGAG
20134
GAG
20224
GACTCC
20314
GTAACGGCAGA
20404
mG*mU*mA*rAr
20494
GTAACGG
20584
++


RT10_
GCAGA

CTAGAAAT

AAG

TGAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PBS14
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT





CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGACTCCTGAG

GrCrUrArGrUrCr

CCGTTATC







C





GAGAAGTCTGCC

CrGrUrUrArUrCr

AACTTGA













GT

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGACTCCT















CrGrArCrUrCrCr

GAGGAGA















UrGrArGrGrArGr

AGTCTGCC















ArArGrUrCrUrGr

GT















C*mC*mG*mU









HBB8_
GTAACG
20045
GTTTTAGAG
20135
GAG
20225
GACTCC
20315
GTAACGGCAGA
20405
mG*mU*mA*rAr
20495
GTAACGG
20585
++


RT10_
GCAGA

CTAGAAAT

AAG

TGAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PBS13
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT





CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

G



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGACTCCTGAG

GrCrUrArGrUrCr

CCGTTATC







C





GAGAAGTCTGCC

CrGrUrUrArUrCr

AACTTGA













G

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGACTCCT















CrGrArCrUrCrCr

GAGGAGA















UrGrArGrGrArGr

AGTCTGCC















ArArGrUrCrUrG*

G















mC*mC*mG









HBB8_
GTAACG
20046

20136
GAG
20226
GACTCC
20316

20406
mG*mU*mA*rAr


20586
++


RT10_
GCAGA

GTTTTAGAG

AAG

TGAG

GTAACGGCAGA

CrGrGrCrArGrAr
20496
GTAACGG




PBS12
CTTCTC

CTAGAAAT

TCT



CTTCTCCACGTT

CrUrUrCrUrCrCr

CAGACTTC





CAC

AGCAAGTT

GCC



TTAGAGCTAGAA

ArCrGrUrUrUrUr

TCCACGTT







AAAATAAG





ATAGCAAGTTAA

ArGrArGrCrUrAr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

GrArArArUrArGr

AGAAATA







TTATCAACT





CCGTTATCAACT

CrArArGrUrUrAr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

ArArArUrArArGr

AAAATAA















GrCrUrArGrUrCr

GGCTAGT















CrGrUrUrArUrCr

CCGTTATC















ArArCrUrUrGrAr

AACTTGA







TGGCACCG





CACCGAGTCGGT

ArArArArGrUrGr

AAAAGTG







AGTCGGTG





GCGACTCCTGAG

GrCrArCrCrGrAr

GCACCGA







C





GAGAAGTCTGCC

GrUrCrGrGrUrGr

GTCGGTG















CrGrArCrUrCrCr

CGACTCCT















UrGrArGrGrArGr

GAGGAGA















ArArGrUrCrU*m

AGTCTGCC















G*mC*mC









HBB8_
GTAACG
20047

20137
GAG
20227
GACTCC
20317
GTAACGGCAGA
20407
mG*mU*mA*rAr
20497
GTAACGG
20587
++


RT10_
GCAGA

GTTTTAGAG

AAG

TGAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

CTAGAAAT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S11
CAC

AGCAAGTT

GC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







AAAATAAG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







GCTAGTCCG





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TTATCAACT





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGAAAAAG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT













GCGACTCCTGAG

GrCrUrArGrUrCr

CCGTTATC













GAGAAGTCTGC

CrGrUrUrArUrCr

AACTTGA







TGGCACCG







ArArCrUrUrGrAr

AAAAGTG







AGTCGGTG







ArArArArGrUrGr

GCACCGA







C







GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGACTCCT















CrGrArCrUrCrCr

GAGGAGA















UrGrArGrGrArGr

AGTCTGC















ArArGrUrC*mU*

















mG*mC









HBB8_
GTAACG
20048
GTTTTAGAG
20138
GAG
20228
GACTCC
20318
GTAACGGCAGA
20408
mG*mU*mA*rAr
20498
GTAACGG
20588
++


RT10_
GCAGA

CTAGAAAT

AAG

TGAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PB
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




S10
CAC

AAAATAAG

G



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGACTCCTGAG

GrCrUrArGrUrCr

CCGTTATC







C





GAGAAGTCTG

CrGrUrUrArUrCr

AACTTGA















ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGACTCCT















CrGrArCrUrCrCr

GAGGAGA















UrGrArGrGrArGr

AGTCTG















ArArGrU*mC*m

















U*mG









HBB8_
GTAACG
20049
GTTTTAGAG
20139
GAG

GACTCC
20319
GTAACGGCAGA
20409
mG*mU*mA*rAr
20499
GTAACGG
20589
+


RT10_
GCAGA

CTAGAAAT

AAG

TGAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PBS9
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT





CAC

AAAATAAG





ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGACTCCTGAG

GrCrUrArGrUrCr

CCGTTATC







C





GAGAAGTCT

CrGrUrUrArUrCr

AACTTGA















ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGACTCCT















CrGrArCrUrCrCr

GAGGAGA















UrGrArGrGrArGr

AGTCT















ArArG*mU*mC*

















mU









HBB8_
GTAACG
20050
GTTTTAGAG
20140
GAG

GACTCC
20320
GTAACGGCAGA
20410
mG*mU*mA*rAr
20500
GTAACGG
20590
++


RT10_
GCAGA

CTAGAAAT

AAG

TGAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PBS8
CTTCTC

AGCAAGTT

TC



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT





CAC

AAAATAAG





ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCGACTCCTGAG

GrCrUrArGrUrCr

CCGTTATC







C





GAGAAGTC

CrGrUrUrArUrCr

AACTTGA















ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CGACTCCT















CrGrArCrUrCrCr

GAGGAGA















UrGrArGrGrArGr

AGTC















ArA*mG*mU*m

















C









HBB8_
GTAACG
20051
GTTTTAGAG
20141
GAG
20231
ACTCCT

GTAACGGCAGA
20411
mG*mU*mA*rAr
20501
GTAACGG
20591
+


RT9_
GCAGA

CTAGAAAT

AAG

GAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PBS
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




17
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT

AC



CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCACTCCTGAGG

GrCrUrArGrUrCr

CCGTTATC







C





AGAAGTCTGCCG

CrGrUrUrArUrCr

AACTTGA













TTAC

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CACTCCTG















CrArCrUrCrCrUr

AGGAGAA















GrArGrGrArGrAr

GTCTGCCG















ArGrUrCrUrGrCr

TTAC















CrGrU*mU*mA*

















mC









HBB8_
GTAACG
20052
GTTTTAGAG
20142
GAG
20232
ACTCCT

GTAACGGCAGA
20412
mG*mU*mA*rAr
20502
GTAACGG
20592
++


RT9_
GCAGA

CTAGAAAT

AAG

GAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PBS


AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




16
CTTCTC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT





CAC

GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT

A



CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCACTCCTGAGG

GrCrUrArGrUrCr

CCGTTATC







C





AGAAGTCTGCCG

CrGrUrUrArUrCr

AACTTGA













TTA

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CACTCCTG















CrArCrUrCrCrUr

AGGAGAA















GrArGrGrArGrAr

GTCTGCCG















ArGrUrCrUrGrCr

TTA















CrG*mU*mU*m

















A









HBB8_
GTAACG
20053
GTTTTAGAG
20143
GAG
20233
ACTCCT

GTAACGGCAGA
20413
mG*mU*mA*rAr
20503
GTAACGG
20593
++


RT9_
GCAGA

CTAGAAAT

AAG

GAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PBS
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




15
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GTT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCACTCCTGAGG

GrCrUrArGrUrCr

CCGTTATC







C





AGAAGTCTGCCG

CrGrUrUrArUrCr

AACTTGA













TT

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CACTCCTG















CrArCrUrCrCrUr

AGGAGAA















GrArGrGrArGrAr

GTCTGCCG















ArGrUrCrUrGrCr

TT















C*mG*mU*mU









HBB8_
GTAACG
20054
GTTTTAGAG
20144
GAG
20234
ACTCCT

GTAACGGCAGA
20414
mG*mU*mA*rAr
20504
GTAACGG
20594
+++


RT9_
GCAGA

CTAGAAAT

AAG

GAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PBS
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




14
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

GT



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCACTCCTGAGG

GrCrUrArGrUrCr

CCGTTATC







C





AGAAGTCTGCCG

CrGrUrUrArUrCr

AACTTGA













T

ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CACTCCTG















CrArCrUrCrCrUr

AGGAGAA















GrArGrGrArGrAr

GTCTGCCG















ArGrUrCrUrGrC*

T















mC*mG*mU









HBB8_
GTAACG
20055
GTTTTAGAG
20145
GAG
20235
ACTCCT

GTAACGGCAGA
20415
mG*mU*mA*rAr
20505
GTAACGG
20595
+++


RT9_
GCAGA

CTAGAAAT

AAG

GAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PBS
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




13
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG

G



AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCACTCCTGAGG

GrCrUrArGrUrCr

CCGTTATC







C





AGAAGTCTGCCG

CrGrUrUrArUrCr

AACTTGA















ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CACTCCTG















CrArCrUrCrCrUr

AGGAGAA















GrArGrGrArGrAr

GTCTGCCG















ArGrUrCrUrG*m

















C*mC*mG









HBB8_
GTAACG
20056
GTTTTAGAG
20146
GAG
20236
ACTCCT

GTAACGGCAGA
20416
mG*mU*mA*rAr
20506
GTAACGG
20596
++


RT9_
GCAGA

CTAGAAAT

AAG

GAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PBS
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




12
CAC

AAAATAAG

GCC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCACTCCTGAGG

GrCrUrArGrUrCr

CCGTTATC







C





AGAAGTCTGCC

CrGrUrUrArUrCr

AACTTGA















ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CACTCCTG















CrArCrUrCrCrUr

AGGAGAA















GrArGrGrArGrAr

GTCTGCC















ArGrUrCrU*mG*

















mC*mC









HBB8_
GTAACG
20057
GTTTTAGAG
20147
GAG
20237
ACTCCT

GTAACGGCAGA
20417
mG*mU*mA*rAr
20507
GTAACGG
20597
++


RT9_
GCAGA

CTAGAAAT

AAG

GAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PBS
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




11
CAC

AAAATAAG

GC



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCACTCCTGAGG

GrCrUrArGrUrCr

CCGTTATC







C





AGAAGTCTGC

CrGrUrUrArUrCr

AACTTGA















ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CACTCCTG















CrArCrUrCrCrUr

AGGAGAA















GrArGrGrArGrAr

GTCTGC















ArGrUrC*mU*m

















G*mC









HBB8_
GTAACG
20058
GTTTTAGAG
20148
GAG
20238
ACTCCT

GTAACGGCAGA
20418
mG*mU*mA*rAr
20508
GTAACGG
20598
++


RT9_
GCAGA

CTAGAAAT

AAG

GAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PBS
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




10
CAC

AAAATAAG

G



ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCACTCCTGAGG

GrCrUrArGrUrCr

CCGTTATC







C





AGAAGTCTG

CrGrUrUrArUrCr

AACTTGA















ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CACTCCTG















CrArCrUrCrCrUr

AGGAGAA















GrArGrGrArGrAr

GTCTG















ArGrU*mC*mU*

















mG









HBB8_
GTAACG
20059
GTTTTAGAG
20149
GAG

ACTCCT

GTAACGGCAGA
20419
mG*mU*mA*rAr
20509
GTAACGG
20599
+


RT9_
GCAGA

CTAGAAAT

AAG

GAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PBS
CTTCTC

AGCAAGTT

TCT



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




9
CAC

AAAATAAG





ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCACTCCTGAGG

GrCrUrArGrUrCr

CCGTTATC







C





AGAAGTCT

CrGrUrUrArUrCr

AACTTGA















ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CACTCCTG















CrArCrUrCrCrUr

AGGAGAA















GrArGrGrArGrAr

GTCT















ArG*mU*mC*m

















U









HBB8_
GTAACG
20060
GTTTTAGAG
20150
GAG

ACTCCT

GTAACGGCAGA
20420
mG*mU*mA*rAr
20510
GTAACGG
20600
++


RT9_
GCAGA

CTAGAAAT

AAG

GAG

CTTCTCCACGTT

CrGrGrCrArGrAr

CAGACTTC




PBS
CTTCTC

AGCAAGTT

TC



TTAGAGCTAGAA

CrUrUrCrUrCrCr

TCCACGTT




8
CAC

AAAATAAG





ATAGCAAGTTAA

ArCrGrUrUrUrUr

TTAGAGCT







GCTAGTCCG





AATAAGGCTAGT

ArGrArGrCrUrAr

AGAAATA







TTATCAACT





CCGTTATCAACT

GrArArArUrArGr

GCAAGTT







TGAAAAAG





TGAAAAAGTGG

CrArArGrUrUrAr

AAAATAA







TGGCACCG





CACCGAGTCGGT

ArArArUrArArGr

GGCTAGT







AGTCGGTG





GCACTCCTGAGG

GrCrUrArGrUrCr

CCGTTATO







C





AGAAGTC

CrGrUrUrArUrCr

AACTTGA















ArArCrUrUrGrAr

AAAAGTG















ArArArArGrUrGr

GCACCGA















GrCrArCrCrGrAr

GTCGGTG















GrUrCrGrGrUrGr

CACTCCTG















CrArCrUrCrCrUr

AGGAGAA















GrArGrGrArGrAr

GTC















A*mG*mU*mC
















TABLE B1







Table B Sequences Reproduced without Nucleotide


Modifications. The Template Sequence


(+SNP +PAM-kill) (RNA) sequences from


Table B are reproduced below without nucleotide


modifications. In some embodiments, In some


embodiments, the sequences used in this table


can be used without chemical modifications.











SEQ


Name
Template Sequence (+SNP) (RNA)
ID NO





HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21907


RT20_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS17
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGGUGCACCUGACUCCUGAGGAGA




AGUCUGCCGUUAC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21908


RT20_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS16
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGGUGCACCUGACUCCUGAGGAGA




AGUCUGCCGUUA






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21909


RT20_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS15
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGGUGCACCUGACUCCUGAGGAGA




AGUCUGCCGUU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21910


RT20_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS14
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGGUGCACCUGACUCCUGAGGAGA




AGUCUGCCGU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21911


RT20_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS13
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGGUGCACCUGACUCCUGAGGAGA




AGUCUGCCG






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21912


RT20_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS12
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGGUGCACCUGACUCCUGAGGAGA




AGUCUGCC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21913


RT20_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS11
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGGUGCACCUGACUCCUGAGGAGA




AGUCUGC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21914


RT20_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS10
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGGUGCACCUGACUCCUGAGGAGA




AGUCUG






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21915


RT20_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS9
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGGUGCACCUGACUCCUGAGGAGA




AGUCU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21916


RT20_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS8
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGGUGCACCUGACUCCUGAGGAGA




AGUC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21917


RT19_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS17
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGGUGCACCUGACUCCUGAGGAGAA




GUCUGCCGUUAC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21918


RT19_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS16
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGGUGCACCUGACUCCUGAGGAGAA




GUCUGCCGUUA






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21919


RT19_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS15
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGGUGCACCUGACUCCUGAGGAGAA




GUCUGCCGUU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21920


RT19_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS14
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGGUGCACCUGACUCCUGAGGAGAA




GUCUGCCGU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21921


RT19_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS13
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGGUGCACCUGACUCCUGAGGAGAA




GUCUGCCG






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21922


RT19_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS12
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGGUGCACCUGACUCCUGAGGAGAA




GUCUGCC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21923


RT19_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS11
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGGUGCACCUGACUCCUGAGGAGAA




GUCUGC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21924


RT19_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS10
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGGUGCACCUGACUCCUGAGGAGAA




GUCUG






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21925


RT19_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS9
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGGUGCACCUGACUCCUGAGGAGAA




GUCU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21926


RT19_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS8
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGGUGCACCUGACUCCUGAGGAGAA




GUC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21927


RT18_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS17
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGUGCACCUGACUCCUGAGGAGAAG




UCUGCCGUUAC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21928


RT18_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS16
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGUGCACCUGACUCCUGAGGAGAAG




UCUGCCGUUA






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21929


RT18_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS15
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGUGCACCUGACUCCUGAGGAGAAG




UCUGCCGUU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21930


RT18_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS14
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGUGCACCUGACUCCUGAGGAGAAG




UCUGCCGU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21931


RT18_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS13
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGUGCACCUGACUCCUGAGGAGAAG




UCUGCCG






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21932


RT18_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS12
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGUGCACCUGACUCCUGAGGAGAAG




UCUGCC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21933


RT18_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS11
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGUGCACCUGACUCCUGAGGAGAAG




UCUGC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21934


RT18_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS10
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGUGCACCUGACUCCUGAGGAGAAG




UCUG






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21935


RT18_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS9
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGUGCACCUGACUCCUGAGGAGAAG




UCU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21936


RT18_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS8
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGUGCACCUGACUCCUGAGGAGAAG




UC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21937


RT17_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS17
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGCACCUGACUCCUGAGGAGAAGU




CUGCCGUUAC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21938


RT17_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS16
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGCACCUGACUCCUGAGGAGAAGU




CUGCCGUUA






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21939


RT17_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS15
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGCACCUGACUCCUGAGGAGAAGU




CUGCCGUU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21940


RT17_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS14
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGCACCUGACUCCUGAGGAGAAGU




CUGCCGU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21941


RT17_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS13
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGCACCUGACUCCUGAGGAGAAGU




CUGCCG






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21942


RT17_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS12
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGCACCUGACUCCUGAGGAGAAGU




CUGCC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21943


RT17_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS11
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGCACCUGACUCCUGAGGAGAAGU




CUGC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21944


RT17_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS10
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGCACCUGACUCCUGAGGAGAAGU




CUG






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21945


RT17_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS9
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGCACCUGACUCCUGAGGAGAAGU




CU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21946


RT17_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS8
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGCACCUGACUCCUGAGGAGAAGU




C






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21947


RT16_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS17
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGCACCUGACUCCUGAGGAGAAGUC




UGCCGUUAC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21948


RT16_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS16
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGCACCUGACUCCUGAGGAGAAGUC




UGCCGUUA






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21949


RT16_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS15
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGCACCUGACUCCUGAGGAGAAGUC




UGCCGUU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21950


RT16_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS14
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGCACCUGACUCCUGAGGAGAAGUC




UGCCGU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21951


RT16_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS13
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGCACCUGACUCCUGAGGAGAAGUC




UGCCG






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21952


RT16_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS12
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGCACCUGACUCCUGAGGAGAAGUC




UGCC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21953


RT16_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS11
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGCACCUGACUCCUGAGGAGAAGUC




UGC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21954


RT16_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS10
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGCACCUGACUCCUGAGGAGAAGUC




UG






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21955


RT16_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS9
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGCACCUGACUCCUGAGGAGAAGUC




U






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21956


RT16_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS8
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGCACCUGACUCCUGAGGAGAAGUC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21957


RT14_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS17
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACCUGACUCCUGAGGAGAAGUCUG




CCGUUAC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21958


RT14_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS16
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACCUGACUCCUGAGGAGAAGUCUG




CCGUUA






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21959


RT14_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS15
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACCUGACUCCUGAGGAGAAGUCUG




CCGUU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21960


RT14_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS14
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACCUGACUCCUGAGGAGAAGUCUG




CCGU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21961


RT14_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS13
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACCUGACUCCUGAGGAGAAGUCUG




CCG






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21962


RT14_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS12
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACCUGACUCCUGAGGAGAAGUCUG




CC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21963


RT14_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS11
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACCUGACUCCUGAGGAGAAGUCUG




C






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21964


RT14_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS10
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACCUGACUCCUGAGGAGAAGUCUG






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21965


RT14_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS9
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACCUGACUCCUGAGGAGAAGUCU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21966


RT14_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS8
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACCUGACUCCUGAGGAGAAGUC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21967


RT11_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS17
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGACUCCUGAGGAGAAGUCUGCCG




UUAC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21968


RT11_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS16
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGACUCCUGAGGAGAAGUCUGCCG




UUA






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21969


RT11_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS15
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGACUCCUGAGGAGAAGUCUGCCG




UU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21970


RT11_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS14
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGACUCCUGAGGAGAAGUCUGCCG




U






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21971


RT11_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS13
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGACUCCUGAGGAGAAGUCUGCCG






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21972


RT11_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS12
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGACUCCUGAGGAGAAGUCUGCC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21973


RT11_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS11
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGACUCCUGAGGAGAAGUCUGC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21974


RT11_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS10
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGACUCCUGAGGAGAAGUCUG






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21975


RT11_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS9
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGACUCCUGAGGAGAAGUCU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21976


RT11_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS8
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCUGACUCCUGAGGAGAAGUC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21977


RT10_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS17
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGACUCCUGAGGAGAAGUCUGCCGU




UAC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21978


RT10_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS16
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGACUCCUGAGGAGAAGUCUGCCGU




UA






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21979


RT10_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS15
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGACUCCUGAGGAGAAGUCUGCCGU




U






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21980


RT10_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS14
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGACUCCUGAGGAGAAGUCUGCCGU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21981


RT10_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS13
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGACUCCUGAGGAGAAGUCUGCCG






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21982


RT10_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS12
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGACUCCUGAGGAGAAGUCUGCC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21983


RT10_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS11
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGACUCCUGAGGAGAAGUCUGC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21984


RT10_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS10
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGACUCCUGAGGAGAAGUCUG






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21985


RT10_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS9
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGACUCCUGAGGAGAAGUCU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21986


RT10_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS8
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCGACUCCUGAGGAGAAGUC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21987


RT9_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS17
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACUCCUGAGGAGAAGUCUGCCGUU




AC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21988


RT9_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS16
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACUCCUGAGGAGAAGUCUGCCGUU




A






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21989


RT9_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS15
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACUCCUGAGGAGAAGUCUGCCGUU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21990


RT9_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS14
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACUCCUGAGGAGAAGUCUGCCGU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21991


RT9_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS13
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACUCCUGAGGAGAAGUCUGCCG






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21992


RT9_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS12
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACUCCUGAGGAGAAGUCUGCC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21993


RT9_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS11
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACUCCUGAGGAGAAGUCUGC






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21994


RT9_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS10
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACUCCUGAGGAGAAGUCUG






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21995


RT9_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS9
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACUCCUGAGGAGAAGUCU






HBB8_
GUAACGGCAGACUUCUCCACGUUUUAGAGC
21996


RT9_P
UAGAAAUAGCAAGUUAAAAUAAGGCUAGUC



BS8
CGUUAUCAACUUGAAAAAGUGGCACCGAGU




CGGUGCACUCCUGAGGAGAAGUC
















TABLE C





Gene Modifying Polypeptide

















Nucleic
ATGAAACGGACAGCCGACGGAAGCGAGTTC
SEQ ID


Acid
GAGTCACCAAAGAAGAAGCGGAAAGTCGAC
NO:


Sequence
AAGAAGTACAGCATCGGCCTGGACATCGGC
20601



ACCAACTCTGTGGGCTGGGCCGTGATCACC




GACGAGTACAAGGTGCCCAGCAAGAAATTC




AAGGTGCTGGGCAACACCGACCGGCACAGC




ATCAAGAAGAACCTGATCGGAGCCCTGCTG




TTCGACAGCGGCGAAACAGCCGAGGCCACC




CGGCTGAAGAGAACCGCCAGAAGAAGATAC




ACCAGACGGAAGAACCGGATCTGCTATCTG




CAAGAGATCTTCAGCAACGAGATGGCCAAG




GTGGACGACAGCTTCTTCCACAGACTGGAA




GAGTCCTTCCTGGTGGAAGAGGATAAGAAG




CACGAGCGGCACCCCATCTTCGGCAACATC




GTGGACGAGGTGGCCTACCACGAGAAGTAC




CCCACCATCTACCACCTGAGAAAGAAACTG




GTGGACAGCACCGACAAGGCCGACCTGCGG




CTGATCTATCTGGCCCTGGCCCACATGATC




AAGTTCCGGGGCCACTTCCTGATCGAGGGC




GACCTGAACCCCGACAACAGCGACGTGGAC




AAGCTGTTCATCCAGCTGGTGCAGACCTAC




AACCAGCTGTTCGAGGAAAACCCCATCAAC




GCCAGCGGCGTGGACGCCAAGGCCATCCTG




TCTGCCAGACTGAGCAAGAGCAGACGGCTG




GAAAATCTGATCGCCCAGCTGCCCGGCGAG




AAGAAGAATGGCCTGTTCGGAAACCTGATT




GCCCTGAGCCTGGGCCTGACCCCCAACTTC




AAGAGCAACTTCGACCTGGCCGAGGATGCC




AAACTGCAGCTGAGCAAGGACACCTACGAC




GACGACCTGGACAACCTGCTGGCCCAGATC




GGCGACCAGTACGCCGACCTGTTTCTGGCC




GCCAAGAACCTGTCCGACGCCATCCTGCTG




AGCGACATCCTGAGAGTGAACACCGAGATC




ACCAAGGCCCCCCTGAGCGCCTCTATGATC




AAGAGATACGACGAGCACCACCAGGACCTG




ACCCTGCTGAAAGCTCTCGTGCGGCAGCAG




CTGCCTGAGAAGTACAAAGAGATTTTCTTC




GACCAGAGCAAGAACGGCTACGCCGGCTAC




ATTGACGGCGGAGCCAGCCAGGAAGAGTTC




TACAAGTTCATCAAGCCCATCCTGGAAAAG




ATGGACGGCACCGAGGAACTGCTCGTGAAG




CTGAACAGAGAGGACCTGCTGCGGAAGCAG




CGGACCTTCGACAACGGCAGCATCCCCCAC




CAGATCCACCTGGGAGAGCTGCACGCCATT




CTGCGGCGGCAGGAAGATTTTTACCCATTC




CTGAAGGACAACCGGGAAAAGATCGAGAAG




ATCCTGACCTTCCGCATCCCCTACTACGTG




GGCCCTCTGGCCAGGGGAAACAGCAGATTC




GCCTGGATGACCAGAAAGAGCGAGGAAACC




ATCACCCCCTGGAACTTCGAGGAAGTGGTG




GACAAGGGCGCTTCCGCCCAGAGCTTCATC




GAGCGGATGACCAACTTCGATAAGAACCTG




CCCAACGAGAAGGTGCTGCCCAAGCACAGC




CTGCTGTACGAGTACTTCACCGTGTATAAC




GAGCTGACCAAAGTGAAATACGTGACCGAG




GGAATGAGAAAGCCCGCCTTCCTGAGCGGC




GAGCAGAAAAAGGCCATCGTGGACCTGCTG




TTCAAGACCAACCGGAAAGTGACCGTGAAG




CAGCTGAAAGAGGACTACTTCAAGAAAATC




GAGTGCTTCGACTCCGTGGAAATCTCCGGC




GTGGAAGATCGGTTCAACGCCTCCCTGGGC




ACATACCACGATCTGCTGAAAATTATCAAG




GACAAGGACTTCCTGGACAATGAGGAAAAC




GAGGACATTCTGGAAGATATCGTGCTGACC




CTGACACTGTTTGAGGACAGAGAGATGATC




GAGGAACGGCTGAAAACCTATGCCCACCTG




TTCGACGACAAAGTGATGAAGCAGCTGAAG




CGGCGGAGATACACCGGCTGGGGCAGGCTG




AGCCGGAAGCTGATCAACGGCATCCGGGAC




AAGCAGTCCGGCAAGACAATCCTGGATTTC




CTGAAGTCCGACGGCTTCGCCAACAGAAAC




TTCATGCAGCTGATCCACGACGACAGCCTG




ACCTTTAAAGAGGACATCCAGAAAGCCCAG




GTGTCCGGCCAGGGCGATAGCCTGCACGAG




CACATTGCCAATCTGGCCGGCAGCCCCGCC




ATTAAGAAGGGCATCCTGCAGACAGTGAAG




GTGGTGGACGAGCTCGTGAAAGTGATGGGC




CGGCACAAGCCCGAGAACATCGTGATCGAA




ATGGCCAGAGAGAACCAGACCACCCAGAAG




GGACAGAAGAACAGCCGCGAGAGAATGAAG




CGGATCGAAGAGGGCATCAAAGAGCTGGGC




AGCCAGATCCTGAAAGAACACCCCGTGGAA




AACACCCAGCTGCAGAACGAGAAGCTGTAC




CTGTACTACCTGCAGAATGGGCGGGATATG




TACGTGGACCAGGAACTGGACATCAACCGG




CTGTCCGACTACGATGTGGACCATATCGTG




CCTCAGAGCTTTCTGAAGGACGACTCCATC




GACAACAAGGTGCTGACCAGAAGCGACAAG




GCCCGGGGCAAGAGCGACAACGTGCCCTCC




GAAGAGGTCGTGAAGAAGATGAAGAACTAC




TGGCGGCAGCTGCTGAACGCCAAGCTGATT




ACCCAGAGAAAGTTCGACAATCTGACCAAG




GCCGAGAGAGGCGGCCTGAGCGAACTGGAT




AAGGCCGGCTTCATCAAGAGACAGCTGGTG




GAAACCCGGCAGATCACAAAGCACGTGGCA




CAGATCCTGGACTCCCGGATGAACACTAAG




TACGACGAGAATGACAAGCTGATCCGGGAA




GTGAAAGTGATCACCCTGAAGTCCAAGCTG




GTGTCCGATTTCCGGAAGGATTTCCAGTTT




TACAAAGTGCGCGAGATCAACAACTACCAC




CACGCCCACGACGCCTACCTGAACGCCGTC




GTGGGAACCGCCCTGATCAAAAAGTACCCT




AAGCTGGAAAGCGAGTTCGTGTACGGCGAC




TACAAGGTGTACGACGTGCGGAAGATGATC




GCCAAGAGCGAGCAGGAAATCGGCAAGGCT




ACCGCCAAGTACTTCTTCTACAGCAACATC




ATGAACTTTTTCAAGACCGAGATTACCCTG




GCCAACGGCGAGATCCGGAAGCGGCCTCTG




ATCGAGACAAACGGCGAAACCGGGGAGATC




GTGTGGGATAAGGGCCGGGATTTTGCCACC




GTGCGGAAAGTGCTGAGCATGCCCCAAGTG




AATATCGTGAAAAAGACCGAGGTGCAGACA




GGCGGCTTCAGCAAAGAGTCTATCCTGCCC




AAGAGGAACAGCGATAAGCTGATCGCCAGA




AAGAAGGACTGGGACCCTAAGAAGTACGGC




GGCTTCGACAGCCCCACCGTGGCCTATTCT




GTGCTGGTGGTGGCCAAAGTGGAAAAGGGC




AAGTCCAAGAAACTGAAGAGTGTGAAAGAG




CTGCTGGGGATCACCATCATGGAAAGAAGC




AGCTTCGAGAAGAATCCCATCGACTTTCTG




GAAGCCAAGGGCTACAAAGAAGTGAAAAAG




GACCTGATCATCAAGCTGCCTAAGTACTCC




CTGTTCGAGCTGGAAAACGGCCGGAAGAGA




ATGCTGGCCTCTGCCGGCGAACTGCAGAAG




GGAAACGAACTGGCCCTGCCCTCCAAATAT




GTGAACTTCCTGTACCTGGCCAGCCACTAT




GAGAAGCTGAAGGGCTCCCCCGAGGATAAT




GAGCAGAAACAGCTGTTTGTGGAACAGCAC




AAGCACTACCTGGACGAGATCATCGAGCAG




ATCAGCGAGTTCTCCAAGAGAGTGATCCTG




GCCGACGCTAATCTGGACAAAGTGCTGTCC




GCCTACAACAAGCACCGGGATAAGCCCATC




AGAGAGCAGGCCGAGAATATCATCCACCTG




TTTACCCTGACCAATCTGGGAGCCCCTGCC




GCCTTCAAGTACTTTGACACCACCATCGAC




CGGAAGAGGTACACCAGCACCAAAGAGGTG




CTGGACGCCACCCTGATCCACCAGAGCATC




ACCGGCCTGTACGAGACACGGATCGACCTG




TCTCAGCTGGGAGGTGACTCTGGAGGATCT




AGCGGAGGATCCTCTGGCAGCGAGACACCA




GGAACAAGCGAGTCAGCAACACCAGAGAGC




AGTGGCGGCAGCAGCGGCGGCAGCAGCACC




CTAAATATAGAAGATGAGTATCGGCTACAT




GAGACCTCAAAAGAGCCAGATGTTTCTCTA




GGGTCCACATGGCTGTCTGATTTTCCTCAG




GCCTGGGCGGAAACCGGGGGCATGGGACTG




GCAGTTCGCCAAGCTCCTCTGATCATACCT




CTGAAAGCAACCTCTACCCCCGTGTCCATA




AAACAATACCCCATGTCACAAGAAGCCAGA




CTGGGGATCAAGCCCCACATACAGAGACTG




TTGGACCAGGGAATACTGGTACCCTGCCAG




TCCCCCTGGAACACGCCCCTGCTACCCGTT




AAGAAACCAGGGACTAATGATTATAGGCCT




GTCCAGGATCTGAGAGAAGTCAACAAGCGG




GTGGAAGACATCCACCCCACCGTGCCCAAC




CCTTACAACCTCTTGAGCGGGCTCCCACCG




TCCCACCAGTGGTACACTGTGCTTGATTTA




AAGGATGCCTTTTTCTGCCTGAGACTCCAC




CCCACCAGTCAGCCTCTCTTCGCCTTTGAG




TGGAGAGATCCAGAGATGGGAATCTCAGGA




CAATTGACCTGGACCAGACTCCCACAGGGT




TTCAAAAACAGTCCCACCCTGTTTAATGAG




GCACTGCACAGAGACCTAGCAGACTTCCGG




ATCCAGCACCCAGACTTGATCCTGCTACAG




TACGTGGATGACTTACTGCTGGCCGCCACT




TCTGAGCTAGACTGCCAACAAGGTACTCGG




GCCCTGTTACAAACCCTAGGGAACCTCGGG




TATCGGGCCTCGGCCAAGAAAGCCCAAATT




TGCCAGAAACAGGTCAAGTATCTGGGGTAT




CTTCTAAAAGAGGGTCAGAGATGGCTGACT




GAGGCCAGAAAAGAGACTGTGATGGGGCAG




CCTACTCCGAAGACCCCTCGACAACTAAGG




GAGTTCCTAGGGAAGGCAGGCTTCTGTCGC




CTCTTCATCCCTGGGTTTGCAGAAATGGCA




GCCCCCCTGTACCCTCTCACCAAACCGGGG




ACTCTGTTTAATTGGGGCCCAGACCAACAA




AAGGCCTATCAAGAAATCAAGCAAGCTCTT




CTAACTGCCCCAGCCCTGGGGTTGCCAGAT




TTGACTAAGCCCTTTGAACTCTTTGTCGAC




GAGAAGCAGGGCTACGCCAAAGGTGTCCTA




ACGCAAAAACTGGGACCTTGGCGTCGGCCG




GTGGCCTACCTGTCCAAAAAGCTAGACCCA




GTAGCAGCTGGGTGGCCCCCTTGCCTACGG




ATGGTAGCAGCCATTGCCGTACTGACAAAG




GATGCAGGCAAGCTAACCATGGGACAGCCA




CTAGTCATTCTGGCCCCCCATGCAGTAGAG




GCACTAGTCAAACAACCCCCCGACCGCTGG




CTTTCCAACGCCCGGATGACTCACTATCAG




GCCTTGCTTTTGGACACGGACCGGGTCCAG




TTCGGACCGGTGGTAGCCCTGAACCCGGCT




ACGCTGCTCCCACTGCCTGAGGAAGGGCTG




CAACACAACTGCCTTGATATCCTGGCCGAA




GCCCACGGAACCCGACCCGACCTAACGGAC




CAGCCGCTCCCAGACGCCGACCACACCTGG




TACACGGATGGAAGCAGTCTCTTACAAGAG




GGACAGCGTAAGGCGGGAGCTGCGGTGACC




ACCGAGACCGAGGTAATCTGGGCTAAAGCC




CTGCCAGCCGGGACATCCGCTCAGCGGGCT




GAACTGATAGCACTCACCCAGGCCCTAAAG




ATGGCAGAAGGTAAGAAGCTAAATGTTTAT




ACTGATAGCCGTTATGCTTTTGCTACTGCC




CATATCCATGGAGAAATATACAGAAGGCGT




GGGTGGCTCACATCAGAAGGCAAAGAGATC




AAAAATAAAGACGAGATCTTGGCCCTACTA




AAAGCCCTCTTTCTGCCCAAAAGACTTAGC




ATAATCCATTGTCCAGGACATCAAAAGGGA




CACAGCGCCGAGGCTAGAGGCAACCGGATG




GCTGACCAAGCGGCCCGAAAGGCAGCCATC




ACAGAGACTCCAGACACCTCTACCCTCCTC




ATAGAAAATTCATCACCCTCTGGCGGCTCA




AAAAGAACCGCCGACGGCAGCGAATTCGAG




CCCAAGAAGAAGAGGAAAGTC






Amino
MKRTADGSEFESPKKKRKVDKKYSIGLDIG
SEQ ID


Acid
TNSVGWAVITDEYKVPSKKFKVLGNTDRHS
NO:


Sequence
IKKNLIGALLFDSGETAEATRLKRTARRRY
20602



TRRKNRICYLQEIFSNEMAKVDDSFFHRLE




ESFLVEEDKKHERHPIFGNIVDEVAYHEKY




PTIYHLRKKLVDSTDKADLRLIYLALAHMI




KFRGHFLIEGDLNPDNSDVDKLFIQLVQTY




NQLFEENPINASGVDAKAILSARLSKSRRL




ENLIAQLPGEKKNGLFGNLIALSLGLTPNF




KSNFDLAEDAKLQLSKDTYDDDLDNLLAQI




GDQYADLFLAAKNLSDAILLSDILRVNTEI




TKAPLSASMIKRYDEHHQDLTLLKALVRQQ




LPEKYKEIFFDQSKNGYAGYIDGGASQEEF




YKFIKPILEKMDGTEELLVKLNREDLLRKQ




RTFDNGSIPHQIHLGELHAILRRQEDFYPF




LKDNREKIEKILTFRIPYYVGPLARGNSRF




AWMTRKSEETITPWNFEEVVDKGASAQSFI




ERMTNFDKNLPNEKVLPKHSLLYEYFTVYN




ELTKVKYVTEGMRKPAFLSGEQKKAIVDLL




FKTNRKVTVKQLKEDYFKKIECFDSVEISG




VEDRFNASLGTYHDLLKIIKDKDFLDNEEN




EDILEDIVLTLTLFEDREMIEERLKTYAHL




FDDKVMKQLKRRRYTGWGRLSRKLINGIRD




KQSGKTILDFLKSDGFANRNFMQLIHDDSL




TFKEDIQKAQVSGQGDSLHEHIANLAGSPA




IKKGILQTVKVVDELVKVMGRHKPENIVIE




MARENQTTQKGQKNSRERMKRIEEGIKELG




SQILKEHPVENTQLQNEKLYLYYLQNGRDM




YVDQELDINRLSDYDVDHIVPQSFLKDDSI




DNKVLTRSDKARGKSDNVPSEEVVKKMKNY




WRQLLNAKLITQRKFDNLTKAERGGLSELD




KAGFIKRQLVETRQITKHVAQILDSRMNTK




YDENDKLIREVKVITLKSKLVSDFRKDFQF




YKVREINNYHHAHDAYLNAVVGTALIKKYP




KLESEFVYGDYKVYDVRKMIAKSEQEIGKA




TAKYFFYSNIMNFFKTEITLANGEIRKRPL




IETNGETGEIVWDKGRDFATVRKVLSMPQV




NIVKKTEVQTGGFSKESILPKRNSDKLIAR




KKDWDPKKYGGFDSPTVAYSVLVVAKVEKG




KSKKLKSVKELLGITIMERSSFEKNPIDFL




EAKGYKEVKKDLIIKLPKYSLFELENGRKR




MLASAGELQKGNELALPSKYVNFLYLASHY




EKLKGSPEDNEQKQLFVEQHKHYLDEIIEQ




ISEFSKRVILADANLDKVLSAYNKHRDKPI




REQAENIIHLFTLTNLGAPAAFKYFDTTID




RKRYTSTKEVLDATLIHQSITGLYETRIDL




SQLGGDSGGSSGGSSGSETPGTSESATPES




SGGSSGGSSTLNIEDEYRLHETSKEPDVSL




GSTWLSDFPQAWAETGGMGLAVRQAPLIIP




LKATSTPVSIKQYPMSQEARLGIKPHIQRL




LDQGILVPCQSPWNTPLLPVKKPGTNDYRP




VQDLREVNKRVEDIHPTVPNPYNLLSGLPP




SHQWYTVLDLKDAFFCLRLHPTSQPLFAFE




WRDPEMGISGQLTWTRLPQGFKNSPTLFNE




ALHRDLADFRIQHPDLILLQYVDDLLLAAT




SELDCQQGTRALLQTLGNLGYRASAKKAQI




CQKQVKYLGYLLKEGQRWLTEARKETVMGQ




PTPKTPRQLREFLGKAGFCRLFIPGFAEMA




APLYPLTKPGTLFNWGPDQQKAYQEIKQAL




LTAPALGLPDLTKPFELFVDEKQGYAKGVL




TQKLGPWRRPVAYLSKKLDPVAAGWPPCLR




MVAAIAVLTKDAGKLTMGQPLVILAPHAVE




ALVKQPPDRWLSNARMTHYQALLLDTDRVQ




FGPVVALNPATLLPLPEEGLQHNCLDILAE




AHGTRPDLTDQPLPDADHTWYTDGSSLLQE




GQRKAGAAVTTETEVIWAKALPAGTSAQRA




ELIALTQALKMAEGKKLNVYTDSRYAFATA




HIHGEIYRRRGWLTSEGKEIKNKDEILALL




KALFLPKRLSIIHCPGHQKGHSAEARGNRM




ADQAARKAAITETPDTSTLLIENSSPSGGS




KRTADGSEFEPKKKRKV









Example 5: Quantifying Activity of a Gene Editing Polypeptide and Template for Rewriting the Endogenous B-Globin Locus Achieved in 293T Cells and CD34+Primary Human Hematopoietic Stem Cells (HSCs)

This example demonstrates the use of a gene modifying system containing a gene modifying polypeptide and a template RNA, to convert the glutamic acid codon (GAG) at amino acid position 7 in the endogenous B-globin locus in primary human HSCs to alanine (GCA), thereby rewriting a non-pathogenic sequence into position 7. This conversion comprises a change of two base pairs (i.e., replacement of the DNA bases adenine and guanine at nucleotide positions 20 and 21 to the bases cytosine and adenine, respectively).


In this example, the template RNA contained:

    • (1) a gRNA spacer;
    • (2) a gRNA scaffold;
    • (3) a heterologous object sequence; and
    • (4) a primer binding site (PBS) sequence.


More specifically, the template RNAs comprised the following sequences from 5′ to 3′, wherein the first 3, and last 3 bases have 2′-O-methyl phosphorothioate chemical modifications.











FYF tgRNA11



(SEQ ID NO: 20603)



CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU







AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU







CGGUGCGACUUCUCUGCAGGAGUCAGGU







FYF tgRNA12



(SEQ ID NO: 20604)



CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU







AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU







CGGUGCGACUUCUCUGCAGGAGUCAGGUG







FYF tgRNA13



(SEQ ID NO: 20605)



CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU







AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU







CGGUGCGACUUCUCUGCAGGAGUCAGGUGCAC







FYF tgRNA14



(SEQ ID NO: 20606)



CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU







AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU







CGGUGCAGACUUCUCUGCAGGAGUCAGGUG







FYF tgRNA15



(SEQ ID NO: 20607)



CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU







AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU







CGGUGCAGACUUCUCUGCAGGAGUCAGGUGC







FYF tgRNA16



(SEQ ID NO: 20608)



CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU







AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU







CGGUGCAGACUUCUCUGCAGGAGUCAGGUGCA







FYF tgRNA17



(SEQ ID NO: 20609)



CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU







AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU







CGGUGCAGACUUCUCUGCAGGAGUCAGGUGCAC







FYF tgRNA18



(SEQ ID NO: 20610)



GCAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGU







UAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAG







UCGGUGCAGACUUCUCUGCAGGAGUCAGGUGCAC







FYF tgRNA19



(SEQ ID NO: 20611)



CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU







AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU







CGGUGCGGCAGACUUCUCUGCAGGAGUCAGGUGC







FYF tgRNA20



(SEQ ID NO: 20612)



CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAAGUU







AAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGU







CGGUGCGGCAGACUUCUCUGCAGGAGUCAGGUGCAC






The gene modifying polypeptides tested comprised the sequence set out in Example 8 labeled RNAV209.


The gene modifying system comprising the gene modifying polypeptides and the template RNA described above was transfected into 293T cells and human HSCs. The gene modifying polypeptide and the template RNA were delivered by nucleofection in RNA format. Specifically, 1000 or 2000 ng of gene modifying polypeptide RNA were combined with 1000 or 2000 ng template RNA in RNA format, all at a 1 to 1 ratio. The RNA mixture was added to 200,000 293T cells or 200,000 primary human HSCs in a total of 20 μL of Lonza SF buffer (293T) or Lonza P3 buffer (HSC) and cells were nucleofected in 16-well nucleofection cassettes using program DS-150 (293T) or DZ-100 (HSC). After nucleofection, cells incubated at room temperature for 10 minutes and were transferred to 24-well plates containing 500 μL of DMEM+10% fetal bovine serum (293T) or 500 μL of StemSpan-XF+SCF at 100 ng/mL, Flt3-L at 100 ng/ml, and TPO at 100 ng/mL (HSC) in each well and cultured at 37° C., 5% CO2 for 3 days prior to cell lysis and genomic DNA extraction. To analyze gene editing activity, primers flanking the target insertion site locus were used to amplify across the locus. Amplicons were analyzed via short read sequencing using an Illumina MiSeq. Replacement of the DNA bases adenine and guanine at nucleotide positions 20 and 21 to the bases cytosine and adenine, respectively, downstream of the transcriptional start site within the endogenous B-globin locus indicates successful editing.


The gene modifying systems tested achieved up to 18.5% average perfect rewrite in 293T cells and up to 7.9% perfect rewrite in primary human HSCs. As shown in FIG. 2, average perfect rewrite levels of 4%-18% were detected in 293T cells (single nick at 2000 ng per RNA) and 0%-2.5% in primary human HSCs (single nick at 2000 ng per RNA) when screened with template gRNAs. As shown in FIG. 3, average perfect rewrite levels of 6%-18.5% were detected in 293T cells (single nick at 2000 ng per RNA) and 0%-7.9% in primary human HSCs (single nick at 2000 ng per RNA) using the gRNAs shown. These results demonstrate the use of a gene modifying system to rewrite a non-pathogenic sequence into a clinically relevant codon in the endogenous B-globin locus in primary human HSCs.


Example 6: Quantifying Activity of a Gene Editing Polypeptide and Template for Rewriting the Endogenous B-Globin Locus Achieved in Human Primary Fibroblasts

This example demonstrates the use of a gene modifying system containing a gene modifying polypeptide and a template RNA, to convert the glutamic acid codon (GAG) at amino acid position 7 in the endogenous B-globin locus in human primary fibroblasts to alanine (GCA). This conversion comprises a change of two base pairs (i.e., replacement of the DNA bases adenine and guanine at nucleotide positions 20 and 21 to the bases cytosine and adenine, respectively).


In this example, the template RNA contained:

    • (1) a gRNA spacer;
    • (2) a gRNA scaffold;
    • (3) a heterologous object sequence; and
    • (4) a primer binding site (PBS) sequence.


More specifically, the template RNA comprised the sequence of tgRNA14 as described in the previous example.


The gene modifying polypeptides tested comprised the sequence set out in Example 8 labeled RNAV209.


The system further comprises a gRNA sequence designed to produce a second nick, wherein the first 3, and last 3 bases have 2′-O-methyl phosphorothioate modifications and is comprised of the following sequence











(SEQ ID NO: 20613)



5′-CCUUGAUACCAACCUGCCCAGUUUUAGAGCUAGAAAUAGCAA







GUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCG







AGUCGGUGCUUUU-3′.






The gene modifying system comprising the gene modifying polypeptides and the template RNA described above was electroporated into human primary fibroblasts. The gene modifying polypeptide and the template RNA were delivered by electroporation in RNA format and comprised the sequences detailed above. Specifically, two doses were delivered (1000 ng and 2000 ng) wherein each gene editing component was delivered as RNA at the specified dose. For example, for the 1000 ng dose, 1000 ng of gene modifying polypeptide RNA were combined with 1000 ng template RNA in RNA format and 1000 ng of second nick gRNA in RNA format, in a 10 μL electroporation mixture comprised of 200,000 primary human fibroblasts resuspended in Buffer R (Invitrogen). The electroporation mixture was then aspirated into a 10 μL neon electroporation tip (Invitrogen), transferred to the neon electroporation system (Invitrogen), and electroporated with one pulse at 1700 mV, for 20 mS. Cells were then transferred to one well of a 12-well plate (Corning), cultured in 1 mL of Glutamax containing DMEM supplemented with 15% fetal bovine serum, 1% non-essential amino acids, 1% sodium pyruvate, and 1% HEPES (all Gibco), and cultured for 3 days at 37° C., 5% CO2 prior to cell lysis and genomic DNA extraction. To analyze gene editing activity, primers flanking the target insertion site locus were used to amplify across the locus. Amplicons were analyzed via short read sequencing using an Illumina MiSeq. Replacement of the DNA bases adenine and guanine at nucleotide positions 20 and 21 to the bases cytosine and adenine, respectively, downstream of the transcriptional start site within the endogenous B-globin locus indicated successful editing.


As shown in FIG. 4, perfect rewrite levels of 3.7% and 10.6% were detected at the 1000 ng and 2000 ng doses, respectively, when the gene editing polypeptide was combined with the template guide RNA and no second nick gRNA was added. Addition of a second nick increased perfect rewriting from 3.7% to 44.5% at the 1000 ng dose and from 10.6% to 56.5% at the 2000 ng dose. In this experiment, indel levels in the range of 1.5-1.65% (single nick; 1000 ng, 2000 ng) and 7.9-5.8% (second nick; 1000 ng, 2000 ng) were observed. These results demonstrate the use of a gene modifying system to rewrite a non-pathogenic sequence into a clinically relevant codon in the endogenous B-globin locus in human primary fibroblasts. Furthermore, introduction of a second nick gRNA increased perfect rewriting by five to ten-fold, depending on dose administered.


Example 7: Comparing the Activity of a Gene Editing Polypeptide and Multiple Templates for Rewriting Different Sequences into the Same Location within the Endogenous B-Globin Locus in Wild-Type Human Primary Fibroblasts and Fibroblasts Containing the Sickle Cell Mutation

This example demonstrates similar efficacy when installing different mutations into the same genomic loci by changing the sequences within the reverse transcriptase (RT) domain of a template guide RNA and holding the design of a gene modifying polypeptide, template RNA primer binding side (PBS) and template guide RNA scaffold constant. In this example, two adjacent DNA bases, one of which is positioned at the site mutated in sickle cell disease within the B-globin locus, were substituted in wild type fibroblasts, converting the glutamic acid codon (GAG) at amino acid position 7 in the endogenous B-globin locus in human primary fibroblasts to alanine (GCA). In parallel, at the same amino acid position, the valine codon (GTG) present in sickle mutation containing fibroblasts was converted to a synonymous glutamic acid codon (GAA).


In this example, the template RNA contained:

    • (1) a gRNA spacer;
    • (2) a gRNA scaffold;
    • (3) a heterologous object sequence; and
    • (4) a primer binding site (PBS) sequence.


      More specifically, the template RNA utilized in wild type fibroblasts comprised the sequence of tgRNA14 as described in the previous example.


      The template RNA utilized in sickle fibroblasts comprised the following sequence and contained 2′-O-methyl phosphorothioate modifications at the first 3, and last 3 bases:











(SEQ ID NO: 20614)



5′-CAUGGUGCACCUGACUCCUGGUUUUAGAGCUAGAAAUAGCAA







GUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCG







AGUCGGUGCAGACUUCUCUUCAGGAGUCAGGUG-3′






The gene modifying polypeptides tested comprised the sequence set out in Example 8 labeled RNAV209.


The system further comprised a gRNA sequence designed to produce a second nick, wherein the gRNA has a sequence of











(SEQ ID NO: 20615)



5′-CCUUGAUACCAACCUGCCCAGUUUUAGAGCUAGAAAUAGCAA







GUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCG







AGUCGGUGCUUUU-3′.






The same gRNA comprised of the sequence above was utilized for both wild type and sickle cell second nick conditions within this example.


The gene modifying system comprising the gene modifying polypeptides and the template RNA described above was electroporated into wild type and sickle mutation-containing human primary fibroblasts. The gene modifying polypeptide and the template RNA were delivered by electroporation in RNA format and comprised of the sequences detailed above. One dose was delivered (1000 ng) wherein each gene editing component was delivered as RNA at the specified dose. Specifically, 1000 ng of gene modifying polypeptide RNA were combined with 1000 ng template RNA in RNA format and 1000 ng of second nick gRNA in RNA format, in a 10 pL electroporation mixture comprised of 200,000 primary human fibroblasts resuspended in Buffer R (Invitrogen). The electroporation mixture was then aspirated into a 10 μL neon electroporation tip (Invitrogen), transferred to the neon electroporation system (Invitrogen), and electroporated with one pulse at 1700 mV, for 20 mS. Cells were then transferred to one well of a 12-well plate (Corning), cultured in 1 mL of Glutamax containing DMEM supplemented with 15% fetal bovine serum, 1% non-essential amino acids, 1% sodium pyruvate, and 1% HEPES (all Gibco), and cultured for 3 days at 37° C., 5% CO2 prior to cell lysis and genomic DNA extraction. To analyze gene editing activity, primers flanking the target insertion site locus were used to amplify across the locus. Amplicons were analyzed by Sanger sequencing followed by analysis using the TIDER algorithm. Replacement of the DNA bases adenine and guanine at nucleotide positions 20 and 21 to the bases cytosine and adenine, respectively, downstream of the transcriptional start site within the endogenous B-globin locus indicated successful editing in wild type cells. Conversely, replacement of the DNA bases thymine and guanine at nucleotide positions 20 and 21 to the base adenine, respectively, downstream of the transcriptional start site within the endogenous B-globin locus indicated successful editing in sickle mutation containing fibroblasts.


As shown in FIG. 5, perfect rewrite levels of 10.8% and 6.1% were detected in wild type and sickle fibroblasts, respectively, when the gene editing polypeptide was combined with the template guide RNA. Addition of a second nick increased perfect rewriting to 75.6% in wild type cells and to 74.6% in sickle fibroblasts. These results demonstrate the use of a gene modifying system to correct a pathogenic mutation in sickle mutation-bearing human primary fibroblasts and to install a non-pathogenic mutation into wild-type human primary fibroblasts. Furthermore, introduction of a second nick gRNA increased perfect rewriting more than 7-fold in wild type primary fibroblasts and over ten-fold in sickle primary fibroblasts.


Example 8: Quantifying Activity of a Gene Editing Polypeptide and Template for Rewriting the Endogenous FAH Locus Achieved in Primary Mouse Hepatocytes

This example demonstrates the use of a gene modifying system containing a gene modifying polypeptide and a template RNA, to convert an A nucleotide to a G nucleotide in the endogenous Fah locus in mouse primary hepatocytes derived from a Fah5981SB mouse. The Fah5981SB mouse model harbors a G to A point mutation in the last nucleotide of exon 8 of the Fah gene, leading to aberrant mRNA splicing and subsequent mRNA degradation, without the production of Fah protein and, and thus serves as a mouse model of hereditary tyrosinemia type I.


In this example, the template RNA contained:

    • (1) a gRNA spacer;
    • (2) a gRNA scaffold;
    • (3) a heterologous object sequence; and
    • (4) a primer binding site (PBS) sequence.


More specifically, the template RNA (including chemical modification pattern) comprised the following sequences:











FAH1_R14_P12 Heavy



RNACS048



(SEQ ID NO: 20616)



mG*mG*mA*rUrGrGrUrCrCrUrCrArUrGrArArCrGrArCrG







rUrUrUrUrArGrAmGmCmUmAmGmAmAmAmUmAmGmCrArArGr







UrUrArArArArUrArArGrGrCrUrArGrUrCrCrGrUrUrArU







rCrAmAmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrUrUrArCrCrGrCrUrCrCrArGrUrCrG







rUrUrCrArUrGrArG*mG*mA*mC







FAH1_R15_P10_Heavy



RNACS049



(SEQ ID NO: 20617)



mG*mG*mA*rUrGrGrUrCrCrUrCrArUrGrArArCrGrArCrG







rUrUrUrUrArGrAmGmCmUmAmGmAmAmAmUmAmGmCrArArGr







UrUrArArArArUrArArGrGrCrUrArGrUrCrCrGrUrUrArU







rCrAmAmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArUrUrArCrCrGrCrUrCrCrArGrUrC







rGrUrUrCrArUrG*mA*mG*mG







FAH2_R19_P11_MUT_Heavy



RNACS052



(SEQ ID NO: 20618)



mU*mC*mA*rGrArGrGrArArGrCrUrGrGrGrCrCrArCrCrG







rUrUrUrUrArGrAmGmCmUmAmGmAmAmAmUmAmGmCrArArGr







UrUrArArArArUrArArGrGrCrUrArGrUrCrCrGrUrUrArU







rCrAmAmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrUrGrGrArGrCrGrGrUrArArUrGrGrC







rUrGrGrUrGrGrCrCrCrArGrC*mU*mU*mC







FAH2_R19_P13_MUT_Heavy



RNACS053



(SEQ ID NO: 20619)



mU*mC*mA*rGrArGrGrArArGrCrUrGrGrGrCrCrArCrCrG







rUrUrUrUrArGrAmGmCmUmAmGmAmAmAmUmAmGmCrArArGr







UrUrArArArArUrArArGrGrCrUrArGrUrCrCrGrUrUrArU







rCrAmAmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrUrGrGrArGrCrGrGrUrArArUrGrGrC







rUrGrGrUrGrGrCrCrCrArGrCrUrU*mC*mC*mU







Additional exemplary template RNAs that could be utilized in this experiment include the following:











FAH1



RNACS050



(SEQ ID NO: 20620)



mG*mG*mA*rUrGrGrUrCrCrUrCrArUrGrArArCrGrArCrG







rUrUrUrUrArGrAmGmCmUmAmGmAmAmAmUmAmGmCrArArGr







UrUrArArArArUrArArGrGrCrUrArGrUrCrCrGrUrUrArU







rCrAmAmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArGrGrCrArUrUrArCrCrGrCrUrCrC







rArGrUrCrGrUrUrCrArUrGrArG*mG*mA*mC







FAH1



RNACS051



(SEQ ID NO: 20621)



mG*mG*mA*rUrGrGrUrCrCrUrCrArUrGrArArCrGrArCrG







rUrUrUrUrArGrAmGmCmUmAmGmAmAmAmUmAmGmCrArArGr







UrUrArArArArUrArArGrGrCrUrArGrUrCrCrGrUrUrArU







rCrAmAmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArGrGrCrArUrUrArCrCrGrCrUrCrC







rArGrUrCrGrUrUrCrArUrG*mA*mG*mG






In the sequences above m=2′-O-methyl ribonucleotide, r=ribose and *=phosphorothioate bond.


The gene modifying polypeptides tested comprised sequence of: RNAV209 (nCas9-RT) and RNAV214 (wtCas9-RT). Specifically, the nCas9-RT and the wtCas9-RT had the following amino acid sequences:











nCas9-RT (RNAV209):



(SEQ ID NO: 20622)



MPAAKRVKLDGGDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFK







VLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNR







ICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNI







VDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRG







HFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAI







LSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNF







DLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAIL







LSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLP







EKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEEL







LVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLK







DNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFE







EVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNEL







TKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYF







KKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENED







ILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGW







GRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTF







KEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELV







KVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGS







QILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDV







DHIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYW







RQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQIT







KHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFY







KVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDV







RKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLI







ETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKE







SILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGK







SKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKL







PKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYE







KLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLD







KVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDR







KRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDSGGSSGGSSG







SETPGTSESATPESSGGSSGGSSTLNIEDEYRLHETSKEPDVSLG







STWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMS







QEARLGIKPHIQRLLDQGILVPCQSPWNTPLLPVKKPGTNDYRPV







QDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFC







LRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPTLFNEA







LHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTL







GNLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQP







TPKTPRQLREFLGKAGFCRLFIPGFAEMAAPLYPLTKPGTLFNWG







PDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLT







QKLGPWRRPVAYLSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLT







MGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQF







GPVVALNPATLLPLPEEGLQHNCLDILAEAHGTRPDLTDQPLPDA







DHTWYTDGSSLLQEGQRKAGAAVTTETEVIWAKALPAGTSAQRAE







LIALTQALKMAEGKKLNVYTDSRYAFATAHIHGEIYRRRGWLTSE







GKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGHSAEARGNRMA







DQAARKAAITETPDTSTLLIENSSPSGGSKRTADGSEFEKRTADG







SEFESPKKKAKVE







wtCas9-RT



(RNAV214):



(SEQ ID NO: 20623)



MPAAKRVKLDGGDKKYSIGLDIGTNSVGWAVITDEYKVPSKKF







KVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKN







RICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNI







VDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRG







HFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAI







LSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNF







DLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAIL







LSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPE







KYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELL







VKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKD







NREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEE







VVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELT







KVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFK







KIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDI







LEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWG







RLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFK







EDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKV







MGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQI







LKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDH







IVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQ







LLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKH







VAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKV







REINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRK







MIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIET







NGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESI







LPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSK







KLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPK







YSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKL







KGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKV







LSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKR







YTSTKEVLDATLIHQSITGLYETRIDLSQLGGDSGGSSGGSSGSE







TPGTSESATPESSGGSSGGSSTLNIEDEYRLHETSKEPDVSLGST







WLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQE







ARLGIKPHIQRLLDQGILVPCQSPWNTPLLPVKKPGTNDYRPVQ







DLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCL







RLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPTLFNEAL







HRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLG







NLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPT







PKTPRQLREFLGKAGFCRLFIPGFAEMAAPLYPLTKPGTLFNWGP







DQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQ







KLGPWRRPVAYLSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTM







GQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFG







PVVALNPATLLPLPEEGLQHNCLDILAEAHGTRPDLTDQPLPDAD







HTWYTDGSSLLQEGQRKAGAAVTTETEVIWAKALPAGTSAQRAEL







IALTQALKMAEGKKLNVYTDSRYAFATAHIHGEIYRRRGWLTSEG







KEIKNKDEILALLKALFLPKRLSIIHCPGHQKGHSAEARGNRMAD







QAARKAAITETPDTSTLLIENSSPSGGSKRTADGSEFEKRTADGS







EFESPKKKAKVE







Underlining indicates the residue that differs between the nickase and wild-type sequences.


The gene modifying system comprising the gene modifying polypeptides listed above and the template RNA described above were transfected into primary mouse hepatocytes. The gene modifying polypeptide and the template RNA were delivered by nucleofection in the RNA format. Specifically, 4 ug of gene modifying polypeptide mRNA were combined with 10 ug of chemically synthesized template RNA in 5 μL of water. The transfection mix was added to 100,000 mouse primary hepatocytes in Buffer P3 [Lonza], and cells were nucleofected using program DG-138. After nucleofection, cells were grown at 37° C., 5% CO2 for 3 days prior to cell lysis and genomic DNA extraction. To analyze gene editing activity, primers flanking the target insertion site locus were used to amplify across the locus. Amplicons were analyzed via short read sequencing using an Illumina MiSeq. Conversion of terminal A to G sequence in exon 8 of fah gene indicates successful editing.


As shown in FIG. 2, for FAH2 templates, perfect rewrite levels (conversion of A to G with no unwanted mutations detected) of 4-8% were detected with RNAV209 but not with RNAV214-040. Indel levels of 4.4 to 6.6% were observed with RNAV209. Furthermore, the amount of WT Fah mRNA was measured using quantitative RT-PCR using primers that bind to exons 7 and 8. As shown in FIG. 3, FAH2 templates result in an increase in the abundance of Fah mRNA relative to WT by up to 12% when FAH2 template is tested with RNAV209-013 mRNA. These results demonstrate the use of a gene modifying system to reverse a mutation in the Fah gene, resulting in partial restoration of the expression of wild-type Fah mRNA.


Example 9: Quantifying Activity of a Gene Editing Polypeptide and Template In Vivo for Rewriting the Endogenous FAH Locus Achieved in Mouse Liver

This example demonstrates the use of a gene modifying system containing a gene modifying polypeptide and a template RNA, to convert an A nucleotide to a G nucleotide in the Fah5981SB mouse model into the endogenous Fah locus in mouse liver. The Fah5981SB mouse model harbors a G to A point mutation in the last nucleotide of exon 8 of the Fah gene, leading to aberrant mRNA splicing and subsequent mRNA degradation, without the production of Fah protein and serves as a mouse model of hereditary tyrosinemia type I.


In this example, the template RNA contained:

    • (1) a gRNA spacer;
    • (2) a gRNA scaffold;
    • (3) a heterologous object sequence; and
    • (4) a primer binding site (PBS) sequence.


More specifically, the template RNA comprised the following sequences:











FAH1_R14_P12_Heavy



RNACS048



(SEQ ID NO: 20624)



mG*mG*mA*rUrGrGrUrCrCrUrCrArUrGrArArCrGrArCrG







rUrUrUrUrArGrAmGmCmUmAmGmAmAmAmUmAmGmCrArArGr







UrUrArArArArUrArArGrGrCrUrArGrUrCrCrGrUrUrArU







rCrAmAmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrUrUrArCrCrGrCrUrCrCrArGrUrCrG







rUrUrCrArUrGrArG*mG*mA*mC







FAH1_R15_P10_Heavy



RNACS049



(SEQ ID NO: 20625)



mG*mG*mA*rUrGrGrUrCrCrUrCrArUrGrArArCrGrArCrG







rUrUrUrUrArGrAmGmCmUmAmGmAmAmAmUmAmGmCrArArGr







UrUrArArArArUrArArGrGrCrUrArGrUrCrCrGrUrUrArU







rCrAmAmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArUrUrArCrCrGrCrUrCrCrArGrUrC







rGrUrUrCrArUrG*mA*mG*mG







FAH2_R19_P11_MUT_Heavy



RNACS052



(SEQ ID NO: 20626)



mU*mC*mA*rGrArGrGrArArGrCrUrGrGrGrCrCrArCrCrG







rUrUrUrUrArGrAmGmCmUmAmGmAmAmAmUmAmGmCrArArGr







UrUrArArArArUrArArGrGrCrUrArGrUrCrCrGrUrUrArU







rCrAmAmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrUrGrGrArGrCrGrGrUrArArUrGrGrC







rUrGrGrUrGrGrCrCrCrArGrC*mU*mU*mC







FAH2_R19_P13_MUT_Heavy



RNACS053



mU*mC*mA*rGrArGrGrArArGrCrUrGrGrGrCrCrArCrCrG



(SEQ ID NO: 20627)



rUrUrUrUrArGrAmGmCmUmAmGmAmAmAmUmAmGmCrArArGr







UrUrArArArArUrArArGrGrCrUrArGrUrCrCrGrUrUrArU







rCrAmAmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrUrGrGrArGrCrGrGrUrArArUrGrGrC







rUrGrGrUrGrGrCrCrCrArGrCrUrU*mC*mC*mU






The gene modifying polypeptides tested comprised a sequence of: RNAV209 and RNAV214, the sequences of which are each provided in Example 3.


The gene modifying system comprising the gene modifying polypeptides and the template RNA described above was formulated in LNP and delivered to mice. Specifically, 2 mg/kg of total RNA equivalent formulated in LNPs, combined at 1:1 (w/w) of template RNA and mRNA, were dosed intravenously in 7 to 9-week-old, mixed gender Fah5981SB mice. Six hours or 6 days post-dosing, animals were sacrificed, and their liver collected for analyses. To determine the expression distribution of the gene modifying polypeptide in the liver, 6-hr liver samples were subjected to immunohistochemistry using an anti-Cas9 antibody. Upon staining, quantification of Cas9-positive hepatocytes was determined by QuPath Markup. As shown in FIG. 4, the expression of the gene modifying polypeptide was observed in 82-91% of hepatocytes.


To analyze gene editing activity, primers flanking the target insertion site locus were used to amplify across the locus in the genomic DNA of liver samples collected 6 days post-dosing. Amplicons were analyzed via short read sequencing using an Illumina MiSeq. Conversion of an A nucleotide to a G nucleotide indicates successful editing. As shown in FIG. 5, perfect rewrite levels (conversion of A to G with no unwanted mutations detected) of 0.1%-1.9% were detected across the different groups. Indel levels were in the range of 0.2%-0.4%.


To determine the phenotypic correction caused by the gene editing activity, the restoration of wild-type FAH mRNA was determined by real-time qRT-PCR, and the restoration of Fah protein expression determined by immunohistochemistry using an anti-Fah antibody. As shown in FIG. 6, wild-type mRNA restoration of 0.1%-6%, relative to littermate heterozygous mice, was detected across the different groups. As shown in FIG. 7, Fah protein was detected in 0.1%-7% of liver cross-sectional area across the different groups. These results demonstrate the use of a gene modifying system to reverse a mutation in the Fah gene in an in vivo mouse model for hereditary tyrosinemia type I, resulting in partial restoration of expression of wild-type Fah mRNA and Fah protein.


Example 10. Gene Editing at the TTR Locus in an In Vivo Mouse Model

This Example demonstrates successful delivery of an mRNA and guide using Cas9-mediated gene editing using the protospacer sequence ACACAAAUACCAGUCCAGCG (SEQ ID NO: 20630) that targets the TTR locus using a gene modifying polypeptide and RNA in a C57Blk/6 mouse.


RNAs were prepared as follows. An mRNA encoding a gene modifying polypeptide having the sequence shown in Table 10A below was produced by in vitro transcription and the purified mRNA was dissolved in 1 mM sodium citrate, pH 6, to a final concentration of RNA of 1-2 mg/mL. Similarly, a guide RNA having a sequence shown in Table 10A below was produced by chemical synthesis and dissolved in water or aqueous buffer, to a final concentration of RNA of 1-2 mg/mL.









TABLE 10A







Sequences of Example 10









Name
Nucleic acid sequence
SEQ ID NO





Cas9-RT
AUGCCUGCGGCUAAGCGGGU
20628


gene
AAAAUUGGAUGGUGGGGACA



modifying
AGAAGUACAGCAUCGGCCUG



poly-
GACAUCGGCACCAACUCUGU



peptide
GGGCUGGGCCGUGAUCACCG




ACGAGUACAAGGUGCCCAGC




AAGAAAUUCAAGGUGCUGGG




CAACACCGACCGGCACAGCA




UCAAGAAGAACCUGAUCGGA




GCCCUGCUGUUCGACAGCGG




CGAAACAGCCGAGGCCACCC




GGCUGAAGAGAACCGCCAGA




AGAAGAUACACCAGACGGAA




GAACCGGAUCUGCUAUCUGC




AAGAGAUCUUCAGCAACGAG




AUGGCCAAGGUGGACGACAG




CUUCUUCCACAGACUGGAAG




AGUCCUUCCUGGUGGAAGAG




GAUAAGAAGCACGAGCGGCA




CCCCAUCUUCGGCAACAUCG




UGGACGAGGUGGCCUACCAC




GAGAAGUACCCCACCAUCUA




CCACCUGAGAAAGAAACUGG




UGGACAGCACCGACAAGGCC




GACCUGCGGCUGAUCUAUCU




GGCCCUGGCCCACAUGAUCA




AGUUCCGGGGCCACUUCCUG




AUCGAGGGCGACCUGAACCC




CGACAACAGCGACGUGGACA




AGCUGUUCAUCCAGCUGGUG




CAGACCUACAACCAGCUGUU




CGAGGAAAACCCCAUCAACG




CCAGCGGCGUGGACGCCAAG




GCCAUCCUGUCUGCCAGACU




GAGCAAGAGCAGACGGCUGG




AAAAUCUGAUCGCCCAGCUG




CCCGGCGAGAAGAAGAAUGG




CCUGUUCGGAAACCUGAUUG




CCCUGAGCCUGGGCCUGACC




CCCAACUUCAAGAGCAACUU




CGACCUGGCCGAGGAUGCCA




AACUGCAGCUGAGCAAGGAC




ACCUACGACGACGACCUGGA




CAACCUGCUGGCCCAGAUCG




GCGACCAGUACGCCGACCUG




UUUCUGGCCGCCAAGAACCU




GUCCGACGCCAUCCUGCUGA




GCGACAUCCUGAGAGUGAAC




ACCGAGAUCACCAAGGCCCC




CCUGAGCGCCUCUAUGAUCA




AGAGAUACGACGAGCACCAC




CAGGACCUGACCCUGCUGAA




AGCUCUCGUGCGGCAGCAGC




UGCCUGAGAAGUACAAAGAG




AUUUUCUUCGACCAGAGCAA




GAACGGCUACGCCGGCUACA




UUGACGGCGGAGCCAGCCAG




GAAGAGUUCUACAAGUUCAU




CAAGCCCAUCCUGGAAAAGA




UGGACGGCACCGAGGAACUG




CUCGUGAAGCUGAACAGAGA




GGACCUGCUGCGGAAGCAGC




GGACCUUCGACAACGGCAGC




AUCCCCCACCAGAUCCACCU




GGGAGAGCUGCACGCCAUUC




UGCGGCGGCAGGAAGAUUUU




UACCCAUUCCUGAAGGACAA




CCGGGAAAAGAUCGAGAAGA




UCCUGACCUUCCGCAUCCCC




UACUACGUGGGCCCUCUGGC




CAGGGGAAACAGCAGAUUCG




CCUGGAUGACCAGAAAGAGC




GAGGAAACCAUCACCCCCUG




GAACUUCGAGGAAGUGGUGG




ACAAGGGCGCUUCCGCCCAG




AGCUUCAUCGAGCGGAUGAC




CAACUUCGAUAAGAACCUGC




CCAACGAGAAGGUGCUGCCC




AAGCACAGCCUGCUGUACGA




GUACUUCACCGUGUAUAACG




AGCUGACCAAAGUGAAAUAC




GUGACCGAGGGAAUGAGAAA




GCCCGCCUUCCUGAGCGGCG




AGCAGAAAAAGGCCAUCGUG




GACCUGCUGUUCAAGACCAA




CCGGAAAGUGACCGUGAAGC




AGCUGAAAGAGGACUACUUC




AAGAAAAUCGAGUGCUUCGA




CUCCGUGGAAAUCUCCGGCG




UGGAAGAUCGGUUCAACGCC




UCCCUGGGCACAUACCACGA




UCUGCUGAAAAUUAUCAAGG




ACAAGGACUUCCUGGACAAU




GAGGAAAACGAGGACAUUCU




GGAAGAUAUCGUGCUGACCC




UGACACUGUUUGAGGACAGA




GAGAUGAUCGAGGAACGGCU




GAAAACCUAUGCCCACCUGU




UCGACGACAAAGUGAUGAAG




CAGCUGAAGCGGCGGAGAUA




CACCGGCUGGGGCAGGCUGA




GCCGGAAGCUGAUCAACGGC




AUCCGGGACAAGCAGUCCGG




CAAGACAAUCCUGGAUUUCC




UGAAGUCCGACGGCUUCGCC




AACAGAAACUUCAUGCAGCU




GAUCCACGACGACAGCCUGA




CCUUUAAAGAGGACAUCCAG




AAAGCCCAGGUGUCCGGCCA




GGGCGAUAGCCUGCACGAGC




ACAUUGCCAAUCUGGCCGGC




AGCCCCGCCAUUAAGAAGGG




CAUCCUGCAGACAGUGAAGG




UGGUGGACGAGCUCGUGAAA




GUGAUGGGCCGGCACAAGCC




CGAGAACAUCGUGAUCGAAA




UGGCCAGAGAGAACCAGACC




ACCCAGAAGGGACAGAAGAA




CAGCCGCGAGAGAAUGAAGC




GGAUCGAAGAGGGCAUCAAA




GAGCUGGGCAGCCAGAUCCU




GAAAGAACACCCCGUGGAAA




ACACCCAGCUGCAGAACGAG




AAGCUGUACCUGUACUACCU




GCAGAAUGGGCGGGAUAUGU




ACGUGGACCAGGAACUGGAC




AUCAACCGGCUGUCCGACUA




CGAUGUGGACCAUAUCGUGC




CUCAGAGCUUUCUGAAGGAC




GACUCCAUCGACAACAAGGU




GCUGACCAGAAGCGACAAGA




AUCGGGGCAAGAGCGACAAC




GUGCCCUCCGAAGAGGUCGU




GAAGAAGAUGAAGAACUACU




GGCGGCAGCUGCUGAACGCC




AAGCUGAUUACCCAGAGAAA




GUUCGACAAUCUGACCAAGG




CCGAGAGAGGCGGCCUGAGC




GAACUGGAUAAGGCCGGCUU




CAUCAAGAGACAGCUGGUGG




AAACCCGGCAGAUCACAAAG




CACGUGGCACAGAUCCUGGA




CUCCCGGAUGAACACUAAGU




ACGACGAGAAUGACAAGCUG




AUCCGGGAAGUGAAAGUGAU




CACCCUGAAGUCCAAGCUGG




UGUCCGAUUUCCGGAAGGAU




UUCCAGUUUUACAAAGUGCG




CGAGAUCAACAACUACCACC




ACGCCCACGACGCCUACCUG




AACGCCGUCGUGGGAACCGC




CCUGAUCAAAAAGUACCCUA




AGCUGGAAAGCGAGUUCGUG




UACGGCGACUACAAGGUGUA




CGACGUGCGGAAGAUGAUCG




CCAAGAGCGAGCAGGAAAUC




GGCAAGGCUACCGCCAAGUA




CUUCUUCUACAGCAACAUCA




UGAACUUUUUCAAGACCGAG




AUUACCCUGGCCAACGGCGA




GAUCCGGAAGCGGCCUCUGA




UCGAGACAAACGGCGAAACC




GGGGAGAUCGUGUGGGAUAA




GGGCCGGGAUUUUGCCACCG




UGCGGAAAGUGCUGAGCAUG




CCCCAAGUGAAUAUCGUGAA




AAAGACCGAGGUGCAGACAG




GCGGCUUCAGCAAAGAGUCU




AUCCUGCCCAAGAGGAACAG




CGAUAAGCUGAUCGCCAGAA




AGAAGGACUGGGACCCUAAG




AAGUACGGCGGCUUCGACAG




CCCCACCGUGGCCUAUUCUG




UGCUGGUGGUGGCCAAAGUG




GAAAAGGGCAAGUCCAAGAA




ACUGAAGAGUGUGAAAGAGC




UGCUGGGGAUCACCAUCAUG




GAAAGAAGCAGCUUCGAGAA




GAAUCCCAUCGACUUUCUGG




AAGCCAAGGGCUACAAAGAA




GUGAAAAAGGACCUGAUCAU




CAAGCUGCCUAAGUACUCCC




UGUUCGAGCUGGAAAACGGC




CGGAAGAGAAUGCUGGCCUC




UGCCGGCGAACUGCAGAAGG




GAAACGAACUGGCCCUGCCC




UCCAAAUAUGUGAACUUCCU




GUACCUGGCCAGCCACUAUG




AGAAGCUGAAGGGCUCCCCC




GAGGAUAAUGAGCAGAAACA




GCUGUUUGUGGAACAGCACA




AGCACUACCUGGACGAGAUC




AUCGAGCAGAUCAGCGAGUU




CUCCAAGAGAGUGAUCCUGG




CCGACGCUAAUCUGGACAAA




GUGCUGUCCGCCUACAACAA




GCACCGGGAUAAGCCCAUCA




GAGAGCAGGCCGAGAAUAUC




AUCCACCUGUUUACCCUGAC




CAAUCUGGGAGCCCCUGCCG




CCUUCAAGUACUUUGACACC




ACCAUCGACCGGAAGAGGUA




CACCAGCACCAAAGAGGUGC




UGGACGCCACCCUGAUCCAC




CAGAGCAUCACCGGCCUGUA




CGAGACACGGAUCGACCUGU




CUCAGCUGGGAGGUGACUCU




GGAGGAUCUAGCGGAGGAUC




CUCUGGCAGCGAGACACCAG




GAACAAGCGAGUCAGCAACA




CCAGAGAGCAGUGGCGGCAG




CAGCGGCGGCAGCAGCACCC




UAAAUAUAGAAGAUGAGUAU




CGGCUACAUGAGACCUCAAA




AGAGCCAGAUGUUUCUCUAG




GGUCCACAUGGCUGUCUGAU




UUUCCUCAGGCCUGGGCGGA




AACCGGGGGCAUGGGACUGG




CAGUUCGCCAAGCUCCUCUG




AUCAUACCUCUGAAAGCAAC




CUCUACCCCCGUGUCCAUAA




AACAAUACCCCAUGUCACAA




GAAGCCAGACUGGGGAUCAA




GCCCCACAUACAGAGACUGU




UGGACCAGGGAAUACUGGUA




CCCUGCCAGUCCCCCUGGAA




CACGCCCCUGCUACCCGUUA




AGAAACCAGGGACUAAUGAU




UAUAGGCCUGUCCAGGAUCU




GAGAGAAGUCAACAAGCGGG




UGGAGGACAUCCACCCCACC




GUGCCCAACCCUUACAACCU




CUUGAGCGGGCUCCCACCGU




CCCACCAGUGGUACACUGUG




CUUGAUUUAAAGGAUGCCUU




UUUCUGCCUGAGACUCCACC




CCACCAGUCAGCCUCUCUUC




GCCUUUGAGUGGAGAGAUCC




AGAGAUGGGAAUCUCAGGAC




AAUUGACCUGGACCAGACUC




CCACAGGGUUUCAAAAACAG




UCCCACCCUGUUUAAUGAGG




CACUGCACAGAGACCUAGCA




GACUUCCGGAUCCAGCACCC




AGACUUGAUCCUGCUACAGU




ACGUGGAUGACUUACUGCUG




GCCGCCACUUCUGAGCUAGA




CUGCCAACAAGGUACUCGGG




CCCUGUUACAAACCCUAGGG




AACCUCGGGUAUCGGGCCUC




GGCCAAGAAAGCCCAAAUUU




GCCAGAAACAGGUCAAGUAU




CUGGGGUAUCUUCUAAAAGA




GGGUCAGAGAUGGCUGACUG




AGGCCAGAAAAGAGACUGUG




AUGGGGCAGCCUACUCCGAA




GACCCCUCGACAACUAAGGG




AGUUCCUAGGGAAGGCAGGC




UUCUGUCGCCUCUUCAUCCC




UGGGUUUGCAGAAAUGGCAG




CCCCCCUGUACCCUCUCACC




AAACCGGGGACUCUGUUUAA




UUGGGGCCCAGACCAACAAA




AGGCCUAUCAAGAAAUCAAG




CAAGCCCUUCUAACUGCCCC




AGCCCUGGGGUUGCCAGAUU




UGACUAAGCCCUUUGAACUC




UUUGUCGACGAGAAGCAGGG




CUACGCCAAAGGUGUCCUAA




CGCAAAAACUGGGACCUUGG




CGUCGGCCGGUGGCCUACCU




GUCCAAAAAGCUAGACCCAG




UAGCAGCUGGGUGGCCCCCU




UGCCUACGGAUGGUAGCAGC




CAUUGCCGUACUGACAAAGG




AUGCAGGCAAGCUAACCAUG




GGACAGCCACUAGUCAUUCU




GGCCCCCCAUGCAGUAGAGG




CACUAGUCAAACAACCCCCC




GACCGCUGGCUUUCCAACGC




CCGGAUGACUCACUAUCAGG




CCUUGCUUUUGGACACGGAC




CGGGUCCAGUUCGGACCGGU




GGUAGCCCUGAACCCGGCUA




CGCUGCUCCCACUGCCUGAG




GAAGGGCUGCAACACAACUG




CCUUGAUAUCCUGGCCGAAG




CCCACGGAACCCGACCCGAC




CUAACGGACCAGCCGCUCCC




AGACGCCGACCACACCUGGU




ACACGGAUGGAAGCAGUCUC




UUACAAGAGGGACAGCGUAA




GGCGGGAGCUGCGGUGACCA




CCGAGACCGAGGUAAUCUGG




GCUAAAGCCCUGCCAGCCGG




GACAUCCGCUCAGCGGGCUG




AACUGAUAGCACUCACCCAG




GCCCUAAAGAUGGCAGAAGG




UAAGAAGCUAAAUGUUUAUA




CUGAUAGCCGUUAUGCUUUU




GCUACUGCCCAUAUCCAUGG




AGAAAUAUACAGAAGGCGUG




GGUGGCUCACAUCAGAAGGC




AAAGAGAUCAAAAAUAAAGA




CGAGAUCUUGGCCCUACUAA




AAGCCCUCUUUCUGCCCAAA




AGACUUAGCAUAAUCCAUUG




UCCAGGACAUCAAAAGGGAC




ACAGCGCCGAGGCUAGAGGC




AACCGGAUGGCUGACCAAGC




GGCCCGAAAGGCAGCCAUCA




CAGAGACUCCAGACACCUCU




ACCCUCCUCAUAGAAAAUUC




AUCACCCUCUGGCGGCUCAA




AAAGAACCGCCGACGGCAGC




GAAUUCGAGAAAAGGACGGC




GGAUGGUAGCGAAUUCGAGA




GCCCUAAAAAGAAGGCCAAG




GUAGAGUAA






guide RNA
mA*mC*mA*CAAAUACCAGU
20629



CCAGCGGUUUUAGAmGmCmU




mAmGmAmAmAmUmAmGmCAA




GUUAAAAUAAGGCUAGUCCG




UUAUCAmAmCmUmUmGmAmA




mAmAmAmGmUmGmGmCmAmC




mCmGmAmGmUmCmGmGmUmG




mCmU*mU*mU*mU




m = 2′OMethyl,




* = phosphorothioate




linkage









Lipid nanoparticle (LNP) components (ionizable lipid, helper lipid, sterol, PEG) were dissolved in 100% ethanol with the lipid component molar ratios of 47:8:43.5:1.5, respectively. RNA (guide and mRNA) was combined in a 1:1 weight ratio and diluted to a concentration of 0.05-0.2 mg/mL in sodium acetate buffer, pH 5. RNA was formulated into distinct LNPs with a lipid amine to total RNA phosphate (N:P) molar ratio of 4.0. The LNPs were formed by microfluidic or turbulent mixing of the lipid and RNA solutions. A 3:1 ratio of aqueous to organic solvent was maintained during mixing using differential flow rates. After mixing, the LNPs were diluted, collected and buffer exchanged into 50 mM Tris, 9% sucrose buffer using tangential flow filtration. Formulations were concentrated to 1.0 mg/mL or higher then filtered through 0.2 μm sterile filter. The final LNP were stored at −80° C. until further use.


The LNP formulations were delivered intravenously by bolus tail vein injection to C57Blk/6 mice that were approximately 8 weeks old at concentrations ranging from 1-0.1 mg/kg. The expression of the Cas9-RT was measured by 6 hours after injection by euthanizing animals and collecting livers during necropsy. Animals were euthanized at 5 days after injection where liver was collected upon necropsy to which the activity of gene editing of the TTR locus was assessed. Expression of the Cas9-RT gene editing polypeptide in liver was measured by Western blot where Cas9 was detected by a mouse monoclonal antibody (7A9-3A3, Cell Signaling Technology) and GAPDH (Cell Signaling Technology) was used as a loading control. (FIG. 12). Editing of the TTR locus was quantified by Sanger sequencing followed by TIDE analysis of an amplicon of the TTR locus near the binding site of the protospacer. Editing of the TTR locus was observed, as shown in FIG. 13. TTR protein levels in serum were quantified by an ELISA using a standard curve (Aviva Biosciences). TTR protein levels in serum declined in treated animals, as shown in FIG. 14. These experiments demonstrate that the Cas9-RT polypeptide can be expressed in vivo, and can edit the TTR locus, resulting in a decrease in TTR protein levels in serum.


Example 11. Gene Editing at the TTR Locus in an In Vivo Cynomolgus Macaque Model

This Example demonstrates successful delivery of an mRNA and guide using Cas9-mediated gene editing using the protospacer sequence ACACAAAUACCAGUCCAGCG (SEQ ID NO: 20630) that targets the TTR locus using a gene modifying polypeptide and RNA in a cynomolgus model.


RNAs were prepared as follows. An mRNA encoding a gene modifying polypeptide having the sequence shown in Table 11A below was produced by in vitro transcription and the purified mRNA was dissolved in 1 mM sodium citrate, pH 6, to a final concentration of RNA of 1-2 mg/mL. Similarly, a guide RNA having a sequence shown in Table 11A below was produced by chemical synthesis and dissolved in water or aqueous buffer, to a final concentration of RNA of 1-2 mg/mL.









TABLE 11A







Sequences of Example 11









Name
Nucleic acid sequence
SEQ ID NO





Cas9-RT gene
AUGCCUGCGGCUAAGCGGGUAAAAU
20631


modifying
UGGAUGGUGGGGACAAGAAGUACAG



polypeptide
CAUCGGCCUGGACAUCGGCACCAAC




UCUGUGGGCUGGGCCGUGAUCACCG




ACGAGUACAAGGUGCCCAGCAAGAA




AUUCAAGGUGCUGGGCAACACCGAC




CGGCACAGCAUCAAGAAGAACCUGA




UCGGAGCCCUGCUGUUCGACAGCGG




CGAAACAGCCGAGGCCACCCGGCUG




AAGAGAACCGCCAGAAGAAGAUACA




CCAGACGGAAGAACCGGAUCUGCUA




UCUGCAAGAGAUCUUCAGCAACGAG




AUGGCCAAGGUGGACGACAGCUUCU




UCCACAGACUGGAAGAGUCCUUCCU




GGUGGAAGAGGAUAAGAAGCACGAG




CGGCACCCCAUCUUCGGCAACAUCG




UGGACGAGGUGGCCUACCACGAGAA




GUACCCCACCAUCUACCACCUGAGA




AAGAAACUGGUGGACAGCACCGACA




AGGCCGACCUGCGGCUGAUCUAUCU




GGCCCUGGCCCACAUGAUCAAGUUC




CGGGGCCACUUCCUGAUCGAGGGCG




ACCUGAACCCCGACAACAGCGACGU




GGACAAGCUGUUCAUCCAGCUGGUG




CAGACCUACAACCAGCUGUUCGAGG




AAAACCCCAUCAACGCCAGCGGCGU




GGACGCCAAGGCCAUCCUGUCUGCC




AGACUGAGCAAGAGCAGACGGCUGG




AAAAUCUGAUCGCCCAGCUGCCCGG




CGAGAAGAAGAAUGGCCUGUUCGGA




AACCUGAUUGCCCUGAGCCUGGGCC




UGACCCCCAACUUCAAGAGCAACUU




CGACCUGGCCGAGGAUGCCAAACUG




CAGCUGAGCAAGGACACCUACGACG




ACGACCUGGACAACCUGCUGGCCCA




GAUCGGCGACCAGUACGCCGACCUG




UUUCUGGCCGCCAAGAACCUGUCCG




ACGCCAUCCUGCUGAGCGACAUCCU




GAGAGUGAACACCGAGAUCACCAAG




GCCCCCCUGAGCGCCUCUAUGAUCA




AGAGAUACGACGAGCACCACCAGGA




CCUGACCCUGCUGAAAGCUCUCGUG




CGGCAGCAGCUGCCUGAGAAGUACA




AAGAGAUUUUCUUCGACCAGAGCAA




GAACGGCUACGCCGGCUACAUUGAC




GGCGGAGCCAGCCAGGAAGAGUUCU




ACAAGUUCAUCAAGCCCAUCCUGGA




AAAGAUGGACGGCACCGAGGAACUG




CUCGUGAAGCUGAACAGAGAGGACC




UGCUGCGGAAGCAGCGGACCUUCGA




CAACGGCAGCAUCCCCCACCAGAUC




CACCUGGGAGAGCUGCACGCCAUUC




UGCGGCGGCAGGAAGAUUUUUACCC




AUUCCUGAAGGACAACCGGGAAAAG




AUCGAGAAGAUCCUGACCUUCCGCA




UCCCCUACUACGUGGGCCCUCUGGC




CAGGGGAAACAGCAGAUUCGCCUGG




AUGACCAGAAAGAGCGAGGAAACCA




UCACCCCCUGGAACUUCGAGGAAGU




GGUGGACAAGGGCGCUUCCGCCCAG




AGCUUCAUCGAGCGGAUGACCAACU




UCGAUAAGAACCUGCCCAACGAGAA




GGUGCUGCCCAAGCACAGCCUGCUG




UACGAGUACUUCACCGUGUAUAACG




AGCUGACCAAAGUGAAAUACGUGAC




CGAGGGAAUGAGAAAGCCCGCCUUC




CUGAGCGGCGAGCAGAAAAAGGCCA




UCGUGGACCUGCUGUUCAAGACCAA




CCGGAAAGUGACCGUGAAGCAGCUG




AAAGAGGACUACUUCAAGAAAAUCG




AGUGCUUCGACUCCGUGGAAAUCUC




CGGCGUGGAAGAUCGGUUCAACGCC




UCCCUGGGCACAUACCACGAUCUGC




UGAAAAUUAUCAAGGACAAGGACUU




CCUGGACAAUGAGGAAAACGAGGAC




AUUCUGGAAGAUAUCGUGCUGACCC




UGACACUGUUUGAGGACAGAGAGAU




GAUCGAGGAACGGCUGAAAACCUAU




GCCCACCUGUUCGACGACAAAGUGA




UGAAGCAGCUGAAGCGGCGGAGAUA




CACCGGCUGGGGCAGGCUGAGCCGG




AAGCUGAUCAACGGCAUCCGGGACA




AGCAGUCCGGCAAGACAAUCCUGGA




UUUCCUGAAGUCCGACGGCUUCGCC




AACAGAAACUUCAUGCAGCUGAUCC




ACGACGACAGCCUGACCUUUAAAGA




GGACAUCCAGAAAGCCCAGGUGUCC




GGCCAGGGCGAUAGCCUGCACGAGC




ACAUUGCCAAUCUGGCCGGCAGCCC




CGCCAUUAAGAAGGGCAUCCUGCAG




ACAGUGAAGGUGGUGGACGAGCUCG




UGAAAGUGAUGGGCCGGCACAAGCC




CGAGAACAUCGUGAUCGAAAUGGCC




AGAGAGAACCAGACCACCCAGAAGG




GACAGAAGAACAGCCGCGAGAGAAU




GAAGCGGAUCGAAGAGGGCAUCAAA




GAGCUGGGCAGCCAGAUCCUGAAAG




AACACCCCGUGGAAAACACCCAGCU




GCAGAACGAGAAGCUGUACCUGUAC




UACCUGCAGAAUGGGCGGGAUAUGU




ACGUGGACCAGGAACUGGACAUCAA




CCGGCUGUCCGACUACGAUGUGGAC




CAUAUCGUGCCUCAGAGCUUUCUGA




AGGACGACUCCAUCGACAACAAGGU




GCUGACCAGAAGCGACAAGAAUCGG




GGCAAGAGCGACAACGUGCCCUCCG




AAGAGGUCGUGAAGAAGAUGAAGAA




CUACUGGCGGCAGCUGCUGAACGCC




AAGCUGAUUACCCAGAGAAAGUUCG




ACAAUCUGACCAAGGCCGAGAGAGG




CGGCCUGAGCGAACUGGAUAAGGCC




GGCUUCAUCAAGAGACAGCUGGUGG




AAACCCGGCAGAUCACAAAGCACGU




GGCACAGAUCCUGGACUCCCGGAUG




AACACUAAGUACGACGAGAAUGACA




AGCUGAUCCGGGAAGUGAAAGUGAU




CACCCUGAAGUCCAAGCUGGUGUCC




GAUUUCCGGAAGGAUUUCCAGUUUU




ACAAAGUGCGCGAGAUCAACAACUA




CCACCACGCCCACGACGCCUACCUG




AACGCCGUCGUGGGAACCGCCCUGA




UCAAAAAGUACCCUAAGCUGGAAAG




CGAGUUCGUGUACGGCGACUACAAG




GUGUACGACGUGCGGAAGAUGAUCG




CCAAGAGCGAGCAGGAAAUCGGCAA




GGCUACCGCCAAGUACUUCUUCUAC




AGCAACAUCAUGAACUUUUUCAAGA




CCGAGAUUACCCUGGCCAACGGCGA




GAUCCGGAAGCGGCCUCUGAUCGAG




ACAAACGGCGAAACCGGGGAGAUCG




UGUGGGAUAAGGGCCGGGAUUUUGC




CACCGUGCGGAAAGUGCUGAGCAUG




CCCCAAGUGAAUAUCGUGAAAAAGA




CCGAGGUGCAGACAGGCGGCUUCAG




CAAAGAGUCUAUCCUGCCCAAGAGG




AACAGCGAUAAGCUGAUCGCCAGAA




AGAAGGACUGGGACCCUAAGAAGUA




CGGCGGCUUCGACAGCCCCACCGUG




GCCUAUUCUGUGCUGGUGGUGGCCA




AAGUGGAAAAGGGCAAGUCCAAGAA




ACUGAAGAGUGUGAAAGAGCUGCUG




GGGAUCACCAUCAUGGAAAGAAGCA




GCUUCGAGAAGAAUCCCAUCGACUU




UCUGGAAGCCAAGGGCUACAAAGAA




GUGAAAAAGGACCUGAUCAUCAAGC




UGCCUAAGUACUCCCUGUUCGAGCU




GGAAAACGGCCGGAAGAGAAUGCUG




GCCUCUGCCGGCGAACUGCAGAAGG




GAAACGAACUGGCCCUGCCCUCCAA




AUAUGUGAACUUCCUGUACCUGGCC




AGCCACUAUGAGAAGCUGAAGGGCU




CCCCCGAGGAUAAUGAGCAGAAACA




GCUGUUUGUGGAACAGCACAAGCAC




UACCUGGACGAGAUCAUCGAGCAGA




UCAGCGAGUUCUCCAAGAGAGUGAU




CCUGGCCGACGCUAAUCUGGACAAA




GUGCUGUCCGCCUACAACAAGCACC




GGGAUAAGCCCAUCAGAGAGCAGGC




CGAGAAUAUCAUCCACCUGUUUACC




CUGACCAAUCUGGGAGCCCCUGCCG




CCUUCAAGUACUUUGACACCACCAU




CGACCGGAAGAGGUACACCAGCACC




AAAGAGGUGCUGGACGCCACCCUGA




UCCACCAGAGCAUCACCGGCCUGUA




CGAGACACGGAUCGACCUGUCUCAG




CUGGGAGGUGACUCUGGAGGAUCUA




GCGGAGGAUCCUCUGGCAGCGAGAC




ACCAGGAACAAGCGAGUCAGCAACA




CCAGAGAGCAGUGGCGGCAGCAGCG




GCGGCAGCAGCACCCUAAAUAUAGA




AGAUGAGUAUCGGCUACAUGAGACC




UCAAAAGAGCCAGAUGUUUCUCUAG




GGUCCACAUGGCUGUCUGAUUUUCC




UCAGGCCUGGGCGGAAACCGGGGGC




AUGGGACUGGCAGUUCGCCAAGCUC




CUCUGAUCAUACCUCUGAAAGCAAC




CUCUACCCCCGUGUCCAUAAAACAA




UACCCCAUGUCACAAGAAGCCAGAC




UGGGGAUCAAGCCCCACAUACAGAG




ACUGUUGGACCAGGGAAUACUGGUA




CCCUGCCAGUCCCCCUGGAACACGC




CCCUGCUACCCGUUAAGAAACCAGG




GACUAAUGAUUAUAGGCCUGUCCAG




GAUCUGAGAGAAGUCAACAAGCGGG




UGGAGGACAUCCACCCCACCGUGCC




CAACCCUUACAACCUCUUGAGCGGG




CUCCCACCGUCCCACCAGUGGUACA




CUGUGCUUGAUUUAAAGGAUGCCUU




UUUCUGCCUGAGACUCCACCCCACC




AGUCAGCCUCUCUUCGCCUUUGAGU




GGAGAGAUCCAGAGAUGGGAAUCUC




AGGACAAUUGACCUGGACCAGACUC




CCACAGGGUUUCAAAAACAGUCCCA




CCCUGUUUAAUGAGGCACUGCACAG




AGACCUAGCAGACUUCCGGAUCCAG




CACCCAGACUUGAUCCUGCUACAGU




ACGUGGAUGACUUACUGCUGGCCGC




CACUUCUGAGCUAGACUGCCAACAA




GGUACUCGGGCCCUGUUACAAACCC




UAGGGAACCUCGGGUAUCGGGCCUC




GGCCAAGAAAGCCCAAAUUUGCCAG




AAACAGGUCAAGUAUCUGGGGUAUC




UUCUAAAAGAGGGUCAGAGAUGGCU




GACUGAGGCCAGAAAAGAGACUGUG




AUGGGGCAGCCUACUCCGAAGACCC




CUCGACAACUAAGGGAGUUCCUAGG




GAAGGCAGGCUUCUGUCGCCUCUUC




AUCCCUGGGUUUGCAGAAAUGGCAG




CCCCCCUGUACCCUCUCACCAAACC




GGGGACUCUGUUUAAUUGGGGCCCA




GACCAACAAAAGGCCUAUCAAGAAA




UCAAGCAAGCCCUUCUAACUGCCCC




AGCCCUGGGGUUGCCAGAUUUGACU




AAGCCCUUUGAACUCUUUGUCGACG




AGAAGCAGGGCUACGCCAAAGGUGU




CCUAACGCAAAAACUGGGACCUUGG




CGUCGGCCGGUGGCCUACCUGUCCA




AAAAGCUAGACCCAGUAGCAGCUGG




GUGGCCCCCUUGCCUACGGAUGGUA




GCAGCCAUUGCCGUACUGACAAAGG




AUGCAGGCAAGCUAACCAUGGGACA




GCCACUAGUCAUUCUGGCCCCCCAU




GCAGUAGAGGCACUAGUCAAACAAC




CCCCCGACCGCUGGCUUUCCAACGC




CCGGAUGACUCACUAUCAGGCCUUG




CUUUUGGACACGGACCGGGUCCAGU




UCGGACCGGUGGUAGCCCUGAACCC




GGCUACGCUGCUCCCACUGCCUGAG




GAAGGGCUGCAACACAACUGCCUUG




AUAUCCUGGCCGAAGCCCACGGAAC




CCGACCCGACCUAACGGACCAGCCG




CUCCCAGACGCCGACCACACCUGGU




ACACGGAUGGAAGCAGUCUCUUACA




AGAGGGACAGCGUAAGGCGGGAGCU




GCGGUGACCACCGAGACCGAGGUAA




UCUGGGCUAAAGCCCUGCCAGCCGG




GACAUCCGCUCAGCGGGCUGAACUG




AUAGCACUCACCCAGGCCCUAAAGA




UGGCAGAAGGUAAGAAGCUAAAUGU




UUAUACUGAUAGCCGUUAUGCUUUU




GCUACUGCCCAUAUCCAUGGAGAAA




UAUACAGAAGGCGUGGGUGGCUCAC




AUCAGAAGGCAAAGAGAUCAAAAAU




AAAGACGAGAUCUUGGCCCUACUAA




AAGCCCUCUUUCUGCCCAAAAGACU




UAGCAUAAUCCAUUGUCCAGGACAU




CAAAAGGGACACAGCGCCGAGGCUA




GAGGCAACCGGAUGGCUGACCAAGC




GGCCCGAAAGGCAGCCAUCACAGAG




ACUCCAGACACCUCUACCCUCCUCA




UAGAAAAUUCAUCACCCUCUGGCGG




CUCAAAAAGAACCGCCGACGGCAGC




GAAUUCGAGAAAAGGACGGCGGAUG




GUAGCGAAUUCGAGAGCCCUAAAAA




GAAGGCCAAGGUAGAGUAA






guide RNA
mA*mC*mA*CAAAUACCAGUCCAGC
20632



GGUUUUAGAmGmCmUmAmGmAmAmA




mUmAmGmCAAGUUAAAAUAAGGCUA




GUCCGUUAUCAmAmCmUmUmGmAmA




mAmAmAmGmUmGmGmCmAmCmCmGm




AmGmUmCmGmGmUmGmCmU*mU*mU




*mU




m = 2′OMethyl,




* = phosphorothioate linkage









Lipid nanoparticle (LNP) components (ionizable lipid, helper lipid, sterol, PEG) were dissolved in 100% ethanol with the lipid component molar ratios of 47:8:43.5:1.5, respectively. RNA (guide and mRNA) was combined in a 1:1 weight ratio and diluted to a concentration of 0.05-0.2 mg/mL in sodium acetate buffer, pH 5. RNA was formulated into distinct LNPs with a lipid amine to total RNA phosphate (N:P) molar ratio of 4.0. The LNPs were formed by microfluidic or turbulent mixing of the lipid and RNA solutions. A 3:1 ratio of aqueous to organic solvent was maintained during mixing using differential flow rates. After mixing, the LNPs were diluted, collected and buffer exchanged into 50 mM Tris, 9% sucrose buffer using tangential flow filtration. Formulations were concentrated to 1.0 mg/mL or higher then filtered through 0.2 μm sterile filter. The final LNP were stored at −80° C. until further use. The LNP formulations were delivered intravenously by infusion over the course of 1 hour at 2 mg/kg where the volume of the infusion was 5 ml/kg. Cynomolgus macaques from mainland Asia were given dexamethasone 2 mg/kg bolus via intramuscular injection 1.5-2 h prior to intravenous infusion using a syringe pump. Animals were monitored after infusion and the expression of the Cas9-RT was measured by laparoscopic biopsies taken from the liver 8-12 h, 24 h, and 48 h after infusion. Animals were euthanized 14 days after infusion and liver was harvested by dividing the organ up into 8 different segments to which the activity of gene editing of the TTR locus was assessed. Expression of the Cas9-RT gene editing polypeptide in liver was quantified by capillary electrophoresis western blot using the ProteinSimple Jess system (bio-techne) where Cas9 was detected by a mouse monoclonal antibody (7A9-3A3, Cell Signaling Technology).


Relative expression of the Cas9-RT gene editing polypeptide was measured by an area under curve analysis, as shown in FIG. 15. Editing of the TTR locus was quantified by amplicon-sequencing of the TTR locus near the binding site of the protospacer. Editing of the TTR locus was observed, as shown in FIG. 16. These experiments demonstrate that the Cas9-RT polypeptide can be expressed in vivo in a non-human primate model and can edit the TTR locus.


Example 12: Quantifying Activity of a Gene Modifying Polypeptide and Template RNA for Editing the Endogenous B-Globin Locus Achieved in CD34+Primary Human Hematopoietic Stem Cells (HSCs)

This example demonstrates the use of a gene modifying system containing an exemplary gene modifying polypeptide and a template RNA, to convert the glutamic acid codon (GAG) at amino acid position 7 in the endogenous B-globin locus in primary human HSCs to alanine (GCG), thereby demonstrating targeting the sequence position associated with sickle cell disease (SCD) and editing of the sequence to encode a non-pathogenic amino acid at position 7. The “C” residue installed by this process is referred to as the “Makassar” variant and is a non-pathogenic sequence variant that occurs in the human population. This conversion comprises the change of one base pair (i.e., replacement of the DNA base adenine at nucleotide positions 20 to the base cytosine in SEQ ID NO: 20633).











(SEQ ID NO: 20633)



ATGGTGCATCTGACTCCTGAGGAGAAGTCTGCCGTTACTGCCCTG







TGGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGC







AGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTCC







TTTGGGGATCTGTCCACTCCTGATGCTGTTATGGGCAACCCTAAG







GTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGC







CTGGCTCACCTGGACAACCTCAAGGGCACCTTTGCCACACTGAGT







GAGCTGCACTGTGACAAGCTGCACGTGGATCCTGAGAACTTCAGG







CTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCATCACTTTGGC







AAAGAATTCACCCCACCAGTGCAGGCTGCCTATCAGAAAGTGGTG







GCTGGTGTGGCTAATGCCCTGGCCCACAAGTATCACTAAGCTCGC







TTTCTTGCTGTCCAATTTCTATTAAAGGTTCCTTTGTTCCCTAA






In this example, the template RNAs contained:

    • (1) a gRNA spacer;
    • (2) a gRNA scaffold;
    • (3) a heterologous object sequence; and
    • (4) a primer binding site (PBS) sequence.


The exemplary template RNAs comprised the following sequences from 5′ to 3′ wherein the first 3, and last 3 bases have 2′-O-methyl phosphorothioate chemical modifications as indicated. In the sequences below, m=2′-O-methyl ribonucleotide, r=ribose, and *=phosphorothioate bond.











tg34



(SEQ ID NO: 20634)



mG*mU*mA*rArCrGrGrCrArGrArCrUrUrCrUrCrCrUrCrG







rUrUrUrUrArGrArGrCrUrArGrArArArUrArGrCrArArGr







UrUrArArArArUrArArGrGrCrUrArGrUrCrCrGrUrUrArU







rCrArArCrUrUrGrArArArArArGrUrGrGrCrArCrCrGrAr







GrUrCrGrGrUrGrCrArCrCrUrGrArCrUrCrCrUrGrCrGrG







rArGrArArGrUrC*mU*mG*mC







tg35



(SEQ ID NO: 20635)



mG*mU*mA*rArCrGrGrCrArGrArCrUrUrCrUrCrCrUrCrG







rUrUrUrUrArGrArGrCrUrArGrArArArUrArGrCrArArGr







UrUrArArArArUrArArGrGrCrUrArGrUrCrCrGrUrUrArU







rCrArArCrUrUrGrArArArArArGrUrGrGrCrArCrCrGrAr







GrUrCrGrGrUrGrCrArCrCrUrGrArCrUrCrCrUrGrCrGrG







rArGrArArGrUrCrU*mG*mC*mC







tg36



(SEQ ID NO: 20636)



mG*mU*mA*rArCrGrGrCrArGrArCrUrUrCrUrCrCrUrCrG







rUrUrUrUrArGrArGrCrUrArGrArArArUrArGrCrArArGr







UrUrArArArArUrArArGrGrCrUrArGrUrCrCrGrUrUrArU







rCrArArCrUrUrGrArArArArArGrUrGrGrCrArCrCrGrAr







GrUrCrGrGrUrGrCrGrCrArCrCrUrGrArCrUrCrCrUrGrC







rGrGrArGrArArGrUrC*mU*mG*mC







Unmodified versions of these sequences are shown in Table BB below. In some embodiments, the sequences used in this table can be used without chemical modifications.









TABLE BB







tg34, tg35, and tg36 without


nucleotide modifications.













SEQ



Name
Sequence
ID NO







tg34
GUAACGGCAGACUUCUCCUCGUUUU
21767




AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCACCU





GACUCCUGCGGAGAAGUCUGC








tg35
GUAACGGCAGACUUCUCCUCGUUUU
21768




AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCACCU





GACUCCUGCGGAGAAGUCUGCC








tg36
GUAACGGCAGACUUCUCCUCGUUUU
21769




AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCGCAC





CUGACUCCUGCGGAGAAGUCUGC











The gene modifying polypeptide tested comprised the sequence set out in Example 8 labeled RNAV209.


The gene modifying system comprising the gene modifying polypeptides and the template RNA described above was transfected into human HSCs. The gene modifying polypeptide and the template RNA were delivered by nucleofection in RNA format. Specifically, 2000 ng of mRNA encoding the gene modifying polypeptide were combined with 2000 ng template RNA. The RNA mixture was added to 200,000 primary human HSCs in a total of 20 μL of Lonza P3 buffer and HSC were nucleofected in 16-well nucleofection cassettes using program DZ-100. After nucleofection, cells were incubated at room temperature for 10 minutes and were transferred to 24-well plates containing 500 μL of StemSpan-XF+SCF at 100 ng/mL, Flt3-L at 100 ng/ml, and TPO at 100 ng/mL in each well and cultured at 37° C., 5% CO2 for 3 days prior to cell lysis and genomic DNA extraction. To analyze gene editing activity, primers flanking the target insertion site locus were used to amplify across the locus. Amplicons were analyzed via short read sequencing using an Illumina MiSeq. Replacement of the DNA base adenine at nucleotide position 20 to the base cytosine downstream of the transcriptional start site within the endogenous B-globin locus indicated successful editing.


As shown in FIG. 17, average perfect rewrite levels, corresponding to replacement of the C nucleotide with an A at the SCD codon, of 1.3%-1.8% were detected in primary human HSCs when the primary human HSCs were treated with the exemplary template gRNAs and mRNA encoding the exemplary gene modifying polypeptide. These results demonstrate the use of a gene modifying system to edit a non-pathogenic sequence into a clinically relevant codon in the endogenous B-globin locus in primary human HSCs. The results further demonstrate that several exemplary template RNAs can be used to achieve the desired editing.


Example 13: Comparing the Activity of Different Second Strand-Targeting gRNA in Combination with a Gene Modifying Polypeptide and Template RNAs for Editing the Endogenous B-Globin Locus in CD34+Primary Human Hematopoietic Stem Cells (HSCs)

This example demonstrates the use of exemplary gene modifying systems containing a gene modifying polypeptide, a template RNA, and one of several different exemplary second strand-targeting gRNAs to convert the glutamic acid codon (GAG) at amino acid position 7 in the endogenous B-globin locus in primary human HSCs to alanine (GCA or GCG), thereby demonstrating targeting the sequence position associated with sickle cell disease (SCD) and editing of the sequence to encode a non-pathogenic amino acid at position 7. This conversion comprises a change of two base pairs for exemplary template RNAs comprising the exemplary HBB5 spacer (i.e., replacement of the DNA bases adenine and guanine at nucleotide positions 20 and 21 to the bases cytosine and adenine, respectively) and the change of one base pair for exemplary template RNAs comprising the exemplary HBB8 spacer (i.e., replacement of the DNA base adenine at nucleotide positions 20 to the base cytosine).


In this example, the template RNA contained:

    • (1) a gRNA spacer;
    • (2) a gRNA scaffold;
    • (3) a heterologous object sequence; and
    • (4) a primer binding site (PBS) sequence.


The template RNAs comprised the nucleic acid sequence set out in Example 5 labeled FYF tgRNA14 for the exemplary HBB5 template RNA or tg34 for the exemplary HBB8 template RNA, respectively.


The gene modifying polypeptide comprised the amino acid sequence set out in Example 8 labeled RNAV209.


The second strand-targeting gRNA sequences, designed to produce a second nick, comprised the sequences listed in Table X1.









TABLE X1







Exemplary Second Strand-Targeting gRNAs











SEQ


Name
RNA sequence
ID NO





HBB5_
mC*mU*mU*rGrCrCrCrCrArCrArGrGrGrCrA
20817


216rv
rGrUrArArGrUrUrUrUrArGrAmGmCmUmAmGm




AmAmAmUmAmGmCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAm




AmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmC




mGmAmGmUmCmGmGmUmGmCmU*mU*mU*mU






HBB5_
mU*mG*mC*rArGrGrArGrUrCrArGrGrUrGrC
20818


24rv
rArCrCrArGrUrUrUrUrArGrAmGmCmUmAmGm




AmAmAmUmAmGmCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAm




AmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmC




mGmAmGmUmCmGmGmUmGmCmU*mU*mU*mU






HBB5_
mC*mA*mG*rArCrUrUrCrUrCrUrGrCrArGrG
20819


34rv
rArGrUrCrGrUrUrUrUrArGrAmGmCmUmAmGm




AmAmAmUmAmGmCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAm




AmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmC




mGmAmGmUmCmGmGmUmGmCmU*mU*mU*mU






HBB5_
mC*mA*mG*rArCrUrUrCrUrCrUrGrCrCrGrG
20820


34rv_
rArGrUrCrGrUrUrUrUrArGrAmGmCmUmAmGm



h
AmAmAmUmAmGmCrArArGrUrUrArArArArUrA



s1
rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAm




AmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmC




mGmAmGmUmCmGmGmUmGmCmU*mU*mU*mU






HBB5_
mG*mU*mA*rArCrGrGrCrArGrArCrUrUrCrU
20821


41rv
rCrUrGrCrGrUrUrUrUrArGrAmGmCmUmAmGm




AmAmAmUmAmGmCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAm




AmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmC




mGmAmGmUmCmGmGmUmGmCmU*mU*mU*mU






HBB5_
mA*mA*mG*rCrArArArUrGrUrArArGrCrArA
20822


122rv
rUrArGrArGrUrUrUrUrArGrAmGmCmUmAmGm




AmAmAmUmAmGmCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAm




AmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmC




mGmAmGmUmCmGmGmUmGmCmU*mU*mU*mU






HBB5_
mC*mU*mG*rArCrUrUrUrUrArUrGrCrCrCrA
20823


92rv
rGrCrCrCrGrUrUrUrUrArGrAmGmCmUmAmGm




AmAmAmUmAmGmCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAm




AmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmC




mGmAmGmUmCmGmGmUmGmCmU*mU*mU*mU






HBB5_
mC*mC*mU*rUrGrArUrArCrCrArArCrCrUrG
20824


g27
rCrCrCrArGrUrUrUrUrArGrAmGmCmUmAmGm




AmAmAmUmAmGmCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAm




AmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmC




mGmAmGmUmCmGmGmUmGmCmU*mU*mU*mU






HBB5_
mC*mA*mC*rGrUrUrCrArCrCrUrUrGrCrCrC
20825


g37
rCrArCrArGrUrUrUrUrArGrAmGmCmUmAmGm




AmAmAmUmAmGmCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAm




AmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmC




mGmAmGmUmCmGmGmUmGmCmU*mU*mU*mU






HBB5_
mC*mC*mA*rCrGrUrUrCrArCrCrUrUrGrCrC
20826


g38
rCrCrArCrGrUrUrUrUrArGrArGrCrUrArGr




ArArArUrArGrCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAr




ArCrUrUrGrArArArArArGrUrGrGrCrArCrC




rGrArGrUrCrGrGrUrGrCmU*mU*mU*rU






HBB5_
mA*mC*mC*rUrUrGrArUrArCrCrArArCrCrU
20827


g39
rGrCrCrCrGrUrUrUrUrArGrArGrCrUrArGr




ArArArUrArGrCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAr




ArCrUrUrGrArArArArArGrUrGrGrCrArCrC




rGrArGrUrCrGrGrUrGrCmU*mU*mU*rU






HBB5_
mU*mC*mC*rArCrArUrGrCrCrCrArGrUrUrU
20828


g40
rCrUrArUrGrUrUrUrUrArGrArGrCrUrArGr




ArArArUrArGrCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAr




ArCrUrUrGrArArArArArGrUrGrGrCrArCrC




rGrArGrUrCrGrGrUrGrCmU*mU*mU*rU






HBB8_
mC*mA*mG*rGrGrCrUrGrGrGrCrArUrArArA
20829


gRNA1
rArGrUrCrGrUrUrUrUrArGrAmGmCmUmAmGm




AmAmAmUmAmGmCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAm




AmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmC




mGmAmGmUmCmGmGmUmGmCmU*mU*mU*mU






HBB8_
mA*mG*mG*rGrCrUrGrGrGrCrArUrArArArA
20830


gRNA2
rGrUrCrArGrUrUrUrUrArGrAmGmCmUmAmGm




AmAmAmUmAmGmCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAm




AmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmC




mGmAmGmUmCmGmGmUmGmCmU*mU*mU*mU






HBB8_
mG*mC*mA*rArCrCrUrCrArArArCrArGrArC
20831


gRNA3
rArCrCrArGrUrUrUrUrArGrAmGmCmUmAmGm




AmAmAmUmAmGmCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAm




AmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmC




mGmAmGmUmCmGmGmUmGmCmU*mU*mU*mU






HBB8_
mG*mG*mA*rGrGrGrCrArGrGrArGrCrCrArG
20832


gRNA4
rGrGrCrUrGrUrUrUrUrArGrAmGmCmUmAmGm




AmAmAmUmAmGmCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAm




AmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmC




mGmAmGmUmCmGmGmUmGmCmU*mU*mU*mU






HBB8_
mG*mU*mC*rUrGrCrCrGrUrUrArCrUrGrCrC
20833


231fw
rCrUrGrUrGrUrUrUrUrArGrAmGmCmUmAmGm




AmAmAmUmAmGmCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAm




AmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmC




mGmAmGmUmCmGmGmUmGmCmU*mU*mU*mU






HBB8_
mC*mG*mU*rUrArCrUrGrCrCrCrUrGrUrGrG
20834


237fw
rGrGrCrArGrUrUrUrUrArGrAmGmCmUmAmGm




AmAmAmUmAmGmCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAm




AmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmC




mGmAmGmUmCmGmGmUmGmCmU*mU*mU*mU






HBB8_
mC*mC*mU*rGrUrGrGrGrGrCrArArGrGrUrG
20835


246fw
rArArCrGrGrUrUrUrUrArGrAmGmCmUmAmGm




AmAmAmUmAmGmCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAm




AmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmC




mGmAmGmUmCmGmGmUmGmCmU*mU*mU*mU






HBB8_
mA*mA*mG*rGrUrGrArArCrGrUrGrGrArUrG
20836


256fw
rArArGrUrGrUrUrUrUrArGrAmGmCmUmAmGm




AmAmAmUmAmGmCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAm




AmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmC




mGmAmGmUmCmGmGmUmGmCmU*mU*mU*mU






HBB8_
mU*mG*mA*rArGrUrUrGrGrUrGrGrUrGrArG
20837


270fw
rGrCrCrCrGrUrUrUrUrArGrAmGmCmUmAmGm




AmAmAmUmAmGmCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAm




AmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmC




mGmAmGmUmCmGmGmUmGmCmU*mU*mU*mU






HBB8_
mU*mG*mG*rUrGrArGrGrCrCrCrUrGrGrGrC
20838


279fw
rArGrGrUrGrUrUrUrUrArGrAmGmCmUmAmGm




AmAmAmUmAmGmCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAm




AmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmC




mGmAmGmUmCmGmGmUmGmCmU*mU*mU*mU






HBB8_
mU*mG*mG*rUrArUrCrArArGrGrUrUrArCrA
20839


299fw
rArGrArCrGrUrUrUrUrArGrAmGmCmUmAmGm




AmAmAmUmAmGmCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAm




AmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmC




mGmAmGmUmCmGmGmUmGmCmU*mU*mU*mU






HBB8_
mA*mA*mG*rGrUrUrArCrArArGrArCrArGrG
20840


306fw
rUrUrUrArGrUrUrUrUrArGrAmGmCmUmAmGm




AmAmAmUmAmGmCrArArGrUrUrArArArArUrA




rArGrGrCrUrArGrUrCrCrGrUrUrArUrCrAm




AmCmUmUmGmAmAmAmAmAmGmUmGmGmCmAmCmC




mGmAmGmUmCmGmGmUmGmCmU*mU*mU*mU









Table XIA shows the sequences of XI without modifications. In some embodiments, the sequences used in this table can be used without chemical modifications.









TABLE X1A







Table X1 Sequences without Modifications











SEQ




ID


Name
RNA sequence
NO





HBB5_216
CUUGCCCCACAGGGCAGUAAGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21770


rv
ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB5_24r
UGCAGGAGUCAGGUGCACCAGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21771


v
ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB5_34r
CAGACUUCUCUGCAGGAGUCGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21772


v
ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB5_34r
CAGACUUCUCUGCCGGAGUCGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21773


v_hs1
ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB5_41r
GUAACGGCAGACUUCUCUGCGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21774


v
ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB5_122
AAGCAAAUGUAAGCAAUAGAGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21775


rv
ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB5_92r
CUGACUUUUAUGCCCAGCCCGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21776


v
ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB5_g27
CCUUGAUACCAACCUGCCCAGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21777



ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB5_g37
CACGUUCACCUUGCCCCACAGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21778



ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB5_g38
CCACGUUCACCUUGCCCCACGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21779



ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB5_g39
ACCUUGAUACCAACCUGCCCGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21780



ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB5_g40
UCCACAUGCCCAGUUUCUAUGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21781



ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB8_gR
CAGGGCUGGGCAUAAAAGUCGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21782


NA1
ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB8_gR
AGGGCUGGGCAUAAAAGUCAGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21783


NA2
ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB8_gR
GCAACCUCAAACAGACACCAGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21784


NA3
ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB8_gR
GGAGGGCAGGAGCCAGGGCUGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21785


NA4
ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB8_231
GUCUGCCGUUACUGCCCUGUGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21786


fw
ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB8_237
CGUUACUGCCCUGUGGGGCAGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21787


fw
ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB8_246
CCUGUGGGGCAAGGUGAACGGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21788


fw
ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB8_256
AAGGUGAACGUGGAUGAAGUGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUC
21789


fw
AACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB8_270
UGAAGUUGGUGGUGAGGCCCGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21790


fw
ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB8_279
UGGUGAGGCCCUGGGCAGGUGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21791


fw
ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB8_299
UGGUAUCAAGGUUACAAGACGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21792


fw
ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU






HBB8_306
AAGGUUACAAGACAGGUUUAGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA
21793


fw
ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU









The gene modifying system comprising the gene modifying polypeptides and the template RNA described above was transfected into human HSCs. The gene modifying polypeptide and the template RNA were delivered by nucleofection in RNA format. Specifically, 3000 ng of mRNA encoding the gene modifying polypeptide were combined with 2000 ng template RNA and 2000 ng (for systems comprising HBB5 template RNA) or 3000 ng (for systems comprising HBB8 template RNA) of second strand-targeting gRNA. The RNA mixture was added to 200,000 primary human HSCs in a total of 20 μL of Lonza P3 buffer and HSC were nucleofected in 16-well nucleofection cassettes using program DZ-100. After nucleofection, cells incubated at room temperature for 10 minutes and were transferred to 24-well plates containing 500 μL of StemSpan-XF+SCF at 100 ng/ml, Flt3-L at 100 ng/mL, and TPO at 100 ng/mL in each well and cultured at 37° C., 5% CO2 for 3 days prior to cell lysis and genomic DNA extraction. To analyze gene editing activity, primers flanking the target insertion site locus were used to amplify across the locus. Amplicons were analyzed via short read sequencing using an Illumina MiSeq. Replacement of the DNA bases adenine and guanine at nucleotide positions 20 and 21 to the bases cytosine and adenine (HBB5 template RNA) or replacement of the DNA bases adenine at nucleotide positions 20 to the base cytosine (HBB8 template RNA) downstream of the transcriptional start site within the endogenous B-globin locus indicated successful editing.


As shown in FIG. 18A, average perfect rewrite levels, corresponding to replacement of adenine and guanine at nucleotide positions 20 and 21 to the bases cytosine and adenine (respectively) at the SCD codon, of 4.5%-21.3% were detected in primary human HSCs when the HSCs were treated with exemplary gene modifying systems comprising exemplary HBB5 template RNA tg14 and various second strand-targeting gRNAs. As shown in FIG. 18B, average perfect rewrite levels, corresponding to replacement of adenine at nucleotide positions 20 to the base cytosine at the SCD codon, of 2.9%-24.6% were detected in primary human HSCs when the HSCs were treated with exemplary gene modifying systems comprising exemplary HBB8 template gRNA tg34 and various second strand-targeting gRNAs.


These results demonstrate that use of a second strand-targeting gRNA increases the editing activity of exemplary gene modifying systems targeting a clinically relevant codon in the endogenous B-globin locus in primary human HSCs. The results further demonstrate that adjusting the positioning of a second strand-targeting gRNA (e.g., relative to the sequence targeted by a spacer of an exemplary template RNA) increases the enhancement to editing activity, e.g., to more than 9-fold higher than perfect rewriting in the absence of second strand-targeting gRNA.


Example 14: Characterizing Configurations of Template RNAs Including Silent Substitutions for Editing the Endogenous B-Globin Locus in CD34+Primary Human Hematopoietic Stem Cells (HSCs)

This example demonstrates the use of a gene modifying system containing an exemplary gene modifying polypeptide and various template RNAs comprising different silent substitutions, to convert the glutamic acid codon at amino acid position 7 in the endogenous B-globin locus in primary human HSCs to alanine, thereby demonstrating targeting the sequence position associated with sickle cell disease (SCD) and editing of the sequence to encode a non-pathogenic sequence into position 7. This conversion comprises a change of two base pairs for exemplary HBB5 template RNAs (i.e., replacement of the DNA bases adenine and guanine at nucleotide positions 20 and 21 to the bases cytosine and adenine) plus the inclusion of additional relevant silent substitutions (which alter the nucleic acid sequence of the DNA but not the protein sequence through the usage of different synonymous codons).


In this example, the template RNA contained:

    • (1) a gRNA spacer;
    • (2) a gRNA scaffold;
    • (3) a heterologous object sequence; and
    • (4) a primer binding site (PBS) sequence.


More specifically, the exemplary template RNAs comprised the following sequences from 5′ to 3′, wherein the first 3, and last 3 bases have 2′-O-methyl phosphorothioate chemical modifications. In the sequences below, m=2′-O-methyl ribonucleotide, r=ribose, and *=phosphorothioate bond. Different combinations of substitutions and RT/PBS length were included (Table X2).









TABLE X2







Exemplary Silent Substitution-Containing Template RNAs.














SEQ







ID
RT
PBS
Substi-


Name
Sequence
NO
length
length
tution















tg14_
mC*mA*mU*rGrGrUrGrCrArCrC
20841
14
10
none


h
rUrGrArCrUrCrCrUrGrGrUrUr







UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







ArCrUrUrCrUrCrUrGrCrArGrG







rArGrUrCrArG*mG*mU*mG









tg14_
mC*mA*mU*rGrGrUrGrCrArCrC
20842
14
10
sub


hs1
rUrGrArCrUrCrCrUrGrGrUrUr



1



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







ArCrUrUrCrUrCrUrGrCrCrGrG







rArGrUrCrArG*mG*mU*mG









tg14_
mC*mA*mU*rGrGrUrGrCrArCrC
20843
14
10
sub


hs2
rUrGrArCrUrCrCrUrGrGrUrUr



2



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







ArCrUrUrCrUrCrUrGrCrGrGrG







rArGrUrCrArG*mG*mU*mG









tg14_
mC*mA*mU*rGrGrUrGrCrArCrC
20844
14
10
sub


hs3
rUrGrArCrUrCrCrUrGrGrUrUr



3



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







ArCrUrUrCrUrCrUrGrCrUrGrG







rArGrUrCrArG*mG*mU*mG









tg14_
mC*mA*mU*rGrGrUrGrCrArCrC
20845
14
10
sub


hs4
rUrGrArCrUrCrCrUrGrGrUrUr



4



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







ArUrUrUrCrUrCrUrGrCrArGrG







rArGrUrCrArG*mG*mU*mG









tg14_
mC*mA*mU*rGrGrUrGrCrArCrC
20846
14
10
sub


hs5
rUrGrArCrUrCrCrUrGrGrUrUr



5



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







ArCrUrUrUrUrCrUrGrCrArGrG







rArGrUrCrArG*mG*mU*mG









tg14_
mC*mA*mU*rGrGrUrGrCrArCrC
20847
14
10
sub


hs6
rUrGrArCrUrCrCrUrGrGrUrUr



6



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







ArCrUrUrUrUrCrUrGrCrCrGrG







rArGrUrCrArG*mG*mU*mG









tg14_
mC*mA*mU*rGrGrUrGrCrArCrC
20848
14
10
sub


hs7
rUrGrArCrUrCrCrUrGrGrUrUr



7



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







ArCrUrUrUrUrCrUrGrCrGrGrG







rArGrUrCrArG*mG*mU*mG









tg14_
mC*mA*mU*rGrGrUrGrCrArCrC
20849
14
10
sub


hs8
rUrGrArCrUrCrCrUrGrGrUrUr



8



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







ArCrUrUrUrUrCrUrGrCrUrGrG







rArGrUrCrArG*mG*mU*mG









tg14_
mC*mA*mU*rGrGrUrGrCrArCrC
20850
14
10
sub


hs9
rUrGrArCrUrCrCrUrGrGrUrUr



9



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







ArUrUrUrCrUrCrUrGrCrCrGrG







rArGrUrCrArG*mG*mU*mG









tg14_
mC*mA*mU*rGrGrUrGrCrArCrC
20851
14
10
sub


hs10
rUrGrArCrUrCrCrUrGrGrUrUr



10



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







ArUrUrUrCrUrCrUrGrCrGrGrG







rArGrUrCrArG*mG*mU*mG









tg14_
mC*mA*mU*rGrGrUrGrCrArCrC
20852
14
10
sub


hs11
rUrGrArCrUrCrCrUrGrGrUrUr



11



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







ArUrUrUrCrUrCrUrGrCrUrGrG







rArGrUrCrArG*mG*mU*mG









tg14_
mC*mA*mU*rGrGrUrGrCrArCrC
20853
14
10
sub


hs12
rUrGrArCrUrCrCrUrGrGrUrUr



12



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







ArUrUrUrUrUrCrUrGrCrArGrG







rArGrUrCrArG*mG*mU*mG









tg14_
mC*mA*mU*rGrGrUrGrCrArCrC
20854
14
10
sub


hs13
rUrGrArCrUrCrCrUrGrGrUrUr



13



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







ArUrUrUrUrUrCrUrGrCrCrGrG







rArGrUrCrArG*mG*mU*mG









tg14_
mC*mA*mU*rGrGrUrGrCrArCrC
20855
14
10
sub


hs14
rUrGrArCrUrCrCrUrGrGrUrUr



14



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







ArUrUrUrUrUrCrUrGrCrGrGrG







rArGrUrCrArG*mG*mU*mG









tg14_
mC*mA*mU*rGrGrUrGrCrArCrC
20856
14
10
sub


hs15
rUrGrArCrUrCrCrUrGrGrUrUr



15



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







ArUrUrUrUrUrCrUrGrCrUrGrG







rArGrUrCrArG*mG*mU*mG









tg19_
mC*mA*mU*rGrGrUrGrCrArCrC
20857
17
11
non


h
rUrGrArCrUrCrCrUrGrGrUrUr



e



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrGrGr







CrArGrArCrUrUrCrUrCrUrGrC







rArGrGrArGrUrCrArGrG*mU*m







G*mC









tg19_
mC*mA*mU*rGrGrUrGrCrArCrC
20858
17
11
sub


hs1
rUrGrArCrUrCrCrUrGrGrUrUr



1



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrGrGr







CrArGrArCrUrUrCrUrCrUrGrC







rCrGrGrArGrUrCrArGrG*mU*m







G*mC









tg19_
mC*mA*mU*rGrGrUrGrCrArCrC
20859
17
11
sub


hs2
rUrGrArCrUrCrCrUrGrGrUrUr



2



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrGrGr







CrArGrArCrUrUrCrUrCrUrGrC







rGrGrGrArGrUrCrArGrG*mU*m







G*mC









tg19_
mC*mA*mU*rGrGrUrGrCrArCrC
20860
17
11
sub


hs3
rUrGrArCrUrCrCrUrGrGrUrUr



3



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrGrGr







CrArGrArCrUrUrCrUrCrUrGrC







rUrGrGrArGrUrCrArGrG*mU*m







G*mC









tg19_
mC*mA*mU*rGrGrUrGrCrArCrC
20861
17
11
sub


hs4
rUrGrArCrUrCrCrUrGrGrUrUr



4



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrGrGr







CrArGrArUrUrUrCrUrCrUrGrC







rArGrGrArGrUrCrArGrG*mU*m







G*mC









tg19_
mC*mA*mU*rGrGrUrGrCrArCrC
20862
17
11
sub


hs5
rUrGrArCrUrCrCrUrGrGrUrUr



5



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrGrGr







CrArGrArCrUrUrUrUrCrUrGrC







rArGrGrArGrUrCrArGrG*mU*m







G*mC









tg19_
mC*mA*mU*rGrGrUrGrCrArCrC
20863
17
11
sub


hs6
rUrGrArCrUrCrCrUrGrGrUrUr



6



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrGrGr







CrArGrArCrUrUrUrUrCrUrGrC







rCrGrGrArGrUrCrArGrG*mU*m







G*mC









tg19_
mC*mA*mU*rGrGrUrGrCrArCrC
20864
17
11
sub


hs7
rUrGrArCrUrCrCrUrGrGrUrUr



7



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrGrGr







CrArGrArCrUrUrUrUrCrUrGrC







rGrGrGrArGrUrCrArGrG*mU*m







G*mC









tg19_
mC*mA*mU*rGrGrUrGrCrArCrC
20865
17
11
sub


hs8
rUrGrArCrUrCrCrUrGrGrUrUr



8



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrGrGr







CrArGrArCrUrUrUrUrCrUrGrC







rUrGrGrArGrUrCrArGrG*mU*m







G*mC









tg19_
mC*mA*mU*rGrGrUrGrCrArCrC
20866
17
11
sub


hs9
rUrGrArCrUrCrCrUrGrGrUrUr



9



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrGrGr







CrArGrArUrUrUrCrUrCrUrGrC







rCrGrGrArGrUrCrArGrG*mU*m







G*mC









tg19_
mC*mA*mU*rGrGrUrGrCrArCrC
20867
17
11
sub


hs10
rUrGrArCrUrCrCrUrGrGrUrUr



10



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrGrGr







CrArGrArUrUrUrCrUrCrUrGrC







rGrGrGrArGrUrCrArGrG*mU*m







G*mC









tg19_
mC*mA*mU*rGrGrUrGrCrArCrC
20868
17
11
sub


hs11
rUrGrArCrUrCrCrUrGrGrUrUr



11



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrGrGr







CrArGrArUrUrUrCrUrCrUrGrC







rUrGrGrArGrUrCrArGrG*mU*m







G*mC









tg19_
mC*mA*mU*rGrGrUrGrCrArCrC
20869
17
11
sub


hs12
rUrGrArCrUrCrCrUrGrGrUrUr



12



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrGrGr







CrArGrArUrUrUrUrUrCrUrGrC







rArGrGrArGrUrCrArGrG*mU*m







G*mC









tg19_
mC*mA*mU*rGrGrUrGrCrArCrC
20870
17
11
sub


hs13
rUrGrArCrUrCrCrUrGrGrUrUr



13



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrGrGr







CrArGrArUrUrUrUrUrCrUrGrC







rCrGrGrArGrUrCrArGrG*mU*m







G*mC









tg19_
mC*mA*mU*rGrGrUrGrCrArCrC
20871
17
11
sub


hs14
rUrGrArCrUrCrCrUrGrGrUrUr



14



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrGrGr







CrArGrArUrUrUrUrUrCrUrGrC







rGrGrGrArGrUrCrArGrG*mU*m







G*mC









tg19_
mC*mA*mU*rGrGrUrGrCrArCrC
20872
17
11
sub


hs15
rUrGrArCrUrCrCrUrGrGrUrUr



15



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrGrGr







CrArGrArUrUrUrUrUrCrUrGrC







rUrGrGrArGrUrCrArGrG*mU*m







G*mC









tg41_
mC*mA*mU*rGrGrUrGrCrArCrC
20873
19
11
non


h
rUrGrArCrUrCrCrUrGrGrUrUr



e



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArCr







GrGrCrArGrArCrUrUrCrUrCrU







rGrCrArGrGrArGrUrCrArGrG*







mU*mG*mC









tg41_
mC* mA*mU*rGrGrUrGrCrArCr
20874
19
11
sub


hs1
CrUrGrArCrUrCrCrUrGrGrUrU



1



rUrUrArGrAmGmCmUmAmGmAmAm







AmUmAmGmCrArArGrUrUrArArA







rArUrArArGrGrCrUrArGrUrCr







CrGrUrUrArUrCrAmAmCmUmUmG







mAmAmAmAmAmGmUmGmGmCmAmCm







CmGmAmGmUmCmGmGmUmGmCrArC







rGrGrCrArGrArCrUrUrCrUrCr







UrGrCrCrGrGrArGrUrCrArGrG







*mU*mG*mC









tg41_
mC*mA*mU*rGrGrUrGrCrArCrC
20875
19
11
sub


hs2
rUrGrArCrUrCrCrUrGrGrUrUr



2



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArCr







GrGrCrArGrArCrUrUrCrUrCrU







rGrCrGrGrGrArGrUrCrArGrG*







mU*mG*mC









tg41_
mC*mA*mU*rGrGrUrGrCrArCrC
20876
19
11
sub


hs3
rUrGrArCrUrCrCrUrGrGrUrUr



3



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArCr







GrGrCrArGrArCrUrUrCrUrCrU







rGrCrUrGrGrArGrUrCrArGrG*







mU*mG*mC









tg41_
mC*mA*mU*rGrGrUrGrCrArCrC
20877
19
11
sub


hs4
rUrGrArCrUrCrCrUrGrGrUrUr



4



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArCr







GrGrCrArGrArUrUrUrCrUrCrU







rGrCrArGrGrArGrUrCrArGrG*







mU*mG*mC









tg41_
mC*mA*mU*rGrGrUrGrCrArCrC
20878
19
11
sub


hs5
rUrGrArCrUrCrCrUrGrGrUrUr



5



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArCr







GrGrCrArGrArCrUrUrUrUrCrU







rGrCrArGrGrArGrUrCrArGrG*







mU*mG*mC









tg41_
mC*mA*mU*rGrGrUrGrCrArCrC
20879
19
11
sub


hs6
rUrGrArCrUrCrCrUrGrGrUrUr



6



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArCr







GrGrCrArGrArCrUrUrUrUrCrU







rGrCrCrGrGrArGrUrCrArGrG*







mU*mG*mC









tg41_
mC*mA*mU*rGrGrUrGrCrArCrC
20880
19
11
sub


hs7
rUrGrArCrUrCrCrUrGrGrUrUr



7



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArCr







GrGrCrArGrArCrUrUrUrUrCrU







rGrCrGrGrGrArGrUrCrArGrG*







mU*mG*mC









tg41_
mC*mA*mU*rGrGrUrGrCrArCrC
20881
19
11
sub


hs8
rUrGrArCrUrCrCrUrGrGrUrUr



8



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArCr







GrGrCrArGrArCrUrUrUrUrCrU







rGrCrUrGrGrArGrUrCrArGrG*







mU*mG*mC









tg41_
mC*mA*mU*rGrGrUrGrCrArCrC
20882
19
11
sub


hs9
rUrGrArCrUrCrCrUrGrGrUrUr



9



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArCr







GrGrCrArGrArUrUrUrCrUrCrU







rGrCrCrGrGrArGrUrCrArGrG*







mU*mG*mC









tg41_
mC*mA*mU*rGrGrUrGrCrArCrC
20883
19
11
sub


hs10
rUrGrArCrUrCrCrUrGrGrUrUr



10



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArCr







GrGrCrArGrArUrUrUrCrUrCrU







rGrCrGrGrGrArGrUrCrArGrG*







mU*mG*mC









tg41_
mC*mA*mU*rGrGrUrGrCrArCrC
20884
19
11
sub


hs11
rUrGrArCrUrCrCrUrGrGrUrUr



11



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArCr







GrGrCrArGrArUrUrUrCrUrCrU







rGrCrUrGrGrArGrUrCrArGrG*







mU*mG*mC









tg41_
mC*mA*mU*rGrGrUrGrCrArCrC
20885
19
11
sub


hs12
rUrGrArCrUrCrCrUrGrGrUrUr



12



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArCr







GrGrCrArGrArUrUrUrUrUrCrU







rGrCrArGrGrArGrUrCrArGrG*







mU*mG*mC









tg41_
mC*mA*mU*rGrGrUrGrCrArCrC
20886
19
11
sub


hs13
rUrGrArCrUrCrCrUrGrGrUrUr



13



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArCr







GrGrCrArGrArUrUrUrUrUrCrU







rGrCrCrGrGrArGrUrCrArGrG*







mU*mG*mC









tg41_
mC*mA*mU*rGrGrUrGrCrArCrC
20887
19
11
sub


hs14
rUrGrArCrUrCrCrUrGrGrUrUr



14



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArCr







GrGrCrArGrArUrUrUrUrUrCrU







rGrCrGrGrGrArGrUrCrArGrG*







mU*mG*mC









tg41_
mC*mA*mU*rGrGrUrGrCrArCrC
20888
19
11
sub


hs15
rUrGrArCrUrCrCrUrGrGrUrUr



15



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArCr







GrGrCrArGrArUrUrUrUrUrCrU







rGrCrUrGrGrArGrUrCrArGrG*







mU*mG*mC









tg42_
mC*mA*mU*rGrGrUrGrCrArCrC
20889
21
11
non


h
rUrGrArCrUrCrCrUrGrGrUrUr



e



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrUrAr







ArCrGrGrCrArGrArCrUrUrCrU







rCrUrGrCrArGrGrArGrUrCrAr







GrG*mU*mG*mC









tg42_
mC*mA*mU*rGrGrUrGrCrArCrC
20890
21
11
sub


hs1
rUrGrArCrUrCrCrUrGrGrUrUr



1



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrUrAr







ArCrGrGrCrArGrArCrUrUrCrU







rCrUrGrCrCrGrGrArGrUrCrAr







GrG*mU*mG*mC









tg42_
mC*mA*mU*rGrGrUrGrCrArCrC
20891
21
11
sub


hs2
rUrGrArCrUrCrCrUrGrGrUrUr



2



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrUrAr







ArCrGrGrCrArGrArCrUrUrCrU







rCrUrGrCrGrGrGrArGrUrCrAr







GrG*mU*mG*mC









tg42_
mC*mA*mU*rGrGrUrGrCrArCrC
20892
21
11
sub


hs3
rUrGrArCrUrCrCrUrGrGrUrUr



3



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrUrAr







ArCrGrGrCrArGrArCrUrUrCrU







rCrUrGrCrUrGrGrArGrUrCrAr







GrG*mU*mG*mC









tg42_
mC*mA*mU*rGrGrUrGrCrArCrC
20893
21
11
sub


hs4
rUrGrArCrUrCrCrUrGrGrUrUr



4



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmA







mGmUmGmGmCmAmCm







CmGmAmGmUmCmGmGmUmGmCrUrA







rArCrGrGrCrArGrArUrUrUrCr







UrCrUrGrCrArGrGrArGrUrCrA







rGrG*mU*mG*mC









tg42_
mC*mA*mU*rGrGrUrGrCrArCrC
20894
21
11
sub


hs5
rUrGrArCrUrCrCrUrGrGrUrUr



5



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrUrAr







ArCrGrGrCrArGrArCrUrUrUrU







rCrUrGrCrArGrGrArGrUrCrAr







GrG*mU*mG*mC









tg42_
mC*mA*mU*rGrGrUrGrCrArCrC
20895
21
11
sub


hs6
rUrGrArCrUrCrCrUrGrGrUrUr



6



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrUrAr







ArCrGrGrCrArGrArCrUrUrUrU







rCrUrGrCrCrGrGrArGrUrCrAr







GrG*mU*mG*mC









tg42_
mC*mA*mU*rGrGrUrGrCrArCrC
20896
21
11
sub


hs7
rUrGrArCrUrCrCrUrGrGrUrUr



7



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrUrAr







ArCrGrGrCrArGrArCrUrUrUrU







rCrUrGrCrGrGrGrArGrUrCrAr







GrG*mU*mG*mC









tg42_
mC*mA*mU*rGrGrUrGrCrArCrC
20897
21
11
sub


hs8
rUrGrArCrUrCrCrUrGrGrUrUr



8



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrUrAr







ArCrGrGrCrArGrArCrUrUrUrU







rCrUrGrCrUrGrGrArGrUrCrAr







GrG*mU*mG*mC









tg42_
mC*mA*mU*rGrGrUrGrCrArCrC
20898
21
11
sub


hs9
rUrGrArCrUrCrCrUrGrGrUrUr



9



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrUrAr







ArCrGrGrCrArGrArUrUrUrCrU







rCrUrGrCrCrGrGrArGrUrCrAr







GrG*mU*mG*mC









tg42_
mC*mA*mU*rGrGrUrGrCrArCrC
20899
21
11
sub


hs10
rUrGrArCrUrCrCrUrGrGrUrUr



10



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrUrAr







ArCrGrGrCrArGrArUrUrUrCrU







rCrUrGrCrGrGrGrArGrUrCrAr







GrG*mU*mG*mC









tg42_
mC*mA*mU*rGrGrUrGrCrArCrC
20900
21
11
sub


hs11
rUrGrArCrUrCrCrUrGrGrUrUr



11



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrUrAr







ArCrGrGrCrArGrArUrUrUrCrU







rCrUrGrCrUrGrGrArGrUrCrAr







GrG*mU*mG*mC









tg42_
mC*mA*mU*rGrGrUrGrCrArCrC
20901
21
11
sub


hs12
rUrGrArCrUrCrCrUrGrGrUrUr



12



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrUrAr







ArCrGrGrCrArGrArUrUrUrUrU







rCrUrGrCrArGrGrArGrUrCrAr







GrG*mU*mG*mC









tg42_
mC*mA*mU*rGrGrUrGrCrArCrC
20902
21
11
sub


hs13
rUrGrArCrUrCrCrUrGrGrUrUr



13



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrUrAr







ArCrGrGrCrArGrArUrUrUrUrU







rCrUrGrCrCrGrGrArGrUrCrAr







GrG*mU*mG*mC









tg42_
mC*mA*mU*rGrGrUrGrCrArCrC
20903
21
11
sub


hs14
rUrGrArCrUrCrCrUrGrGrUrUr



14



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrUrAr







ArCrGrGrCrArGrArUrUrUrUrU







rCrUrGrCrGrGrGrArGrUrCrAr







GrG*mU*mG*mC









tg42_
mC*mA*mU*rGrGrUrGrCrArCrC
20904
21
11
sub


hs15
rUrGrArCrUrCrCrUrGrGrUrUr



15



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrUrAr







ArCrGrGrCrArGrArUrUrUrUrU







rCrUrGrCrUrGrGrArGrUrCrAr







GrG*mU*mG*mC









tg43_
mC*mA*mU*rGrGrUrGrCrArCrC
20905
23
11
non


h
rUrGrArCrUrCrCrUrGrGrUrUr



e



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







UrArArCrGrGrCrArGrArCrUrU







rCrUrCrUrGrCrArGrGrArGrUr







CrArGrG*mU*mG*mC









tg43_
mC*mA*mU*rGrGrUrGrCrArCrC
20906
23
11
sub


hs1
rUrGrArCrUrCrCrUrGrGrUrUr



1



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







UrArArCrGrGrCrArGrArCrUrU







rCrUrCrUrGrCrCrGrGrArGrUr







CrArGrG*mU*mG*mC









tg43_
mC*mA*mU*rGrGrUrGrCrArCrC
20907
23
11
sub


hs2
rUrGrArCrUrCrCrUrGrGrUrUr



2



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







UrArArCrGrGrCrArGrArCrUrU







rCrUrCrUrGrCrGrGrGrArGrUr







CrArGrG*mU*mG*mC









tg43_
mC*mA*mU*rGrGrUrGrCrArCrC
20908
23
11
sub


hs3
rUrGrArCrUrCrCrUrGrGrUrUr



3



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







UrArArCrGrGrCrArGrArCrUrU







rCrUrCrUrGrCrUrGrGrArGrUr







CrArGrG*mU*mG*mC









tg43_
mC*mA*mU*rGrGrUrGrCrArCrC
20909
23
11
sub


hs4
rUrGrArCrUrCrCrUrGrGrUrUr



4



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







UrArArCrGrGrCrArGrArUrUrU







rCrUrCrUrGrCrArGrGrArGrUr







CrArGrG*mU*mG*mC









tg43_
mC*mA*mU*rGrGrUrGrCrArCrC
20910
23
11
sub


hs5
rUrGrArCrUrCrCrUrGrGrUrUr



5



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







UrArArCrGrGrCrArGrArCrUrU







rUrUrCrUrGrCrArGrGrArGrUr







CrArGrG*mU*mG*mC









tg43_
mC*mA*mU*rGrGrUrGrCrArCrC
20911
23
11
sub


hs6
rUrGrArCrUrCrCrUrGrGrUrUr



6



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







UrArArCrGrGrCrArGrArCrUrU







rUrUrCrUrGrCrCrGrGrArGrUr







CrArGrG*mU*mG*mC









tg43_
mC*mA*mU*rGrGrUrGrCrArCrC
20912
23
11
sub


hs7
rUrGrArCrUrCrCrUrGrGrUrUr



7



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







UrArArCrGrGrCrArGrArCrUrU







rUrUrCrUrGrCrGrGrGrArGrUr







CrArGrG*mU*mG*mC









tg43_
mC*mA*mU*rGrGrUrGrCrArCrC
20913
23
11
sub


hs8
rUrGrArCrUrCrCrUrGrGrUrUr



8



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







UrArArCrGrGrCrArGrArCrUrU







rUrUrCrUrGrCrUrGrGrArGrUr







CrArGrG*mU*mG*mC









tg43_
mC*mA*mU*rGrGrUrGrCrArCrC
20914
23
11
sub


hs9
rUrGrArCrUrCrCrUrGrGrUrUr



9



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







UrArArCrGrGrCrArGrArUrUrU







rCrUrCrUrGrCrCrGrGrArGrUr







CrArGrG*mU*mG*mC









tg43_
mC*mA*mU*rGrGrUrGrCrArCrC
20915
23
11
sub


hs10
rUrGrArCrUrCrCrUrGrGrUrUr



10



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







UrArArCrGrGrCrArGrArUrUrU







rCrUrCrUrGrCrGrGrGrArGrUr







CrArGrG*mU*mG*mC









tg43_
mC*mA*mU*rGrGrUrGrCrArCrC
20916
23
11
sub


hs11
rUrGrArCrUrCrCrUrGrGrUrUr



11



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







UrArArCrGrGrCrArGrArUrUrU







rCrUrCrUrGrCrUrGrGrArGrUr







CrArGrG*mU*mG*mC









tg43_
mC*mA*mU*rGrGrUrGrCrArCrC
20917
23
11
sub


hs12
rUrGrArCrUrCrCrUrGrGrUrUr



12



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







UrArArCrGrGrCrArGrArUrUrU







rUrUrCrUrGrCrArGrGrArGrUr







CrArGrG*mU*mG*mC









tg43_
mC*mA*mU*rGrGrUrGrCrArCrC
20918
23
11
sub


hs13
rUrGrArCrUrCrCrUrGrGrUrUr



13



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







UrArArCrGrGrCrArGrArUrUrU







rUrUrCrUrGrCrCrGrGrArGrUr







CrArGrG*mU*mG*mC









tg43_
mC*mA*mU*rGrGrUrGrCrArCrC
20919
23
11
sub


hs14
rUrGrArCrUrCrCrUrGrGrUrUr



14



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







UrArArCrGrGrCrArGrArUrUrU







rUrUrCrUrGrCrGrGrGrArGrUr







CrArGrG*mU*mG*mC









tg43_
mC*mA*mU*rGrGrUrGrCrArCrC
20920
23
11
sub


hs15
rUrGrArCrUrCrCrUrGrGrUrUr



15



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







UrArArCrGrGrCrArGrArUrUrU







rUrUrCrUrGrCrUrGrGrArGrUr







CrArGrG*mU*mG*mC









tg43_
mC*mA*mU*rGrGrUrGrCrArCrC
20921
23
11
sub


hs16
rUrGrArCrUrCrCrUrGrGrUrUr



16



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







UrArArCrGrGrCrUrGrArUrUrU







rUrUrCrUrGrCrCrGrGrArGrUr







CrArGrG*mU*mG*mC









tg43_
mC*mA*mU*rGrGrUrGrCrArCrC
20922
23
11
sub


hs17
rUrGrArCrUrCrCrUrGrGrUrUr



17



UrUrArGrAmGmCmUmAmGmAmAmA







mUmAmGmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGrUrCrC







rGrUrUrArUrCrAmAmCmUmUmGm







AmAmAmAmAmGmUmGmGmCmAmCmC







mGmAmGmUmCmGmGmUmGmCrArGr







UrArArCrGrGrCrCrGrArUrUrU







rUrUrCrUrGrCrCrGrGrArGrUr







CrArGrG*mU*mG*mC









Table X2A shows the sequences of X2 without modifications. In some embodiments, the sequences used in this table can be used without chemical modifications.









TABLE X2A







Table X2 Sequences without Modifications













SEQ



Name
Sequence
ID NO






tg14_
CAUGGUGCACCUGACUCCUGGUUUU
21794



h
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGAC





UUCUCUGCAGGAGUCAGGUG







tg14_
CAUGGUGCACCUGACUCCUGGUUUU
21795



hs1
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGAC





UUCUCUGCCGGAGUCAGGUG







tg14_
CAUGGUGCACCUGACUCCUGGUUUU
21796



hs2
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGAC





UUCUCUGCGGGAGUCAGGUG







tg14_
CAUGGUGCACCUGACUCCUGGUUUU
21797



hs3
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGAC





UUCUCUGCUGGAGUCAGGUG







tg14_
CAUGGUGCACCUGACUCCUGGUUUU
21798



hs4
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGAU





UUCUCUGCAGGAGUCAGGUG







tg14_
CAUGGUGCACCUGACUCCUGGUUUU
21799



hs5
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGAC





UUUUCUGCAGGAGUCAGGUG







tg14_
CAUGGUGCACCUGACUCCUGGUUUU
21800



hs6
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGAC





UUUUCUGCCGGAGUCAGGUG







tg14_
CAUGGUGCACCUGACUCCUGGUUUU
21801



hs7
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGAC





UUUUCUGCGGGAGUCAGGUG







tg14_
CAUGGUGCACCUGACUCCUGGUUUU
21802



hs8
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGAC





UUUUCUGCUGGAGUCAGGUG







tg14_
CAUGGUGCACCUGACUCCUGGUUUU
21803



hs9
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGAU





UUCUCUGCCGGAGUCAGGUG







tg14_
CAUGGUGCACCUGACUCCUGGUUUU
21804



hs10
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGAU





UUCUCUGCGGGAGUCAGGUG







tg14_
CAUGGUGCACCUGACUCCUGGUUUU
21805



hs11
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGAU





UUCUCUGCUGGAGUCAGGUG







tg14_
CAUGGUGCACCUGACUCCUGGUUUU
21806



hs12
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGAU





UUUUCUGCAGGAGUCAGGUG







tg14_
CAUGGUGCACCUGACUCCUGGUUUU
21807



hs13
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGAU





UUUUCUGCCGGAGUCAGGUG







tg14_
CAUGGUGCACCUGACUCCUGGUUUU
21808



hs14
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGAU





UUUUCUGCGGGAGUCAGGUG







tg14_
CAUGGUGCACCUGACUCCUGGUUUU
21809



hs15
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGAU





UUUUCUGCUGGAGUCAGGUG







tg19_
CAUGGUGCACCUGACUCCUGGUUUU
21810



h
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCGGCA





GACUUCUCUGCAGGAGUCAGGUGC







tg19_
CAUGGUGCACCUGACUCCUGGUUUU
21811



hs1
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCGGCA





GACUUCUCUGCCGGAGUCAGGUGC







tg19_
CAUGGUGCACCUGACUCCUGGUUUU
21812



hs2
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCGGCA





GACUUCUCUGCGGGAGUCAGGUGC







tg19_
CAUGGUGCACCUGACUCCUGGUUUU
21813



hs3
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCGGCA





GACUUCUCUGCUGGAGUCAGGUGC







tg19_
CAUGGUGCACCUGACUCCUGGUUUU
21814



hs4
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCGGCA





GAUUUCUCUGCAGGAGUCAGGUGC







tg19_
CAUGGUGCACCUGACUCCUGGUUUU
21815



hs5
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCGGCA





GACUUUUCUGCAGGAGUCAGGUGC







tg19_
CAUGGUGCACCUGACUCCUGGUUUU
21816



hs6
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCGGCA





GACUUUUCUGCCGGAGUCAGGUGC







tg19_
CAUGGUGCACCUGACUCCUGGUUUU
21817



hs7
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCGGCA





GACUUUUCUGCGGGAGUCAGGUGC







tg19_
CAUGGUGCACCUGACUCCUGGUUUU
21818



hs8
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCGGCA





GACUUUUCUGCUGGAGUCAGGUGC







tg19_
CAUGGUGCACCUGACUCCUGGUUUU
21819



hs9
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCGGCA





GAUUUCUCUGCCGGAGUCAGGUGC







tg19_
CAUGGUGCACCUGACUCCUGGUUUU
21820



hs10
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCGGCA





GAUUUCUCUGCGGGAGUCAGGUGC







tg19_
CAUGGUGCACCUGACUCCUGGUUUU
21821



hs11
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCGGCA





GAUUUCUCUGCUGGAGUCAGGUGC







tg19_
CAUGGUGCACCUGACUCCUGGUUUU
21822



hs12
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCGGCA





GAUUUUUCUGCAGGAGUCAGGUGC







tg19_
CAUGGUGCACCUGACUCCUGGUUUU
21823



hs13
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCGGCA





GAUUUUUCUGCCGGAGUCAGGUGC







tg19_
CAUGGUGCACCUGACUCCUGGUUUU
21824



hs14
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCGGCA





GAUUUUUCUGCGGGAGUCAGGUGC







tg19_
CAUGGUGCACCUGACUCCUGGUUUU
21825



hs15
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCGGCA





GAUUUUUCUGCUGGAGUCAGGUGC







tg41_
CAUGGUGCACCUGACUCCUGGUUUU
21826



h
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCACGG





CAGACUUCUCUGCAGGAGUCAGGUG





C







tg41_
CAUGGUGCACCUGACUCCUGGUUUU
21827



hs1
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCACGG





CAGACUUCUCUGCCGGAGUCAGGUG





C







tg41_
CAUGGUGCACCUGACUCCUGGUUUU
21828



hs2
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCACGG





CAGACUUCUCUGCGGGAGUCAGGUG





C







tg41_
CAUGGUGCACCUGACUCCUGGUUUU
21829



hs3
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCACGG





CAGACUUCUCUGCUGGAGUCAGGUG





C







tg41_
CAUGGUGCACCUGACUCCUGGUUUU
21830



hs4
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCACGG





CAGAUUUCUCUGCAGGAGUCAGGUG





C







tg41_
CAUGGUGCACCUGACUCCUGGUUUU
21831



hs5
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCACGG





CAGACUUUUCUGCAGGAGUCAGGUG





C







tg41_
CAUGGUGCACCUGACUCCUGGUUUU
21832



hs6
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCACGG





CAGACUUUUCUGCCGGAGUCAGGUG





C







tg41_
CAUGGUGCACCUGACUCCUGGUUUU
21833



hs7
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCACGG





CAGACUUUUCUGCGGGAGUCAGGUG





C







tg41_
CAUGGUGCACCUGACUCCUGGUUUU
21834



hs8
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCACGG





CAGACUUUUCUGCUGGAGUCAGGUG





C







tg41_
CAUGGUGCACCUGACUCCUGGUUUU
21835



hs9
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCACGG





CAGAUUUCUCUGCCGGAGUCAGGUG





C







tg41_
CAUGGUGCACCUGACUCCUGGUUUU
21836



hs10
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCACGG





CAGAUUUCUCUGCGGGAGUCAGGUG





C







tg41_
CAUGGUGCACCUGACUCCUGGUUUU
21837



hs11
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCACGG





CAGAUUUCUCUGCUGGAGUCAGGUG





C







tg41_
CAUGGUGCACCUGACUCCUGGUUUU
21838



hs12
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCACGG





CAGAUUUUUCUGCAGGAGUCAGGUG





C







tg41_
CAUGGUGCACCUGACUCCUGGUUUU
21839



hs13
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCACGG





CAGAUUUUUCUGCCGGAGUCAGGUG





C







tg41_
CAUGGUGCACCUGACUCCUGGUUUU
21840



hs14
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCACGG





CAGAUUUUUCUGCGGGAGUCAGGUG





C







tg41_
CAUGGUGCACCUGACUCCUGGUUUU
21841



hs15
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCACGG





CAGAUUUUUCUGCUGGAGUCAGGUG





C







tg42_
CAUGGUGCACCUGACUCCUGGUUUU
21842



h
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCUAAC





GGCAGACUUCUCUGCAGGAGUCAGG





UGC







tg42_
CAUGGUGCACCUGACUCCUGGUUUU
21843



hs1
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCUAAC





GGCAGACUUCUCUGCCGGAGUCAGG





UGC







tg42_
CAUGGUGCACCUGACUCCUGGUUUU
21844



hs2
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCUAAC





GGCAGACUUCUCUGCGGGAGUCAGG





UGC







tg42_
CAUGGUGCACCUGACUCCUGGUUUU
21845



hs3
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCUAAC





GGCAGACUUCUCUGCUGGAGUCAGG





UGC







tg42_
CAUGGUGCACCUGACUCCUGGUUUU
21846



hs4
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCUAAC





GGCAGAUUUCUCUGCAGGAGUCAGG





UGC







tg42_
CAUGGUGCACCUGACUCCUGGUUUU
21847



hs5
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCUAAC





GGCAGACUUUUCUGCAGGAGUCAGG





UGC







tg42_
CAUGGUGCACCUGACUCCUGGUUUU
21848



hs6
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCUAAC





GGCAGACUUUUCUGCCGGAGUCAGG





UGC







tg42_
CAUGGUGCACCUGACUCCUGGUUUU
21849



hs7
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCUAAC





GGCAGACUUUUCUGCGGGAGUCAGG





UGC







tg42_
CAUGGUGCACCUGACUCCUGGUUUU
21850



hs8
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCUAAC





GGCAGACUUUUCUGCUGGAGUCAGG





UGC







tg42_
CAUGGUGCACCUGACUCCUGGUUUU
21851



hs9
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCUAAC





GGCAGAUUUCUCUGCCGGAGUCAGG





UGC







tg42_
CAUGGUGCACCUGACUCCUGGUUUU
21852



hs10
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCUAAC





GGCAGAUUUCUCUGCGGGAGUCAGG





UGC







tg42_
CAUGGUGCACCUGACUCCUGGUUUU
21853



hs11
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCUAAC





GGCAGAUUUCUCUGCUGGAGUCAGG





UGC







tg42_
CAUGGUGCACCUGACUCCUGGUUUU
21854



hs12
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCUAAC





GGCAGAUUUUUCUGCAGGAGUCAGG





UGC







tg42_
CAUGGUGCACCUGACUCCUGGUUUU
21855



hs13
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCUAAC





GGCAGAUUUUUCUGCCGGAGUCAGG





UGC







tg42_
CAUGGUGCACCUGACUCCUGGUUUU
21856



hs14
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCUAAC





GGCAGAUUUUUCUGCGGGAGUCAGG





UGC







tg42_
CAUGGUGCACCUGACUCCUGGUUUU
21857



hs15
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCUAAC





GGCAGAUUUUUCUGCUGGAGUCAGG





UGC







tg43_
CAUGGUGCACCUGACUCCUGGUUUU
21858



h
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGUA





ACGGCAGACUUCUCUGCAGGAGUCA





GGUGC







tg43_
CAUGGUGCACCUGACUCCUGGUUUU
21859



hs1
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGUA





ACGGCAGACUUCUCUGCCGGAGUCA





GGUGC







tg43_
CAUGGUGCACCUGACUCCUGGUUUU
21860



hs2
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGUA





ACGGCAGACUUCUCUGCGGGAGUCA





GGUGC







tg43_
CAUGGUGCACCUGACUCCUGGUUUU
21861



hs3
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGUA





ACGGCAGACUUCUCUGCUGGAGUCA





GGUGC







tg43_
CAUGGUGCACCUGACUCCUGGUUUU
21862



hs4
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGUA





ACGGCAGAUUUCUCUGCAGGAGUCA





GGUGC







tg43_
CAUGGUGCACCUGACUCCUGGUUUU
21863



hs5
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGUA





ACGGCAGACUUUUCUGCAGGAGUCA





GGUGC







tg43_
CAUGGUGCACCUGACUCCUGGUUUU
21864



hs6
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGUA





ACGGCAGACUUUUCUGCCGGAGUCA





GGUGC







tg43_
CAUGGUGCACCUGACUCCUGGUUUU
21865



hs7
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGUA





ACGGCAGACUUUUCUGCGGGAGUCA





GGUGC







tg43_
CAUGGUGCACCUGACUCCUGGUUUU
21866



hs8
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGUA





ACGGCAGACUUUUCUGCUGGAGUCA





GGUGC







tg43_
CAUGGUGCACCUGACUCCUGGUUUU
21867



hs9
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGUA





ACGGCAGAUUUCUCUGCCGGAGUCA





GGUGC







tg43_
CAUGGUGCACCUGACUCCUGGUUUU
21868



hs10
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGUA





ACGGCAGAUUUCUCUGCGGGAGUCA





GGUGC







tg43_
CAUGGUGCACCUGACUCCUGGUUUU
21869



hs11
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGUA





ACGGCAGAUUUCUCUGCUGGAGUCA





GGUGC







tg43_
CAUGGUGCACCUGACUCCUGGUUUU
21870



hs12
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGUA





ACGGCAGAUUUUUCUGCAGGAGUCA





GGUGC







tg43_
CAUGGUGCACCUGACUCCUGGUUUU
21871



hs13
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGUA





ACGGCAGAUUUUUCUGCCGGAGUCA





GGUGC







tg43_
CAUGGUGCACCUGACUCCUGGUUUU
21872



hs14
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGUA





ACGGCAGAUUUUUCUGCGGGAGUCA





GGUGC







tg43_
CAUGGUGCACCUGACUCCUGGUUUU
21873



hs15
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGUA





ACGGCAGAUUUUUCUGCUGGAGUCA





GGUGC







tg43_
CAUGGUGCACCUGACUCCUGGUUUU
21874



hs16
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGUA





ACGGCUGAUUUUUCUGCCGGAGUCA





GGUGC







tg43_
CAUGGUGCACCUGACUCCUGGUUUU
21875



hs17
AGAGCUAGAAAUAGCAAGUUAAAAU





AAGGCUAGUCCGUUAUCAACUUGAA





AAAGUGGCACCGAGUCGGUGCAGUA





ACGGCCGAUUUUUCUGCCGGAGUCA





GGUGC









Select corresponding template RNA sequences not comprising silent substitutions are given in Example 5 (e.g., FYF tgRNA14 is a corresponding template RNA sequence to tg14h, FYF tgRNA19 is a corresponding template RNA sequence to tg19h, etc.).


The gene modifying polypeptide used comprised the sequence set out in Example 8 labeled RNAV209.


The gene modifying system comprising the gene modifying polypeptides and the template RNA described above was transfected into human HSCs. The gene modifying polypeptide and the template RNA were delivered by nucleofection in RNA format. Specifically, 3000 ng of mRNA encoding the gene modifying polypeptide were combined with 2000 ng template RNA. The RNA mixture was added to 200,000 primary human HSCs in a total of 20 μL of Lonza P3 buffer and HSC were nucleofected in 16-well nucleofection cassettes using program DZ-100. After nucleofection, cells incubated at room temperature for 10 minutes and were transferred to 24-well plates containing 500 μL of StemSpan-XF+SCF at 100 ng/ml, Flt3-L at 100 ng/ml, and TPO at 100 ng/ml in each well and cultured at 37° C., 5% CO2 for 3 days prior to cell lysis and genomic DNA extraction. To analyze gene editing activity, primers flanking the target insertion site locus were used to amplify across the locus. Amplicons were analyzed via short read sequencing using an Illumina MiSeq. Replacement of the DNA bases adenine and guanine at nucleotide positions 20 and 21 to the bases cytosine and adenine (HBB5 template RNA) plus the inclusion of the expected silent substitutions downstream of the transcriptional start site within the endogenous B-globin locus indicated successful editing.


As shown in FIG. 19A, average perfect rewrite levels of 0.2%-7.3%, corresponding to replacement of adenine and guanine at nucleotide positions 20 and 21 to the bases cytosine and adenine (respectively) at the SCD codon, were detected in primary human HSCs when the HSCs were treated with exemplary gene modifying systems comprising exemplary HBB5 template RNAs containing various silent substitutions. The results show that in some cases a silent substitution or substitutions can increase editing activity across several different template RNAs, e.g., exemplary silent substitution(s) hs1. In particular, replacement of the codon encoding the 6th amino acid, counting the initial methionine, of the HBB gene (the proline) to either CCC or CCG resulted in increased editing.


These results demonstrate that introducing silent substitutions within an exemplary template RNA increases editing activity of a gene modifying system comprising said template RNAs up to 5-fold when targeting a clinically relevant codon in the endogenous B-globin locus in primary human HSCs. The results further demonstrate that adjusting the identity or identities of the silent substitution(s) can increase the enhancement to editing activity.


Example 15: Characterizing Configurations of Template RNAs Including Silent Substitutions for Editing the Endogenous B-Globin Locus in CD34+Primary Human Hematopoietic Stem Cells (HSCs)

This example demonstrates the use of a gene modifying system containing an exemplary gene modifying polypeptide and various template RNAs comprising different silent substitutions, to convert the glutamic acid codon at amino acid position 7 in the endogenous B-globin locus in primary human HSCs to alanine, thereby demonstrating targeting the sequence position associated with sickle cell disease (SCD) and editing of the sequence to encode a non-pathogenic sequence into position 7. This conversion comprises a change of one base pair for exemplary HBB8 template RNAs (i.e., replacement of the DNA base adenine at nucleotide positions 20 to the base cytosine) plus the inclusion of additional relevant silent substitutions (which alter the nucleic acid sequence of the DNA but not the protein sequence through the usage of different synonymous codons).


In this example, the template RNA contained:

    • (1) a gRNA spacer;
    • (2) a gRNA scaffold;
    • (3) a heterologous object sequence; and
    • (4) a primer binding site (PBS) sequence.


More specifically, the exemplary template RNAs comprised the following sequences from 5′ to 3′, wherein the first 3, and last 3 bases have 2′-O-methyl phosphorothioate chemical modifications. In the sequences below, m=2′-O-methyl ribonucleotide, r=ribose, and *=phosphorothioate bond. Different combinations of substitutions and RT/PBS length were included (Table X3).









TABLE X3







Exemplary Silent Substitution-Containing Template RNAs














SEQ







ID
RT
PBS



Name
Sequence
NO
length
length
Substitution















tg34
mG*mU*mA*rArCrGrGrCr
22005
14
11
none



ArGrArCrUrUrCrUrCrCr







UrCrGrUrUrUrUrArGrAm







GmCmUmAmGmAmAmAmUmAm







GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrUrGr







CrGrGrArGrArArGrUrC*







mU*mG*mC









tg34_
mG*mU*mA*rArCrGrGrCr
20924
14
11
sub


HBB8h
ArGrArCrUrUrCrUrCrCr






S
UrCrGrUrUrUrUrArGrAm







GmCmUmAmGmAmAmAmUmAm







GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrGrGr







CrCrGrArGrArArGrUrC*







mU*mG*mC









tg34_
mG*mU*mA*rArCrGrGrCr
20925
14
11
Sub


HBB8h
ArGrArCrUrUrCrUrCrCr



1


s1
UrCrGrUrUrUrUrArGrAm







GmCmUmAmGmAmAmAmUmAm







GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrUrGr







CrArGrArGrArArGrUrC*







mU*mG*mC









tg34_
mG*mU*mA*rArCrGrGrCr
20926
14
11
Sub


HBB8h
ArGrArCrUrUrCrUrCrCr



2


s2
UrCrGrUrUrUrUrArGrAm







GmCmUmAmGmAmAmAmUmAm







GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrUrGr







CrUrGrArGrArArGrUrC*







mU*mG*mC









tg34_
mG*mU*mA*rArCrGrGrCr
20927
14
11
Sub


HBB8h
ArGrArCrUrUrCrUrCrCr



3


s3
UrCrGrUrUrUrUrArGrAm







GmCmUmAmGmAmAmAmUmAm







GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrUrGr







CrCrGrArGrArArGrUrC*







mU*mG*mC









tgRNA
mG*mU*mA*rArCrGrGrCr
20928
14
11
Sub


34_HB
ArGrArCrUrUrCrUrCrCr



4


B8hs4
UrCrGrUrUrUrUrArGrAm







GmCmUmAmGmAmAmAmUmAm







GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrCrGr







CrArGrArGrArArGrUrC*







mU*mG*mC









tg34_
mG*mU*mA*rArCrGrGrCr
20929
14
11
Sub


HBB8h
ArGrArCrUrUrCrUrCrCr



5


s5
UrCrGrUrUrUrUrArGrAm







GmCmUmAmGmAmAmAmUmAm







GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrArGr







CrArGrArGrArArGrUrC*







mU*mG*mC









tg34_
mG*mU*mA*rArCrGrGrCr
20930
14
11
Sub


HBB8h
ArGrArCrUrUrCrUrCrCr



6


s6
UrCrGrUrUrUrUrArGrAm







GmCmUmAmGmAmAmAmUmAm







GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrGrGr







CrArGrArGrArArGrUrC*







mU*mG*mC









tg34_
mG*mU*mA*rArCrGrGrCr
20931
14
11
Sub


HBB8h
ArGrArCrUrUrCrUrCrCr



7


s7
UrCrGrUrUrUrUrArGrAm







GmCmUmAmGmAmAmAmUmAm







GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrCrGr







CrUrGrArGrArArGrUrC*







mU*mG*mC









tg34_
mG*mU*mA*rArCrGrGrCr
20932
14
11
Sub


HBB8h
ArGrArCrUrUrCrUrCrCr



8


s8
UrCrGrUrUrUrUrArGrAm







GmCmUmAmGmAmAmAmUmAm







GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrArGr







CrUrGrArGrArArGrUrC*







mU*mG*mC









tg34_
mG*mU*mA*rArCrGrGrCr
20933
14
11
Sub


HBB8h
ArGrArCrUrUrCrUrCrCr



9


s9
UrCrGrUrUrUrUrArGrAm







GmCmUmAmGmAmAmAmUmAm







GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrGrGr







CrUrGrArGrArArGrUrC*







mU*mG*mC









tg34_
mG*mU*mA*rArCrGrGrCr
20934
14
11
sub


HBB8h
ArGrArCrUrUrCrUrCrCr



10


s10
UrCrGrUrUrUrUrArGrAm







GmCmUmAmGmAmAmAmUmAm







GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrCrGr







CrCrGrArGrArArGrUrC*







mU*mG*mC









tg34_
mG*mU*mA*rArCrGrGrCr
20935
14
11
sub


HBB8h
ArGrArCrUrUrCrUrCrCr



11


s11
UrCrGrUrUrUrUrArGrAm







GmCmUmAmGmAmAmAmUmAm







GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrArGr







CrCrGrArGrArArGrUrC*







mU*mG*mC









tg34_
mG*mU*mA*rArCrGrGrCr
20936
14
11
sub


HBB8h
ArGrArCrUrUrCrUrCrCr



12


s12
UrCrGrUrUrUrUrArGrAm







GmCmUmAmGmAmAmAmUmAm







GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrGrGr







CrGrGrArGrArArGrUrC*







mU*mG*mC









tg34_
mG*mU*mA*rArCrGrGrCr
20937
14
11
sub


HBB8h
ArGrArCrUrUrCrUrCrCr



13


s13
UrCrGrUrUrUrUrArGrAm







GmCmUmAmGmAmAmAmUmAm







GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrCrGr







CrGrGrArGrArArGrUrC*







mU*mG*mC









tg34_
mG*mU*mA*rArCrGrGrCr
20938
14
11
sub


HBB8h
ArGrArCrUrUrCrUrCrCr



14


s14
UrCrGrUrUrUrUrArGrAm







GmCmUmAmGmAmAmAmUmAm







GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrArGr







CrGrGrArGrArArGrUrC*







mU*mG*mC









Table X3A shows the sequences of X3 without modifications. In some embodiments, the sequences used in this table can be used without chemical modifications.









TABLE X3A







Table X3 Sequences without Modifications











SEQ ID


Name
Sequence
NO





tg34
GUAACGGCAGACUUCUCCUCGUUUU
21876



AGAGCUAGAAAUAGCAAGUUAAAAU




AAGGCUAGUCCGUUAUCAACUUGAA




AAAGUGGCACCGAGUCGGUGCACCU




GACUCCUGCGGAGAAGUCUGC






tg34_
GUAACGGCAGACUUCUCCUCGUUUU
21877


HBB8hs
AGAGCUAGAAAUAGCAAGUUAAAAU




AAGGCUAGUCCGUUAUCAACUUGAA




AAAGUGGCACCGAGUCGGUGCACCU




GACUCCGGCCGAGAAGUCUGC






tg34_
GUAACGGCAGACUUCUCCUCGUUUU
21878


HBB8hs1
AGAGCUAGAAAUAGCAAGUUAAAAU




AAGGCUAGUCCGUUAUCAACUUGAA




AAAGUGGCACCGAGUCGGUGCACCU




GACUCCUGCAGAGAAGUCUGC






tg34_
GUAACGGCAGACUUCUCCUCGUUUU
21879


HBB8hs2
AGAGCUAGAAAUAGCAAGUUAAAAU




AAGGCUAGUCCGUUAUCAACUUGAA




AAAGUGGCACCGAGUCGGUGCACCU




GACUCCUGCUGAGAAGUCUGC






tg34_
GUAACGGCAGACUUCUCCUCGUUUU
21880


HBB8hs3
AGAGCUAGAAAUAGCAAGUUAAAAU




AAGGCUAGUCCGUUAUCAACUUGAA




AAAGUGGCACCGAGUCGGUGCACCU




GACUCCUGCCGAGAAGUCUGC






tgRNA34_
GUAACGGCAGACUUCUCCUCGUUUU
21881


HBB8hs4
AGAGCUAGAAAUAGCAAGUUAAAAU




AAGGCUAGUCCGUUAUCAACUUGAA




AAAGUGGCACCGAGUCGGUGCACCU




GACUCCCGCAGAGAAGUCUGC






tg34_
GUAACGGCAGACUUCUCCUCGUUUU
21882


HBB8hs5
AGAGCUAGAAAUAGCAAGUUAAAAU




AAGGCUAGUCCGUUAUCAACUUGAA




AAAGUGGCACCGAGUCGGUGCACCU




GACUCCAGCAGAGAAGUCUGC






tg34_
GUAACGGCAGACUUCUCCUCGUUUU
21883


HBB8hs6
AGAGCUAGAAAUAGCAAGUUAAAAU




AAGGCUAGUCCGUUAUCAACUUGAA




AAAGUGGCACCGAGUCGGUGCACCU




GACUCCGGCAGAGAAGUCUGC






tg34_
GUAACGGCAGACUUCUCCUCGUUUU
21884


HBB8hs7
AGAGCUAGAAAUAGCAAGUUAAAAU




AAGGCUAGUCCGUUAUCAACUUGAA




AAAGUGGCACCGAGUCGGUGCACCU




GACUCCCGCUGAGAAGUCUGC






tg34_
GUAACGGCAGACUUCUCCUCGUUUU
21885


HBB8hs8
AGAGCUAGAAAUAGCAAGUUAAAAU




AAGGCUAGUCCGUUAUCAACUUGAA




AAAGUGGCACCGAGUCGGUGCACCU




GACUCCAGCUGAGAAGUCUGC






tg34_
GUAACGGCAGACUUCUCCUCGUUUU
21886


HBB8hs9
AGAGCUAGAAAUAGCAAGUUAAAAU




AAGGCUAGUCCGUUAUCAACUUGAA




AAAGUGGCACCGAGUCGGUGCACCU




GACUCCGGCUGAGAAGUCUGC






tg34_
GUAACGGCAGACUUCUCCUCGUUUU
21887


HBB8hs10
AGAGCUAGAAAUAGCAAGUUAAAAU




AAGGCUAGUCCGUUAUCAACUUGAA




AAAGUGGCACCGAGUCGGUGCACCU




GACUCCCGCCGAGAAGUCUGC






tg34_
GUAACGGCAGACUUCUCCUCGUUUU
21888


HBB8hs11
AGAGCUAGAAAUAGCAAGUUAAAAU




AAGGCUAGUCCGUUAUCAACUUGAA




AAAGUGGCACCGAGUCGGUGCACCU




GACUCCAGCCGAGAAGUCUGC






tg34_
GUAACGGCAGACUUCUCCUCGUUUU
21889


HBB8hs12
AGAGCUAGAAAUAGCAAGUUAAAAU




AAGGCUAGUCCGUUAUCAACUUGAA




AAAGUGGCACCGAGUCGGUGCACCU




GACUCCGGCGGAGAAGUCUGC






tg34_
GUAACGGCAGACUUCUCCUCGUUUU
21890


HBB8hs13
AGAGCUAGAAAUAGCAAGUUAAAAU




AAGGCUAGUCCGUUAUCAACUUGAA




AAAGUGGCACCGAGUCGGUGCACCU




GACUCCCGCGGAGAAGUCUGC






tg34_
GUAACGGCAGACUUCUCCUCGUUUU
21891


HBB8hs14
AGAGCUAGAAAUAGCAAGUUAAAAU




AAGGCUAGUCCGUUAUCAACUUGAA




AAAGUGGCACCGAGUCGGUGCACCU




GACUCCAGCGGAGAAGUCUGC









The gene modifying polypeptides used comprised the sequence set out in Example 8 labeled RNAV209.


The gene modifying system comprising the gene modifying polypeptides and the template RNA described above was transfected into human HSCs. The gene modifying polypeptide and the template RNA were delivered by nucleofection in RNA format. Specifically, 3000 ng of mRNA encoding the gene modifying polypeptide were combined with 3000 ng template RNA. The RNA mixture was added to 200,000 primary human HSCs in a total of 20 μL of Lonza P3 buffer and HSC were nucleofected in 16-well nucleofection cassettes using program DZ-100. After nucleofection, cells incubated at room temperature for 10 minutes and were transferred to 24-well plates containing 500 μL of StemSpan-XF+SCF at 100 ng/ml, Flt3-L at 100 ng/ml, and TPO at 100 ng/ml in each well and cultured at 37° C., 5% CO2 for 3 days prior to cell lysis and genomic DNA extraction. To analyze gene editing activity, primers flanking the target insertion site locus were used to amplify across the locus. Amplicons were analyzed via short read sequencing using an Illumina MiSeq. Replacement of the DNA bases adenine at nucleotide positions 20 to the base cytosine (HBB8 template RNA) plus the inclusion of the expected silent substitutions downstream of the transcriptional start site within the endogenous B-globin locus indicated successful editing.


As shown in FIG. 19B, average perfect rewrite levels of 0.1%-13.1%, corresponding to replacement of adenine and guanine at nucleotide positions 20 and 21 to the bases cytosine and adenine (respectively) at the SCD codon, were detected in primary human HSCs when the HSCs were treated with exemplary gene modifying systems comprising exemplary HBB8 template RNAs containing various silent substitutions. These results further demonstrate that introducing silent substitutions within an exemplary template gRNA increases editing activity of a gene modifying system comprising said template RNAs more than 9-fold when targeting a clinically relevant codon in the endogenous B-globin locus in primary human HSCs. The results further demonstrate that adjusting the identity or identities of the silent substitution(s) can increase the enhancement to editing activity.


Example 16: Evaluating the Effect of Second Strand-Targeting gRNA and Silent Substitution on Activity of a Gene Modifying Polypeptide and Template RNA for Editing the Endogenous B-Globin Locus Achieved in CD34+Primary Human Hematopoietic Stem Cells (HSCs)

This example demonstrates the use of a gene modifying system containing or not containing a variety of second strand-targeting gRNAs, an exemplary gene modifying polypeptide, and a template RNA, to convert the glutamic acid codon at amino acid position 7 in the endogenous B-globin locus in primary human HSCs to alanine, thereby rewriting a non-pathogenic sequence into position 7. This conversion comprises a change of 2 base pairs for exemplary HBB5 template RNAs (i.e., replacement of the DNA bases thymidine, adenine and guanine at nucleotide positions 18, 20 and 21 to the bases guanine, cytosine and adenine). For exemplary HBB8 template RNAs, the conversion comprises the change of the DNA bases thymidine and adenine at nucleotide positions 18 and 20 to the bases cytosine and cytosine (e.g., using template RNA tg34_HBB8_hs13) or replacement of DNA bases thymidine, adenine and guanine at nucleotide positions 18, 20, 21 to the bases cytosine, cytosine and cytosine, respectively (e.g., using tg34_HBB8_hs10). This Example demonstrates the editing using systems comprising a variety of second strand-targeting gRNAs with: an exemplary HBB5 template RNA comprising a silent substitution (FIG. 20A), or either of two exemplary HBB8 template RNAs each comprising a different silent substitution (FIG. 20B).


In this example, the template RNA contained:

    • (1) a gRNA spacer;
    • (2) a gRNA scaffold;
    • (3) a heterologous object sequence; and
    • (4) a primer binding site (PBS) sequence.


More specifically, the template RNAs comprised the sequences set out in Example 14 labeled tg14_hs1 or Example 15 labeled tg34_HBB8hs10 and tg34_HBB8hs13.


The system further comprised a second strand-targeting gRNA comprising a sequence in Table X1.


The gene modifying polypeptides tested comprised the sequence set out in Example 8 labeled RNAV209.


The gene modifying system comprising the gene modifying polypeptides and the template RNA described above was transfected into human HSCs. The gene modifying polypeptide and the template RNA were delivered by nucleofection in RNA format. Specifically, 3000 ng of mRNA encoding the gene modifying polypeptide were combined with 3000 ng template RNA with or without 2000 ng of second strand-targeting gRNA. The RNA mixture was added to 200,000 primary human HSCs in a total of 20 μL of Lonza P3 buffer and HSC were nucleofected in 16-well nucleofection cassettes using program DZ-100. After nucleofection, cells incubated at room temperature for 10 minutes and were transferred to 24-well plates containing 500 μL of StemSpan-XF+SCF at 100 ng/mL, Flt3-L at 100 ng/mL, and TPO at 100 ng/ml in each well and cultured at 37° ° C., 5% CO2 for 3 days prior to cell lysis and genomic DNA extraction. To analyze gene editing activity, primers flanking the target insertion site locus were used to amplify across the locus. Amplicons were analyzed via short read sequencing using an Illumina MiSeq. Replacement of the DNA bases thymidine, adenine and guanine at nucleotide positions 18, 20 and 21 to the bases guanine, cytosine and adenine indicates successful editing for HBB5 spacer. Replacement thymidine and adenine at nucleotide positions 18 and 20 to the bases cytosine and cytosine (tg34_HBB8_hs13) or replacement of DNA bases thymidine, adenine and guanine at nucleotide positions 18, 20, 21 to the bases cytosine, cytosine and cytosine, respectively (tg34_HBB8_hs10) indicated successful editing for HBB8 spacer.



FIG. 20A shows a graph of editing % in HSCs treated with gene modifying systems comprising the exemplary HBB5 template RNA tg14_hs1 (comprising an exemplary silent substitution) with or without various second strand-targeting gRNAs. The results demonstrate the additive effect of second strand-targeting gRNAs with template gRNAs for HBB5 template RNAs containing silent substitutions for rewriting of a non-pathogenic sequence into a clinically relevant codon in the endogenous B-globin locus.



FIG. 20B shows a graph of editing % in HSCs treated with gene modifying systems comprising either of two exemplary HBB8 template RNAs, tg34_hs13 or tg34_hs10 (each comprising a different exemplary silent substitution) with or without various second strand-targeting gRNAs. The results further demonstrate the additive effect of second strand-targeting gRNAs with template gRNAs for HBB8 template RNAs containing silent substitutions for rewriting of a non-pathogenic sequence into a clinically relevant codon in the endogenous B-globin locus.


Example 17: Evaluating the Effect of Second Strand-Targeting gRNA and Silent Substitution on Activity of a Gene Modifying Polypeptide and Template RNA for Editing the Endogenous B-Globin Locus Achieved in CD34+Primary Human Hematopoietic Stem Cells (HSCs)

This example demonstrates the use of a gene modifying system containing or not containing a second strand-targeting gRNA, an exemplary gene modifying polypeptide and various template RNAs (some comprising a silent substitution), to convert the glutamic acid codon at amino acid position 7 in the endogenous B-globin locus in primary human HSCs to alanine, thereby demonstrating targeting the sequence position associated with sickle cell disease (SCD) and editing of the sequence to encode a non-pathogenic sequence into position 7. This conversion comprises a change of 2 base pairs for exemplary HBB5 template RNAs (i.e., replacement of the DNA bases adenine and guanine at nucleotide positions 20 and 21 to the bases cytosine and adenine) plus or minus the additional replacement of thymidine to guanine at nucleotide position 18 (silent substitution).


In this example, the template RNA contained:

    • (1) a gRNA spacer;
    • (2) a gRNA scaffold;
    • (3) a heterologous object sequence; and
    • (4) a primer binding site (PBS) sequence.


More specifically, the template RNAs comprised the sequences the sequence set out in Example 14 labeled tg14h, tg14_hs1, tg19h or tg19_hs1.


The system further comprised a gRNA sequence designed to produce a second nick, wherein the gRNA has the sequence labeled HBB5_g37 in Table X1.


The gene modifying polypeptides tested comprised the sequence set out in Example 8 labeled RNAV209.


The gene modifying system comprising the gene modifying polypeptides and the template RNA described above was transfected into human HSCs. The gene modifying polypeptide and the template RNA were delivered by nucleofection in RNA format. Specifically, 3000 ng of mRNA encoding the gene modifying polypeptide were combined with 2000 ng template RNA with or without 2000 ng of second strand-targeting gRNA. The RNA mixture was added to 200,000 primary human HSCs in a total of 20 μL of Lonza P3 buffer and cells were nucleofected in 16-well nucleofection cassettes using program DZ-100. After nucleofection, cells incubated at room temperature for 10 minutes and were transferred to 24-well plates containing 500 μL of StemSpan-XF+SCF at 100 ng/ml, Flt3-L at 100 ng/ml, and TPO at 100 ng/ml and cultured at 37° ° C., 5% CO2 for 3 days prior to cell lysis and genomic DNA extraction. To analyze gene editing activity, primers flanking the target insertion site locus were used to amplify across the locus. Amplicons were analyzed via short read sequencing using an Illumina MiSeq. Replacement of the DNA bases adenine and guanine at nucleotide positions 20 and 21 to the bases cytosine and adenine (tg14h or tg19h) or replacement of DNA bases thymidine, adenine and guanine at nucleotide positions 18, 20, 21 to the bases guanine, cytosine, adenine, respectively (tg14_hs1 or tg19_hs1), downstream of the transcriptional start site within the endogenous B-globin locus indicated successful editing.


As shown in FIG. 20C, average perfect rewrite levels of 1.8% and 3.4%, corresponding to replacement of adenine and guanine at nucleotide positions 20 and 21 to the bases cytosine and adenine (respectively) at the SCD codon, were detected in human HSCs when the HSCs were treated with exemplary gene modifying systems comprising exemplary HBB5 template RNAs tg14h or tg19h and no second strand-targeting gRNA was added. Inclusion of the hs1 silent substitution in the template gRNA (tg14_hs1 or tg19_hs1) increased perfect rewriting to 9.1% and 6.3%.


Addition of a second strand-targeting gRNA increased average perfect rewriting to 17.1% for tg14h and 30.2% for tg14_hs1. Similarly, addition of a second strand-targeting gRNA resulted in average perfect rewriting to 20.2% for tg19h and 32.2% for tg19_hs1.


These results demonstrate that silent substitutions and second strand-targeting gRNAs can individually increase editing activity of gene modifying systems, and further show the additive effect of the second strand-targeting gRNA and silent substitutions within an exemplary HBB5 template RNA. The results show a cumulative increase in editing activity of more than 20-fold when using both a silent substitution and second strand-targeting gRNA in primary human HSCs to write a non-pathogenic sequence into a clinically relevant codon in the endogenous B-globin locus.


Example 18: Evaluating Impact of a Gene Modifying Systems Editing the Endogenous B-Globin Locus on Stemness Markers in CD34+Primary Human Hematopoietic Stem Cells (HSCs)

This example demonstrates that editing using a gene modifying system containing an exemplary gene modifying polypeptide and a template RNA with or without a second strand-targeting gRNA to convert the glutamic acid codon at amino acid position 7 in the endogenous B-globin locus in primary human HSCs to alanine does not significantly affect the levels of stem cell markers and proportions of cell marker-characterized sub-populations.


In this example, the template RNA contained:

    • (1) a gRNA spacer;
    • (2) a gRNA scaffold;
    • (3) a heterologous object sequence; and
    • (4) a primer binding site (PBS) sequence.


More specifically, the template RNA comprised the nucleic acid sequences set out in Example 5 labeled FYF tgRNA14.


The system further comprised a second strand-targeting gRNA sequence designed to produce a second nick, wherein the gRNA has the sequence labeled HBB5 g37 in Table X1.


The gene modifying polypeptides tested comprised the amino acid sequence set out in Example 8 labeled RNAV209.


The gene modifying system comprising the gene modifying polypeptides and the template RNA described above was transfected into human HSCs. The gene modifying polypeptide and the template RNA were delivered by nucleofection in RNA format. Specifically, 3000 ng of mRNA encoding the gene modifying polypeptide were combined with 2000 ng template RNA with or without 2000 ng of second strand-targeting gRNA. The RNA mixture was added to 200,000 primary human HSCs in a total of 20 μL of Lonza P3 buffer and cells were nucleofected in 16-well nucleofection cassettes using program DZ-100. After nucleofection, cells incubated at room temperature for 10 minutes and were transferred to 24-well plates containing 500 μL of StemSpan-XF+SCF at 100 ng/mL, Flt3-L at 100 ng/ml, and TPO at 100 ng/ml and cultured at 37° C., 5% CO2 for 3 days prior to cell lysis and genomic DNA extraction. To analyze gene editing activity, primers flanking the target insertion site locus were used to amplify across the locus. Amplicons were analyzed via short read sequencing using an Illumina MiSeq. Replacement of the DNA bases adenine and guanine at nucleotide positions 20 and 21 to the bases cytosine and adenine, respectively, downstream of the transcriptional start site within the endogenous B-globin locus indicated successful editing. To analyze cell surface markers representative of different HSCs subpopulations, cells were stained with fluorescently labeled anti human CD90, CD133, CD34 antibodies and analyzed by flow cytometry 3 days after nucleofection.


As shown in FIG. 21A, editing activity levels of 6.3% and 34.4%, corresponding to replacement of adenine and guanine at nucleotide positions 20 and 21 to the bases cytosine and adenine (respectively) at the SCD codon, were detected in the human HSCs when the HSCs were treated with exemplary gene editing polypeptide combined with template guide RNA tg14 without or with a second strand-targeting gRNA, respectively. Analysis of the distribution of hematopoietic subpopulations (CD34+CD133+CD90+, a combination of markers enriched in HSC with long term reconstitution potential; CD34+CD133+CD90-, a combination of markers enriched in early progenitors; CD34+CD133-, a combination of markers enriched in committed progenitors; CD34-, the absence of which is enriched in differentiated cells) revealed no skewing of subpopulation proportions when comparing samples treated with exemplary gene modifying systems (with or without the addition of a second strand-targeting gRNA) to a mock treated control (FIG. 21B).


These results demonstrate that editing that introduces a non-pathogenic sequence into a clinically relevant codon in the endogenous B-globin locus does not affect the phenotype of primary human HSC, and specifically does not affect markers indicative of differentiation potential in HSCs.


Example 19: Evaluating Editing of Long-Term Reconstitution Capable HSC Subpopulations Using a Gene Modifying Polypeptide and Template RNA for Editing the Endogenous B-Globin Locus Achieved

This example demonstrates that editing using a gene modifying system containing an exemplary gene modifying polypeptide and a template RNA and a second strand-targeting gRNA to convert the glutamic acid codon (GAG) at amino acid position 7 in the endogenous B-globin locus in primary human HSCs to alanine (GCA or GCG) effectively targets HSC subpopulations associated with long term reconstitution as well as other subpopulations, thereby rewriting a non-pathogenic sequence into position 7 into stem cells having longevity and differentiation potential. This conversion comprises a change of two base pairs for exemplary HBB5 template RNAs (i.e., replacement of the DNA bases adenine and guanine at nucleotide positions 20 and 21 to the bases cytosine and adenine, respectively) and the change of one base pair for exemplary HBB8 template RNAs (i.e., replacement of the DNA bases adenine at nucleotide positions 20 to the base cytosine).


In this example, the template RNA contained:

    • (1) a gRNA spacer;
    • (2) a gRNA scaffold;
    • (3) a heterologous object sequence; and
    • (4) a primer binding site (PBS) sequence.


The template RNAs comprised the sequence set out in Example 5 labeled FYF tgRNA14 for HBB5 template RNA or tgRNA34 for HBB8 template RNA, respectively.


The gene modifying polypeptides tested comprised the sequence set out in Example 8 labeled RNAV209.


The system further comprised a second strand-targeting gRNA comprising the sequence listed in Table X1 as HBB5_g37 and HBB8_256 fw.


The gene modifying system comprising the gene modifying polypeptides and the template RNA described above was transfected into human HSCs. The gene modifying polypeptide and the template RNA were delivered by nucleofection in RNA format. Specifically, 3000 ng of mRNA encoding the gene modifying polypeptide were combined with 2000 ng template RNA with or without 2000 ng of second strand-targeting gRNA. The RNA mixture was added to 200,000 primary human HSCs in a total of 20 μL of Lonza P3 buffer and cells were nucleofected in 16-well nucleofection cassettes using program DZ-100. After nucleofection, cells incubated at room temperature for 10 minutes and were transferred to 24-well plates containing 500 μL of StemSpan-XF+SCF at 100 ng/ml, Flt3-L at 100 ng/mL, and TPO at 100 ng/ml and cultured at 37° C., 5% CO2. 3 days after nucleofection, cells were stained with fluorescently labeled anti human CD90, CD133, CD34 antibodies and CD34+CD133+CD90+ and CD34+CD133+CD90− fraction were FACS-sorted and subjected to cell lysis and genomic DNA extraction. To analyze gene editing activity, primers flanking the target insertion site locus were used to amplify across the locus. Amplicons were analyzed via short read sequencing using an Illumina MiSeq. Replacement of the DNA bases adenine and guanine at nucleotide positions 20 and 21 to the bases cytosine and adenine (HBB5 template RNA) or replacement of the DNA bases adenine at nucleotide positions 20 to the base cytosine (HBB8 template RNA) downstream of the transcriptional start site within the endogenous B-globin locus indicated successful editing.


As shown in FIG. 22A, editing activity levels of 19.3% and 29.8% were detected in the CD34+CD133+CD90+HSC subpopulation after treatment with gene modifying systems comprising HBB5 template RNA or HBB8 template RNA, respectively. CD34+CD133+CD90+ cells are enriched in HSCs with long term reconstitution potential. Editing activity levels of 23.73% and 31.5% were detected in all the rest of the HSC population (not CD34+CD133+CD90+) treated with the same exemplary gene modifying systems comprising HBB5 template RNAs and HBB8 template RNAs, respectively. The experiment was repeated using the exemplary HBB5 template RNA tg14_hs1 (Table X1) and a second strand-targeting gRNA (FIG. 22B), and the results showed editing activity of 56% in the CD34+CD133+90+HSC-enriched fraction and 52.9% in the CD34+90− progenitors enriched fraction. This result showed that the addition of silent substitutions to a template RNA (compare tg14_hs1 in FIG. 22B to FYF tgRNA14 in FIG. 22A) significantly increases the editing activity of a gene modifying system when used in long-term primary human HSCs.


These results demonstrate that the editing activity of exemplary gene modifying systems can write a non-pathogenic sequence into a clinically relevant codon in the endogenous B-globin locus in phenotypically long-term primary human HSCs. The results further demonstrate that the editing activity levels in the phenotypically long-term primary human HSCs were comparable to the levels achieved in the rest of the HSC population. The results further demonstrate a high level of editing (greater than 50%) in long-term and progenitor HSCs.


Example 20: Evaluating the Impact on Differentiation Ability of Using a Gene Editing Polypeptide and Template RNA for Rewriting the Endogenous B-Globin Locus of CD34+Primary Human Hematopoietic Stem Cells (HSCs)

This example demonstrates that editing using a gene modifying system containing an exemplary gene modifying polypeptide and a template RNA with or without a second strand-targeting gRNA to convert the glutamic acid codon (GAG) at amino acid position 7 in the endogenous B-globin locus in primary human HSCs to alanine (GCA or GCG) (thereby rewriting a non-pathogenic sequence into position 7) does not significantly alter the differentiation ability of human HSCs. This conversion comprises a change of two base pairs for exemplary HBB5 template RNA (i.e., replacement of the DNA bases adenine and guanine at nucleotide positions 20 and 21 to the bases cytosine and adenine, respectively).


In this example, the template RNA contained:

    • (1) a gRNA spacer;
    • (2) a gRNA scaffold;
    • (3) a heterologous object sequence; and
    • (4) a primer binding site (PBS) sequence.


The template RNAs comprised the sequence set out in Example 5 labeled FYF tgRNA14 for HBB5 template RNA.


The gene modifying polypeptides tested comprised the sequence set out in Example 8 labeled RNAV209.


The system further comprised a second strand-targeting gRNA comprising the sequence listed in Table X1 as HBB5_g27.


The gene modifying system comprising the gene modifying polypeptides and the template RNA described above was transfected into human HSCs. The gene modifying polypeptide and the template RNA were delivered by nucleofection in RNA format. Specifically, 3000 ng of mRNA encoding the gene modifying polypeptide RNA were combined with 2000 ng template RNA with or without 2000 ng of second strand-targeting gRNA. The RNA mixture was added to 200,000 primary human HSCs in a total of 20 μL of Lonza P3 buffer and cells were nucleofected in 16-well nucleofection cassettes using program DZ-100. After nucleofection, cells incubated at room temperature for 10 minutes and were transferred to 24-well plates containing 500 μL of StemSpan-XF+SCF at 100 ng/mL, Flt3-L at 100 ng/ml, and TPO at 100 ng/ml and cultured at 37° C., 5% CO2. 2 days after nucleofection, cells were cultured in semi-solid Methcult media for colony forming assay.


To analyze gene editing activity, primers flanking the target insertion site locus were used to amplify across the locus. Amplicons were analyzed via short read sequencing using an Illumina MiSeq. Replacement of the DNA bases adenine and guanine at nucleotide positions 20 and 21 to the bases cytosine and adenine (HBB5 template RNA) downstream of the transcriptional start site within the endogenous B-globin locus indicated successful editing.


As shown in FIG. 23A, total colony CFU numbers after treating HSCs obtained from 3 different donors with exemplary gene modifying system with or without second strand-targeting gRNA were comparable to total colony CFU numbers when the HSCs received mock treatment. These results demonstrate that treatment with the exemplary gene modifying systems did not significantly decrease the viability of treated HSCs. As shown in FIG. 23B, the numbers of CFU-E, BFU-E, CFU-M, CFU-GM, and CFU-G produced from CD34+ cells transfected with exemplary gene modifying systems after 14 days of clonal growth in methylcellulose were comparable to the corresponding CFU numbers when the CD34+ cells that received mock treatment. FIG. 23C shows a graph of the percent enucleated CD235+ cells after HSCs treated with exemplary gene modifying systems began in vitro differentiation. The results show that HSCs treated with exemplary gene modifying systems produced similar percentages of red blood cell-like cells at a similar rate as mock treated HSCs.


These results show that editing a non-pathogenic sequence into a clinically relevant codon in the endogenous B-globin locus using exemplary gene modifying systems described herein does not have a significant effect on the differentiation ability of human HSCs.


Example 21: Screening Configurations of Template RNAs that Correct the SCD Mutation in Human CD34+ Cell with SCD Mutation

This example describes the use of an exemplary gene modifying system containing a gene modifying polypeptide and template RNAs comprising varied lengths of heterologous object sequences and PBS sequences to identify favorable configurations for correction of the SCD mutation. In this example, a template RNA contains:

    • (1) a gRNA spacer;
    • (2) a gRNA scaffold;
    • (3) a heterologous object sequence; and
    • (4) a primer binding site (PBS) sequence.


The template RNAs were designed to contain 8-17 nucleotide PBS sequences and 9-20 nucleotide heterologous object sequences (Table X4). Template RNAs with two different gRNA exemplary spacer sequences, HBB5 and HBB8, were used to target SCD mutation in CD34+SCD human cells. The heterologous object sequences and PBS sequences were designed to correct the SCD mutation by replacing a “T” nucleotide with an “A” nucleotide (Wildtype) or with a “C” (Makassar installation) at the mutation site using a gene modifying system described herein. Template RNAs were also designed to produce either or both of 1) PAM-kill mutations or 2) one or more silent substitutions.









TABLE X4







Exemplary Template RNAs Designed to ConvertSCD mutation to


Wildtype or Makassar.














SEQ







ID
RT
PBS
WT/


Name
Sequence
NO
length
length
Makassar















tg14
mC*mA*mU*rGrGrUrGrCr
20939
14
10
Makassar


_hs1
ArCrCrUrGrArCrUrCrCr







UrGrGrUrUrUrUrArGrAm







GmCmUmAmGmAmAmAmUmAm







GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArGr







ArCrUrUrCrUrCrUrGrCr







CrGrGrArGrUrCrArG*mG







*mU*mG









tg14
mC*mA*mU*rGrGrUrGrCr
20940
14
10
WT


_hs1-
ArCrCrUrGrArCrUrCrCr






SCD-
UrGrGrUrUrUrUrArGrAm






wt
GmCmUmAmGmAmAmAmUmAm







GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArGr







ArCrUrUrCrUrCrUrUrCr







CrGrGrArGrUrCrArG*mG







*mU*mG









tgRN
mG*mU*mA*rArCrGrGrCr
20941
14
11
Makassar


A34_
ArGrArCrUrUrCrUrCrCr






HBB
ArCrGrUrUrUrUrArGrAm






8h-
GmCmUmAmGmAmAmAmUmAm






SCD-
GmCrArArGrUrUrArArAr






M
ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrUrGr







CrGrGrArGrArArGrUrC*







mU*mG*mC









tgRN
mG*mU*mA*rArCrGrGrCr
20942
14
11
WT


A34
ArGrArCrUrUrCrUrCrCr






_HBB
ArCrGrUrUrUrUrArGrAm






8h-
GmCmUmAmGmAmAmAmUmAm






SCD-
GmCrArArGrUrUrArArAr






wt
ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrUrGr







ArGrGrArGrArArGrUrC*







mU*mG*mC









tgRN
mG*mU*mA*rArCrGrGrCr
20943
14
11
Makassar


A34
ArGrArCrUrUrCrUrCrCr






_HBB
ArCrGrUrUrUrUrArGrAm






8hs4-
GmCmUmAmGmAmAmAmUmAm






MK
GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrCrGr







CrArGrArGrArArGrUrC*







mU*mG*mC









tgRN
mG*mU*mA*rArCrGrGrCr
20944
14
11
WT


A34
ArGrArCrUrUrCrUrCrCr






_HBB
ArCrGrUrUrUrUrArGrAm






8hs4-
GmCmUmAmGmAmAmAmUmAm






WT
GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrCrGr







ArArGrArGrArArGrUrC*







mU*mG*mC









tgRN
mG*mU*mA*rArCrGrGrCr
20945
14
11
Makassar


A34
ArGrArCrUrUrCrUrCrCr






_HBB
ArCrGrUrUrUrUrArGrAm






8hs7-
GmCmUmAmGmAmAmAmUmAm






MK
GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrCrGr







CrUrGrArGrArArGrUrC*







mU*mG*mC









tgRN
mG*mU*mA*rArCrGrGrCr
20946
14
11
WT


A34
ArGrArCrUrUrCrUrCrCr






_HBB
ArCrGrUrUrUrUrArGrAm






8hs7-
GmCmUmAmGmAmAmAmUmAm






WT
GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrCrGr







ArUrGrArGrArArGrUrC*







mU*mG*mC









tgRN
mG*mU*mA*rArCrGrGrCr
20947
14
11
Makassar


A34
ArGrArCrUrUrCrUrCrCr






_HBB
ArCrGrUrUrUrUrArGrAm






8hs10
GmCmUmAmGmAmAmAmUmAm






-MK
GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrCrGr







CrCrGrArGrArArGrUrC*







mU*mG*mC









tgRN
mG*mU*mA*rArCrGrGrCr
20948
14
11
WT


A34
ArGrArCrUrUrCrUrCrCr






_HBB
ArCrGrUrUrUUrArGrAmG






8hs10
mCmUmAmGmAmAmAmUmAmG






-WT
mCrArArGrUrUrArArArA







rUrArArGrGrCrUrArGrU







rCrCrGrUrUrArUrCrAmA







mCmUmUmGmAmAmAmAmAmG







mUmGmGmCmAmCmCmGmAmG







mUmCmGmGmUmGmCrArCrC







rUrGrArCrUrCrCrCrGrA







rCrGrArGrArArGrUrC*m







U*mG*mC









tgRN
mG*mU*mA*rArCrGrGrCr
20949
14
11
Makassar


A34
ArGrArCrUrUrCrUrCrCr






_HBB
ArCrGrUrUrUrUrArGrAm






8hs13
GmCmUmAmGmAmAmAmUmAm






-MK
GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrCrGr







CrGrGrArGrArArGrUrC*







mU*mG*mC









tgRN
mG*mU*mA*rArCrGrGrCr
20950
14
11
WT


A34
ArGrArCrUrUrCrUrCrCr






_HBB
ArCrGrUrUrUrUrArGrAm






8hs13
GmCmUmAmGmAmAmAmUmAm






-WT
GmCrArArGrUrUrArArAr







ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArCr







CrUrGrArCrUrCrCrCrGr







ArGrGrArGrArArGrUrC*







mU*mG*mC









tg14_
mC*mA*mU*rGrGrUrGrCr
20951
14
10
WT


PAM
ArCrCrUrGrArCrUrCrCr






T_hs
UrGrGrUrUrUrUrArGrAm






1-
GmCmUmAmGmAmAmAmUmAm






SCD-
GmCrArArGrUrUrArArAr






wt
ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArGr







ArCrUrUrCrUrCrArGrCr







CrGrGrArGrUrCrArG*mG







*mU*mG









tg14
mC*mA*mU*rGrGrUrGrCr
20952
14
10
WT


PAM
ArCrCrUrGrArCrUrCrCr






C_hs
UrGrGrUrUrUrUrArGrAm






1-
GmCmUmAmGmAmAmAmUmAm






SCD-
GmCrArArGrUrUrArArAr






wt
ArUrArArGrGrCrUrArGr







UrCrCrGrUrUrArUrCrAm







AmCmUmUmGmAmAmAmAmAm







GmUmGmGmCmAmCmCmGmAm







GmUmCmGmGmUmGmCrArGr







ArCrUrUrCrUrCrGrGrCr







CrGrGrArGrUrCrArG*mG







*mU*mG









Table X4A shows the sequences of X4 without modifications. In some embodiments, the sequences used in this table can be used without chemical modifications.









TABLE X4A







Table X4 Sequences without Modifications













SEQ



Name
Sequence
ID NO






tg14_hs1
CAUGGUGCACCUGACUCCUG
21892




GUUUUAGAGCUAGAAAUAGC





AAGUUAAAAUAAGGCUAGUC





CGUUAUCAACUUGAAAAAGU





GGCACCGAGUCGGUGCAGAC





UUCUCUGCCGGAGUCAGGUG







tg14_hs1-
CAUGGUGCACCUGACUCCUG
21893



SCD-wt
GUUUUAGAGCUAGAAAUAGC





AAGUUAAAAUAAGGCUAGUC





CGUUAUCAACUUGAAAAAGU





GGCACCGAGUCGGUGCAGAC





UUCUCUUCCGGAGUCAGGUG







tgRNA34_H
GUAACGGCAGACUUCUCCAC
21894



BB8h-SCD-
GUUUUAGAGCUAGAAAUAGC




M
AAGUUAAAAUAAGGCUAGUC





CGUUAUCAACUUGAAAAAGU





GGCACCGAGUCGGUGCACCU





GACUCCUGCGGAGAAGUCUG





C







tgRNA34_H
GUAACGGCAGACUUCUCCAC
21895



BB8h-SCD-
GUUUUAGAGCUAGAAAUAGC




wt
AAGUUAAAAUAAGGCUAGUC





CGUUAUCAACUUGAAAAAGU





GGCACCGAGUCGGUGCACCU





GACUCCUGAGGAGAAGUCUG





C







tgRNA34_H
GUAACGGCAGACUUCUCCAC
21896



BB8hs4-MK
GUUUUAGAGCUAGAAAUAGC





AAGUUAAAAUAAGGCUAGUC





CGUUAUCAACUUGAAAAAGU





GGCACCGAGUCGGUGCACCU





GACUCCCGCAGAGAAGUCUG





C







tgRNA34_H
GUAACGGCAGACUUCUCCAC
21897



BB8hs4-WT
GUUUUAGAGCUAGAAAUAGC





AAGUUAAAAUAAGGCUAGUC





CGUUAUCAACUUGAAAAAGU





GGCACCGAGUCGGUGCACCU





GACUCCCGAAGAGAAGUCUG





C







tgRNA34_H
GUAACGGCAGACUUCUCCAC
21898



BB8hs7-MK
GUUUUAGAGCUAGAAAUAGC





AAGUUAAAAUAAGGCUAGUC





CGUUAUCAACUUGAAAAAGU





GGCACCGAGUCGGUGCACCU





GACUCCCGCUGAGAAGUCUG





C







tgRNA34_H
GUAACGGCAGACUUCUCCAC
21899



BB8hs7-WT
GUUUUAGAGCUAGAAAUAGC





AAGUUAAAAUAAGGCUAGUC





CGUUAUCAACUUGAAAAAGU





GGCACCGAGUCGGUGCACCU





GACUCCCGAUGAGAAGUCUG





C







tgRNA34_H
GUAACGGCAGACUUCUCCAC
21900



BB8hs10-
GUUUUAGAGCUAGAAAUAGC




MK
AAGUUAAAAUAAGGCUAGUC





CGUUAUCAACUUGAAAAAGU





GGCACCGAGUCGGUGCACCU





GACUCCCGCCGAGAAGUCUG





C







tgRNA34_H
GUAACGGCAGACUUCUCCAC
21901



BB8hs10-
GUUUUAGAGCUAGAAAUAGC




WT
AAGUUAAAAUAAGGCUAGUC





CGUUAUCAACUUGAAAAAGU





GGCACCGAGUCGGUGCACCU





GACUCCCGACGAGAAGUCUG





C







tgRNA34_H
GUAACGGCAGACUUCUCCAC
21902



BB8hs13-
GUUUUAGAGCUAGAAAUAGC




MK
AAGUUAAAAUAAGGCUAGUC





CGUUAUCAACUUGAAAAAGU





GGCACCGAGUCGGUGCACCU





GACUCCCGCGGAGAAGUCUG





C







tgRNA34_H
GUAACGGCAGACUUCUCCAC
21903



BB8hs13-
GUUUUAGAGCUAGAAAUAGC




WT
AAGUUAAAAUAAGGCUAGUC





CGUUAUCAACUUGAAAAAGU





GGCACCGAGUCGGUGCACCU





GACUCCCGAGGAGAAGUCUG





C







tg14_PAMT
CAUGGUGCACCUGACUCCUG
21904



hs1-SCD-
GUUUUAGAGCUAGAAAUAGC




wt
AAGUUAAAAUAAGGCUAGUC





CGUUAUCAACUUGAAAAAGU





GGCACCGAGUCGGUGCAGAC





UUCUCAGCCGGAGUCAGGUG







tg14_PAMC
CAUGGUGCACCUGACUCCUG
21905



hs1-SCD-
GUUUUAGAGCUAGAAAUAGC




wt
AAGUUAAAAUAAGGCUAGUC





CGUUAUCAACUUGAAAAAGU





GGCACCGAGUCGGUGCAGAC





UUCUCGGCCGGAGUCAGGUG









Exemplary gene modifying systems comprising mRNA encoding the gene modifying polypeptide and a template RNA from Table X4 with or without second strand-targeting gRNA (e.g., from Table X1) are used to transfect human HSCs harboring the SCD mutation. The gene modifying system is used to correct the SCD mutation by replacing a “T” nucleotide with an “A” (wildtype) or “C” (Makassar) nucleotide at the mutation site in the endogenous B-globin locus in primary human HSCs. Amplicon sequencing will be used to show editing at the mutation site in the endogenous B-globin locus in primary human HSCs.


The results will show that exemplary gene modifying systems have editing activity when correcting the SCD mutation in the endogenous B-globin locus in primary human HSCs.


Herein, when an RNA sequence (e.g., a template RNA sequence) is said to comprise a particular sequence (e.g., a sequence of Table A or Table B or a portion thereof) that comprises thymine (T), it is of course understood that the RNA sequence may (and frequently does) comprise uracil (U) in place of T. For instance, the RNA sequence may comprise U at every position shown as T in the sequence in Table A or B. More specifically, the present disclosure provides an RNA sequence according to every template sequence shown in Table A and B, wherein the RNA sequence has a U in place of each T in the sequence of Table A and B.


It should be understood that for all numerical bounds describing some parameter in this application, such as “about,” “at least,” “less than,” and “more than,” the description also necessarily encompasses any range bounded by the recited values. Accordingly, for example, the description “at least 1, 2, 3, 4, or 5” also describes, inter alia, the ranges 1-2, 1-3, 1-4, 1-5, 2-3, 2-4, 2-5, 3-4, 3-5, and 4-5, et cetera.


For all patents, applications, or other reference cited herein, such as non-patent literature and reference sequence information, it should be understood that they are incorporated by reference in their entirety for all purposes as well as for the proposition that is recited. Where any conflict exists between a document incorporated by reference and the present application, this application will control. All information associated with reference gene sequences disclosed in this application, such as GeneIDs or accession numbers (typically referencing NCBI accession numbers), including, for example, genomic loci, genomic sequences, functional annotations, allelic variants, and reference mRNA (including, e.g., exon boundaries or response elements) and protein sequences (such as conserved domain structures), as well as chemical references (e.g., PubChem compound, PubChem substance, or PubChem Bioassay entries, including the annotations therein, such as structures and assays, et cetera), are hereby incorporated by reference in their entirety.


Headings used in this application are for convenience only and do not affect the interpretation of this application.










LENGTHY TABLES




The patent application contains a lengthy table section. A copy of the table is available in electronic form from the USPTO web site (). An electronic copy of the table will also be available from the USPTO upon request and payment of the fee set forth in 37 CFR 1.19(b)(3).





Claims
  • 1. A template RNA comprising from 5′ to 3′: a) a gRNA spacer that is complementary to a first portion of the human HBB gene, wherein the gRNA spacer comprises a sequence according to SEQ ID NO: 20,027;b) a gRNA scaffold that binds a SpCas9;c) a heterologous object sequence comprising a mutation region to correct a mutation in a second portion of the human HBB gene; andd) a primer binding site (PBS) sequence comprising 8 bases with 100% identity to a third portion of the human HBB gene, wherein the PBS sequence comprises a nucleotide sequence comprising GAGAAGUCUGC.
  • 2. The template RNA of claim 1, wherein the mutation to be corrected in the human HBB gene is E6V.
  • 3. The template RNA of claim 1, wherein the gRNA spacer has a length of 20 nucleotides.
  • 4. The template RNA of claim 1, wherein the heterologous object sequence has a length of 10-20 nucleotides.
  • 5. The template RNA of claim 1, wherein the heterologous object sequence comprises, from 5′ to 3′, a post-edit homology region, a mutation region, and a pre-edit homology region.
  • 6. The template RNA of claim 1, wherein the heterologous object sequence has an RNA sequence of (i) ACCUGACUCCUGAG, (ii) ACCUGACUCCCGAG, or (iii) an RNA sequence having at least 90% identity thereto.
  • 7. The template RNA of claim 1, wherein the PBS sequence has a length of 11-16 nucleotides.
  • 8. The template RNA of claim 1, wherein the PBS sequence consists of an RNA sequence of GAGAAGUCUGC.
  • 9. The template RNA of claim 1, wherein the gRNA scaffold comprises an RNA sequence having at least 90% identity to SEQ ID NO: 20,117.
  • 10. The template RNA of claim 1, wherein the gRNA scaffold comprises an RNA sequence according to SEQ ID NO: 20,117.
  • 11. The template RNA of claim 1, which comprises an RNA sequence having at least 90% identity to SEQ ID NO: 21,963, SEQ ID NO: 20,567, or SEQ ID NO: 21,903.
  • 12. The template RNA of claim 1, which comprises an RNA sequence according to SEQ ID NO: 21,963, SEQ ID NO: 20,567, or SEQ ID NO: 21,903.
  • 13. The template RNA of claim 1, wherein the mutation region comprises a first region designed to correct a pathogenic mutation in the HBB gene and a second region designed to introduce a silent substitution.
  • 14. The template RNA of claim 1, which comprises one or more chemically modified nucleotides.
  • 15. The template RNA of claim 14, which comprises the RNA sequence and chemical modifications set out in SEQ ID NO: 20,942, SEQ ID NO: 20,477, or SEQ ID NO: 20,950.
  • 16. A gene modifying system comprising: a template RNA of claim 1, anda gene modifying polypeptide, or a nucleic acid encoding the gene modifying polypeptide.
  • 17. The gene modifying system of claim 16, which comprises the nucleic acid encoding the gene modifying polypeptide, wherein the nucleic acid comprises RNA.
  • 18. The gene modifying system of claim 16, wherein the gene modifying polypeptide comprises: a reverse transcriptase (RT) domain;a Cas domain; anda linker disposed between the RT domain and the Cas domain.
  • 19. The gene modifying system of claim 18, wherein the Cas domain is a SpCas9 domain.
  • 20. The gene modifying system of claim 18, wherein the RT domain is an RT domain from a murine leukemia virus (MMLV), a porcine endogenous retrovirus (PERV); Avian reticuloendotheliosis virus (AVIRE), a feline leukemia virus (FLV), simian foamy virus (SFV) (e.g., SFV3L), bovine leukemia virus (BLV), Mason-Pfizer monkey virus (MPMV), human foamy virus (HFV), or bovine foamy/syncytial virus (BFV/BSV).
  • 21. The gene modifying system of claim 16, which further comprises a second strand-targeting gRNA spacer that directs a second nick to the second strand of the human HBB gene.
  • 22. A pharmaceutical composition, comprising the gene modifying system of claim 16 and a pharmaceutically acceptable excipient or carrier.
  • 23. The pharmaceutical composition of claim 22, wherein the pharmaceutically acceptable excipient or carrier is selected from the group consisting of a plasmid vector, a viral vector, a vesicle, and a lipid nanoparticle.
  • 24. A method of making the template RNA of claim 1, the method comprising synthesizing the template RNA by in vitro transcription, solid-phase synthesis, or by introducing a DNA encoding the template RNA into a host cell under conditions that allow for production of the template RNA.
  • 25. A method for modifying a target site in the human HBB gene in a cell, the method comprising contacting the cell with the gene modifying system of claim 16, or DNA encoding the same, thereby modifying the target site in the human HBB gene in a cell.
  • 26. A method for treating a subject having a disease or condition associated with a mutation in the human HBB gene, the method comprising administering to the subject the gene modifying system of claim 16, or DNA encoding the same, thereby treating the subject having a disease or condition associated with a mutation in the human HBB gene.
  • 27. A template RNA comprising, e.g., from 5′ to 3′: (i) a gRNA spacer that is complementary to a first portion of the human HBB gene, wherein the gRNA spacer has a sequence comprising the core nucleotides of a gRNA spacer sequence of Table 1, or wherein the gRNA spacer has a sequence of a spacer chosen from Table A, Table AA, Table B, Table B1, Tables 5A-5D, Table X4, or Table X4A;(ii) a gRNA scaffold that binds a gene modifying polypeptide,(iii) a heterologous object sequence comprising a mutation region to introduce a mutation into a second portion of the human HBB gene, and(iv) a primer binding site (PBS) sequence comprising at least 3, 4, 5, 6, 7, or 8 bases with 100% identity to a third portion of the human HBB gene,
  • 28. A gene modifying system comprising: a template RNA of claim 27, anda gene modifying polypeptide, or a nucleic acid encoding the gene modifying polypeptide.
  • 29. A method for modifying a target site in the human HBB gene in a cell, the method comprising contacting the cell with the gene modifying system of claim 28, or DNA encoding the same, thereby modifying the target site in the human HBB gene in a cell.
  • 30. A method for treating a subject having a disease or condition associated with a mutation in the human HBB gene, the method comprising administering to the subject the gene modifying system of claim 28, or DNA encoding the same, thereby treating the subject having a disease or condition associated with a mutation in the human HBB gene.
Provisional Applications (3)
Number Date Country
63241994 Sep 2021 US
63250143 Sep 2021 US
63303900 Jan 2022 US
Continuations (1)
Number Date Country
Parent PCT/US22/76063 Sep 2022 WO
Child 18590275 US