OLIGONUCLEOTIDES TARGETING XBP1

REFERENCE TO AN ELECTRONIC SEQUENCE LISTING

A Sequence Listing conforming to the rules of WIPO Standard ST.26 is hereby incorporated by reference. Said Sequence Listing has been filed as an electronic document via PatentCenter encoded as XML in UTF-8 text. The electronic document, created on Jun. 22, 2023, is entitled “105703-1389885-P36582-US_ST26.xml”, and is 8,699,236 bytes in size.

FIELD OF THE INVENTION

The present invention relates to oligonucleotides which induce expression of an XBP1 splice variant. Such oligonucleotides can enhance the level and/or quality of protein expression in cells and have utility in mammalian protein expression systems, such as heterologous protein expression systems. The oligonucleotides also have therapeutic utilities including the treatment or prevention of proteopathological diseases.

BACKGROUND

XBP1, X-box binding protein 1, is a transcription factor which mediates adaptation to ER stress by inducing genes that are involved in protein folding and quality control.

The XBP1 transcript exists in different splice forms, including a splice variant whose expression is regulated by IRE1α (inositol requiring-enzyme 1 alpha). In mammalian cells, IRE1α excises a 26 nucleotide fragment from the XBP1 mRNA under endoplasmic reticulum (ER) stress to generate a splice variant that encodes the functionally active XBP1s protein.

The excision of the 26 nucleotide fragment results in a +2 out of frame event, resulting in the expression of the active XBP1 transcription factor (XBP-1S). The 26 nucleotide fragment is present in exon 4 of XBP1 mature mRNA.

Cain et al., (Biotechnol Prog 2013; 29(3):697-706) reports on Chinese hamster ovary (CHO) cells engineered to express both X-box binding protein (XBP-1S) and endoplasmic reticulum oxidoreductase (ERO1-Lα) (CHOS-XE. CHOS-XE cells) which provide increased antibody yields (5.3-6.2 fold) in comparison to CHOS cells.

Tong et al., (Neurochem. 2012 November; 123(3): 406-416), reports on the over-expression of mutant TDP-43 in transgenic rats, which resulted in prominent aggregation of ubiquitin and loss of fragmentation of Golgi complexes, prior to neuronal loss. Notably the aggregation of ubiquitin and loss of fragmentation of Golgi complexes was further preceded by depletion of XBP1 and inactivation of the unfolded protein response (UPR) This indicates that there is a need for restoring or up-regulating the XBP1 mediated UPR in diseases associated with aberrant protein folding (proteopathological diseases), such as neurodegenerative diseases, including TDP-43 pathologies, e.g. frontotemporal lobar degeneration (FTLD) and ALS.

In WO 2003/89622 novel genes, compositions, and methods for modulating the unfolded protein response are disclosed.

In WO 2019/004939 antisense oligonucleotides for modulating the function of a t cell are disclosed.

In WO 2008/016356 the genemap of the human genes associated with psoriasis is disclosed.

OBJECT OF THE INVENTION

The inventors have surprisingly determined that an active XBP1 splice variant has applications in methods of protein production as well as in therapeutic methods, primarily relating to the treatment of proteopathological diseases.

The inventors have surprisingly determined that an active XBP1 spice variant can be produced using an antisense oligonucleotide which is complementary, such as fully complementary, to a portion of the XBP1 pre-mRNA transcript. This XPB1 splice variant may be an XBP1Δ4 splice variant (XBP1 splice variant with deleted exon 4). XBP1 exon 4 comprises the 26 nucleotide fragment which is excised by IRE1α in vivo, and as with the in vivo IRE1α 26 nucleotide excision event, the skipping of exon 4 introduces a +2 out of frame event.

The current invention is based, at least in part, on the finding that the generation or expression of the XBP1Δ4 variant in recombinant mammalian cells results in an enhanced expression of heterologously expressed proteins, such as monoclonal antibodies, particularly heterologously expressed proteins which are otherwise difficult to express. With the expression of the XBP1Δ4 variant, protein expression with enhanced quality in mammalian cells can be obtained.

The current invention is based, at least in part, on the finding that compounds, such as antisense oligonucleotides, which induce the generation or expression of XBP1Δ4 in mammalian cells, are useful in enhancing the recombinant expression of heterologously expressed proteins in mammalian cells. In particular, compounds, such as antisense oligonucleotides, which induce the expression of XBP1Δ4 in mammalian cells, are useful in enhancing the recombinant expression of correctly folded heterologously expressed proteins in mammalian cells.

The current invention is based, at least in part, on the finding that antisense oligonucleotides which induce the expression of XBP1Δ4 in mammalian cells are useful for the treatment of proteopathological diseases.

SUMMARY OF THE INVENTION

According to one aspect, the invention provides an antisense oligonucleotide for use in the generation or expression of a XBP1 splice variant in a cell which expresses XBP1, wherein the antisense oligonucleotide is 8-40 nucleotides in length and comprises a contiguous nucleotide sequence of 8-40 nucleotides in length which is complementary to a mammalian XBP1 pre-mRNA transcript.

The XBP1 splice variant may be a XBP1Δ4 variant.

The contiguous nucleotide sequence may be complementary to at least 10 contiguous nucleotides of the hamster XBP1 pre-mRNA transcript (SEQ ID NO 1), such as at least 10 contiguous nucleotides from nucleotides 2960-3113 of SEQ ID NO 1 or at least 10 contiguous nucleotides from nucleotides 2986-3018 of SEQ ID NO 1.

The contiguous nucleotide sequence may be complementary to a sequence selected from the group consisting of SEQ ID NO 299, SEQ ID NO 301, SEQ ID NO 302, SEQ ID NO 304, SEQ ID NO 305, SEQ ID NO 306, SEQ ID NO 307, SEQ ID NO 308, SEQ ID NO 309, SEQ ID NO 310, SEQ ID NO 314, SEQ ID NO 316, SEQ ID NO 317, SEQ ID NO 318, SEQ ID NO 319, SEQ ID NO 323, SEQ ID NO 325, SEQ ID NO 327, SEQ ID NO 328, SEQ ID NO 330, SEQ ID NO 331, SEQ ID NO 332, SEQ ID NO 333, SEQ ID NO 334, SEQ ID NO 336, SEQ ID NO 337, SEQ ID NO 385, SEQ ID NO 386, SEQ ID NO 387, SEQ ID NO 388, SEQ ID NO 390, SEQ ID NO 391, SEQ ID NO 392, SEQ ID NO 393, SEQ ID NO 394, SEQ ID NO 395, SEQ ID NO 396 397, SEQ ID NO 398, SEQ ID NO 399, SEQ ID NO 401, SEQ ID NO 402, SEQ ID NO 419, SEQ ID NO 431, SEQ ID NO, SEQ ID NO 432, SEQ ID NO 433, SEQ ID NO 434, SEQ ID NO 438, SEQ ID NO 439, SEQ ID NO 440, SEQ ID NO 441, SEQ ID NO 442, SEQ ID NO 449, SEQ ID NO 484, SEQ ID NO 485, SEQ ID NO 486, SEQ ID NO 487, SEQ ID NO 488, SEQ ID NO 489, SEQ ID NO 490, SEQ ID NO 491, SEQ ID NO 492, SEQ ID NO 493, SEQ ID NO 494, SEQ ID NO 495, SEQ ID NO 496, SEQ ID NO 497, SEQ ID NO 498, SEQ ID NO 499, SEQ ID NO 500, SEQ ID NO 501, SEQ ID NO 502, SEQ ID NO 503, SEQ ID NO 505, SEQ ID NO 506, SEQ ID NO 507, SEQ ID NO 508, SEQ ID NO 509, SEQ ID NO 510, SEQ ID NO 511, SEQ ID NO 512, SEQ ID NO 513, SEQ ID NO 515, SEQ ID NO 517, SEQ ID NO 520, SEQ ID NO 572, SEQ ID NO 573, SEQ ID NO 576, SEQ ID NO 577, SEQ ID NO 588 and SEQ ID NO 589.

The contiguous nucleotide sequence may be selected from the group consisting of SEQ ID NO 8, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 32, SEQ ID NO 34, SEQ ID NO 36, SEQ ID NO 37, SEQ ID NO 39, SEQ ID NO 40, SEQ ID NO 41, SEQ ID NO 42, SEQ ID NO 43, SEQ ID NO 45, SEQ ID NO 46, SEQ ID NO 94, SEQ ID NO 95, SEQ ID NO 96, SEQ ID NO 97, SEQ ID NO 99, SEQ ID NO 100, SEQ ID NO 101, SEQ ID NO 102, SEQ ID NO 103, SEQ ID NO 104, SEQ ID NO 105, SEQ ID NO 106, SEQ ID NO 107, SEQ ID NO 108, SEQ ID NO 110, SEQ ID NO 111, SEQ ID NO 128, SEQ ID NO 140, SEQ ID NO 141, SEQ ID NO 142, SEQ ID NO 143, SEQ ID NO 147, SEQ ID NO 148, SEQ ID NO 149, SEQ ID NO 150, SEQ ID NO 151, SEQ ID NO 158, SEQ ID NO 193, SEQ ID NO 194, SEQ ID NO 195, SEQ ID NO 196, SEQ ID NO 197, SEQ ID NO 198, SEQ ID NO 199, SEQ ID NO 200, SEQ ID NO 201, SEQ ID NO 202, SEQ ID NO 203, SEQ ID NO 204, SEQ ID NO 205, SEQ ID NO 206, SEQ ID NO 207, SEQ ID NO 208, SEQ ID NO 209, SEQ ID NO 210, SEQ ID NO 211, SEQ ID NO 212, SEQ ID NO 214, SEQ ID NO 215, SEQ ID NO 216, SEQ ID NO 217, SEQ ID NO 218, SEQ ID NO 219, SEQ ID NO 220. SEQ ID NO 221, SEQ ID NO 222, SEQ ID NO 224, SEQ ID NO 226, SEQ ID NO 229, SEQ ID NO 281, SEQ ID NO 282, SEQ ID NO 285, SEQ ID NO 286, SEQ ID NO 297 and SEQ ID NO 298.

The contiguous nucleotide sequence may be complementary to at least 10 contiguous nucleotides of the mouse XBP1 pre-mRNA transcript (SEQ ID NO 590).

The contiguous nucleotide sequence may be complementary to a sequence selected from the group consisting of SEQ ID NO 699, SEQ ID NO 700, SEQ ID NO 703, SEQ ID NO 710, SEQ ID NO 713, SEQ ID NO 724, SEQ ID NO 729, SEQ ID NO 739, SEQ ID NO 743, SEQ ID NO 744, SEQ ID NO 745, SEQ ID NO 749, SEQ ID NO 750, SEQ ID NO 751, SEQ ID NO 752, SEQ ID NO 753, SEQ ID NO 754, SEQ ID NO 755, SEQ ID NO 756, SEQ ID NO 757, SEQ ID NO 758, SEQ ID NO 759, SEQ ID NO 760, SEQ ID NO 761, SEQ ID NO 762, SEQ ID NO 763, SEQ ID NO 773, SEQ ID NO 776, SEQ ID NO 778, SEQ ID NO 781, SEQ ID NO 783, SEQ ID NO 784, SEQ ID NO 785, SEQ ID NO 787, SEQ ID NO 789, SEQ ID NO 790, SEQ ID NO 791, SEQ ID NO 792, SEQ ID NO 793, SEQ ID NO 794, SEQ ID NO 795, SEQ ID NO 796, SEQ ID NO 797, SEQ ID NO 798, SEQ ID NO 799 and SEQ ID NO 800.

The contiguous nucleotide sequence may be selected from the group consisting of SEQ ID NO 597, SEQ ID NO 598, SEQ ID NO 601, SEQ ID NO 608, SEQ ID NO 611, SEQ ID NO 622, SEQ ID NO 627, SEQ ID NO 637, SEQ ID NO 641, SEQ ID NO 642, SEQ ID NO 643, SEQ ID NO 647, SEQ ID NO 648, SEQ ID NO 649, SEQ ID NO 650, SEQ ID NO 651, SEQ ID NO 652, SEQ ID NO 653, SEQ ID NO 654, SEQ ID NO 655, SEQ ID NO 656, SEQ ID NO 657, SEQ ID NO 658, SEQ ID NO 659, SEQ ID NO 660, SEQ ID NO 661, SEQ ID NO 671, SEQ ID NO 674, SEQ ID NO 676, SEQ ID NO 679, SEQ ID NO 681, SEQ ID NO 682, SEQ ID NO 683, SEQ ID NO 685, SEQ ID NO 687, SEQ ID NO 688, SEQ ID NO 689, SEQ ID NO 690, SEQ ID NO 691, SEQ ID NO 692, SEQ ID NO 693, SEQ ID NO 694, SEQ ID NO 695, SEQ ID NO 696, SEQ ID NO 697 and SEQ ID NO 697.

The contiguous nucleotide sequence may be complementary to at least 10 contiguous nucleotides of the human XBP1 pre-mRNA transcript (SEQ ID NO 801).

The contiguous nucleotide sequence may be complementary to a sequence selected from the group consisting of SEQ ID NO 947, SEQ ID NO 948, SEQ ID NO 949, SEQ ID NO 950, SEQ ID NO 951 and SEQ ID NO 988.

The contiguous nucleotide sequence may be selected from the group consisting of SEQ ID NO 854, SEQ ID NO 855, SEQ ID NO 856, SEQ ID NO 857, SEQ ID NO 858 and SEQ ID NO 895.

The antisense oligonucleotide or contiguous nucleotide sequence thereof may be fully complementary to a mammalian XBP1 pre-mRNA transcript.

The contiguous nucleotide sequence may be the same length as the antisense oligonucleotide.

The antisense oligonucleotide may be isolated, purified or manufactured.

The antisense oligonucleotide or contiguous nucleotide sequence thereof may comprise one or more modified nucleotides or one or more modified nucleosides.

The antisense oligonucleotide or contiguous nucleotide sequence thereof may be or comprises an antisense oligonucleotide mixmer or totalmer.

The invention includes conjugates and pharmaceutically acceptable salts of the antisense oligonucleotides of the invention as well as compositions and pharmaceutical compositions comprising the antisense oligonucleotides of the invention.

In another aspect, the invention provides an isolated XBP1Δ4 protein.

The isolated XBP1Δ4 protein of the invention may comprise the sequence of SEQ ID NO: 7, SEQ ID NO: 596 or SEQ ID NO 807.

In another aspect, the invention provides an isolated mRNA encoding the XBP1Δ4 protein of the invention.

The isolated mRNA of the invention may comprise the sequence of SEQ ID NO: 7, SEQ ID NO: 595 or SEQ ID NO: 806.

In another aspect, the invention provides a method for producing a polypeptide comprising the steps of:

- a) cultivating a mammalian cell, which is expressing XBP1 and which comprises one or more nucleic acids encoding the polypeptide; and
- b) recovering the polypeptide from the cells or the cultivation medium;
- characterized in that the cultivating is in the presence of an antisense oligonucleotide, a composition, a pharmaceutical composition, a protein or an mRNA of the invention.

Within the invention, the method may comprise the steps of:

- a1) propagating a mammalian cell, which is expressing XBP1 and which comprises one or more nucleic acids encoding the polypeptide, in a cultivation medium comprising an antisense oligonucleotide according to the invention, to obtain a first cell population;
- a2) mixing an aliquot of the first cell population with cultivation medium optionally comprising the antisense oligonucleotide to obtain a second cell population;
- a3) cultivating the second cell population to obtain a third cell population; and
- b) recovering the polypeptide from the cells and/or the cultivation medium of the third cell cultivation.

Within the method of the invention, the antisense oligonucleotide may be added to a final concentration of 25 μM or more.

Within the method of the invention the cells resulting in the first cell population may be cultivated at a starting cell density of 0.5*10E6 to 4*10E6 cells/mL.

Within the method of the invention, the second cell population may have a cell density of 0.5*10E6 to 10*10E6 cells/mL.

Within the method of the invention, the mammalian cell may be a CHO cell.

Within the method of the invention, the polypeptide may be an antibody.

One aspect of the invention is a method for the recombinant production of a multimeric polypeptide comprising the steps of:

- a) cultivating a mammalian cell, which comprises one or more nucleic acids encoding the multimeric polypeptide and which is expressing XBP1, in the presence of a nucleic acid according to the invention, which is inducing the formation of an XBP1 variant, in one preferred embodiment the XBP1 variant is XBP1Δ4; and
- b) recovering the multimeric polypeptide from the cells or the cultivation medium.

One further aspect of the invention is a method for the recombinant production of a multimeric polypeptide comprising the steps of:

- a) cultivating a mammalian cell, which comprises one or more nucleic acids encoding the multimeric polypeptide and which is expressing XBP1, in the presence of a nucleic acid according to the invention, which is inducing the skipping of exon 4 in XBP1 mRNA, whereby a +2 out of frame event is introduced; and
- b) recovering the multimeric polypeptide from the cells or the cultivation medium.

In certain embodiments of all aspects and embodiments of the method for the recombinant production of a multimeric polypeptide, the method comprises the steps of:

- a1) propagating a mammalian cell, which is expressing XBP1 and which comprises one or more nucleic acids encoding the polypeptide, in a cultivation medium comprising a nucleic acid according to the invention, which is inducing the formation of an XBP1 variant, in one preferred embodiment the XBP1 variant is XBP1Δ4, to obtain a first cell population;
- a2) mixing an aliquot of the first cell population with cultivation medium optionally comprising the same or a different nucleic acid according to the invention, which is inducing the formation of the XBP1 variant XBP1Δ4, to obtain a second cell population;
- a3) cultivating the second cell population to obtain a third cell population; and
- b) recovering the multimeric polypeptide from the cells and/or the cultivation medium of the third cell cultivation.

In certain embodiments of all aspects and embodiments of the method for the recombinant production of a multimeric polypeptide, the method comprises the steps of:

- a1) propagating a mammalian cell, which is expressing XBP1 and which comprises one or more nucleic acids encoding the polypeptide, in a cultivation medium comprising a nucleic acid according to the invention, which is inducing the skipping of exon 4 in XBP1 mRNA, whereby a +2 out of frame event is introduced, to obtain a first cell population;
- a2) mixing an aliquot of the first cell population with cultivation medium optionally comprising the same or a different nucleic acid according to the invention, which is inducing the skipping of exon 4 in XBP1 mRNA, whereby a +2 out of frame event is introduced, to obtain a second cell population;
- a3) cultivating the second cell population to obtain a third cell population; and
- b) recovering the multimeric polypeptide from the cells and/or the cultivation medium of the third cell cultivation.

In certain embodiments of all aspects and embodiments of the method for the recombinant production of a multimeric polypeptide, the nucleic acid according to the invention is an antisense oligonucleotide.

In certain embodiments of all aspects and embodiments of the method for the recombinant production of a multimeric polypeptide, the nucleic acid according to the invention is complementary to at least 10 contiguous nucleotides of the hamster XBP1 pre-mRNA transcript (SEQ ID NO 1), such as at least 10 contiguous nucleotides from nucleotides 2960-3113 of SEQ ID NO 1 or at least 10 contiguous nucleotides from nucleotides 2986-3018 of SEQ ID NO 1.

In certain embodiments of all aspects and embodiments of the method for the recombinant production of a multimeric polypeptide, the nucleic acid according to the invention is complementary to a sequence selected from the group consisting of SEQ ID NO 947, SEQ ID NO 948, SEQ ID NO 949, SEQ ID NO 950, SEQ ID NO 951 and SEQ ID NO 988.

In certain embodiments of all aspects and embodiments of the method for the recombinant production of a multimeric polypeptide, the nucleic acid according to the invention is selected from the group consisting of SEQ ID NO 854, SEQ ID NO 855, SEQ ID NO 856, SEQ ID NO 857, SEQ ID NO 858 and SEQ ID NO 895.

In certain embodiments of all aspects and embodiments of the method for the recombinant production of a multimeric polypeptide, the XBP1 variant comprises the sequence of SEQ ID NO: 7, SEQ ID NO: 596 or SEQ ID NO 807.

In certain embodiments of all aspects and embodiments of the method for the recombinant production of a multimeric polypeptide, the XBP1 variant is encoded by the sequence of SEQ ID NO: 7, SEQ ID NO: 595 or SEQ ID NO: 806.

In certain embodiments of all aspects and embodiments of the method for the recombinant production of a multimeric polypeptide, the nucleic acid according to the invention is be added to a final concentration of 25 μM or more.

In certain embodiments of all aspects and embodiments of the method for the recombinant production of a multimeric polypeptide, the cells resulting in the first cell population are cultivated with a starting cell density of 0.5*10E6 to 4*10E6 cells/mL.

In certain embodiments of all aspects and embodiments of the method for the recombinant production of a multimeric polypeptide, the second cell population has a starting cell density of 0.5*10E6 to 10*10E6 cells/mL.

In certain embodiments of all aspects and embodiments of the method for the recombinant production of a multimeric polypeptide, the mammalian cell is a CHO cell.

In certain embodiments of all aspects and embodiments of the method for the recombinant production of a multimeric polypeptide, the mammalian cell is a HEK cell.

In certain embodiments of all aspects and embodiments of the method for the recombinant production of a multimeric polypeptide, the mammalian cell is a SP2/0 cell.

In certain embodiments of all aspects and embodiments of the method for the recombinant production of a multimeric polypeptide, the multimeric polypeptide is an antibody. In certain embodiments, the antibody is a bispecific antibody. In certain embodiments, the bispecific antibody is a full-length antibody with domain exchange or an antibody-multimer-fusion. In certain embodiments, the bispecific antibody is a trivalent, bispecific antibody. In certain embodiments, the bispecific, trivalent antibody is a full-length antibody with domain exchange and additional heavy chain C-terminal binding site or a full-length antibody with an additional heavy chain C-terminal binding site with domain exchange or a T-cell bispecific antibody. In certain embodiments, the antibody is bi- or trivalent.

One aspect of the invention is the use of the nucleic acid according to the invention to enhance the yield or the quality of multimeric polypeptides produced by recombinant protein expression systems, for example in the manufacture of antibodies, such as monoclonal antibodies.

In certain embodiments of all aspects and embodiments of the use of the nucleic acid according to the invention, the nucleic acid according to the invention is an antisense oligonucleotide.

In certain embodiments of all aspects and embodiments of the use of the nucleic acid according to the invention, the nucleic acid according to the invention is complementary to at least 10 contiguous nucleotides of the hamster XBP1 pre-mRNA transcript (SEQ ID NO 1), such as at least 10 contiguous nucleotides from nucleotides 2960-3113 of SEQ ID NO 1 or at least 10 contiguous nucleotides from nucleotides 2986-3018 of SEQ ID NO 1.

In certain embodiments of all aspects and embodiments of the use of the nucleic acid according to the invention, the nucleic acid according to the invention is complementary to a sequence selected from the group consisting of SEQ ID NO 947, SEQ ID NO 948, SEQ ID NO 949, SEQ ID NO 950, SEQ ID NO 951 and SEQ ID NO 988.

In certain embodiments of all aspects and embodiments of the use of the nucleic acid according to the invention, the nucleic acid according to the invention is selected from the group consisting of SEQ ID NO 854, SEQ ID NO 855, SEQ ID NO 856, SEQ ID NO 857, SEQ ID NO 858 and SEQ ID NO 895.

One further aspect of the invention is the use of an)(BPI variant obtained from an XBP1 mRNA wherein exon 4 is skipped and +2 out of frame event is introduced to enhance the yield or the quality of multimeric polypeptides produced by recombinant protein expression systems, for example in the manufacture of antibodies, such as monoclonal antibodies.

One further aspect of the invention is the use of an XBP1 variant comprising the sequence of SEQ ID NO: 7, SEQ ID NO: 596 or SEQ ID NO 807 to enhance the yield or the quality of multimeric polypeptides produced by recombinant protein expression systems, for example in the manufacture of antibodies, such as monoclonal antibodies.

In certain embodiments of all aspects and embodiments of the before outlined uses, the nucleic acid according to the invention is used at a final concentration of 25 μM or more.

In another aspect, the invention provides a therapeutic application for the antisense oligonucleotides, compositions, pharmaceutical compositions, proteins and/or isolated mRNAs of the invention.

In one aspect, the invention provides an antisense oligonucleotide, composition, pharmaceutical composition, protein and/or isolated mRNA of the invention for use in medicine or therapy.

In another aspect, the invention provides the use of an antisense oligonucleotide, composition, pharmaceutical composition, protein and/or isolated mRNA of the invention in the manufacture of a medicament for the treatment of proteopathological disease.

In another aspect, the invention provides a method of treating a proteopathological disease, the method comprising administering an antisense oligonucleotide, composition, pharmaceutical composition, protein and/or isolated mRNA of the invention.

Throughout the therapeutic applications of the invention, the proteopathological disease may be a TDP-43 pathology, such as motor neuron disease or frontotemporal lobar degeneration.

BRIEF DESCRIPTION OF FIGURES

FIG. 1: Illustration of the IRE1 mediated splicing event in the human XBP1 transcript XBP1-207 (SEQ ID Nos:1006-1007).

FIG. 2: Illustration of the proposed mechanism for the alternative IRE1 mediated splicing event.

FIG. 3: Illustration of the consequence of the IRE1 mediated splicing event on XBP1 pre-mRNA, resulting in a mRNA XBP1s that encodes an extended C-terminal domain.

FIG. 4 Alignment of protein sequences of XBP1u (SEQ ID NO:1008), XBP1s (SEQ ID NO:1009) and the XBP1Δ4 variant (SEQ ID NO:1010), illustrating that exon 4 removal results in the retention of the majority of the C-terminal amino acid sequence found in the IRE1 mediated splicing event (XBP1s).

FIG. 5: Screening Assay Design for XBP1 exon 4 skipping; Exon 4-5 probe (SEQ ID NO:1499), Exon 4-6 probe (SEQ ID NO:1502), Primer F (SEQ ID NO:1498), Primer R (SEQ ID NO:1497).

FIG. 6: Initial library screen of antisense oligonucleotides targeting nucleotides 2960-3113 of SEQ ID NO 1, identifying compounds which are effective in mediating the skipping of exon 4.

FIG. 7: Effective exon 4 splice switching compounds, e.g. SEQ ID NOs 23 and 24 increase the titre of CHO cell expressing difficult-to-express mAb.

FIG. 8: Activity of oligonucleotides is shown relative to their position along exon 4 of SEQ ID 2.

FIG. 9: Alignment of XBP1s highlighting conservation in the Exon 4 sequence across key species (SEQ ID NOs 5, 594 & 805).

FIG. 10: Alignment of XBPΔ4 highlighting conservation in the Exon 4 sequence across key species (SEQ ID NOs 7, 596 & 807).

FIG. 11: Alignment of human XBP1s (SEQ ID NO 805) and XBPΔ4 (SEQ ID NO 807).

DEFINITIONS
General

Useful methods and techniques for carrying out the current invention are described in e.g. Ausubel, F. M. (ed.), Current Protocols in Molecular Biology, Volumes I to Ill (1997); Glover, N. D., and Hames, B. D., ed., DNA Cloning: A Practical Approach, Volumes I and II (1985), Oxford University Press; Freshney, R. I. (ed.), Animal Cell Culture—a practical approach, IRL Press Limited (1986); Watson, J. D., et al., Recombinant DNA, Second Edition, CHSL Press (1992); Winnacker, E. L., From Genes to Clones; N.Y., VCH Publishers (1987); Cells, J., ed., Cell Biology, Second Edition, Academic Press (1998); Freshney, R. I., Culture of Animal Cells: A Manual of Basic Technique, second edition, Alan R. Liss, Inc., N.Y. (1987).

The use of recombinant DNA technology enables the generation of derivatives of a nucleic acid. Such derivatives can, for example, be modified in individual or several nucleotide positions by substitution, alteration, exchange, deletion or insertion. The modification or derivatization can, for example, be carried out by means of site directed mutagenesis. Such modifications can easily be carried out by a person skilled in the art (see e.g. Sambrook, J., et al., Molecular Cloning: A laboratory manual (1999) Cold Spring Harbor Laboratory Press, New York, USA; Hames, B. D., and Higgins, S. G., Nucleic acid hybridization—a practical approach (1985) IRL Press, Oxford, England).

It must be noted that as used herein and in the appended claims, the singular forms “a”, “an”, and “the” include plural reference unless the context clearly dictates otherwise. Thus, for example, reference to “a cell” includes a plurality of such cells and equivalents thereof known to those skilled in the art, and so forth. As well, the terms “a” (or “an”), “one or more” and “at least one” can be used interchangeably herein. It is also to be noted that the terms “comprising”, “including”, and “having” can be used interchangeably.

The term “about” denotes a range of +1-20% of the thereafter following numerical value. In one embodiment, the term about denotes a range of +1-10% of the thereafter following numerical value. In one embodiment the term “about” denotes a range of +/−5% of the thereafter following numerical value.

The term “comprising” also encompasses the term “consisting of”.

Compound

Herein, in the context of compounds of the present invention the term “compound” means any molecule capable of modulating the expression or activity of XBP1, particularly any molecule capable of modulating the splicing of the XBP1 pre-mRNA to increase the level of expression of XBP1 an XBP1 splice variant, such as an mRNA which lacks XBP1 exon 4. Particular compounds of the invention are nucleic acid molecules, such as antisense oligonucleotides, and conjugates comprising such a nucleic acid molecule.

Recombinant Mammalian Cell

The term “recombinant mammalian cell” as used herein denotes a mammalian cell comprising an exogenous nucleotide sequence capable of expressing a polypeptide. Such a polypeptide can be a polypeptide endogenous or heterologous (exogeneous) to said mammalian cell. Such recombinant mammalian cells are cells into which one or more exogenous nucleic acid(s) have been introduced, including the progeny of such cells. Thus, the term “a mammalian cell comprising a nucleic acid encoding a heterologous polypeptide” denotes cells comprising an exogenous nucleotide sequence integrated in the genome of the mammalian cell and capable of expressing the heterologous polypeptide. In one embodiment the mammalian cell comprising an exogenous nucleotide sequence is a cell comprising an exogenous nucleotide sequence integrated at a single site within a locus of the genome of the host cell, wherein the exogenous nucleotide sequence comprises a first and a second recombination recognition sequence flanking at least one first selection marker, and a third recombination recognition sequence located between the first and the second recombination recognition sequence, and all the recombination recognition sequences are different.

Such “recombinant mammalian cells” can be used for the production of said homologous or heterologous polypeptide of interest at any scale.

Transformed Cells

A mammalian cell comprising an exogenous nucleotide sequence is a “transformed cell”. This term includes the primary transformed cell as well as progeny derived therefrom without regard to the number of passages. Progeny may, e.g., not be completely identical in nucleic acid content to a parent cell, but may contain mutations. Mutant progeny that has the same function or biological activity as screened or selected for in the originally transformed cell are encompassed.

Isolated

An “isolated” composition is one that has been separated from a component of its natural environment. In some embodiments, a composition is purified to greater than 95% or 99% purity as determined by, for example, electrophoretic (e.g., SDS-PAGE, isoelectric focusing (IEF), capillary electrophoresis, CE-SDS) or chromatographic (e.g., size exclusion chromatography or ion exchange or reverse phase HPLC) methods. For a review of methods for assessment of e.g. antibody purity, see, e.g., Flatman, S. et al., J. Chrom. B 848 (2007) 79-87.

An “isolated” nucleic acid refers to a nucleic acid molecule that has been separated from a component of its natural environment. An isolated nucleic acid includes a nucleic acid molecule contained in cells that ordinarily contain the nucleic acid molecule, but wherein the nucleic acid molecule is present extrachromosomally or at a chromosomal location that is different from its natural chromosomal location.

An “isolated” polypeptide or antibody refers to a polypeptide molecule or antibody molecule that has been separated from a component of its natural environment.

Integration Site

The term “integration site” denotes a nucleic acid sequence within a cell's genome into which an exogenous nucleotide sequence is inserted. In certain embodiments, an integration site is between two adjacent nucleotides in the cell's genome. In certain embodiments, an integration site includes a stretch of nucleotide sequences. In certain embodiments, the integration site is located within a specific locus of the genome of a mammalian cell. In certain embodiments, the integration site is within an endogenous gene of a mammalian cell. The terms “vector” or “plasmid”, which can be used interchangeably, as used herein, refer to a nucleic acid molecule capable of propagating another nucleic acid to which it is linked. The term includes the vector as a self-replicating nucleic acid structure as well as the vector incorporated into the genome of a host cell into which it has been introduced. Certain vectors are capable of directing the expression of nucleic acids to which they are operatively linked. Such vectors are referred to herein as “expression vectors”.

Selection Marker

As used herein, the term “selection marker” denotes a gene that allows cells carrying the gene to be specifically selected for or against, in the presence of a corresponding selection agent. For example, but not by way of limitation, a selection marker can allow the host cell transformed with the selection marker gene to be positively selected for in the presence of the respective selection agent (selective cultivation conditions); a non-transformed host cell would not be capable of growing or surviving under the selective cultivation conditions. Selection markers can be positive, negative or bi-functional. Positive selection markers can allow selection for cells carrying the marker, whereas negative selection markers can allow cells carrying the marker to be selectively eliminated. A selection marker can confer resistance to a drug or compensate for a metabolic or catabolic defect in the host cell. In prokaryotic cells, amongst others, genes conferring resistance against ampicillin, tetracycline, kanamycin or chloramphenicol can be used. Resistance genes useful as selection markers in eukaryotic cells include, but are not limited to, genes for aminoglycoside phosphotransferase (APH) (e.g., hygromycin phosphotransferase (HYG), neomycin and G418 APH), dihydrofolate reductase (DHFR), thymidine kinase (TK), glutamine synthetase (GS), asparagine synthetase, tryptophan synthetase (indole), histidinol dehydrogenase (histidinol D), and genes encoding resistance to puromycin, blasticidin, bleomycin, phleomycin, chloramphenicol, Zeocin, and mycophenolic acid. Further marker genes are described in WO 92/08796 and WO 94/28143.

Beyond facilitating a selection in the presence of a corresponding selection agent, a selection marker can alternatively be a molecule normally not present in the cell, e.g., green fluorescent protein (GFP), enhanced GFP (eGFP), synthetic GFP, yellow fluorescent protein (YFP), enhanced YFP (eYFP), cyan fluorescent protein (CFP), mPlum, mCherry, tdTomato, mStrawberry, J-red, DsRed-monomer, mOrange, mKO, mCitrine, Venus, YPet, Emerald, CyPet, mCFPm, Cerulean, and T-Sapphire. Cells expressing such a molecule can be distinguished from cells not harbouring this gene, e.g., by the detection or absence, respectively, of the fluorescence emitted by the encoded polypeptide.

Operably Linked

As used herein, the term “operably linked” refers to a juxtaposition of two or more components, wherein the components are in a relationship permitting them to function in their intended manner. For example, a promoter and/or an enhancer is operably linked to a coding sequence if the promoter and/or enhancer acts to modulate the transcription of the coding sequence. In certain embodiments, DNA sequences that are “operably linked” are contiguous and adjacent on a single chromosome. In certain embodiments, e.g., when it is necessary to join two protein encoding regions, such as a secretory leader and a polypeptide, the sequences are contiguous, adjacent, and in the same reading frame. In certain embodiments, an operably linked promoter is located upstream of the coding sequence and can be adjacent to it. In certain embodiments, e.g., with respect to enhancer sequences modulating the expression of a coding sequence, the two components can be operably linked although not adjacent. An enhancer is operably linked to a coding sequence if the enhancer increases transcription of the coding sequence. Operably linked enhancers can be located upstream, within, or downstream of coding sequences and can be located at a considerable distance from the promoter of the coding sequence. Operable linkage can be accomplished by recombinant methods known in the art, e.g., using PCR methodology and/or by ligation at convenient restriction sites. If convenient restriction sites do not exist, then synthetic oligonucleotide adaptors or linkers can be used in accord with conventional practice. An internal ribosomal entry site (IRES) is operably linked to an open reading frame (ORF) if it allows initiation of translation of the ORF at an internal location in a 5′-end-independent manner.

Exogenous

As used herein, the term “exogenous” indicates that a nucleotide sequence does not originate from a specific cell and is introduced into said cell by DNA delivery methods, e.g., by transfection, electroporation, or transformation methods. Thus, an exogenous nucleotide sequence is an artificial sequence wherein the artificiality can originate, e.g., from the combination of subsequences of different origin (e.g. a combination of a recombinase recognition sequence with an SV40 promoter and a coding sequence of green fluorescent protein is an artificial nucleic acid) or from the deletion of parts of a sequence (e.g. a sequence coding only the extracellular domain of a membrane-bound receptor or a cDNA) or the mutation of nucleobases. The term “endogenous” refers to a nucleotide sequence originating from a cell. An “exogenous” nucleotide sequence can partly have an “endogenous” counterpart that is identical in base compositions, but where the “exogenous” sequence is introduced into the cell, e.g., via recombinant DNA technology.

Heterologous

As used herein, the term “heterologous” indicates that a polypeptide does not originate from a specific cell and the respective encoding nucleic acid has been introduced into said cell by DNA delivery methods, e.g., by transfection, electroporation, or transformation methods. Thus, a heterologous polypeptide is a polypeptide that is artificial to the cell expressing it, whereby this is independent of whether the polypeptide is a naturally occurring polypeptide originating from a different cell/organism or is a man-made polypeptide.

Oligonucleotide

The term “oligonucleotide” as used herein is defined as it is generally understood by the skilled person, as a molecule comprising two or more covalently linked nucleosides. Such covalently bound nucleosides can also be referred to as nucleic acid molecules or oligomers. Oligonucleotides are commonly made in the laboratory by solid-phase chemical synthesis followed by purification and isolation. When referring to a sequence of the oligonucleotide, reference is made to the sequence or order of nucleobase moieties, or modifications thereof, of the covalently linked nucleotides or nucleosides. In some embodiments, the oligonucleotides of the invention are man-made, and are chemically synthesized, and are typically purified or isolated. The oligonucleotides of the invention can comprise one or more modified nucleosides, also referred to as nucleoside analogues, such as 2′ sugar modified nucleosides. The oligonucleotides of the invention can comprise one or more modified internucleoside linkages, such as one or more phosphorothioate internucleoside linkages.

Antisense Oligonucleotide

The term “antisense oligonucleotide” or “ASO,” as used herein, is defined as an oligonucleotide capable of modulating expression of a target gene by hybridizing to a target nucleic acid, in particular to a contiguous sequence on a target nucleic acid. Antisense oligonucleotides are not essentially double stranded and are therefore not siRNAs or shRNAs. In some embodiments, the antisense oligonucleotides of the present invention can be single stranded. It is understood that single stranded oligonucleotides of the present invention can form hairpins or intermolecular duplex structures (duplex between two molecules of the same oligonucleotide), as long as the degree of intra or inter self-complementarity is less than approximately 50% across the full length of the oligonucleotide.

In some embodiments, the single stranded antisense oligonucleotides of the invention do not contain RNA nucleosides. As described elsewhere in the present disclosure, in some embodiments, antisense oligonucleotides of the disclosure comprise one or more modified nucleosides or nucleotides, such as 2′ sugar modified nucleosides. In certain embodiments, the non-modified nucleosides of an antisense oligonucleotide disclosed herein are DNA nucleosides.

In certain contexts, the antisense oligonucleotides of the invention may be referred to as oligonucleotides.

Contiguous Nucleotide Sequence

The term “contiguous nucleotide sequence” refers to the region of an antisense oligonucleotide which is complementary to the target nucleic acid. The term is used interchangeably herein with the term “contiguous nucleobase sequence” and the term oligonucleotide “sequence motif.” As used herein, the term “sequence motif” represents the sequence of nucleobases, independent of the nucleoside sugar chemistry and/or design. In some embodiments, the nucleobases A, T, C and G can be modified, for example, capital C can be 5-methyl cytosine beta-D-oxy LNA nucleoside, and in RNA sequences, T can be U. In some embodiments, ail the nucleosides of an antisense oligonucleotide constitute the contiguous nucleotide sequence. The contiguous nucleotide sequence is the sequence of nucleotides in the antisense oligonucleotide which is complementary to, and in some instances fully complementary to, the target nucleic acid or target sequence.

As described herein, in some embodiments, an antisense oligonucleotide comprises the contiguous nucleotide sequence, and can optionally comprise further nucleotide(s), for example a nucleotide linker region which can be used to attach a functional group (e.g. a conjugate group) to the contiguous nucleotide sequence. In some embodiments, the nucleotide linker region can be complementary to the target nucleic acid. In some embodiments, the nucleotide linker region is not complementary to the target nucleic acid. It is understood that the contiguous nucleotide sequence of an antisense oligonucleotide cannot be longer than the antisense oligonucleotide as such, and that the antisense oligonucleotide cannot be shorter than the contiguous nucleotide sequence.

Nucleic Acids

The term “nucleic acids” or “nucleotides” is intended to encompass plural nucleic acids. In some embodiments, the term “nucleic acids” or “nucleotides” refers to a target sequence, e.g., pre-mRNAs, mRNAs, or DNAs in vivo or in vitro.

When the term refers to the nucleic acids or nucleotides in a target sequence, the nucleic acids or nucleotides can be naturally occurring sequences within a cell. In some embodiments, “nucleic acids” or “nucleotides” refer to a sequence in the antisense oligonucleotide of the invention. When the term refers to a sequence in the antisense oligonucleotide, the nucleic acids or nucleotides are not naturally occurring, i.e., chemically synthesized, enzymatically produced, recombinantly produced, or any combination thereof. In some embodiments, the nucleic acids or nucleotides in the antisense oligonucleotide are produced synthetically or recombinantly, but are not a naturally occurring sequence or a fragment thereof. In some embodiments, the nucleic acids or nucleotides in the antisense oligonucleotide are not naturally occurring because they contain at least one nucleotide analog that is not naturally occurring in nature.

The term “nucleic acid” or “nucleotide” refers to a single nucleic acid segment, e.g., a DNA, an RNA, or an analog thereof, in isolated form or present in a polynucleotide. “Nucleic acid” or “nucleotide” includes naturally occurring nucleic acids or non-naturally occurring nucleic acids. In some embodiments, the terms “nucleotide”, “unit” and “monomer” are used interchangeably. It will be recognized that when referring to a sequence of nucleotides or monomers, what is referred to is the sequence of bases, such as A, T, G, C or U, and analogs thereof.

When the term refers to the nucleic acid or nucleic acids encoding a polypeptide or protein, the nucleic acids or nucleotides can be naturally occurring sequences within a cell or an artificial sequence. In some embodiments, the nucleic acid(s) are produced synthetically or recombinantly.

Nucleotide

The term “nucleotide,” as used herein, refers to a glycoside comprising a sugar moiety, a base moiety and a covalently linked group (linkage group), such as a phosphate or phosphorothioate internucleotide linkage group, and covers both naturally occurring nucleotides, such as DNA or RNA, and non-naturally occurring nucleotides comprising modified sugar and/or base moieties, which are also referred to as “nucleotide analogs” herein. Herein, a single nucleotide (unit) can also be referred to as a monomer or nucleic acid unit. In certain embodiments, the term “nucleotide analogs” refers to nucleotides having modified sugar moieties. Non-limiting examples of the nucleotides having modified sugar moieties (e.g., LNA) are disclosed elsewhere herein. In some embodiments, the term “nucleotide analogs” refers to nucleotides having modified nucleobase moieties. The nucleotides having modified nucleobase moieties include, but are not limited to, 5-methyl-cytosine, isocytosine, 5-thiazolo-cytosine, 5-propynyl-cytosine, pseudoisocytosine, 5-bromouracil, 5-propynyl-uracil, thiazolo-uracil, 2-thio-uracil, 2-thiothymine, 6-aminopurine, 2-aminopurine, inosine, diaminopurine, 2,6-diaminopurine, and 2-chloro-6-aminopurine. As one of ordinary skill in the art would recognize, the 5′ terminal nucleotide of an oligonucleotide (e.g., an antisense oligonucleotide disclosed herein) does not comprise a 5′ internucleotide linkage group, although it can comprise a 5′ terminal group.

Nucleoside

The term “nucleoside,” as used herein, is used to refer to a glycoside comprising a sugar moiety and a base moiety, and can therefore be used when referring to the nucleotide units, which are covalently linked by the internucleotide linkages between the nucleotides of the antisense oligonucleotide. In the field of biotechnology, the term “nucleotide” is often used to refer to a nucleic acid monomer or unit. In the context of an antisense oligonucleotide, the term “nucleotide” can refer to the base alone, i.e., a nucleobase sequence comprising cytosine (DNA and RNA), guanine (DNA and RNA), adenine (DNA and RNA), thymine (DNA) and uracil (RNA), in which the presence of the sugar backbone and internucleotide linkages are implicit. Likewise, particularly in the case of oligonucleotides where one or more of the internucleotide linkage groups are modified, the term “nucleotide” can refer to a “nucleoside.” For example, the term “nucleotide” can be used, even when specifying the presence or nature of the linkages between the nucleosides.

Nucleotide Length

The term “nucleotide length” or the “length” of an antisense oligonucleotide, or contiguous nucleotide sequence thereof, as used herein means the total number of the nucleotides (monomers) in a given sequence. Nucleotides and nucleosides are the building blocks of oligonucleotides and polynucleotides, and for the purposes of the present disclosure include both naturally occurring and non-naturally occurring nucleotides and nucleosides (nucleo(s/t)ide analogs). In nature, nucleotides, such as DNA and RNA nucleotides comprise a ribose sugar moiety, a nucleobase moiety and one or more phosphate groups (which is absent in nucleosides). Nucleosides and nucleotides can also interchangeably be referred to as “units” or “monomers”.

Modified Nucleoside

The term “modified nucleoside” or “nucleoside modification”, or “nucleoside analog” as used herein, refers to nucleosides modified as compared to the equivalent DNA or RNA nucleoside by the introduction of one or more modifications of the sugar moiety or the (nucleo)base moiety. In some embodiments, one or more of the modified nucleosides of the antisense oligonucleotide of the invention comprise a modified sugar moiety. The term modified nucleoside can also be used herein interchangeably with the term “nucleoside analogue,” modified “units,” or modified “monomers.” Nucleosides with an unmodified DNA or RNA sugar moiety are termed DNA or RNA nucleosides herein. In some embodiments, nucleosides with modifications in the base region of the DNA or RNA nucleoside are still termed DNA or RNA if they allow Watson Crick base pairing. Non-limiting examples of modified nucleosides which can be used in the antisense oligonucleotides of the invention include LNA, 2′-O-MOE and morpholino nucleoside analogues. Examples of other modified nucleosides are provided elsewhere in the present disclosure.

High Affinity Modified Nucleoside

A “high affinity modified nucleoside,” as used herein, is a modified nucleotide which, when incorporated into the oligonucleotide, enhances the affinity of the oligonucleotide for its complementary target, for example, as measured by the melting temperature (T^m). A high affinity modified nucleoside of the present disclosure can result in an increase in melting temperature between +0.5 to +12° C., in some instances between +1.5 to +10° C. and in others between +3 to +8° C. per modified nucleoside. Numerous high affinity modified nucleosides are known in the art and include, for example, many 2′ substituted nucleosides as well as locked nucleic acids (LNA) (see e.g. Freier & Altmann; Nucl. Acid Res., 1997, 25, 4429-4443 and Uhlmann; Curr. Opinion in Drug Development, 2000, 3(2), 203-213).

Modified Internucleoside Linkage

The term “modified internucleoside linkage” is defined as generally understood by the skilled person as linkages other than phosphodiester (PO) linkages that covalently couple two nucleosides together. In some embodiments, the oligonucleotides of the invention can therefore comprise one or more modified internucleoside linkages, such as one or more phosphorothioate internucleoside linkage.

In some embodiments, at least about 50% of the internucleoside linkages of the antisense oligonucleotide (e.g., disclosed herein), or contiguous nucleotide sequence thereof, are phosphorothioate, such as at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 90% or more of the internucleoside linkages of the antisense oligonucleotide, or contiguous nucleotide sequence thereof, are phosphorothioate. In some embodiments, all of the internucleoside linkages of the antisense oligonucleotide, or contiguous nucleotide sequence thereof, are phosphorothioate.

In some embodiments, ail the internucleoside linkages of the contiguous nucleotide sequence of the oligonucleotide are phosphorothioate, or all the internucleoside linkages of the antisense oligonucleotide are phosphorothioate linkages.

Nucleobase

The term “nucleobase” includes the purine (e.g. adenine and guanine) and pyrimidine (e.g. uracil, thymine and cytosine) moiety present in nucleosides and nucleotides which form hydrogen bonds in nucleic acid hybridization. In the context of the present invention, the term nucleobase also encompasses modified nucleobases which can differ from naturally occurring nucleobases, but which are functional during nucleic acid hybridization. In this context, “nucleobase” refers to both naturally occurring nucleobases such as adenine, guanine, cytosine, thymidine, uracil, xanthine and hypoxanthine, as well as non-naturally occurring variants. Such variants are, for example, described in Hirao et al (2012) Accounts of Chemical Research vol 45 page 2055 and Bergstrom (2009) Current Protocols in Nucleic Acid Chemistry Suppl. 37 1.4.1.

In some embodiments the nucleobase moiety is modified by changing the purine or pyrimidine into a modified purine or pyrimidine, such as substituted purine or substituted pyrimidine, such as a nucleobase selected from isocytosine, pseudoisocytosine, 5-methyl cytosine, 5-thiozolo-cytosine, 5-propynyl-cytosine, 5-propynyl-uracil, 5-bromouracil 5-thiazolo-uracil, 2-thio-uracil, 2′thio-thymine, inosine, diaminopurine, 6-aminopurine, 2-aminopurine, 2,6-diaminopurine and 2-chloro-6-aminopurine.

The nucleobase moieties can be indicated by the letter code for each corresponding nucleobase, e.g. A, T, G, C or U, wherein each letter can optionally include modified nucleobases of equivalent function. For example, in certain embodiments, the nucleobase moieties of the antisense oligonucleotides disclosed herein are selected from A, T, G, C, and 5-methyl cytosine. Optionally, for LNA gapmers, 5-methyl cytosine LNA nucleosides can be used.

Modified Oligonucleotide

The term “modified oligonucleotide,” as used herein, describes an oligonucleotide (e.g., an antisense oligonucleotide) comprising one or more modified nucleosides (e.g., sugar modified nucleosides) and/or modified internucleoside linkages. The term “chimeric oligonucleotide” is a term that has been used in the literature to describe oligonucleotides comprising modified nucleosides (e.g., sugar modified nucleosides) and DNA nucleosides. In some embodiments, the ASO of the disclosure is a chimeric oligonucleotide.

Alkyl

As used herein, the term “alkyl”, alone or in combination, signifies a straight-chain or branched-chain alkyl group with 1 to 8 carbon atoms (C1-8), particularly a straight or branched-chain alkyl group with 1 to 6 carbon atoms (C1-6) and more particularly a straight or branched-chain alkyl group with 1 to 4 carbon atoms (C1-4). Examples of straight-chain and branched-chain C1-C8 alkyl groups are methyl, ethyl, propyl, isopropyl, butyl, isobutyl, tert-butyl, the isomeric pentyls, the isomeric hexyls, the isomeric heptyls and the isomeric octyls, particularly methyl, ethyl, propyl, butyl and pentyl. Particular examples of alkyl are methyl. Further examples of alkyl are mono, di or trifluoro methyl, ethyl or propyl, such as cyclopropyl (cPr), or mono, di or tri fluoro cyclopropyl.

Alkoxy

The term “alkoxy”, alone or in combination, signifies a group of the formula alkyl-O— in which the term “alkyl” has the previously given significance, such as methoxy, ethoxy, n-propoxy, isopropoxy, n-butoxy, isobutoxy, sec.butoxy and tert.butoxy. Particular “alkoxy” are methoxy.

Bicyclic Sugar

As used herein, the term “bicyclic sugar” refers to a modified sugar moiety comprising a 4 to 7 membered ring comprising a bridge connecting two atoms of the 4 to 7 membered ring to form a second ring, resulting in a bicyclic structure. In some embodiments, the bridge connects the C2′ and C4′ of the ribose sugar ring of a nucleoside (i.e., 2′-4′ bridge), as observed in SNA nucleosides.

Exons

As used herein, the term “exons” or “exonic regions” or “exonic sequences”, which can be used interchangeably herein, refer to nucleic acid molecules containing a sequence of nucleotides that is transcribed into RNA and is represented in a mature form of RNA, such as mRNA (messenger RNA), after splicing and other RNA processing. An mRNA contains one or more exons operatively linked. In some embodiments, exons can encode polypeptides or a portion of a polypeptide. In some embodiments, exons can contain non-translated sequences, for example, translational regulatory sequences.

Introns

The term “introns” or “intronic regions” or “intronic sequences”, which can be used interchangeably, refer to nucleic acid molecules containing a sequence of nucleotides that is transcribed into RNA and is then typically removed from the RNA by splicing to create a mature form of an RNA, for example, an mRNA. In some embodiments, nucleotide sequences of introns are not incorporated into mature RNAs, nor are intron sequences or portions thereof translated and incorporated into a polypeptide. Splice signal sequences, such as splice donors and acceptors, are used by the splicing machinery of a cell to remove introns from RNA. In some embodiments, an intron in one splice variant can be an exon (i.e., present in the spliced transcript) in another variant. Hence, spliced mRNA encoding an intron fusion protein can include an exon(s) and introns.

Splicing

As used herein, the term “splicing” refers to a process of RNA maturation in which introns in the pre-mRNA are removed and exons are operatively linked to create a messenger RNA (mRNA).

Alternative Splicing

As used herein, the term “alternative splicing” refers to the process of producing multiple mRNAs from a gene. In some embodiments, alternate splicing can include operatively linking less than all the exons of a gene, and/or operatively linking one or more alternate exons that are not present in all transcripts derived from a gene.

Splice Modulation

The term “splice modulation,” as used herein, refers to a process that can be used to correct cryptic splicing, modulate alternative splicing, restore the open reading frame, and induce protein knockdown. In the context of the present invention, a splice modulation can be used to modulate alternative splicing of XBP1 pre-mRNA to generate a splice variant. For example, a splice modulation can be used to modulate alternative splicing of XBP1 pre-mRNA to generate XBP1Δ4 mRNA and thereby enhance expression of XBP1Δ4 protein. Splice modulation can be assayed by RNA sequencing (RNA-Seg), which allows for a quantitative assessment of the different splice products of a pre-mRNA. In some embodiments of the invention, the antisense oligonucleotides modulate the splicing of the XBP1 pre-mRNA so as to reduce the level of mature XBP1 mRNA which comprises an exon 4 (mRNA), and to increase the expression of the level of mature XBP1 mRNA which lacks exon 4 (XBP1Δ4 mRNA).

Coding Region

As used herein, a “coding region” or “coding sequence”, which can be used interchangeably, is a portion of polynucleotide which consists of codons translatable into amino acids, Although a “stop codon” (TAG, TGA, or TAA) is typically not translated into an amino acid, it can be considered to be part of a coding region, but any flanking sequences, for example promoters, ribosome binding sites, transcriptional terminators, introns, untranslated regions (“UTRs”), and the like, are not part of a coding region. The boundaries of a coding region are typically determined by a start codon at the 5′ terminus, encoding the amino terminus of the resultant polypeptide, and a translation stop codon at the 3′ terminus, encoding the carboxyl terminus of the resulting polypeptide.

Non-Coding Region

The term “non-coding region” as used herein means a nucleotide sequence that is not a coding region. Examples of non-coding regions include, but are not limited to, promoters, ribosome binding sites, transcriptional terminators, introns, untranslated regions (“UTRs”), non-coding exons and the like. Some of the exons can be wholly or part of the 5′ untranslated region (5′ UTR) or the 3° untranslated region (3′ UTR) of each transcript. The untranslated regions are important for efficient translation of the transcript and for controlling the rate of translation and half-life of the transcript.

Region

The term “region” when used in the context of a nucleotide sequence refers to a section of that sequence. For example, the phrase “region within a nucleotide sequence” or “region within the complement of a nucleotide sequence” refers to a sequence shorter than the nucleotide sequence, but longer than at least 10 nucleotides located within the particular nucleotide sequence or the complement of the nucleotides sequence, respectively. The term “sub-sequence” or “subsequence” can also refer to a region of a nucleotide sequence.

Downstream and Upstream

The term “downstream,” when referring to a nucleotide sequence, means that a nucleic acid or a nucleotide sequence is located 3′ to a reference nucleotide sequence. In certain embodiments, downstream nucleotide sequences relate to sequences that follow the starting point of transcription. For example, the translation initiation codon of a gene is located downstream of the start site of transcription.

The term “upstream” refers to a nucleotide sequence that is located 5′ to a reference nucleotide sequence. In certain embodiments, upstream nucleotide sequences relate to sequences that precede the starting point of transcription. For example, the promoter sequence of a gene is located upstream of the start site of transcription.

Regulatory Region

As used herein, the term “regulatory region” refers to nucleotide sequences located upstream (5′ non-coding sequences), within, or downstream (3′ non-coding sequences) of a coding region, and which influence the transcription, RNA processing, stability, or translation of the associated coding region. Regulatory regions can include promoters, translation leader sequences, introns, polyadenylation recognition sequences, RNA processing sites, effector binding sites, UTRs, and stem-loop structures. If a coding region is intended for expression in a eukaryotic cell, a polyadenylation signal and transcription termination sequence will usually be located 3′ to the coding sequence.

Target Sequence

The term “target sequence,” as used herein, refers to a sequence of nucleotides present in the target nucleic acid which comprises the nucleobase sequence which is complementary to the antisense oligonucleotides of the invention, i.e. in the context of the present invention, a mammalian XBP1 pre-mRNA sequence is a target nucleic acid, and the target sequence is a region of the target nucleic acid which can be effectively targeted to modulate the splicing of exon 4, and includes, for example XBP1 exon 4, and the regions adjacent 5′ and/or 3′ to exon 4, of a XBP1 pre-mRNA transcript.

For example, for the present invention the target nucleic acid may be the hamster XBP1 pre-mRNA (SEQ ID NO 1, and particularly nucleotides 2960-3113 of SEQ ID NO 1), the mouse XBP1 pre-mRNA (SEQ ID NO 590) or the human XBP1 pre-mRNA (SEQ ID NO 801).

In some embodiments, the target sequence consists of a region on the target nucleic acid with a nucleobase sequence that is complementary to the contiguous nucleotide sequence of the antisense oligonucleotide of the invention. This region of the target nucleic acid can interchangeably be referred to as the target nucleotide sequence, target sequence, or target region. In some embodiments, the target sequence is longer than the complementary sequence of a single antisense oligonucleotide, and can, for example, represent a preferred region of the target nucleic acid, which can be targeted by several oligonucleotides of the invention.

The Cell or Target Cell

As used herein, the term “target cell” refers to a cell which expresses the target nucleic acid. In some embodiments, the target cell comprises a mammalian cell, such as a rodent cell, such as a mouse cell or a rat cell, or a hamster cell, such as a CHO cell, or a primate cell such as a monkey cell or a human cell. In some embodiments, the target cell is a transgenic mammalian cell which is expressing a XBP1 target nucleic acid. In some embodiments, the cell is a transgenic animal cell which is expressing a XBP1Δ4 mRNA, for example via heterologous expression.

Due to its general use in heterologous protein expression a preferred cell for use in protein expression methods is a hamster cell, such as a Chinese hamster ovary cell (CHO cell), especially preferred is a CHO-K1 cell growing in suspension.

Due to the therapeutic applications of the antisense oligonucleotides of the invention in neurodegenerative disorders, the target cell may be a neuronal cell.

Typically, the target cell of the present invention expresses the XBP1 pre-mRNA, which is processed in the cell to the mature XBP1 mRNA, resulting in the expression of the both XBP1-E4 protein (also referred to as XBPu) and the XBP1Δ4 transcript variant. As described herein, in some embodiments, the compounds of the invention modulate the splicing of the XBP1 pre-mRNA to increase the proportion of XBP1 mRNA which lacks XBP1 exon 4. Suitably, thereby the expression of XBP1Δ4 transcript variant can be increased, as compared to XBP1-E4 transcript variant.

Complementarity

The term “complementarity” or “nucleobase complementarity”, which can be used interchangeably herein, describes the capacity for Watson-Crick base-pairing of nucleosides/nucleotides. Watson-Crick base pairs are guanine (G)-cytosine (C) and adenine (A)-thymine (T)/uracil (U).

It will be understood that oligonucleotides may comprise nucleosides with modified nucleobases, for example 5-methyl cytosine is often used in place of cytosine, and as such the term complementarily encompasses Watson Crick base-paring between non-modified and modified nucleobases (see for example Hirao et al (2012) Accounts of Chemical Research vol 45 page 2055 and Bergstrom (2009) Current Protocols in Nucleic Acid Chemistry Suppl. 37 1.4.1).

The term “% complementary” as used herein, refers to the proportion of nucleotides (in percent) of a contiguous nucleotide sequence in a nucleic acid molecule (e.g. oligonucleotide) which across the contiguous nucleotide sequence, are complementary to a reference sequence (e.g. a target sequence or sequence motif). The percentage of complementarity is thus calculated by counting the number of aligned nucleobases that are complementary (from Watson Crick base pairs) between the two sequences (when aligned with the target sequence 5′-3′ and the oligonucleotide sequence from 3′-5′), dividing that number by the total number of nucleotides in the oligonucleotide and multiplying by 100. In such a comparison a nucleobase/nucleotide which does not align (form a base pair) is termed a mismatch. Insertions and deletions are not allowed in the calculation of % complementarity of a contiguous nucleotide sequence. It will be understood that in determining complementarity, chemical modifications of the nucleobases are disregarded as long as the functional capacity of the nucleobase to form Watson Crick base pairing is retained (e.g. 5′-methyl cytosine is considered identical to a cytosine for the purpose of calculating % identity).

Within the present invention the term “complementary” requires the antisense oligonucleotide to be at least about 80% complementary, or at least about 90% complementary, to a XBP1 pre-mRNA transcript. In some embodiments the antisense oligonucleotide may be at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98% or at least about 99% complementary to a hamster (SEQ ID NO 1), mouse (SEQ ID NO 590) or human (SEQ ID NO 801) XBP1 pre-mRNA transcript. Put another way, for some embodiments, an antisense oligonucleotide of the invention may include one, two, three or more mis-matches, wherein a mis-match is a nucleotide within the antisense oligonucleotide of the invention which does not base pair with its target.

The term “fully complementary” refers to 100% complementarity.

Complement

The term “complement,” as used herein, indicates a sequence that is complementary to a reference sequence. It is well known that complementarity is the base principle (Watson-Crick base pairing) of DNA replication and transcription as it is a property shared between two DNA or RNA sequences, such that when they are aligned antiparallel to each other, the nucleotide bases at each position in the sequences will be complementary, much like looking in the mirror and seeing the reverse of things. Therefore, for example, the complement of a sequence of 5′-ATGC-3′ can be written as 3′-TACG-5′ or 5′-GCAT-3′. The terms “reverse complement”, “reverse complementary”, and “reverse complementarity” as used herein are interchangeable with the terms “complement”, “complementary”, and “complementarity.”

Identity

The term “identity” as used herein, refers to the proportion of nucleotides (expressed in percent) of a contiguous nucleotide sequence in a nucleic acid molecule (e.g. oligonucleotide) which across the contiguous nucleotide sequence, are identical to a reference sequence (e.g. a sequence motif).

The percentage of identity is thus calculated by counting the number of aligned nucleobases that are identical (a Match) between two sequences (in the contiguous nucleotide sequence of the compound of the invention and in the reference sequence), dividing that number by the total number of nucleotides in the oligonucleotide and multiplying by 100. Therefore, Percentage of Identity=(Matches×100)/Length of aligned region (e.g. the contiguous nucleotide sequence). Insertions and deletions are not allowed in the calculation the percentage of identity of a contiguous nucleotide sequence. It will be understood that in determining identity, chemical modifications of the nucleobases are disregarded as long as the functional capacity of the nucleobase to form Watson Crick base pairing is retained (e.g. 5-methyl cytosine is considered identical to a cytosine for the purpose of calculating % identity).

As used herein, the terms “homologous” and “homology” are interchangeable with the terms “identity” and “identical.”

Naturally Occurring Variant

The term “naturally occurring variant thereof” refers to variants of the XBP1 polypeptide sequence or XBP1 nucleic acid sequence (e.g., transcript) which exist naturally within the defined taxonomic group, such as mammalian, such as mouse, rat, Chinese hamster, monkey, and human. Typically, when referring to “naturally occurring variants” of a polynucleotide the term also can encompass any allelic variant of the XBP1-encoding genomic DNA by chromosomal translocation or duplication, and the RNA, such as mRNA derived therefrom. “Naturally occurring variants” can also include variants derived from alternative splicing of the XBP1 mRNA. When referenced to a specific polypeptide sequence, e.g., XBP1 the term also includes naturally occurring forms of the protein, which can therefore be processed, e.g., by co- or post-translational modifications, such as signal peptide cleavage, proteolytic cleavage, glycosylation, etc. In some embodiments, the naturally occurring variants have at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99% or more homology to a mammalian XBP1 target nucleic acid, such as that set forth in SEQ ID NO: 1 (hamster), SEQ ID NO 590 (mouse) or SEQ ID NO 801 (human). In some embodiments, the naturally occurring variants have at least 99% homology to the hamster XBP1 target nucleic acid of SEQ ID NO: 1. In some embodiments, the naturally occurring variants have at least 99% homology to the mouse XBP1 target nucleic acid of SEQ ID NO: 590. In some embodiments, the naturally occurring variants have at least 99% homology to the human XBP1 target nucleic acid of SEQ ID NO: 801.

Corresponding

The terms “corresponding to” and “corresponds to,” which can be used interchangeably herein, when referencing two separate nucleic acid or nucleotide sequences can be used to clarify regions of the sequences that correspond or are similar to each other based on homology and/or functionality, although the nucleotides of the specific sequences can be numbered differently. For example, different isoforms of a gene transcript can have similar or conserved portions of nucleotide sequences whose numbering can differ in the respective isoforms based on alternative splicing and/or other modifications. In addition, it is recognized that different numbering systems can be employed when characterizing a nucleic acid or nucleotide sequence (e.g., a gene transcript and whether to begin numbering the sequence from the translation start codon or to include the 5′UTR). Further, it is recognized that the nucleic acid or nucleotide sequence of different variants of a gene or gene transcript can vary. As used herein, however, the regions of the variants that share nucleic acid or nucleotide sequence homology and/or functionality are deemed to “correspond” to one another. For example, a nucleotide sequence of a XBP1 transcript corresponding to nucleotides X to Y of SEQ ID NO: 1 (“reference sequence”) refers to an XBP1 transcript sequence (e.g., XBP1 pre-mRNA or mRNA) that has an identical sequence or a similar sequence to nucleotides X to Y of SEQ ID NO: 1, wherein X is the start site and Y is the end site. A person of ordinary skill in the art can identify the corresponding X and Y residues in the XBP1 transcript sequence by aligning the XBP1 transcript sequence with SEQ ID NO: 1.

Hybridization

The terms “hybridizing” or “hybridizes” as used herein are to be understood as two nucleic acid strands (e.g. an antisense oligonucleotide and a target nucleic acid) forming hydrogen bonds between base pairs on opposite strands thereby forming a duplex. The affinity of the binding between two nucleic acid strands is the strength of the hybridization. It is often described in terms of the melting temperature (Tm) defined as the temperature at which half of the oligonucleotides are duplexed with the target nucleic acid. At physiological conditions, Tm is not strictly proportional to the affinity (Mergny and Lacroix, 2003, Oligonucleotides 13:515-537). The standard state Gibbs free energy ΔG° is a more accurate representation of binding affinity and is related to the dissociation constant (Kd) of the reaction by ΔG°=−RTIn(Kd), where R is the gas constant and T is the absolute temperature. Therefore, a very low ΔG° of the reaction between an oligonucleotide and the target nucleic acid reflects a strong hybridization between the oligonucleotide and target nucleic acid. ΔG° is the energy associated with a reaction where aqueous concentrations are 1M, the pH is 7, and the temperature is 37° C. The hybridization of oligonucleotides to a target nucleic acid is a spontaneous reaction and for spontaneous reactions ΔG° is less than zero. ΔG° can be measured experimentally, for example, by use of the isothermal titration calorimetry (ITC) method as described in Hansen et al., 1965, Chem. Comm. 36-38 and Holdgate et al., 2005, Drug Discov Today. The skilled person will know that commercial equipment is available for ΔG° measurements. ΔG° can also be estimated numerically by using the nearest neighbour model as described by SantaLucia, 1998, Proc Natl Acad Sci USA. 95: 1460-1465 using appropriately derived thermodynamic parameters described by Sugimoto et al., 1995, Biochemistry 34:11211-11216 and McTigue et at, 2004, Biochemistry 43:5388-5405.

In some embodiments, antisense oligonucleotides of the present invention hybridize to a target nucleic acid with estimated ΔG° values below −10 kcal for oligonucleotides that are 10-30 nucleotides in length.

In some embodiments the degree or strength of hybridization is measured by the standard state Gibbs free energy ΔG°. The oligonucleotides may hybridize to a target nucleic acid with estimated ΔG° values below the range of −10 kcal, such as below −15 kcal, such as below −20 kcal and such as below −25 kcal for oligonucleotides that are 8-30 nucleotides in length. In some embodiments the oligonucleotides hybridize to a target nucleic acid with an estimated ΔG° value of −10 to −60 kcal, such as −12 to −40, such as from −15 to −30 kcal, or −16 to −27 kcal such as −18 to −25 kcal.

Transcript

The term “transcript” as used herein can refer to a primary transcript that is synthesized by transcription of DNA and becomes a messenger RNA (mRNA) after processing, i.e., a precursor messenger RNA (pre-mRNA), and the processed mRNA itself. The term “transcript” can be interchangeably used with “pre-mRNA” and “mRNA.” After DNA strands are transcribed to primary transcripts, the newly synthesized primary transcripts are modified in several ways to be converted to their mature, functional forms to produce different proteins and RNAs such as mRNA, tRNA, rRNA, lncRNA, miRNA and others. Thus, the term “transcript” can include exons, introns, 5′-DTRs, and 3′-DTRs.

Expression

The term “expression” as used herein refers to a process by which a polynucleotide produces a gene product, for example, a RNA or a polypeptide. It includes, without limitation, transcription of the polynucleotide into messenger RNA (mRNA) and the translation of an mRNA into a polypeptide. Expression produces a “gene product.” As used herein, a gene product can be either a nucleic acid, e.g., a messenger RNA produced by transcription of a gene, or a polypeptide which is translated from a transcript. Gene products described herein further include nucleic acids with post transcriptional modifications, e.g., polyadenylation or splicing, or polypeptides with post translational modifications, e.g., methylation, glycosylation, the addition of lipids, association with other protein subunits, or proteolytic cleavage.

Compound Number

The term “Compound Number” or “Comp No.” as used herein refers to a unique number given to a nucleotide sequence having the detailed chemical structure of the components, e.g., nucleosides (e.g., DNA), nucleoside analogs (e.g., LNA, e.g., beta-D-oxy-LNA), nucleobase (e.g., A, T, G, C, U, or MC), and backbone structure (e.g., phosphorothioate or phosphorodiester).

A reference to a SEQ ID number includes a particular nucleic acid sequence but does not include any design or full chemical structure. Furthermore, the antisense oligonucleotide sequences disclosed in the examples herein show a representative design but are not limited to the specific design shown unless otherwise indicated.

The Subject

By “subject” or “individual” or “animal” or “patient” or “mammal,” is meant any subject, particularly a mammalian subject, for whom diagnosis, prognosis, or therapy is desired. Mammalian subjects include humans, domestic animals, farm animals, sports animals, and zoo animals including, e.g., humans, non-human primates, dogs, cats, guinea pigs, rabbits, rats, mice, horses, cattle, bears, and so on. In some embodiments, the subject is a human.

In some embodiments, the subject is a human who is suffering from a proteopathological diseases, or is at risk of developing a proteopathological disease.

Pharmaceutical Composition

The term “pharmaceutical composition” refers to a preparation which is in such form as to permit the biological activity of the active ingredient to be effective, and which contains no additional components which are unacceptably toxic to a subject to which the composition would be administered. Such compositions can be sterile.

Proteopathological Diseases

Proteopathological diseases (also known as proteopathies, proteinopathies, protein conformational disorders, or protein misfolding diseases) include such diseases as prion diseases e.g. Creutzfeldt-Jakob disease; tauopathies, such as Alzheimer's disease; synucleinopathies such as Parkinson's disease; amyloidosis, multiple system atrophy; and TDP-43 pathologies, such as amyotrophic lateral sclerosis (ALS) frontotemporal lobar degeneration (FTLD); CAG repeat indications, such as spinocerebellar ataxies, such as spinocerebellar ataxia type 1, Spinocerebellar ataxia type 2 (SCA2), and Spinocerebellar ataxia type 3 (SCA3, Machado-Joseph disease).

Effective Amount

An “effective amount” of a composition disclosed herein (e.g., a composition comprising a compound, such as an antisense oligonucleotide, or conjugate or salt thereof) refers to an amount sufficient to carry out a specifically stated purpose. An “effective amount” can be determined empirically and in a routine manner, in relation to the stated purpose.

Treatment

Terms such as “treating” or “treatment” or “to treat” or “alleviating” or “to alleviate” refer to both (1) therapeutic measures that cure, slow down, lessen symptoms of, and/or halt progression of a diagnosed pathologic condition or disorder and (2) prophylactic or preventative measures that prevent and/or slow the development of a targeted pathologic condition or disorder, such as a proteopathological disease. Thus, those in need of treatment include those already with the disorder, those prone to have the disorder and those in whom the disorder is to be prevented. In certain embodiments, a subject is successfully “treated” for a disease or condition disclosed elsewhere herein according to the methods provided herein if the patient shows, e.g., total, partial, or transient alleviation or elimination of symptoms associated with the disease or disorder.

Antibodies

General information regarding the nucleotide sequences of human immunoglobulins light and heavy chains is given in: Kabat, E. A., et al., Sequences of Proteins of Immunological Interest, 5th ed., Public Health Service, National Institutes of Health, Bethesda, MD (1991).

As used herein, the amino acid positions of all constant regions and domains of the heavy and light chain are numbered according to the Kabat numbering system described in Kabat, et al., Sequences of Proteins of Immunological Interest, 5th ed., Public Health Service, National Institutes of Health, Bethesda, MD (1991) and is referred to as “numbering according to Kabat” herein. Specifically, the Kabat numbering system (see pages 647-660) of Kabat, et al., Sequences of Proteins of Immunological Interest, 5th ed., Public Health Service, National Institutes of Health, Bethesda, MD (1991) is used for the light chain constant domain CL of kappa and lambda isotype, and the Kabat EU index numbering system (see pages 661-723) of Kabat, et al., Sequences of Proteins of Immunological Interest, 5th ed., Public Health Service, National Institutes of Health, Bethesda, MD (1991) is used for the constant heavy chain domains (CH1, hinge, CH2 and CH3, which is herein further clarified by referring to “numbering according to Kabat EU index” in this case).

The term “antibody” herein is used in the broadest sense and encompasses various antibody structures, including but not limited to full length antibodies, monoclonal antibodies, multispecific antibodies (e.g., bispecific antibodies), and antibody-antibody fragment-fusions as well as combinations thereof.

Native Antibody

The term “native antibody” denotes naturally occurring immunoglobulin molecules with varying structures. For example, native IgG antibodies are heterotetrameric glycoproteins of about 150,000 Daltons, composed of two identical light chains and two identical heavy chains that are disulfide-bonded. From N- to C-terminus, each heavy chain has a heavy chain variable region (VH) followed by three heavy chain constant domains (CH1, CH2, and CH3), whereby between the first and the second heavy chain constant domain a hinge region is located. Similarly, from N- to C-terminus, each light chain has a light chain variable region (VL) followed by a light chain constant domain (CL). The light chain of an antibody may be assigned to one of two types, called kappa (κ) and lambda (λ), based on the amino acid sequence of its constant domain.

Full Length Antibody

The term “full length antibody” denotes an antibody having a structure substantially similar to that of a native antibody. A full length antibody comprises two full length antibody light chains each comprising in N- to C-terminal direction a light chain variable region and a light chain constant domain, as well as two full length antibody heavy chains each comprising in N- to C-terminal direction a heavy chain variable region, a first heavy chain constant domain, a hinge region, a second heavy chain constant domain and a third heavy chain constant domain. In contrast to a native antibody, a full length antibody may comprise further immunoglobulin domains, such as e.g. one or more additional scFvs, or heavy or light chain Fab fragments, or scFabs conjugated to one or more of the termini of the different chains of the full length antibody, but only a single fragment to each terminus. These conjugates are also encompassed by the term full-length antibody.

Antibody Binding Site

The term “antibody binding site” denotes a pair of a heavy chain variable domains and a light chain variable domain. To ensure proper binding to the antigen these variable domains are cognate variable domains, i.e. belong together. An antibody binding site comprises at least three HVRs (e.g. in case of a VHH) or three-six HVRs (e.g. in case of a naturally occurring, i.e. conventional, antibody with a VH/VL pair). Generally, the amino acid residues of an antibody that are responsible for antigen binding form the binding site. These residues are normally contained in a pair of an antibody heavy chain variable domain and a corresponding antibody light chain variable domain. The antigen-binding site of an antibody comprises amino acid residues from the “hypervariable regions” or “HVRs”. “Framework” or “FR” regions are those variable domain regions other than the hypervariable region residues as herein defined. Therefore, the light and heavy chain variable domains of an antibody comprise from N- to C-terminus the regions FR1, HVR1, FR2, HVR2, FR3, HVR3 and FR4. Especially, the HVR3 region of the heavy chain variable domain is the region, which contributes most to antigen binding and defines the binding specificity of an antibody. A “functional binding site” is capable of specifically binding to its target. The term “specifically binding to” denotes the binding of a binding site to its target in an in vitro assay, in one embodiment in a binding assay. Such binding assay can be any assay as long the binding event can be detected. For example, an assay in which the antibody is bound to a surface and binding of the antigen(s) to the antibody is measured by Surface Plasmon Resonance (SPR). Alternatively, a bridging ELISA can be used.

Hypervariable Region

The term “hypervariable region” or “HVR”, as used herein, refers to each of the regions of an antibody variable domain comprising the amino acid residue stretches which are hypervariable in sequence (“complementarity determining regions” or “CDRs”) and/or form structurally defined loops (“hypervariable loops”), and/or contain the antigen-contacting residues (“antigen contacts”). Generally, antibodies comprise six HVRs; three in the heavy chain variable domain VH (H1, H2, H3), and three in the light chain variable domain VL (L1, L2, L3).

HVRs Include

- (a) hypervariable loops occurring at amino acid residues 26-32 (L1), 50-52 (L2), 91-96 (L3), 26-32 (H1), 53-55 (H2), and 96-101 (H3) (Chothia, C. and Lesk, A. M., J. Mol. Biol. 196 (1987) 901-917);
- (b) CDRs occurring at amino acid residues 24-34 (L1), 50-56 (L2), 89-97 (L3), 31-35b (H1), 50-65 (H2), and 95-102 (H3) (Kabat, E. A. et al., Sequences of Proteins of Immunological Interest, 5th ed. Public Health Service, National Institutes of Health, Bethesda, MD (1991), NIH Publication 91-3242.);
- (c) antigen contacts occurring at amino acid residues 27c-36 (L1), 46-55 (L2), 89-96 (L3), 30-35b (H1), 47-58 (H2), and 93-101 (H3) (MacCallum et al. J. Mol. Biol. 262: 732-745 (1996)); and
- (d) combinations of (a), (b), and/or (c), including amino acid residues 46-56 (L2), 47-56 (L2), 48-56 (L2), 49-56 (L2), 26-35 (H1), 26-35b (H1), 49-65 (H2), 93-102 (H3), and 94-102 (H3).

Unless otherwise indicated, HVR residues and other residues in the variable domain (e.g., FR residues) are numbered herein according to Kabat et al., supra.

Antibody Class

The “class” of an antibody refers to the type of constant domains or constant region, preferably the Fc-region, possessed by its heavy chains. There are five major classes of antibodies: IgA, IgD, IgE, IgG, and IgM, and several of these may be further divided into subclasses (isotypes), e.g., IgG1, IgG2, IgG3, IgG4, IgA1, and IgA2. The heavy chain constant domains that correspond to the different classes of immunoglobulins are called α, δ, ε, γ, and μ, respectively.

Heavy Chain Constant Region

The term “heavy chain constant region” denotes the region of an immunoglobulin heavy chain that contains the constant domains, i.e. the CH1 domain, the hinge region, the CH2 domain and the CH3 domain. In one embodiment, a human IgG constant region extends from Ala118 to the carboxyl-terminus of the heavy chain (numbering according to Kabat EU index). However, the C-terminal lysine (Lys447) of the constant region may or may not be present (numbering according to Kabat EU index). The term “constant region” denotes a dimer comprising two heavy chain constant regions, which can be covalently linked to each other via the hinge region cysteine residues forming inter-chain disulfide bonds.

Heavy Chain Fc-Region

The term “heavy chain Fc-region” denotes the C-terminal region of an immunoglobulin heavy chain that contains at least a part of the hinge region (middle and lower hinge region), the CH2 domain and the CH3 domain. In one embodiment, a human IgG heavy chain Fc-region extends from Asp221, or from Cys226, or from Pro230, to the carboxyl-terminus of the heavy chain (numbering according to Kabat EU index). Thus, an Fc-region is smaller than a constant region but in the C-terminal part identical thereto. However, the C-terminal lysine (Lys447) of the heavy chain Fc-region may or may not be present (numbering according to Kabat EU index). The term “Fc-region” denotes a dimer comprising two heavy chain Fc-regions, which can be covalently linked to each other via the hinge region cysteine residues forming inter-chain disulfide bonds.

The constant region, more precisely the Fc-region, of an antibody (and the constant region likewise) is directly involved in complement activation, C1q binding, C3 activation and Fc receptor binding. While the influence of an antibody on the complement system is dependent on certain conditions, binding to C1q is caused by defined binding sites in the Fc-region. Such binding sites are known in the state of the art and described e.g. by Lukas, T. J., et al., J. Immunol. 127 (1981) 2555-2560; Brunhouse, R., and Cebra, J. J., Mol. Immunol. 16 (1979) 907-917; Burton, D. R., et al., Nature 288 (1980) 338-344; Thommesen, J. E., et al., Mol. Immunol. 37 (2000) 995-1004; Idusogie, E. E., et al., J. Immunol. 164 (2000) 4178-4184; Hezareh, M., et al., J. Virol. 75 (2001) 12161-12168; Morgan, A., et al., Immunology 86 (1995) 319-324; and EP 0 307 434. Such binding sites are e.g. L234, L235, D270, N297, E318, K320, K322, P331 and P329 (numbering according to EU index of Kabat). Antibodies of subclass IgG1, IgG2 and IgG3 usually show complement activation, C1q binding and C3 activation, whereas IgG4 do not activate the complement system, do not bind C1q and do not activate C3. An “Fc-region of an antibody” is a term well known to the skilled artisan and defined on the basis of papain cleavage of antibodies.

Monoclonal Antibody

The term “monoclonal antibody” as used herein refers to an antibody obtained from a population of substantially homogeneous antibodies, i.e., the individual antibodies comprising the population are identical and/or bind the same epitope, except for possible variant antibodies, e.g., containing naturally occurring mutations or arising during production of a monoclonal antibody preparation, such variants generally being present in minor amounts. In contrast to polyclonal antibody preparations, which typically include different antibodies directed against different determinants (epitopes), each monoclonal antibody of a monoclonal antibody preparation is directed against a single determinant on an antigen. Thus, the modifier “monoclonal” indicates the character of the antibody as being obtained from a substantially homogeneous population of antibodies, and is not to be construed as requiring production of the antibody by any particular method. For example, monoclonal antibodies may be made by a variety of techniques, including but not limited to the hybridoma method, recombinant DNA methods, phage-display methods, and methods utilizing transgenic animals containing all or part of the human immunoglobulin loci.

Valent

The term “valent” as used within the current application denotes the presence of a specified number of binding sites in an antibody. As such, the terms “bivalent”, “tetravalent”, and “hexavalent” denote the presence of two binding site, four binding sites, and six binding sites, respectively, in an antibody.

Monospecific Antibody

A “monospecific antibody” denotes an antibody that has a single binding specificity, i.e. specifically binds to one antigen. Monospecific antibodies can be prepared as full-length antibodies or antibody fragments (e.g. F(ab′)₂) or combinations thereof (e.g. full length antibody plus additional scFv or Fab fragments), A monospecific antibody does not need to be monovalent, i.e. a monospecific antibody may comprise more than one binding site specifically binding to the one antigen. A native antibody, for example, is monospecific but bivalent.

Multispecific Antibody

A “multispecific antibody” denotes an antibody that has binding specificities for at least two different epitopes on the same antigen or two different antigens. Multispecific antibodies can be prepared as full-length antibodies or antibody fragments (e.g. F(ab′)₂bispecific antibodies) or combinations thereof (e.g. full length antibody plus additional scFv or Fab fragments). A multispecific antibody is at least bivalent, i.e. comprises two antigen binding sites. In addition, a multispecific antibody is at least bispecific. Thus, a bivalent, bispecific antibody is the simplest form of a multispecific antibody. Engineered antibodies with two, three or more (e.g. four) functional antigen binding sites have also been reported (see, e.g., US 2002/0004587).

In certain embodiments, the antibody is a multispecific antibody, e.g. at least a bispecific antibody. Multispecific antibodies are monoclonal antibodies that have binding specificities for at least two different antigens or epitopes. In certain embodiments, one of the binding specificities is for a first antigen and the other is for a different second antigen. In certain embodiments, multispecific antibodies may bind to two different epitopes of the same antigen. Multispecific antibodies may also be used to localize cytotoxic agents to cells, which express the antigen.

Multispecific antibodies can be prepared as full-length antibodies or antibody-antibody fragment-fusions.

Techniques for making multispecific antibodies include, but are not limited to, recombinant co-expression of two immunoglobulin heavy chain-light chain pairs having different specificities (see Milstein, C. and Cuello, A. C., Nature 305 (1983) 537-540, WO 93/08829, and Traunecker, A., et al., EMBO J. 10 (1991) 3655-3659), and “knob-in-hole” engineering (see, e.g., U.S. Pat. No. 5,731,168). Multi-specific antibodies may also be made by engineering electrostatic steering effects for making antibody Fc-heterodimeric molecules (WO 2009/089004); cross-linking two or more antibodies or fragments (see, e.g., U.S. Pat. No. 4,676,980, and Brennan, M., et al., Science 229 (1985) 81-83); using leucine zippers to produce bi-specific antibodies (see, e.g., Kostelny, S. A., et al., J. Immunol. 148 (1992) 1547-1553); using the common light chain technology for circumventing the light chain mis-pairing problem (see, e.g., WO 98/50431); using specific technology for making bispecific antibody fragments (see, e.g., Holliger, P., et al., Proc. Natl. Acad. Sci. USA 90 (1993) 6444-6448); and preparing trispecific antibodies as described, e.g., in Tuft, A., et al., J. Immunol. 147 (1991) 60-69).

Engineered antibodies with three or more antigen binding sites, including for example, “Octopus antibodies”, or DVD-Ig are also included herein (see, e.g., WO 2001/77342 and WO 2008/024715). Other examples of multispecific antibodies with three or more antigen binding sites can be found in WO 2010/115589, WO 2010/112193, WO 2010/136172, WO 2010/145792, and WO 2013/026831. The bispecific antibody or antigen binding fragment thereof also includes a “Dual Acting Fab” or “DAF” (see, e.g., US 2008/0069820 and WO 2015/095539).

Multi-specific antibodies may also be provided in an asymmetric form with a domain crossover in one or more binding arms of the same antigen specificity, i.e. by exchanging the VH/VL domains (see, e.g., WO 2009/080252 and WO 2015/150447), the CH1/CL domains (see, e.g., WO 2009/080253) or the complete Fab arms (see e.g., WO 2009/080251, WO 2016/016299, also see Schaefer et al., Proc. Natl. Acad. Sci. USA 108 (2011) 1187-1191, and Klein at al., MAbs 8 (2016) 1010-1020). In one aspect, the multispecific antibody comprises a Cross-Fab fragment. The term “Cross-Fab fragment” or “xFab fragment” or “crossover Fab fragment” refers to a Fab fragment, wherein either the variable regions or the constant regions of the heavy and light chain are exchanged. A Cross-Fab fragment comprises a polypeptide chain composed of the light chain variable region (VL) and the heavy chain constant region 1 (CH1), and a polypeptide chain composed of the heavy chain variable region (VH) and the light chain constant region (CL). Asymmetrical Fab arms can also be engineered by introducing charged or non-charged amino acid mutations into domain interfaces to direct correct Fab pairing. See e.g., WO 2016/172485.

The antibody or fragment can also be a multispecific antibody as described in WO 2009/080254, WO 2010/112193, WO 2010/115589, WO 2010/136172, WO 2010/145792, or WO 2010/145793.

The antibody or fragment thereof may also be a multispecific antibody as disclosed in WO 2012/163520.

Various further molecular formats for multispecific antibodies are known in the art and are included herein (see e.g., Spiess et al., Mol. Immunol. 67 (2015) 95-106).

Bispecific antibodies are generally antibody molecules that specifically bind to two different, non-overlapping epitopes on the same antigen or to two epitopes on different antigens.

Complex (multispecific) antibodies are

- a full-length antibody with domain exchange:
- a multispecific IgG antibody comprising a first Fab fragment and a second Fab fragment, wherein in the first Fab fragment
- a) only the CH1 and CL domains are replaced by each other (i.e. the light chain of the first Fab fragment comprises a VL and a CH1 domain and the heavy chain of the first Fab fragment comprises a VH and a CL domain); b) only the VH and VL domains are replaced by each other (i.e. the light chain of the first Fab fragment comprises a VH and a CL domain and the heavy chain of the first Fab fragment comprises a VL and a CH1 domain); or
- c) the CH1 and CL domains are replaced by each other and the VH and VL domains are replaced by each other (i.e. the light chain of the first Fab fragment comprises a VH and a CH1 domain and the heavy chain of the first Fab fragment comprises a VL and a CL domain); and
- wherein the second Fab fragment comprises a light chain comprising a VL and a CL domain, and a heavy chain comprising a VH and a CH1 domain;
- the full-length antibody with domain exchange may comprises a first heavy chain including a CH3 domain and a second heavy chain including a CH3 domain, wherein both CH3 domains are engineered in a complementary manner by respective amino acid substitutions, in order to support heterodimerization of the first heavy chain and the modified second heavy chain, e.g. as disclosed in WO 96/27011, WO 98/050431, EP 1870459, WO 2007/110205, WO 2007/147901, WO 2009/089004, WO 2010/129304, WO 2011/90754, WO 2011/143545, WO 2012/058768, WO 2013/157954, or WO 2013/096291 (incorporated herein by reference);
- a full-length antibody with domain exchange and additional heavy chain C-terminal binding site:
- a multispecific IgG antibody comprising
- a) one full length antibody comprising two pairs each of a full length antibody light chain and a full length antibody heavy chain, wherein the binding sites formed by each of the pairs of the full length heavy chain and the full length light chain specifically bind to a first antigen, and
- b) one additional Fab fragment, wherein the additional Fab fragment is fused to the C-terminus of one heavy chain of the full length antibody, wherein the binding site of the additional Fab fragment specifically binds to a second antigen,
- wherein the additional Fab fragment specifically binding to the second antigen i) comprises a domain crossover such that a) the light chain variable domain (VL) and the heavy chain variable domain (VH) are replaced by each other, or b) the light chain constant domain (CL) and the heavy chain constant domain (CH1) are replaced by each other, or is a single chain Fab fragment;
- the one-armed single chain format (=one-armed single chain antibody): antibody comprising a first binding site that specifically binds to a first epitope or antigen and a second binding site that specifically binds to a second epitope or antigen, whereby the individual chains are as follows
  - light chain (variable light chain domain+light chain kappa constant domain)
  - combined light/heavy chain (variable light chain domain+light chain constant domain+peptidic linker+variable heavy chain domain+CH1+Hinge+CH2+CH3 with knob mutation)
  - heavy chain (variable heavy chain domain+CH1+Hinge+CH2+CH3 with hole mutation);
- a two-armed single chain antibody:
- antibody comprising a first binding site that specifically binds to a first epitope or antigen and a second binding site that specifically binds to a second epitope or antigen, whereby the individual chains are as follows
  - combined light/heavy chain 1 (variable light chain domain+light chain constant domain+peptidic linker+variable heavy chain domain+CH1+Hinge+CH2+CH3 with hole mutation)
  - combined light/heavy chain 2 (variable light chain domain+light chain constant domain+peptidic linker+variable heavy chain domain+CH1+Hinge+CH2+CH3 with knob mutation);
- a common light chain bispecific antibody:
- antibody comprising a first binding site that specifically binds to a first epitope or antigen and a second binding site that specifically binds to a second epitope or antigen, whereby the individual chains are as follows
  - light chain (variable light chain domain+light chain constant domain)
  - heavy chain 1 (variable heavy chain domain+CH1+Hinge+CH2+CH3 with hole mutation)
  - heavy chain 2 (variable heavy chain domain+CH1+Hinge+CH2+CH3 with knob mutation);
- a T-cell bispecific antibody:
- a full-length antibody with additional heavy chain N-terminal binding site with domain exchange comprising
  - a first and a second Fab fragment, wherein each binding site of the first and the second Fab fragment specifically bind to a first antigen,
  - a third Fab fragment, wherein the binding site of the third Fab fragment specifically binds to a second antigen, and wherein the third Fab fragment comprises a domain crossover such that the variable light chain domain (VL) and the variable heavy chain domain (VH) are replaced by each other, and
  - an Fc-region comprising a first Fc-region polypeptide and a second Fc-region polypeptide,
  - wherein the first and the second Fab fragment each comprise a heavy chain fragment and a full-length light chain,
  - wherein the C-terminus of the heavy chain fragment of the first Fab fragment is fused to the N-terminus of the first Fc-region polypeptide,
  - wherein the C-terminus of the heavy chain fragment of the second Fab fragment is fused to the N-terminus of the variable light chain domain of the third Fab fragment and the C-terminus of the CH1 domain of the third Fab fragment is fused to the N-terminus of the second Fc-region polypeptide;
- an antibody-multimer-fusions comprising
  - (a) an antibody heavy chain and an antibody light chain, and
  - (b) a first fusion polypeptide comprising in N- to C-terminal direction a first part of a non-antibody multimeric polypeptide, an antibody heavy chain CH1 domain or an antibody light chain constant domain, an antibody hinge region, an antibody heavy chain CH2 domain and an antibody heavy chain CH3 domain, and a second fusion polypeptide comprising in N- to C-terminal direction the second part of the non-antibody multimeric polypeptide and an antibody light chain constant domain if the first polypeptide comprises an antibody heavy chain CH1 domain or an antibody heavy chain CH1 domain if the first polypeptide comprises an antibody light chain constant domain,
  - wherein
    - (i) the antibody heavy chain of (a) and the first fusion polypeptide of (b), (ii) the antibody heavy chain of (a) and the antibody light chain of (a), and (iii) the first fusion polypeptide of (b) and the second fusion polypeptide of (b) are each independently of each other covalently linked to each other by at least one disulfide bond.
  - wherein
    - the variable domains of the antibody heavy chain and the antibody light chain form a binding site specifically binding to an antigen.

The “knobs into holes” dimerization modules and their use in antibody engineering are described in Carter P.; Ridgway J. B. B.; Presta Immunotechnology, Volume 2, Number 1, February 1996, pp. 73-73(1).

The CH3 domains in the heavy chains of an antibody can be altered by the “knob-into-holes” technology, which is described in detail with several examples in e.g. WO 96/027011, Ridgway, J. B., et al, Protein Eng. 9 (1996) 617-621; and Merchant, A. M., et al., Nat. Biotechnol. 16 (1998) 677-681. In this method, the interaction surfaces of the two CH3 domains are altered to increase the heterodimerization of these two CH3 domains and thereby of the polypeptide comprising them. Each of the two CH3 domains (of the two heavy chains) can be the “knob”, while the other is the “hole”. The introduction of a disulfide bridge further stabilizes the heterodimers (Merchant, A. M., et al., Nature Biotech. 16 (1998) 677-681; Atwell, S., et al., J. Mol. Biol. 270 (1997) 26-35) and increases the yield.

The mutation T366W in the CH3 domain (of an antibody heavy chain) is denoted as “knob-mutation” or “mutation knob” and the mutations T366S, L368A, Y407V in the CH3 domain (of an antibody heavy chain) are denoted as “hole-mutations” or “mutations hole” (numbering according to Kabat EU index). An additional inter-chain disulfide bridge between the CH3 domains can also be used (Merchant, A. M., et al., Nature Biotech. 16 (1998) 677-681) e.g. by introducing a S354C mutation into the CH3 domain of the heavy chain with the “knob-mutation” (denotes as “knob-cys-mutations” or “mutations knob-cys”) and by introducing a Y349C mutation into the CH3 domain of the heavy chain with the “hole-mutations” (denotes as “hole-cys-mutations” or “mutations hole-cys”) (numbering according to Kabat EU index).

Domain Crossover

The term “domain crossover” as used herein denotes that in a pair of an antibody heavy chain VH-CH1 fragment and its corresponding cognate antibody light chain, i.e. in an antibody Fab (fragment antigen binding), the domain sequence deviates from the sequence in a native antibody in that at least one heavy chain domain is substituted by its corresponding light chain domain and vice versa. There are three general types of domain crossovers, (i) the crossover of the CH1 and the CL domains, which leads by the domain crossover in the light chain to a VL-CH1 domain sequence and by the domain crossover in the heavy chain fragment to a VH-CL domain sequence (or a full length antibody heavy chain with a VH-CL-hinge-CH2-CH3 domain sequence), (ii) the domain crossover of the VH and the VL domains, which leads by the domain crossover in the light chain to a VH-CL domain sequence and by the domain crossover in the heavy chain fragment to a VL-CH1 domain sequence, and (iii) the domain crossover of the complete light chain (VL-CL) and the complete VH-CH1 heavy chain fragment (“Fab crossover”), which leads to by domain crossover to a light chain with a VH-CH1 domain sequence and by domain crossover to a heavy chain fragment with a VL-CL domain sequence (all aforementioned domain sequences are indicated in N-terminal to C-terminal direction).

Replaced by Each Other

As used herein the term “replaced by each other” with respect to corresponding heavy and light chain domains refers to the aforementioned domain crossovers. As such, when CH1 and CL domains are “replaced by each other” it is referred to the domain crossover mentioned under item (i) and the resulting heavy and light chain domain sequence. Accordingly, when VH and VL are “replaced by each other” it is referred to the domain crossover mentioned under item (ii); and when the CH1 and CL domains are “replaced by each other” and the VH and VL domains are “replaced by each other” it is referred to the domain crossover mentioned under item (iii). Bispecific antibodies including domain crossovers are reported, e.g. in WO 2009/080251, WO 2009/080252, WO 2009/080253, WO 2009/080254 and Schaefer, W., et al, Proc. Natl. Acad. Sci. USA 108 (2011) 11187-11192. Such antibodies are generally termed CrossMab.

Multispecific antibodies also comprise in one embodiment at least one Fab fragment including either a domain crossover of the CH1 and the CL domains as mentioned under item (i) above, or a domain crossover of the VH and the VL domains as mentioned under item (ii) above, or a domain crossover of the VH-CH1 and the VL-VL domains as mentioned under item (iii) above. In case of multispecific antibodies with domain crossover, the Fabs specifically binding to the same antigen(s) are constructed to be of the same domain sequence. Hence, in case more than one Fab with a domain crossover is contained in the multispecific antibody, said Fab(s) specifically bind to the same antigen.

Humanized

A “humanized” antibody refers to an antibody comprising amino acid residues from non-human HVRs and amino acid residues from human FRs. In certain embodiments, a humanized antibody will comprise substantially all of at least one, and typically two, variable domains, in which all or substantially all of the HVRs (e.g., the CDRs) correspond to those of a non-human antibody, and all or substantially all of the FRs correspond to those of a human antibody. A humanized antibody optionally may comprise at least a portion of an antibody constant region derived from a human antibody. A “humanized form” of an antibody, e.g., a non-human antibody, refers to an antibody that has undergone humanization.

Recombinant Antibody

The term “recombinant antibody”, as used herein, denotes all antibodies (chimeric, humanized and human) that are prepared, expressed, created or isolated by recombinant means, such as recombinant cells. This includes antibodies isolated from recombinant cells such as NS0, HEK, BHK, amniocyte or CHO cells.

Antibody Fragment

As used herein, the term ‘antibody fragment’ refers to a molecule other than an intact antibody that comprises a portion of an intact antibody that binds the antigen to which the intact antibody binds, i.e. it is a functional fragment. Examples of antibody fragments include but are not limited to Fv; Fab; Fab′; Fab′-SH; F(ab′)2; bispecific Fab; diabodies; linear antibodies; single-chain antibody molecules (e.g., scFv or scFab).

Recombinant Methods

Antibodies may be produced using recombinant methods and compositions, e.g., as described in U.S. Pat. No. 4,816,567. For these methods, one or more isolated nucleic acid(s) encoding an antibody are provided.

In one aspect, a method of making an antibody is provided, wherein the method comprises culturing a host cell comprising nucleic acid(s) encoding the antibody, as provided above, under conditions suitable for expression of the antibody, and optionally recovering the antibody from the host cell (or host cell culture medium), wherein at least one cultivation step is in the presence of a compound according to the invention.

For recombinant production of an antibody, nucleic acids encoding the antibody, e.g., as described above, are isolated and inserted into one or more vectors for further cloning and/or expression in a host cell. Such nucleic acids may be readily isolated and sequenced using conventional procedures (e.g., by using oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and light chains of the antibody) or produced by recombinant methods or obtained by chemical synthesis.

Recombinant Mammalian Cell

Generally, for the recombinant large-scale production of a polypeptide of interest, such as e.g. a therapeutic antibody, a cell stably expressing and secreting said polypeptide is required.

This cell is a “recombinant mammalian cell” or “recombinant production cell” and the process used for generating such a cell is termed “cell line development”. In the first step of the cell line development process, a suitable host cell, such as e.g. a CHO cell, is transfected with a nucleic acid sequence suitable for expression of said polypeptide of interest. In a second step, a cell stably expressing the polypeptide of interest is selected based on the co-expression of a selection marker, which had been co-transfected with the nucleic acid encoding the polypeptide of interest.

A nucleic acid encoding a polypeptide, i.e. the coding sequence, is denoted as a structural gene. Such a structural gene is pure coding information. Thus, additional regulatory elements are required for expression thereof. Therefore, normally a structural gene is integrated in a so-called expression cassette. The minimal regulatory elements needed for an expression cassette to be functional in a mammalian cell are a promoter functional in said mammalian cell, which is located upstream, i.e. 5′, to the structural gene, and a polyadenylation signal sequence functional in said mammalian cell, which is located downstream, i.e. 3′, to the structural gene. The promoter, the structural gene and the polyadenylation signal sequence are arranged in an operably linked form.

In case the polypeptide of interest is a heteromultimeric polypeptide that is composed of different (monomeric) polypeptides, such as e.g. an antibody or a complex antibody format, not only a single expression cassette is required but a multitude of expression cassettes differing in the contained structural gene, i.e. at least one expression cassette for each of the different (monomeric) polypeptides of the heteromultimeric polypeptide. For example, a full-length antibody is a heteromultimeric polypeptide comprising two copies of a light chain as well as two copies of a heavy chain. Thus, a full-length antibody is composed of two different polypeptides. Therefore, two expression cassettes are required for the expression of a full-length antibody, one for the light chain and one for the heavy chain. If, for example, the full-length antibody is a bispecific antibody, i.e. the antibody comprises two different binding sites specifically binding to two different antigens, the two light chains as well as the two heavy chains are also different from each other. Thus, such a bispecific, full-length antibody is composed of four different polypeptides and therefore, four expression cassettes are required.

Expression Vector

The expression cassette(s) for the polypeptide of interest is(are) generally integrated into one or more so called “expression vector(s)”. An “expression vector” is a nucleic acid providing all required elements for the amplification of said vector in bacterial cells as well as the expression of the comprised structural gene(s) in a mammalian cell. Typically, an expression vector comprises a prokaryotic plasmid propagation unit, e.g. for E. coli, comprising an origin of replication, and a prokaryotic selection marker, as well as a eukaryotic selection marker, and the expression cassettes required for the expression of the structural gene(s) of interest. An “expression vector” is a transport vehicle for the introduction of expression cassettes into a mammalian cell.

The more complex the polypeptide to be expressed is the higher also the number of required different expression cassettes is. Inherently with the number of expression cassettes also the size of the nucleic acid to be integrated into the genome of the host cell increases. Concomitantly also the size of the expression vector increases. However, there is a practical upper limit to the size of a vector in the range of about 15 kbps above which handling and processing efficiency profoundly drops. This issue can be addressed by using two or more expression vectors. Thereby the expression cassettes can be split between different expression vectors each comprising only some of the expression cassettes resulting in a size reduction.

Cell Line Development

Cell line development (CLD) for the generation of recombinant cell expressing a heterologous polypeptide, such as e.g. a multispecific antibody, employs either random integration (RI) or targeted integration (TI) of the nucleic acid(s) comprising the respective expression cassettes required for the expression and production of the heterologous polypeptide of interest.

Using RI, in general, several vectors or fragments thereof integrate into the cell's genome at the same or different loci.

Using TI, in general, a single copy of the transgene comprising the different expression cassettes is integrated at a predetermined “hot-spot” in the host cell's genome.

Unlike RI CLD, targeted integration (TI) CLD introduces the transgene comprising the different expression cassettes at a predetermined “hot-spot” in a cell's genome. Also the introduction is with a defined ratio of the expression cassettes. Thereby, without being bound by this theory, all the different polypeptides of the heteromultimeric polypeptide are expressed at the same (or at least a comparable and only slightly differing) rate and at an appropriate ratio.

Also, given the defined copy number and the defined integration site, recombinant cells obtained by TI should have better stability compared to cells obtained by RI. Moreover, since the selection marker is only used for selecting cells with proper TI and not for selecting cells with a high level of transgene expression, a less mutagenic marker may be applied to minimize the chance of sequence variants (SVs), which is in part due to the mutagenicity of the selective agents like methotrexate (MTX) or methionine sulfoximine (MSX).

Suitable host cells for the expression of an (glycosylated) antibody are generally derived from multicellular organisms such as e.g. vertebrates.

Host Cells

Any mammalian cell line that is adapted to grow in suspension can be used in the method according to the current invention. In addition, independent from the integration method, i.e. for RI as well as TI, any mammalian host cell can be used.

Examples of useful mammalian host cell lines are human amniocyte cells (e.g. CAP-T cells as described in Woelfel, J. et al., BMC Proc. 5 (2011) P133); monkey kidney CV1 line transformed by SV40 (COS-7); human embryonic kidney line (HEK293 or HEK293T cells as described, e.g., in Graham, F. L. et al., J. Gen Virol. 36 (1977) 59-74); baby hamster kidney cells (BHK); mouse sertoli cells (TM4 cells as described, e.g., in Mather, J. P., Biol. Reprod. 23 (1980) 243-252); monkey kidney cells (CV1); African green monkey kidney cells (VERO-76); human cervical carcinoma cells (HELA); canine kidney cells (MDCK; buffalo rat liver cells (BRL 3A); human lung cells (W138); human liver cells (Hep G2); mouse mammary tumor (MMT 060562); TRI cells (as described, e.g., in Mather, J. P. et al., Annals N.Y. Acad. Sci. 383 (1982) 44-68); MRC 5 cells; and FS4 cells, Other useful mammalian host cell lines include Chinese hamster ovary (CHO) cells, including DHFR-CHO cells (Urlaub, G. et al., Proc. Natl. Acad. Sci. USA 77 (1980) 4216-4220); and myeloma cell lines such as Y0, NS0 and Sp2/0.

For a review of certain mammalian host cell lines suitable for antibody production, see, e.g., Yazaki, P. and Wu, A. M., Methods in Molecular Biology, Vol. 248, Lo, B. K. C. (ed.), Humana Press, Totowa, NJ (2004), pp, 255-268.

In one embodiment, the mammalian host cell is, e.g., a Chinese Hamster Ovary (CHO) cell (e.g. CHO K1, CHO DG44, etc.), a Human Embryonic Kidney (HEK) cell, a lymphoid cell (e.g., Y0, NS0, Sp2/0 cell), or a human amniocyte cells (e.g. CAP-T, etc.). In one preferred embodiment, the mammalian (host) cell is a CHO cell.

Targeted integration allows exogenous nucleotide sequences to be integrated into a pre-determined site of a mammalian cell's genome. In certain embodiments, the targeted integration is mediated by a recombinase that recognizes one or more recombination recognition sequences (RRSs), which are present in the genome and in the exogenous nucleotide sequence to be integrated. In certain embodiments, the targeted integration is mediated by homologous recombination.

Recombination Recognition Sequence

A “recombination recognition sequence” (RRS) is a nucleotide sequence recognized by a recombinase and is necessary and sufficient for recombinase-mediated recombination events. A RRS can be used to define the position where a recombination event will occur in a nucleotide sequence.

In certain embodiments, a RRS can be recognized by a Cre recombinase. In certain embodiments, a RRS can be recognized by a FLP recombinase. In certain embodiments, a RRS can be recognized by a Bxb1 integrase. In certain embodiments, a RRS can be recognized by a φC31 integrase.

In certain embodiments when the RRS is a LoxP site, the cell requires the Cre recombinase to perform the recombination. In certain embodiments when the RRS is a FRT site, the cell requires the FLP recombinase to perform the recombination. In certain embodiments when the RRS is a Bxb1 attP or a Bxb1 attB site, the cell requires the Bxb1 integrase to perform the recombination. In certain embodiments when the RRS is a φC31 attP or a φC31 attB site, the cell requires the φC31 integrase to perform the recombination. The recombinases can be introduced into a cell using an expression vector comprising coding sequences of the enzymes or as protein or a mRNA.

With respect to TI, any known or future mammalian host cell suitable for TI comprising a landing site as described herein integrated at a single site within a locus of the genome can be used in the current invention. Such a cell is denoted as mammalian TI host cell. In certain embodiments, the mammalian TI host cell is a hamster cell, a human cell, a rat cell, or a mouse cell comprising a landing site as described herein. In one preferred embodiment, the mammalian T1 host cell is a CHO cell. In certain embodiments, the mammalian TI host cell is a Chinese hamster ovary (CHO) cell, a CHO K1 cell, a CHO K1SV cell, a CHO DG44 cell, a CHO DUKXB-11 cell, a CHO K1S cell, or a CHO K1M cell comprising a landing site as described herein integrated at a single site within a locus of the genome.

In certain embodiments, a mammalian TI host cell comprises an integrated landing site, wherein the landing site comprises one or more recombination recognition sequence (RRS). The RRS can be recognized by a recombinase, for example, a Cre recombinase, an FLP recombinase, a Bxb1 integrase, or a φC31 integrase. The RRS can be selected independently of each other from the group consisting of a LoxP sequence, a LoxP L3 sequence, a LoxP 2L sequence, a LoxFas sequence, a Lox511 sequence, a Lox2272 sequence, a Lox2372 sequence, a Lox5171 sequence, a Loxm2 sequence, a Lox71 sequence, a Lox66 sequence, a FRT sequence, a Bxb1 attP sequence, a Bxb1 attB sequence, a φC31 attP sequence, and a φC31 attB sequence. If multiple RRSs have to be present, the selection of each of the sequences is dependent on the other insofar as non-identical RRSs are chosen.

In certain embodiments, the landing site comprises one or more recombination recognition sequence (RRS), wherein the RRS can be recognized by a recombinase. In certain embodiments, the integrated landing site comprises at least two RRSs. In certain embodiments, an integrated landing site comprises three RRSs, wherein the third RRS is located between the first and the second RRS. In certain preferred embodiments, all three RRSs are different. In certain embodiments, the landing site comprises a first, a second and a third RRS, and at least one selection marker located between the first and the second RRS, and the third RRS is different from the first and/or the second RRS. In certain embodiments, the landing site further comprises a second selection marker, and the first and the second selection markers are different. In certain embodiments, the landing site further comprises a third selection marker and an internal ribosome entry site (IRES), wherein the IRES is operably linked to the third selection marker. The third selection marker can be different from the first or the second selection marker.

Although the invention is exemplified with a CHO cell hereafter, this is presented solely to exemplify the invention but shall not be construed in any way as limitation. The true scope of the invention is set forth in the claims.

An exemplary mammalian TI host cell that is suitable for use in a method according to the current invention is a CHO cell harboring a landing site integrated at a single site within a locus of its genome wherein the landing site comprises three heterospecific loxP sites for Cre recombinase mediated DNA recombination.

In this example, the heterospecific loxP sites are L3, LoxFas and 2L (see e.g. Lanza et al., Biotechnol. J. 7 (2012) 898-908; Wong et al., Nucleic Acids Res. 33 (2005) e147), whereby L3 and 2L flank the landing site at the 5′-end and 3′-end, respectively, and LoxFas is located between the L3 and 2L sites. The landing site further contains a bicistronic unit linking the expression of a selection marker via an IRES to the expression of the fluorescent GFP protein allowing to stabilize the landing site by positive selection as well as to select for the absence of the site after transfection and Cre-recombination (negative selection), Green fluorescence protein (GFP) serves for monitoring the RMCE reaction.

Such a configuration of the landing site as outlined in the previous paragraph allows for the simultaneous integration of two vectors, e.g. of a so called front vector harboring an L3 and a LoxFas site and a back vector harboring a LoxFas and an 2L site. The functional elements of a selection marker gene different from that present in the landing site can be distributed between both vectors: promoter and start codon can be located on the front vector whereas coding region and poly A signal are located on the back vector. Only correct recombinase-mediated integration of said nucleic acids from both vectors induces resistance against the respective selection agent.

Generally, a mammalian TI host cell is a mammalian cell comprising a landing site integrated at a single site within a locus of the genome of the mammalian cell, wherein the landing site comprises a first and a second recombination recognition sequence flanking at least one first selection marker, and a third recombination recognition sequence located between the first and the second recombination recognition sequence, and all the recombination recognition sequences are different.

The selection marker(s) can be selected from the group consisting of an aminoglycoside phosphotransferase (APH) (e.g., hygromycin phosphotransferase (HYG), neomycin and G418 APH), dihydrofolate reductase (DHFR), thymidine kinase (TK), glutamine synthetase (GS), asparagine synthetase, tryptophan synthetase (indole), histidinol dehydrogenase (histidinol 0), and genes encoding resistance to puromycin, blasticidin, bleomycin, phleomycin, chloramphenicol, Zeocin, and mycophenolic acid. The selection marker(s) can also be a fluorescent protein selected from the group consisting of green fluorescent protein (GFP), enhanced GFP (eGFP), a synthetic GFP, yellow fluorescent protein (YFP), enhanced YFP (eYFP), cyan fluorescent protein (CFP), mPlum, mCherry, tdTomato, mStrawberry, J-red, DsRed-monomer, mOrange, mKO, mCitrine, Venus, YPet, Emerald6, CyPet, mCFPm, Cerulean, and T-Sapphire.

An exogenous nucleotide sequence is a nucleotide sequence that does not originate from a specific cell but can be introduced into said cell by DNA delivery methods, such as, e.g., by transfection, electroporation, or transformation methods. In certain embodiments, a mammalian TI host cell comprises at least one landing site integrated at one or more integration sites in the mammalian cell's genome. In certain embodiments, the landing site is integrated at one or more integration sites within a specific a locus of the genome of the mammalian cell.

In certain embodiments, the integrated landing site comprises at least one selection marker. In certain embodiments, the integrated landing site comprises a first, a second and a third RRS, and at least one selection marker. In certain embodiments, a selection marker is located between the first and the second RRS. In certain embodiments, two RRSs flank at least one selection marker, i.e., a first RRS is located 5′ (upstream) and a second RRS is located 3′ (downstream) of the selection marker. In certain embodiments, a first RRS is adjacent to the 5′-end of the selection marker and a second RRS is adjacent to the 3′-end of the selection marker. In certain embodiments, the landing site comprises a first, second, and third RRS, and at least one selection marker located between the first and the third RRS.

In certain embodiments, a selection marker is located between a first and a second RRS and the two flanking RRSs are different. In certain preferred embodiments, the first flanking RRS is a LoxP L3 sequence and the second flanking RRS is a LoxP 2L sequence. In certain embodiments, a LoxP L3 sequence is located 5′ of the selection marker and a LoxP 2L sequence is located 3′ of the selection marker. In certain embodiments, the first flanking RRS is a wild-type FRT sequence and the second flanking RRS is a mutant FRT sequence. In certain embodiments, the first flanking RRS is a Bxb1 attP sequence and the second flanking RRS is a Bxb1 attB sequence. In certain embodiments, the first flanking RRS is a φC31 attP sequence and the second flanking RRS is a φC31 attB sequence. In certain embodiments, the two RRSs are positioned in the same orientation. In certain embodiments, the two RRSs are both in the forward or reverse orientation. In certain embodiments, the two RRSs are positioned in opposite orientation.

In certain embodiments, the integrated landing site comprises a first and a second selection marker, which are flanked by two RRSs, wherein the first selection marker is different from the second selection marker. In certain embodiments, the two selection markers are both independently of each other selected from the group consisting of a glutamine synthetase selection marker, a thymidine kinase selection marker, a HYG selection marker, and a puromycin resistance selection marker. In certain embodiments, the integrated landing site comprises a thymidine kinase selection marker and a HYG selection marker. In certain embodiments, the first selection maker is selected from the group consisting of an aminoglycoside phosphotransferase (APH) (e.g., hygromycin phosphotransferase (HYG), neomycin and G418 APH), dihydrofolate reductase (DHFR), thymidine kinase (TK), glutamine synthetase (GS), asparagine synthetase, tryptophan synthetase (indole), histidinol dehydrogenase (histidinol D), and genes encoding resistance to puromycin, blasticidin, bleomycin, phleomycin, chloramphenicol, Zeocin, and mycophenolic acid, and the second selection maker is selected from the group consisting of a GFP, an eGFP, a synthetic GFP, a YFP, an eYFP, a CFP, an mPlum, an mCherry, a tdTomato, an mStrawberry, a J-red, a DsRed-monomer, an mOrange, an mKO, an mCitrine, a Venus, a YPet, an Emerald, a CyPet, an mCFPm, a Cerulean, and a T-Sapphire fluorescent protein. In certain embodiments, the first selection marker is a glutamine synthetase selection marker and the second selection marker is a GFP fluorescent protein. In certain embodiments, the two RRSs flanking both selection markers are different.

In certain embodiments, the selection marker is operably linked to a promoter sequence. In certain embodiments, the selection marker is operably linked to an SV40 promoter. In certain embodiments, the selection marker is operably linked to a human Cytomegalovirus (CMV) promoter.

Targeted Integration

One method for the generation of a recombinant mammalian cell according to the current invention is targeted integration (TI).

In targeted integration, site-specific recombination is employed for the introduction of an exogenous nucleic acid into a specific locus in the genome of a mammalian TI host cell. This is an enzymatic process wherein a sequence at the site of integration in the genome is exchanged for the exogenous nucleic acid. One system used to effect such nucleic acid exchanges is the Cre-lox system. The enzyme catalyzing the exchange is the Cre recombinase. The sequence to be exchanged is defined by the position of two lox(P)-sites in the genome as well as in the exogenous nucleic acid. These lox(P)-sites are recognized by the Cre recombinase. Nothing more is required, i.e. no ATP etc. Originally, the Cre-lox system has been found in bacteriophage P1.

The Cre-lox system operates in different cell types, like mammals, plants, bacteria and yeast.

In one embodiment, the exogenous nucleic acid encoding the heterologous polypeptide has been integrated into the mammalian TI host cell by single or double recombinase mediated cassette exchange (RMCE). Thereby a recombinant mammalian cell, such as a recombinant CHO cell, is obtained, in which a defined and specific expression cassette sequence has been integrated into the genome at a single locus, which in turn results in the efficient expression and production of the heterologous polypeptide.

The Cre-LoxP site-specific recombination system has been widely used in many biological experimental systems. Cre recombinase is a 38-kDa site-specific DNA recombinase that recognizes 34 bp LoxP sequences. Cre recombinase is derived from bacteriophage P1 and belongs to the tyrosine family site-specific recombinase. Cre recombinase can mediate both intra and intermolecular recombination between LoxP sequences. The LoxP sequence is composed of an 8 bp non-palindromic core region flanked by two 13 bp inverted repeats. Cre recombinase binds to the 13 bp repeat thereby mediating recombination within the 8 bp core region. Cre-LoxP-mediated recombination occurs at a high efficiency and does not require any other host factors. If two LoxP sequences are placed in the same orientation on the same nucleotide sequence, Cre recombinase-mediated recombination will excise DNA sequences located between the two LoxP sequences as a covalently closed circle. If two LoxP sequences are placed in an inverted position on the same nucleotide sequence, Cre recombinase-mediated recombination will invert the orientation of the DNA sequences located between the two sequences. If two LoxP sequences are on two different DNA molecules and if one DNA molecule is circular, Cre recombinase-mediated recombination will result in integration of the circular DNA sequence.

Matching RRSs

The term “matching RRSs” indicates that a recombination occurs between two RRSs. In certain embodiments, the two matching RRSs are the same. In certain embodiments, both RRSs are wild-type LoxP sequences. In certain embodiments, both RRSs are mutant LoxP sequences. In certain embodiments, both RRSs are wild-type FRT sequences. In certain embodiments, both RRSs are mutant FRT sequences. In certain embodiments, the two matching RRSs are different sequences but can be recognized by the same recombinase. In certain embodiments, the first matching RRS is a Bxb1 attP sequence and the second matching RRS is a Bxb1 attB sequence. In certain embodiments, the first matching RRS is a φC31 attB sequence and the second matching RRS is a φC31 attB sequence.

Two-Plasmid RMCE

A “two-plasmid RMCE” strategy or “double RMCE” is employed in the method according to the current invention when using a two-vector combination. For example, but not by way of limitation, an integrated landing site could comprise three RRSs, e.g., an arrangement where the third RRS (“RRS3”) is present between the first RRS (“RRS1”) and the second RRS (“RRS2”), while a first vector comprises two RRSs matching the first and the third RRS on the integrated exogenous nucleotide sequence, and a second vector comprises two RRSs matching the third and the second RRS on the integrated exogenous nucleotide sequence. The two-plasmid RMCE strategy involves using three RRS sites to carry out two independent RMCEs simultaneously. Therefore, a landing site in the mammalian TI host cell using the two-plasmid RMCE strategy includes a third RRS site (RRS3) that has no cross activity with either the first RRS site (RRS1) or the second RRS site (RRS2). The two plasmids to be targeted require the same flanking RRS sites for efficient targeting, one plasmid (front) flanked by RRS1 and RRS3 and the other (back) by RRS3 and RRS2. In addition, two selection markers are needed in the two-plasmid RMCE. One selection marker expression cassette was split into two parts. The front plasmid would contain the promoter followed by a start codon and the RRS3 sequence. The back plasmid would have the RRS3 sequence fused to the N-terminus of the selection marker coding region, minus the start-codon (ATG). Additional nucleotides may need to be inserted between the RRS3 site and the selection marker sequence to ensure in frame translation for the fusion protein, i.e. operable linkage. Only when both plasmids are correctly inserted, the full expression cassette of the selection marker will be assembled and, thus, rendering cells resistance to the respective selection agent.

Two-plasmid RMCE involves double recombination cross-over events, catalyzed by a recombinase, between the two heterospecific RRSs within the target genomic locus and the donor DNA molecule, Two-plasmid RMCE is designed to introduce a copy of the DNA sequences from the front- and back-vector in combination into the pre-determined locus of a mammalian TI host cell's genome. RMCE can be implemented such that prokaryotic vector sequences are not introduced into the mammalian TI host cell's genome, thus, reducing and/or preventing unwanted triggering of host immune or defense mechanisms. The RMCE procedure can be repeated with multiple DNA sequences.

In certain embodiments, targeted integration is achieved by two RMCEs, wherein two different DNA sequences, each comprising at least one expression cassette encoding a part of a heteromultimeric polypeptide and/or at least one selection marker or part thereof flanked by two heterospecific RRSs, are both integrated into a pre-determined site of the genome of a RRSs matching mammalian TI host cell. In certain embodiments, targeted integration is achieved by multiple RMCEs, wherein DNA sequences from multiple vectors, each comprising at least one expression cassette encoding a part of a heteromultimeric polypeptide and/or at least one selection marker or part thereof flanked by two heterospecific RRSs, are all integrated into a predetermined site of the genome of a mammalian TI host cell. In certain embodiments the selection marker can be partially encoded on the first the vector and partially encoded on the second vector such that only the correct integration of both by double RMCE allows for the expression of the selection marker.

In certain embodiments, targeted integration via recombinase-mediated recombination leads to selection marker and/or the different expression cassettes for the multimeric polypeptide integrated into one or more pre-determined integration sites of a host cell genome free of sequences from a prokaryotic vector.

It has to be pointed out that, as in one embodiment, knockout can be performed either before introduction of the exogenous nucleic acid encoding the heterologous polypeptide or thereafter.

DETAILED DESCRIPTION OF THE INVENTION

XBP1 exon 4 comprises a 26 nucleotide fragment which is excised by IRE1α in vivo to introduce a +2 out of frame event and produce XBP1s. The present inventors have determined that skipping of exon 4 also introduces a +2 out of frame event and produces a functional protein. Skipping of exon 4 can be accomplished using antisense oligonucleotides of the invention. By skipping exon 4 in accordance with the invention, a much larger nucleotide fragment, of 146 bp, is removed from the pre-mRNA as compared to the 26 nucleotide fragment excised by IRE1α. Thus, XBP1Δ4 according to the invention is not equal to in vivo spliced XBP1.

The present inventors have also identified that the generation or expression of the XBP1Δ4 variant in mammalian cells results in an enhanced recombinant expression of heterologously expressed proteins, such as monoclonal antibodies, particularly of heterologously expressed proteins which are otherwise difficult to express. This indicates that the generation or expression of the XBP1Δ4 variant results in an enhanced quality of protein expression in mammalian cells.

The present invention discloses and utilizes specific antisense oligonucleotides, which are complementary, such as fully complementary, to a portion of the XBP1 pre-mRNA transcript. The antisense oligonucleotides of the invention are capable of reducing the inclusion (enhancing the excision) of XBP1 exon 4 in XBP1 transcripts. The antisense oligonucleotides of the invention thereby result in the expression of, or enhanced expression of, an XBP1Δ4 variant.

The inventors have identified that the generation or expression of the XBP1Δ4 variant in mammalian cells results in enhanced protein expression. The antisense oligonucleotides of the invention may therefore be used to enhance the yield or the quality of proteins produced from heterologous protein expression systems, for example in the manufacture of antibodies, such as monoclonal antibodies.

The antisense oligonucleotides of the invention also have therapeutic utilities in the treatment and prevention of proteopathological disease.

Antisense Oligonucleotides

In one aspect, the present invention relates to an antisense oligonucleotide for use in the expression of a XBP1 splice variant in a cell which expresses XBP1, wherein the antisense oligonucleotide is 8-40 nucleotides in length and comprises a contiguous nucleotide sequence of 8-40 nucleotides in length which is complementary to a mammalian XBP1 pre-mRNA transcript.

In certain embodiments of the present invention, the XBP1 splice variant has a +2 out of frame event.

In certain embodiments, the XBP1 splice variant is XBP1Δ4.

The invention provides an antisense oligonucleotide, wherein the antisense oligonucleotide is 8-40 nucleotides in length and comprises a contiguous nucleotide sequence of at least 12 nucleotides in length which is complementary, such as fully complementary, to a mammalian XBP1 pre-mRNA transcript.

The invention provides an antisense oligonucleotide, wherein the antisense oligonucleotide is 8-40 nucleotides in length and comprises a contiguous nucleotide sequence of 12-16 nucleotides in length which is complementary, such as fully complementary, to a mammalian XBP1 pre-mRNA transcript.

The invention provides an antisense oligonucleotide, wherein the antisense oligonucleotide is 12-16 nucleotides in length and comprises a contiguous nucleotide sequence of 12-16 nucleotides in length which is complementary, such as fully complementary, to a mammalian XBP1 pre-mRNA transcript.

The invention provides an antisense oligonucleotide, wherein the antisense oligonucleotide is 8-40 nucleotides in length and comprises a contiguous nucleotide sequence of 12-18 nucleotides in length which is complementary, such as fully complementary, to a mammalian XBP1 pre-mRNA transcript.

The antisense oligonucleotide may be 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 or 40 nucleotides in length.

In some embodiments, the antisense oligonucleotide is 8-40, 12-40, 12-20, 10-20, 14-18, 12-18 or 16-18 nucleotides in length.

The contiguous nucleotide sequence may be 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 or 40 nucleotides in length. In some embodiments, the contiguous nucleotide sequence is at least 12 nucleotides in length, such as 12-16 or 12-18 nucleotides in length.

In some embodiments, the contiguous nucleotide sequence is the same length as the antisense oligonucleotide.

In some embodiments, the antisense oligonucleotide consists of the contiguous nucleotide sequence.

In some embodiments, the antisense oligonucleotide is the contiguous nucleotide sequence.

In some embodiments, the antisense oligonucleotide comprises a contiguous sequence of 8 to 40 nucleotides in length, which is at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99% or more complementary with a region of the target nucleic acid or a target sequence. Put another way, in some embodiments, an antisense oligonucleotide of the invention may include one, two, three or more mis-matches, wherein a mis-match is a nucleotide within the antisense oligonucleotide of the invention which does not base pair with its target.

It is advantageous if the oligonucleotide, or contiguous nucleotide sequence thereof, is fully complementary (100% complementary) to a region of the target sequence.

In some embodiments, the antisense oligonucleotide is isolated, purified, or manufactured.

In some embodiments, the antisense oligonucleotide comprises one or more modified nucleotides or one or more modified nucleosides.

In some embodiments, the antisense oligonucleotide is a morpholino modified antisense oligonucleotide.

In some embodiments, the antisense oligonucleotide comprises one or more modified nucleosides, such as one or more modified nucleotides independently selected from the group consisting of 2′-O-alkyl-RNA; 2′-O-methyl RNA (2′-OMe); Z-alkoxy-RNA; 2′-O-methoxyethyl-RNA (2′-MOE); 2′-amino-DNA; 2′-fluoro-RNA; 2′-fluoro-DNA; arabino nucleic acid (ANA); 2′-fluoro-ANA; bicyclic nucleoside analog (LNA); or any combination thereof.

In some embodiments, one or more of the modified nucleosides is a sugar modified nucleoside.

In some embodiments, one or more of the modified nucleosides comprises a bicyclic sugar.

In some embodiments, one or more of the modified nucleosides is an affinity enhancing 2′ sugar modified nucleoside.

In some embodiments, one or more of the modified nucleosides is an LNA nucleoside.

In some embodiments, the antisense oligonucleotide, or contiguous nucleotide sequence thereof, comprises one or more 5′-methyl-cytosine nucleobases.

In some embodiments, one or more of the internucleoside linkages within the contiguous nucleotide sequence of the antisense oligonucleotide is modified.

In some embodiments, the one or more modified internucleoside linkages comprises a phosphorothioate linkage.

In some embodiments, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or about 100% of the internucleoside linkages of the antisense oligonucleotide or contiguous nucleotide sequence thereof are modified.

In some embodiments, the antisense oligonucleotides of the invention are in solid powdered form, such as in the form of a lyophilized powder.

Additional disclosures regarding the above antisense oligonucleotides are provided throughout the present disclosure.

The Target

As Described Herein, the Antisense Oligonucleotides of the Invention Target the XBP1 mRNA sequence in order to cause expression of an XBP1 splice variant, such as a XBP1Δ4 variant.

As used herein, the term “XBP1Δ4” refers to a XBP1 transcript which lacks exon 4 (a XBP1Δ4 variant), or a XBP1 protein which lacks the amino acids encoded by XBP1 exon 4. A key feature of the XBP1Δ4 variant is that the deletion of exon 4 and the introduction of a +2 frame shift in the XBP1 coding sequence has occurred, which results in the expression of a XBP1Δ4 variant with a C-terminal region which is homologous to the C-terminal region of the XBP1s variant of XBP1 (induced by IRE1).

In certain embodiments, a XBP1Δ4 protein lacks all or essentially all of the peptide sequence encoded by XBP1 exon 4.

The term “target”, as used herein, is used to refer to the transcript of the gene that the antisense oligonucleotides of the present invention specifically hybridizes/binds to (i.e., “XBP1”).

XBP1 is also known as X-box binding protein 1, TREB-5, TREB5, XBP-1, and XBP2.

The target for oligonucleotides of the present invention is an XBP1 pre-mRNA transcript. The XBP1 pre-mRNA transcript is preferably a mammalian XBP1 pre-mRNA transcript

In some embodiments, the mammalian XBP1 pre-mRNA transcript is a hamster XBP1 pre-mRNA transcript.

The hamster XBP1 pre-mRNA sequence is recited in SEQ ID NO 1.

In certain embodiments, the contiguous nucleotide sequence is complementary to at least 10 contiguous nucleotides of the hamster XBP1 pre-mRNA transcript (SEQ ID NO 1).

In certain embodiments, the contiguous nucleotide sequence may be complementary to at least 10 contiguous nucleotides from nucleotides 2960-3113 of SEQ ID NO 1.

In other embodiments, the contiguous nucleotide sequence may be complementary to at least 10 contiguous nucleotides from nucleotides 2986-3018 of SEQ ID NO 1.

In some embodiments the contiguous nucleotide sequence is complementary to at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16 or at least 17 contiguous nucleotides of the hamster XBP1 pre-mRNA transcript (SEQ ID NO 1).

In other embodiments the contiguous nucleotide sequence may be complementary to a nucleotide sequence selected from the group consisting of SEQ ID NO 299, SEQ ID NO 301, SEQ ID NO 302, SEQ ID NO 304, SEQ ID NO 305, SEQ ID NO 306, SEQ ID NO 307, SEQ ID NO 308, SEQ ID NO 309, SEQ ID NO 310, SEQ ID NO 314, SEQ ID NO 316, SEQ ID NO 317, SEQ ID NO 318. SEQ ID NO 319, SEQ ID NO 323, SEQ ID NO 325, SEQ ID NO 327, SEQ ID NO 328, SEQ ID NO 330, SEQ ID NO 331, SEQ ID NO 332, SEQ ID NO 333, SEQ ID NO 334, SEQ ID NO 336, SEQ ID NO 337, SEQ ID NO 385, SEQ ID NO 386, SEQ ID NO 387, SEQ ID NO 388, SEQ ID NO 390, SEQ ID NO 391, SEQ ID NO 392, SEQ ID NO 393, SEQ ID NO 394, SEQ ID NO 395, SEQ ID NO 396 397, SEQ ID NO 398, SEQ ID NO 399, SEQ ID NO 401, SEQ ID NO 402, SEQ ID NO 419, SEQ ID NO 431, SEQ ID NO, SEQ ID NO 432, SEQ ID NO 433, SEQ ID NO 434, SEQ ID NO 438, SEQ ID NO 439, SEQ ID NO 440, SEQ ID NO 441, SEQ ID NO 442, SEQ ID NO 449, SEQ ID NO 484, SEQ ID NO 485, SEQ ID NO 486, SEQ ID NO 487, SEQ ID NO 488, SEQ ID NO 489, SEQ ID NO 490, SEQ ID NO 491, SEQ ID NO 492, SEQ ID NO 493, SEQ ID NO 494, SEQ ID NO 495, SEQ ID NO 496, SEQ ID NO 497, SEQ ID NO 498, SEQ ID NO 499, SEQ ID NO 500, SEQ ID NO 501, SEQ ID NO 502, SEQ ID NO 503, SEQ ID NO 505, SEQ ID NO 506, SEQ ID NO 507, SEQ ID NO 508, SEQ ID NO 509, SEQ ID NO 510, SEQ ID NO 511, SEQ ID NO 512, SEQ ID NO 513, SEQ ID NO 515, SEQ ID NO 517, SEQ ID NO 520, SEQ ID NO 572, SEQ ID NO 573, SEQ ID NO 576, SEQ ID NO 577, SEQ ID NO 588 and SEQ ID NO 589.

In other embodiments the contiguous nucleotide sequence may be complementary to a nucleotide sequence selected from the group consisting of SEQ ID NO 305, SEQ ID NO 307, SEQ ID NO 314, SEQ ID NO 315, SEQ ID NO 316, SEQ ID NO 317, SEQ ID NO 319, SEQ ID NO 331, SEQ ID NO 332, SEQ ID NO 392, SEQ ID NO 394, SEQ ID NO 395, SEQ ID NO 440, SEQ ID NO 492, SEQ ID NO 497, SEQ ID NO 498, SEQ ID NO 499, SEQ ID NO 500, SEQ ID NO 501, SEQ ID NO 502, SEQ ID NO 513 and SEQ ID NO 576.

In other embodiments the contiguous nucleotide sequence may be complementary to SEQ ID NO 314 or SEQ ID NO 315.

In some embodiments the mammalian XBP1 pre-mRNA transcript is a mouse XBP1 pre-mRNA transcript.

The mouse XBP1 pre-mRNA is recited in SEQ ID NO 590.

In certain embodiments the contiguous nucleotide sequence is complementary to at least 10 contiguous nucleotides of the mouse XBP1 pre-mRNA transcript (SEQ ID NO 590).

In certain embodiments the contiguous nucleotide sequence may be complementary to at least 10 contiguous nucleotides from nucleotides 3560-3783 of SEQ ID NO 590.

In other embodiments the contiguous nucleotide sequence may be complementary to a nucleotide sequence selected from the group consisting of SEQ ID NO 699, SEQ ID NO 700, SEQ ID NO 703, SEQ ID NO 710, SEQ ID NO 713, SEQ ID NO 724, SEQ ID NO 729, SEQ ID NO 739, SEQ ID NO 743, SEQ ID NO 744, SEQ ID NO 745, SEQ ID NO 749, SEQ ID NO 750, SEQ ID NO 751, SEQ ID NO 752, SEQ ID NO 753, SEQ ID NO 754, SEQ ID NO 755, SEQ ID NO 756, SEQ ID NO 757, SEQ ID NO 758, SEQ ID NO 759, SEQ ID NO 760, SEQ ID NO 761, SEQ ID NO 762, SEQ ID NO 763, SEQ ID NO 773, SEQ ID NO 776, SEQ ID NO 778, SEQ ID NO 781, SEQ ID NO 783, SEQ ID NO 784, SEQ ID NO 785, SEQ ID NO 787, SEQ ID NO 789, SEQ ID NO 790, SEQ ID NO 791, SEQ ID NO 792, SEQ ID NO 793, SEQ ID NO 794, SEQ ID NO 795, SEQ ID NO 796, SEQ ID NO 797, SEQ ID NO 798, SEQ ID NO 799 and SEQ ID NO 800.

In other embodiments the contiguous nucleotide sequence may be complementary to a nucleotide sequence selected from the group consisting of SEQ ID NO 710, SEQ ID NO 754, SEQ ID NO 756, SEQ ID NO 757, SEQ ID NO 758, SEQ ID NO 759, SEQ ID NO 760, SEQ ID NO 791, SEQ ID NO 792, SEQ ID NO 794, SEQ ID NO 795 and SEQ ID NO 797.

In some embodiments, the mammalian XBP1 pre-mRNA transcript is a human XBP1 pre-mRNA transcript.

The human XBP1 pre-mRNA is recited in SEQ ID NO 801.

In certain embodiments the contiguous nucleotide sequence is complementary to at least 10 contiguous nucleotides of the human XBP1 pre-mRNA transcript (SEQ ID NO 801).

In certain embodiments, the contiguous nucleotide sequence may be complementary to at least 10 contiguous nucleotides from nucleotides 4338-4563 of SEQ ID NO 801

In some embodiments the contiguous nucleotide sequence is complementary to at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, or at least 17 contiguous nucleotides of the human XBP1 pre-mRNA transcript (SEQ ID NO 801).

In other embodiments, the contiguous nucleotide sequence may be complementary to a nucleotide sequence selected from the group consisting of SEQ ID NO 947, SEQ ID NO 948, SEQ ID NO 949, SEQ ID NO 950, SEQ ID NO 951 and SEQ ID NO 988.

In other embodiments, the contiguous nucleotide sequence may be complementary to SEQ ID NO 951.

Antisense Oligonucleotide Sequence

The contiguous nucleotide sequence may be complementary to a portion of the hamster XBP1 pre-mRNA transcript (SEQ ID NO 1).

In certain embodiments, the contiguous nucleotide sequence may be selected from the group consisting of SEQ ID NO 8, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 32, SEQ ID NO 34, SEQ ID NO 36, SEQ ID NO 37, SEQ ID NO 39, SEQ ID NO 40, SEQ ID NO 41, SEQ ID NO 42, SEQ ID NO 43, SEQ ID NO 45, SEQ ID NO 46, SEQ ID NO 94, SEQ ID NO 95, SEQ ID NO 96, SEQ ID NO 97, SEQ ID NO 99, SEQ ID NO 100, SEQ ID NO 101, SEQ ID NO 102, SEQ ID NO 103, SEQ ID NO 104, SEQ ID NO 105, SEQ ID NO 106, SEQ ID NO 107, SEQ ID NO 108, SEQ ID NO 110, SEQ ID NO 111, SEQ ID NO 128, SEQ ID NO 140, SEQ ID NO 141, SEQ ID NO 142, SEQ ID NO 143, SEQ ID NO 147, SEQ ID NO 148, SEQ ID NO 149, SEQ ID NO 150, SEQ ID NO 151, SEQ ID NO 158, SEQ ID NO 193, SEQ ID NO 194, SEQ ID NO 195, SEQ ID NO 196, SEQ ID NO 197, SEQ ID NO 198, SEQ ID NO 199, SEQ ID NO 200, SEQ ID NO 201, SEQ ID NO 202, SEQ ID NO 203, SEQ ID NO 204, SEQ ID NO 205, SEQ ID NO 206, SEQ ID NO 207, SEQ ID NO 208, SEQ ID NO 209, SEQ ID NO 210, SEQ ID NO 211, SEQ ID NO 212, SEQ ID NO 214, SEQ ID NO 215, SEQ ID NO 216, SEQ ID NO 217, SEQ ID NO 218, SEQ ID NO 219, SEQ ID NO 220, SEQ ID NO 221, SEQ ID NO 222, SEQ ID NO 224, SEQ ID NO 226, SEQ ID NO 229, SEQ ID NO 281, SEQ ID NO 282, SEQ ID NO 285, SEQ ID NO 286, SEQ ID NO 297 and SEQ ID NO 298.

In certain embodiments, the contiguous nucleotide sequence may be selected from the group consisting of SEQ ID NO 14, SEQ ID NO 16, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 28, SEQ ID NO 40, SEQ ID NO 41, SEQ ID NO 101, SEQ ID NO 103, SEQ ID NO 104, SEQ ID NO 149, SEQ ID NO 201, SEQ ID NO 206, SEQ ID NO 207, SEQ ID NO 208, SEQ ID NO 209, SEQ ID NO 210, SEQ ID NO 211, SEQ ID NO 222 and SEQ ID NO 285.

In certain embodiments, the contiguous nucleotide sequence may be SEQ ID NO 23 or SEQ ID NO 24.

The contiguous nucleotide sequence may be complementary to a portion of the mouse XBP1 pre-mRNA transcript (SEQ ID NO 590).

In certain embodiments, the contiguous nucleotide sequence may be selected from the group consisting of SEQ ID NO 597, SEQ ID NO 598, SEQ ID NO 601, SEQ ID NO 608, SEQ ID NO 611, SEQ ID NO 622, SEQ ID NO 627, SEQ ID NO 637, SEQ ID NO 641, SEQ ID NO 642, SEQ ID NO 643, SEQ ID NO 647, SEQ ID NO 648, SEQ ID NO 649, SEQ ID NO 650, SEQ ID NO 651, SEQ ID NO 652, SEQ ID NO 653, SEQ ID NO 654, SEQ ID NO 655, SEQ ID NO 656, SEQ ID NO 657, SEQ ID NO 658, SEQ ID NO 659, SEQ ID NO 660, SEQ ID NO 661, SEQ ID NO 671, SEQ ID NO 674, SEQ ID NO 676, SEQ ID NO 679, SEQ ID NO 681, SEQ ID NO 682, SEQ ID NO 683, SEQ ID NO 685, SEQ ID NO 687, SEQ ID NO 688, SEQ ID NO 689, SEQ ID NO 690, SEQ ID NO 691, SEQ ID NO 692, SEQ ID NO 693, SEQ ID NO 694, SEQ ID NO 695, SEQ ID NO 696, SEQ ID NO 697 and SEQ ID NO 698.

In certain embodiments, the contiguous nucleotide sequence may be selected from the group consisting of SEQ ID NO 608, SEQ ID NO 652, SEQ ID NO 654, SEQ ID NO 655, SEQ ID NO 656, SEQ ID NO 657, SEQ ID NO 658, SEQ ID NO 689, SEQ ID NO 690, SEQ ID NO 692, SEQ ID NO 693 and SEQ ID NO 695.

The contiguous nucleotide sequence may be complementary to a portion of the human XBP1 pre-mRNA transcript (SEQ ID NO 801).

In certain embodiments, the contiguous nucleotide sequence may be selected from the group consisting of SEQ ID NO 854, SEQ ID NO 855, SEQ ID NO 856, SEQ ID NO 857, SEQ ID NO 858 and SEQ ID NO 895.

In certain embodiments, the contiguous nucleotide sequence may be SEQ ID NO 858.

In some embodiments, the contiguous nucleotide sequence is the same length as the antisense oligonucleotide.

In some embodiments, the antisense oligonucleotide consists of the contiguous nucleotide sequence.

In some embodiments, the antisense oligonucleotide is the contiguous nucleotide sequence.

The invention also contemplates fragments of the contiguous nucleotide sequence, including fragments of at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16 or at least 17 contiguous nucleotides thereof.

Antisense Oligonucleotide Activity

In some embodiments, the antisense oligonucleotides of the present invention modulate the splicing of a mammalian XBP1 pre-mRNA transcript, such as that described herein. In some embodiments, modulating the splicing of a mammalian XBP1 pre-mRNA transcript can regulate the expression and/or activity of certain XBP1 variants.

Without wishing to be bound by theory, splice modulating oligonucleotides typically operate via an occupation-based mechanism rather than via a degradation mechanism (such as RNaseH or RISC mediated inhibition).

In some embodiments, the antisense oligonucleotides of the invention are capable of reducing or inhibiting the expression (e.g., number) of a XBP1 mRNA transcript comprising exon 4 in a cell. Herein a XBP1 mRNA transcript comprising exon 4 is referred to as XBP1-E4.

The term “reducing” or “inhibiting” the expression of a transcript as used herein is to be understood as an overall term for an antisense oligonucleotide's ability to inhibit or reduce the amount or the activity of XBP1-E4 protein in a target cell (e.g., by reducing or inhibiting the expression of XBP1-E4 mRNA and thereby reducing the expression of a XBP1-E4 protein).

Inhibition of activity can be determined by measuring the level (e.g., number) of XBP1-E4 mRNA, or by measuring the level (e.g., number) or activity of XBP1-E4 protein in a cell. Inhibition of expression can therefore be determined in vitro or in viva. It will be understood that splice modulation can result in an inhibition of expression (e.g., number) of XBP1-E4 transcript (e.g., mRNA), or the protein encoded thereof, in the cell. In certain embodiments, the expression (e.g., number) of XBP1-E4 transcript (e.g., mRNA) is reduced by at least about 1%, at least about 2%, at least about 3%, at least about 5%, at least about 10%, at least about 20%, at least about 30%, at least about 40%, at least about 50% or more compared to a corresponding cell that is not exposed to the antisense oligonucleotide.

As used herein, the term “corresponding cell that is not exposed to the antisense oligonucleotide” can refer to the same cell prior to the treatment with an antisense oligonucleotide of the invention, or to the same cell type (but not the same cell).

Accordingly, in some embodiments treating a cell with an antisense oligonucleotide of the present invention reduces (e.g., by at least about 10% or by at least about 20%) the expression of XBP1-E4 transcript (e.g., mRNA) in the cell compared to the expression of XBP1-E4 transcript (e.g., mRNA) in the same cell prior to the antisense oligonucleotide treatment.

In other embodiments treating a cell with an antisense oligonucleotide of the present invention reduces (e.g., by at least about 10% or by at least about 20%) the expression of XBP1-E4 transcript (e.g., mRNA) in the cell compared to the expression of XBP1-E4 transcript (e.g., mRNA) in the same cell type which has not undergone antisense oligonucleotide treatment.

In some embodiments, the antisense oligonucleotides of the invention are capable of increasing or enhancing the expression (e.g., number) of a XBP1 mRNA transcript lacking exon 4 in a cell. Herein a XBP1 mRNA transcript lacking exon 4 is referred to as XBP1Δ4.

The term “increasing” the expression of a transcript as used herein is to be understood as an overall term for an antisense oligonucleotide's ability to increase or enhance the amount or the activity of XBP1Δ4 protein in a target cell (e.g., by increasing the expression of XBP1 mRNA and thereby increasing the expression of a XBP1Δ4 protein).

Increases in activity can be determined by measuring the level (e.g., number) of XBP1Δ4 mRNA, or by measuring the level (e.g., number) or activity of XBP1Δ4 protein in a cell. Increases in expression can therefore be determined in vitro or in vivo. It will be understood that splice modulation can result in an increase in expression (e.g., number) of XBP1Δ4 transcript (e.g., mRNA), or the protein encoded thereof, in the cell. In certain embodiments, the expression (e.g., number) of XBP1Δ4 transcript (e.g., mRNA) is increased or enhanced by at least about 1%, at least about 2%, at least about 3%, at least about 5%, at least about 10%, at least about 20%, at least about 30%, at least about 40%, at least about 50% or more compared to a corresponding cell that is not exposed to the antisense oligonucleotide. It is preferred that the expression (e.g., number) of XBP1114 transcript (e.g., mRNA) is increased or enhanced by at least about 1% or at least about 5% compared to a corresponding cell that is not exposed to the antisense oligonucleotide.

Accordingly, in some embodiments treating a cell with an antisense oligonucleotide of the present invention increases or enhances (e.g., by at least about 10% or by at least about 20%) the expression of XBP1Δ4 transcript (e.g., mRNA) in the cell compared to the expression of XBP1Δ4 transcript (e.g., mRNA) in the same cell prior to the antisense oligonucleotide treatment.

In other embodiments treating a cell with an antisense oligonucleotide of the present invention increases or enhances (e.g., by at least about 10% or by at least about 20%) the expression of XBP1Δ4 transcript (e.g., mRNA) in the cell compared to the expression of XBP1Δ4 transcript (e.g., mRNA) in the same cell type which has not undergone antisense oligonucleotide treatment.

In some embodiments, the antisense oligonucleotides of the invention can change the ratio of alternative XBP1 splice variants expressed in a cell. For instance, increased or enhanced expression of XBP1Δ4 will result in an increase in the ratio of expression of XBP1Δ4/XBP1E4 transcripts.

Accordingly, in some embodiments, the antisense oligonucleotides disclosed herein can increase the ratio of expression of XBP1Δ4/XBP1E4 mRNA transcripts compared to a corresponding ratio of a cell that is not exposed to the antisense oligonucleotides of the present invention. In certain embodiments, the ratio of the expression of XBP1Δ4 mRNA transcript to the expression of XBP1-E4 mRNA transcript is increased by at least about 2-fold, at least about 3-fold, at least about 4-fold, at least about 5-fold, at least about 10-fold, at least about 20-fold, at least about 25-fold, at least about 30-fold, at least about 35-fold, at least about 40-fold, at least about 50-fold or more compared to a corresponding ratio of a cell that is not exposed to the antisense oligonucleotides of the present invention

In some embodiments, the antisense oligonucleotides disclosed herein can increase the ratio of expression of XBP1Δ4/XBP1E4 protein compared to a corresponding ratio of a cell that is not exposed to the antisense oligonucleotides of the present invention. In certain embodiments, the ratio of the expression of XBP1Δ4 protein to the expression of XBP1-E4 protein is increased by at least about 2-fold, at least about 3-fold, at least about 4-fold, at least about 5-fold, at least about 10-fold, at least about 20-fold, at least about 25-fold or more compared to a corresponding ratio of a cell that is not exposed to the antisense oligonucleotides of the present invention

In some embodiments, the antisense oligonucleotides of the invention are capable of both i) increasing the amount of XBP1Δ4 mRNA or XBP1Δ4 protein in the target cell and ii) decreasing the amount of XBP1-E4 mRNA and XBP1-E4 protein in a target cell.

The change in ratio of different transcript products (e.g., XBP1-E4 vs. XBP1Δ4) can be measured by comparing mRNA levels, or levels of the corresponding protein products. Anti-XBP1 antibodies which can be used for assaying the protein levels of XBP1-E4 and XBP1Δ4 including monoclonal or polyclonal antibodies raised against XBP1.

Oligonucleotide Design

The antisense oligonucleotides of the invention can comprise a nucleotide sequence which comprises both nucleosides and nucleoside analogs, and can be in the form of a gapmer, blockmer, mixmer, headmer, tailmer, or totalmer.

In one embodiment, the antisense oligonucleotide comprises at least 1 modified nucleoside, such as at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 16, at least 16 or at least 17 modified nucleosides.

The term “gapmer” as used herein refers to an antisense oligonucleotide which comprises a region of RNase H recruiting oligonucleotides (gap) which is flanked 5 and 3′ by one or more affinity enhancing modified nucleosides (flanks). The terms “headmers” and “tailmers” are oligonucleotides capable of recruiting RNase H where one of the flanks is missing, i.e., only one of the ends of the oligonucleotide comprises affinity enhancing modified nucleosides. For headmers, the 3° flank is missing (i.e., the 5′ flank comprise affinity enhancing modified nucleosides) and for tailmers, the 5′ flank is missing (i.e., the 3′ flank comprises affinity enhancing modified nucleosides). The term “LNA gapmer” is a gapmer oligonucleotide wherein at least one of the affinity enhancing modified nucleosides is an LNA nucleoside. The term “mixed wing gapmer” refers to an LNA gapmer wherein the flank regions comprise at least one LNA nucleoside and at least one DNA nucleoside or non-LNA modified nucleoside, such as at least one 2′ substituted modified nucleoside, such as, for example, 2′-O-alkyl-RNA, 2′-O-methyl-RNA, 2′-alkoxy-RNA, 2′-O-methoxyethyl-RNA (MOE), 2′-amino-DNA, 2′-Fluoro-RNA, 2′-Fluro-DNA, arabino nucleic acid (ANA), and 2′-Fluoro-ANA nucleoside(s).

Other “chimeric” antisense oligonucleotides, called “mixmers”, consist of an alternating composition of (i) DNA monomers or nucleoside analog monomers recognizable and cleavable by RNase, and (ii) non-RNase recruiting nucleoside analog monomers.

A “totalmer” is a single stranded ASO which only comprises non-naturally occurring nucleotides or nucleotide analogs.

High Affinity Modified Nucleosides

A high affinity modified nucleoside is a modified nucleotide which, when incorporated into the oligonucleotide enhances the affinity of the oligonucleotide for its complementary target, for example as measured by the melting temperature (Tm). A high affinity modified nucleoside of the present invention preferably results in an increase in melting temperature between +0.5 to +12° C., more preferably between +1.5 to +10° C. and most preferably between +3 to +8° C. per modified nucleoside. Numerous high affinity modified nucleosides are known in the art and include for example, many 2′ substituted nucleosides as well as locked nucleic acids (LNA) (see e.g. Freier & Altmann; Nucl. Acid Res., 1997, 25, 4429-4443 and Uhlmann; Curr. Opinion in Drug Development, 2000, 3(2), 203-213).

Sugar Modifications

The antisense oligonucleotides of the invention may comprise one or more nucleosides which have a modified sugar moiety, i.e. a modification of the sugar moiety when compared to the ribose sugar moiety found in DNA and RNA.

Numerous nucleosides with modification of the ribose sugar moiety have been made, primarily with the aim of improving certain properties of oligonucleotides, such as affinity and/or nuclease resistance.

Such modifications include those where the ribose ring structure is modified, e.g. by replacement with a hexose ring (HNA), or a bicyclic ring, which typically have a biradicle bridge between the C2 and C4 carbons on the ribose ring (LNA), or an unlinked ribose ring which typically lacks a bond between the C2 and C3 carbons (e.g. UNA). Other sugar modified nucleosides include, for example, bicyclohexose nucleic acids (WO2011/017521) or tricyclic nucleic acids (WO2013/154798). Modified nucleosides also include nucleosides where the sugar moiety is replaced with a non-sugar moiety, for example in the case of peptide nucleic acids (PNA), or morpholino nucleic acids.

Sugar modifications also include modifications made via altering the substituent groups on the ribose ring to groups other than hydrogen, or the 2′-OH group naturally found in DNA and RNA nucleosides. Substituents may, for example be introduced at the 2′, 3′, 4′ or 5′ positions.

2′ Sugar Modified Nucleosides

A 2′ sugar modified nucleoside is a nucleoside which has a substituent other than H or —OH at the 2′ position (2′ substituted nucleoside) or comprises a 2′ linked biradicle capable of forming a bridge between the 2′ carbon and a second carbon in the ribose ring, such as LNA (2′-4′ biradicle bridged) nucleosides.

Indeed, much focus has been given to developing 2′ sugar substituted nucleosides, and numerous 2′ substituted nucleosides have been found to have beneficial properties when incorporated into oligonucleotides. For example, the 2′ modified sugar may provide enhanced binding affinity and/or increased nuclease resistance to the oligonucleotide. Examples of 2′ substituted modified nucleosides are 2′-O-alkyl-RNA, 2′-O-methyl-RNA, 2′-alkoxy-RNA, 2′-O-methoxyethyl-RNA (MOE), 2′-amino-DNA, 2′-Fluoro-RNA, and 2′-F-ANA nucleoside. For further examples, please see e.g. Freier & Altmann; Nucl, Acid Res., 1997, 25, 4429-4443 and Uhlmann; Curr. Opinion in Drug Development, 2000, 3(2), 203-213, and Deleavey and Damha, Chemistry and Biology 2012, 19, 937, Below in Scheme 1 are illustrations of some 2′ substituted modified nucleosides.

embedded image

In relation to the present invention 2′ substituted sugar modified nucleosides does not include 2′ bridged nucleosides like LNA.

Locked Nucleic Acid Nucleosides (LNA Nucleoside)

A “LNA nucleoside” is a 2′-modified nucleoside which comprises a biradical linking the C2′ and C4′ of the ribose sugar ring of said nucleoside (also referred to as a “2′-4′ bridge”), which restricts or locks the conformation of the ribose ring. These nucleosides are also termed bridged nucleic acid or bicyclic nucleic acid (BNA) in the literature. The locking of the conformation of the ribose is associated with an enhanced affinity of hybridization (duplex stabilization) when the LNA is incorporated into an oligonucleotide for a complementary RNA or DNA molecule. This can be routinely determined by measuring the melting temperature of the oligonucleotide/complement duplex.

Non limiting, exemplary LNA nucleosides are disclosed in WO 99/014226, WO 00/66604, WO 98/039352, WO 2004/046160, WO 00/047599, WO 2007/134181, WO 2010/077578, WO 2010/036698, WO 2007/090071, WO 2009/006478, WO 2011/156202, WO 2008/154401, WO 2009/067647, WO 2008/150729, Morita et al., Bioorganic & Med. Chem. Lett. 12, 73-76, Seth et al. J. Org. Chem. 2010, Vol 75(5) pp. 1569-81, and Mitsuoka et al., Nucleic Acids Research 2009, 37(4), 1225-1238, and Wan and Seth, J. Medical Chemistry 2016, 59, 9645-9667.

Further non limiting, exemplary LNA nucleosides are disclosed in Scheme 2.

embedded image

Particular LNA nucleosides are beta-D-oxy-LNA, 6′-methyl-beta-D-oxy LNA such as (S)-6′-methyl-beta-D-oxy-LNA (ScET) and ENA.

A particularly advantageous LNA is beta-D-oxy-LNA.

Morpholino Oligonucleotides

In some embodiments, the antisense oligonucleotide of the invention comprises or consists of Morpholino nucleosides (i.e. is a Morpholino oligomer and as a phosphorodiamidate Morphol no oligomer (PMO)). Splice modulating morpholino oligonucleotides have been approved for clinical use—see for example eteplirsen, a 30 nt morpholino oligonucleotide targeting a frame shift mutation in DMD, used to treat Duchenne muscular dystrophy. Morpholino oligonucleotides have nucleobases attached to six membered morpholine rings rather ribose, such as methylenemorpholine rings linked through phosphorodiamidate groups, for example as illustrated by the following illustration of 4 consecutive morpholino nucleotides (Scheme 3):

embedded image

In some embodiments, morpholino oligonucleotides of the invention may be, for example 20-40 morpholino nucleotides in length, such as morpholino 25-35 nucleotides in length.

RNase H Activity and Recruitment

The RNase H activity of an antisense oligonucleotide refers to its ability to recruit RNase H when in a duplex with a complementary RNA molecule. WO01/23613 provides in vitro methods for determining RNaseH activity, which may be used to determine the ability to recruit RNaseH. Typically an oligonucleotide is deemed capable of recruiting RNase H if it, when provided with a complementary target nucleic acid sequence, has an initial rate, as measured in pmol/l/min, of at least 5%, such as at least 10%, at least 20% or more than 20%, of the initial rate determined when using an oligonucleotide having the same base sequence as the modified oligonucleotide being tested, but containing only DNA monomers with phosphorothioate linkages between all monomers in the oligonucleotide, and using the methodology provided by Examples 91-95 of WO01/23613 (hereby incorporated by reference). For use in determining RHase H activity, recombinant RNase H1 is available from Lubio Science GmbH, Lucerne, Switzerland.

DNA oligonucleotides are known to effectively recruit RNaseH, as are gapmer oligonucleotides which comprise a region of DNA nucleosides (typically at least 5 or 6 contiguous DNA nucleosides), flanked 5′ and 3′ by regions comprising 2′ sugar modified nucleosides, typically high affinity 2′ sugar modified nucleosides, such as 2-O-MOE and/or LNA. For effective modulation of splicing, degradation of the pre-mRNA is not desirable, and as such it is preferable to avoid the RNaseH degradation of the target. Therefore, the antisense oligonucleotides of the invention are not RNaseH recruiting gapmer oligonucleotide.

RNaseH recruitment may be avoided by limiting the number of contiguous DNA nucleotides in the oligonucleotide—therefore mixmers and totalmer designs may be used. Advantageously the antisense oligonucleotides of the invention, or the contiguous nucleotide sequence thereof, do not comprise more than 3 contiguous DNA nucleosides. Further, advantageously the antisense oligonucleotides of the invention, or the contiguous nucleotide sequence thereof, do not comprise more than 4 contiguous DNA nucleosides. Further advantageously, the antisense oligonucleotides of the invention, or contiguous nucleotide sequence thereof, do not comprise more than 2 contiguous DNA nucleosides.

Mixmers and Totalmers

For splice modulation it is often advantageous to use antisense oligonucleotides which do not recruit RNAaseH. As RNaseH activity requires a contiguous sequence of DNA nucleotides, RNaseH activity of antisense oligonucleotides may be achieved by designing antisense oligonucleotides which do not comprise a region of more than 3 or more than 4 contiguous DNA nucleosides. This may be achieved by using antisense oligonucleotides or contiguous nucleoside regions thereof with a mixmer design, which comprise sugar modified nucleosides, such as 2′ sugar modified nucleosides, and short regions of DNA nucleosides, such as 1, 2 or 3 DNA nucleosides. Mixmers are exemplified herein by every second design, wherein the nucleosides alternate between 1 LNA and 1 DNA nucleoside, e.g. LDLDLDLDLDLDLDLL, with 5′ and 3′ terminal LNA nucleosides, and every third design, such as LDDLDDLDDLDDLDDL, where every third nucleoside is a LNA nucleoside.

A totalmer is an antisense oligonucleotide or a contiguous nucleotide sequence thereof which does not comprise DNA or RNA nucleosides, and may for example comprise only 2′-O-MOE nucleosides, such as a fully MOE phosphorothioate, e.g. MMMMMMMMMMMMMMMMMMMM, where M=2′-O-MOE, which are reported to be effective splice modulators for therapeutic use.

Alternatively, a mixmer may comprise a mixture of modified nucleosides, such as MLMLMLMLMLMLMLMLMLML, wherein L=LNA and M=2′-O-MOE nucleosides.

Advantageously, the internucleoside nucleosides in mixmers and totalmers may be phosphorothioate, or a majority of nucleoside linkages in mixmers may be phosphorothioate. Mixmers and totalmers may comprise other internucleoside linkages, such as phosphodiester or phosphorodithioate, by way of example.

Region D′ or D″ in an Oligonucleotide

The antisense oligonucleotide of the invention may in some embodiments comprise or consist of the contiguous nucleotide sequence of the oligonucleotide which is complementary to the target nucleic acid, such as a mixmer or totalmer region, and further 5′ and/or 3′ nucleosides. The further 5′ and/or 3′ nucleosides may or may not be complementary, such as fully complementary, to the target nucleic acid. Such further 5′ and/or 3′ nucleosides may be referred to as region D′ and D″ herein.

The addition of region D′ or D″ may be used for the purpose of joining the contiguous nucleotide sequence, such as the mixmer or totalmer, to a conjugate moiety or another functional group. When used for joining the contiguous nucleotide sequence with a conjugate moiety it can serve as a biocleavable linker. Alternatively, it may be used to provide exonuclease protection or for ease of synthesis or manufacture.

Region D′ or D″ may independently comprise or consist of 1, 2, 3, 4 or 5 additional nucleotides, which may be complementary or non-complementary to the target nucleic acid. The nucleotide adjacent to the F or F′ region is not a sugar-modified nucleotide, such as a DNA or RNA or base modified versions of these. The D′ or D″ region may serve as a nuclease susceptible biocleavable linker (see definition of linkers). In some embodiments the additional 5′ and/or 3′ end nucleotides are linked with phosphodiester linkages, and are DNA or RNA. Nucleotide based biocleavable linkers suitable for use as region D′ or D″ are disclosed in WO2014/076195, which include by way of example a phosphodiester linked DNA dinucleotide. The use of biocleavable linkers in poly-oligonucleotide constructs is disclosed in WO2015/113922, where they are used to link multiple antisense constructs within a single oligonucleotide.

In one embodiment the antisense oligonucleotide of the invention comprises a region D′ and/or D″ in addition to the contiguous nucleotide sequence which constitutes a mixmer or a totalmer.

In some embodiments the internucleoside linkage positioned between region D′ or D″ and the mixmer or totalmer region is a phosphodiester linkage.

Conjugates

The invention encompasses an antisense oligonucleotide covalently attached to at least one conjugate moiety. In some embodiments this may be referred to as a conjugate of the invention.

The term “conjugate” as used herein refers to an antisense oligonucleotide which is covalently linked to a non-nucleotide moiety (conjugate moiety or region C or third region). The conjugate moiety may be covalently linked to the antisense oligonucleotide, optionally via a linker group, such as region D′ or D″.

Oligonucleotide conjugates and their synthesis has also been reported in comprehensive reviews by Manoharan in Antisense Drug Technology, Principles, Strategies, and Applications, S. T. Crooke, ed., Ch. 16, Marcel Dekker, Inc., 2001 and Manoharan, Antisense and Nucleic Acid Drug Development, 2002, 12, 103.

In some embodiments, the conjugate moiety may comprise a protein, a fatty acid chain, a sugar residue, a glycoprotein, a polymer or any combination thereof.

In some embodiments, the non-nucleotide moiety (conjugate moiety) is selected from the group consisting of carbohydrates (e.g. GalNAc), cell surface receptor ligands, drug substances, hormones, lipophilic substances, polymers, proteins, peptides, toxins (e.g. bacterial toxins), vitamins, viral proteins (e.g. capsids) or combinations thereof.

In some embodiments, the antisense oligonucleotide conjugate of the invention is a prodrug. Here the conjugate moiety may be cleaved off the nucleic acid molecule once the prodrug is delivered to the site of action, e.g. the target cell.

Linkers

A linkage or linker is a connection between two atoms that links one chemical group or segment of interest to another chemical group or segment of interest via one or more covalent bonds. Conjugate moieties can be attached to the antisense oligonucleotide directly or through a linking moiety (e.g. linker or tether), Linkers serve to covalently connect a third region, e.g. a conjugate moiety (Region C), to a first region, e.g. an oligonucleotide or contiguous nucleotide sequence complementary to the target nucleic acid (region A).

In some embodiments of the invention the conjugate or antisense oligonucleotide conjugate of the invention may optionally comprise a linker region (second region or region B and/or region Y) which is positioned between the oligonucleotide or contiguous nucleotide sequence complementary to the target nucleic acid (region A or first region) and the conjugate moiety (region C or third region).

Region B refers to biocleavable linkers comprising or consisting of a physiologically labile bond that is cleavable under conditions normally encountered or analogous to those encountered within a mammalian body. Conditions under which physiologically labile linkers undergo chemical transformation (e.g., cleavage) include chemical conditions such as pH, temperature, oxidative or reductive conditions or agents, and salt concentration found in or analogous to those encountered in mammalian cells. Mammalian intracellular conditions also include the presence of enzymatic activity normally present in a mammalian cell such as from proteolytic enzymes or hydrolytic enzymes or nucleases. In one embodiment the biocleavable linker is susceptible to S1 nuclease cleavage. In some embodiments the nuclease susceptible linker comprises between 1 and 5 nucleosides, such as DNA nucleoside(s) comprising at least two consecutive phosphodiester linkages. Phosphodiester containing biocleavable linkers are described in more detail in WO 2014/076195.

Region Y refers to linkers that are not necessarily biocleavable but primarily serve to covalently connect a conjugate moiety (region C or third region), to an oligonucleotide (region A or first region). The region Y linkers may comprise a chain structure or an oligomer of repeating units such as ethylene glycol, amino acid units or amino alkyl groups. The antisense oligonucleotide conjugates of the present invention can be constructed of the following regional elements A-C, A-B-C, A-B-Y-C, A-Y-B-C or A-Y-C. In some embodiments the linker (region Y) is an amino alkyl, such as a C2-C36 amino alkyl group, including, for example C6 to C12 amino alkyl groups. In some embodiments the linker (region Y) is a C6 amino alkyl group.

Pharmaceutical Salt

The invention provides for an antisense oligonucleotide according to the invention wherein the antisense oligonucleotide is in the form of a pharmaceutically acceptable salt. The term “pharmaceutically acceptable salt” refers to conventional acid-addition salts or base-addition salts that retain the biological effectiveness and properties of the antisense oligonucleotides of the present invention.

In some embodiments, the pharmaceutically acceptable salt may be a sodium salt, a potassium salt or an ammonium salt.

The invention provides for a pharmaceutically acceptable sodium salt of the antisense oligonucleotide according to the invention, or the conjugate according to the invention.

The invention provides for a pharmaceutically acceptable potassium salt of the antisense oligonucleotide according to the invention, or the conjugate according to the invention.

The invention provides for a pharmaceutically acceptable ammonium salt of the antisense oligonucleotide according to the invention, or the conjugate according to the invention.

Pharmaceutical Composition

The invention provides for a pharmaceutical composition comprising the antisense oligonucleotide of the invention, or the conjugate of the invention, or the salt of the invention, and a pharmaceutically acceptable diluent, solvent, carrier, salt and/or adjuvant.

A pharmaceutically acceptable diluent includes phosphate-buffered saline (PBS) and pharmaceutically acceptable salts include, but are not limited to, sodium and potassium salts. In some embodiments the pharmaceutically acceptable diluent is sterile phosphate buffered saline. In some embodiments, the nucleic acid molecule is used in the pharmaceutically acceptable diluent at a concentration of 50 to 300 μM solution.

Suitable formulations for use in the present invention are found in Remington's Pharmaceutical Sciences, Mack Publishing Company, Philadelphia, Pa., 17th ed., 1985. For a brief review of methods for drug delivery, see, e.g., Langer (Science 249:1527-1533, 1990). WO 2007/031091 provides further suitable and preferred examples of pharmaceutically acceptable diluents, carriers and adjuvants (hereby incorporated by reference). Suitable dosages, formulations, administration routes, compositions, dosage forms, combinations with other therapeutic agents, pro-drug formulations are also provided in WO2007/031091.

The invention provides for a pharmaceutical composition comprising the antisense oligonucleotide of the invention, or the conjugate of the invention, and a pharmaceutically acceptable salt. For example, the salt may comprise a metal cation, such as a sodium salt, a potassium salt or an ammonium salt.

The invention provides for a pharmaceutical composition according to the invention, wherein the pharmaceutical composition comprises the antisense oligonucleotide of the invention or the conjugate of the invention, or the pharmaceutically acceptable salt of the invention; and an aqueous diluent or solvent.

In some embodiments, the antisense oligonucleotide of the invention, the conjugate of the invention, or pharmaceutically acceptable salt thereof is in a solid form, such as a powder, such as a lyophilized powder.

The antisense oligonucleotide of the invention, conjugate of the invention or salt of the invention may be mixed with pharmaceutically acceptable active or inert substances for the preparation of pharmaceutical compositions or formulations. Compositions and methods for the formulation of pharmaceutical compositions are dependent upon a number of criteria, including, but not limited to, route of administration, extent of disease, or dose to be administered.

These compositions may be sterilized by conventional sterilization techniques, or may be sterile filtered. The resulting aqueous solutions may be packaged for use as is, or lyophilized, the lyophilized preparation being combined with a sterile aqueous carrier prior to administration. The pH of the preparations typically will be between 3 and 11, more preferably between 5 and 9 or between 6 and 8, and most preferably between 7 and 8, such as 7 to 7.5. The resulting compositions in solid form may be packaged in multiple single dose units, each containing a fixed amount of the above-mentioned agent or agents, such as in a sealed package of tablets or capsules. The composition in solid form can also be packaged in a container for a flexible quantity, such as in a squeezable tube designed for a topically applicable cream or ointment.

Composition

In one aspect the present invention provides a composition comprising an antisense oligonucleotide according to the invention or the conjugate according to the invention, or the salt according to the invention: and a diluent, solvent, carrier, salt and/or adjuvant.

The composition may be a pharmaceutical composition.

Method of Manufacture of the Oligonucleotide According to the Invention

In a further aspect, the invention provides methods for manufacturing the oligonucleotides of the invention comprising reacting nucleotide units and thereby forming covalently linked contiguous nucleotide units comprised in the oligonucleotide. Preferably, the method uses phophoramidite chemistry (see for example Caruthers et al, 1987, Methods in Enzymology vol. 154, pages 287-313).

In a further embodiment, the method further comprises reacting the contiguous nucleotide sequence with a conjugating moiety (ligand) to covalently attach a conjugate moiety to the oligonucleotide.

In a further embodiment, a method is provided for manufacturing the composition of the invention, comprising mixing the oligonucleotide or conjugated oligonucleotide of the invention with a pharmaceutically acceptable diluent, solvent, carrier, salt and/or adjuvant.

XBP1Δ4 Protein

In one aspect, the invention includes an isolated XBP1Δ4 protein.

The isolated XBP1Δ4 protein may be a mammalian protein. In some embodiments the XBP1Δ4 protein may be a hamster, mouse or human protein.

In certain embodiments, the isolated XBP1Δ4 protein is a hamster protein and is encoded by SEQ ID NO 7.

In certain embodiments, the isolated XBP1Δ4 protein is a mouse protein and is encoded by SEQ ID NO 596.

In certain embodiments, the isolated XBP1Δ4 protein is a human protein and is encoded by SEQ ID NO 807.

The invention also contemplates fragments of the isolated XBP1Δ4 protein.

XBP1Δ4 mRNA

In one aspect, the invention includes an isolated mRNA encoding the isolated XBP1Δ4 protein of the invention.

The isolated XBP1Δ4 mRNA may be a mammalian protein. In some embodiments, the XBP1Δ4 mRNA may be a hamster, mouse or human mRNA.

In certain embodiments, the isolated XBP1Δ4 mRNA is a hamster mRNA and is encoded by SEQ ID NO 6.

In certain embodiments, the isolated XBP1Δ4 mRNA is a mouse mRNA and is encoded by SEQ ID NO 595.

In certain embodiments, the isolated XBP1Δ4 mRNA is a human mRNA and is encoded by SEQ ID NO 806.

The invention also contemplates fragments of the isolated XBP1Δ4 mRNA.

Methods of Producing Polypeptides Using the Compound According to the Invention

The present inventors have identified that compounds, which induce the expression of XBP1Δ4 in mammalian cells, are useful in enhancing the recombinant expression of heterologously expressed proteins in mammalian cells, especially of multimeric polypeptides, such as antibodies.

As explained above, XBP1s is a functionally active protein which functions to enhance correct protein folding. The inventors have surprisingly determined that an XBP1 splice variant, such as XBP1Δ4, can enhance the production of correctly folded proteins in recombinant polypeptide production methods.

In one aspect the invention provides a method for (recombinantly) producing a polypeptide comprising the steps of:

- a) cultivating a mammalian cell, which is expressing XBP1 and which comprises one or more nucleic acids encoding the polypeptide; and
- b) recovering the polypeptide from the cells or the cultivation medium;
- characterized in that the cultivating is at least in part in the presence of an antisense oligonucleotide, a composition, a pharmaceutical composition, a protein or an mRNA of the invention.

In one preferred embodiment, the cultivating comprises a pre- and a main-cultivating step, wherein at least the pre-cultivating step is performed in the presence of an oligonucleotide of the invention.

In certain embodiments, the method comprises the steps of:

- a1) propagating a mammalian cell, which is expressing XBP1 and which comprises one or more nucleic acids encoding the polypeptide, in a cultivation medium comprising an antisense oligonucleotide according to the invention, to obtain a first cell population;
- a2) mixing an aliquot of the first cell population with cultivation medium to obtain a second cell population, optionally wherein the cultivation medium comprises the antisense oligonucleotide according to the invention;
- a3) cultivating the second cell population to obtain a third cell population; and
- b) recovering the polypeptide from the cells and/or the cultivation medium of the third cell cultivation.

In certain embodiments, the antisense oligonucleotide is added to a final concentration of at least about 5 μM, at least about 10 μM, at least about 15 μM, at least about 20 μM, at least about 25 μM, at least about 30 μM, at least about 35 μM, at least about 40 μM, at least about 45 μM, at least about 50 μM or more. In one preferred embodiment, the antisense oligonucleotide is added to a final concentration of about 25 μM.

In certain embodiments, the propagating of the mammalian cell is performed at a starting cell density of at least about of 0.5*10E6 cells/mL, at least about of 1*10E6 cells/mL, at least about of 2*10E6 cells/mL, at least about of 3*10E6 cells/mL, at least about of 4*10E6 cells/mL, at least about of 5*10E6 cells/mL or more. In certain embodiments, the cultivation is performed at a starting cell density of 1*10E6 to 2*10E6 cells/mL.

In certain embodiments, the cultivation of the second cell population is performed at a starting cell density of at least about of 0.5*10E6 cells/mL, at least about of 1*10E6 cells/mL, at least about of 2*10E6 cells/mL, at least about of 3*10E6 cells/mL, at least about of 4*10E6 cells/mL, at least about of 5*10E6 cells/mL, at least about 10*10E6 cells/mL or more. In certain embodiments, the cultivation is performed at a starting cell density of 1*10E6 to 2*10E6 cells/mL.

In certain embodiments, the cell is a mammalian cell.

In certain embodiments, the cell is a hamster cell.

In certain embodiments, the cell is a CHO cell, such as a CHO-K1 cell. Chinese hamster ovary (CHO) cells are an epithelial cell line derived from the ovary of the Chinese hamster, often used in biological and medical research and commercially in the production of therapeutic proteins, such as monoclonal antibodies.

In some embodiments, the cell may be a human cell

In some embodiments, the cell may be a neuronal cell or a brain cell.

In some embodiments, the cell may be in vitro. The in vitro cell may for example be a iPSC cell.

In certain embodiments, the polypeptide is a Fab, preferably a bispecific Fab, an Fc-region comprising fusion polypeptide, a human therapeutic polypeptide, or a cytokine.

In certain embodiments, the polypeptide is an antibody. Here the antibody may take any form, as discussed in the definition of “antibody” provided herein.

In certain embodiments, the method of the invention provides for an increase in protein yield by at least about 10%, at least about 20%, at least about 30%, at least about 40%, at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 1000%, at least about 200%, at least about 300%, at least about 400%, at least about 500% or more, relative to the protein yield obtained in the absence of an antisense oligonucleotide of the invention.

In certain embodiments, the increase in yield represents an increase in the absolute amount of polypeptide. In other embodiments, the increase in yield represents an increase in the amount of correctly folded polypeptide. Herein a polypeptide can be defined as correctly folded either by viewing the structure of the polypeptide or by determining the polypeptide's activity.

Treatment

The term ‘treatment’ as used herein refers to both treatment of an existing disease (e.g. a disease or disorder as herein referred to), or prevention of a disease, i.e. prophylaxis. It will therefore be recognized that treatment as referred to herein may, in some embodiments, be prophylactic.

In one aspect, the invention relates to an antisense oligonucleotide, composition or pharmaceutical composition of the invention for use in medicine or therapy.

In some embodiments the therapy relates to the treatment or prevention of proteopathological disease.

In another aspect, the invention relates to use of an antisense oligonucleotide, composition or pharmaceutical composition of the invention in the manufacture of a medicament for the treatment of proteopathological disease.

In another aspect, the invention relates to a method for treating a proteopathological disease in a patient, the method comprising administering to the patient an antisense oligonucleotide, composition or pharmaceutical composition of the invention.

Proteopathological Diseases

In certain embodiments, the invention relates to the treatment or prevention of proteopathological diseases. Proteopathological diseases are also known as proteopathies, proteinopathies, protein conformational disorders, or protein mis-folding diseases.

In certain embodiments, the proteopathological disease may be selected from prion diseases, tauopathies, synucleinopathies, amyloidosis, multiple system atrophy, TDP-43 pathologies and CAG repeat indications.

In certain embodiments, the proteopathological disease may be selected from amyotrophic lateral sclerosis (ALS), frontotemporal lobar degeneration (FTLD), Alzheimer's disease, Parkinson's disease, Autism, Hippocampal sclerosis dementia, Down syndrome, Huntington's disease, polyglutamine diseases, such as spinocerebellar ataxia 3, myopathies and Chronic Traumatic Encephalopathy.

In certain embodiments the prior disease may be Creutzfeldt-Jakob disease.

In certain embodiments the tauopathy may be Alzheimer's disease.

In certain embodiments the synucleinopathy may be Parkinson's disease.

In certain embodiments the TDP-43 pathology may be amyotrophic lateral sclerosis (ALS) frontotemporal lobar degeneration (FTLD).

In certain embodiments the CAG repeat indication may be spinocerebellar ataxics, including spinocerebellar ataxia type 1, Spinocerebellar ataxia type 2 (SCA2), and Spinocerebellar ataxia type 3 (SCA3, Machado-Joseph disease),

Administration

The compounds, antisense oligonucleotides, compositions, pharmaceutical compositions, proteins or nucleic acids of the present invention may be administered topically or enterally or parenterally (such as, intravenous, subcutaneous, or intra-muscular).

In certain embodiments it is the antisense nucleic acid or pharmaceutical composition which is administered for therapy.

In a preferred embodiment, the antisense oligonucleotide or pharmaceutical compositions of the present invention are administered by a parenteral route including intravenous, intra-arterial, subcutaneous, intraperitoneal or intramuscular injection or infusion.

In one embodiment, the antisense nucleic acid or pharmaceutical composition are administered intravenously.

In another embodiment, the antisense nucleic acid or pharmaceutical composition is administered subcutaneously.

In some embodiments, the antisense nucleic acid or pharmaceutical composition of the invention is administered at a dose of 0.1-15 mg/kg, such as from 0.2-10 mg/kg, such as from 0.25-5 mg/kg. The administration can be once a week, every second week, every third week or even once a month.

NUMBERED EMBODIMENTS OF THE INVENTION

1. An antisense oligonucleotide for use in the expression of a XBP1 splice variant in a cell which expresses XBP1, wherein the antisense oligonucleotide is 8-40 nucleotides in length and comprises a contiguous nucleotide sequence of 8-40 nucleotides in length which is complementary to a mammalian XBP1 pre-mRNA transcript.

2. The antisense oligonucleotide according to embodiment 1, wherein the XBP1 splice variant is a XBP1Δ4 variant.

3. The antisense oligonucleotide according to embodiment 1 or embodiment 2, wherein the contiguous nucleotide sequence is complementary to at least 10 contiguous nucleotides of the hamster XBP1 pre-mRNA transcript (SEQ ID NO 1).

4. The antisense oligonucleotide according to embodiment 3, wherein the contiguous nucleotide sequence is complementary to at least 10 contiguous nucleotides from nucleotides 2960-3113 of SEQ ID NO 1.

5. The antisense oligonucleotide according to embodiment 4, wherein the contiguous nucleotide sequence is complementary to at least 10 contiguous nucleotides from nucleotides 2986-3018 of SEQ ID NO 1.

6. The antisense oligonucleotide according to embodiment 3, wherein the contiguous nucleotide sequence is complementary to a sequence selected from the group consisting of SEQ ID NO 299, SEQ ID NO 301, SEQ ID NO 302, SEQ ID NO 304, SEQ ID NO 305, SEQ ID NO 306, SEQ ID NO 307, SEQ ID NO 308, SEQ ID NO 309, SEQ ID NO 310, SEQ ID NO 314, SEQ ID NO 316, SEQ ID NO 317, SEQ ID NO 318, SEQ ID NO 319, SEQ ID NO 323, SEQ ID NO 325, SEQ ID NO 327, SEQ ID NO 328, SEQ ID NO 330, SEQ ID NO 331, SEQ ID NO 332, SEQ ID NO 333, SEQ ID NO 334, SEQ ID NO 336, SEQ ID NO 337, SEQ ID NO 385, SEQ ID NO 386, SEQ ID NO 387, SEQ ID NO 388, SEQ ID NO 390, SEQ ID NO 391, SEQ ID NO 392, SEQ ID NO 393, SEQ ID NO 394, SEQ ID NO 395, SEQ ID NO 396 397, SEQ ID NO 398, SEQ ID NO 399, SEQ ID NO 401, SEQ ID NO 402, SEQ ID NO 419, SEQ ID NO 431, SEQ ID NO, SEQ ID NO 432, SEQ ID NO 433, SEQ ID NO 434, SEQ ID NO 438, SEQ ID NO 439, SEQ ID NO 440, SEQ ID NO 441 SEQ ID NO 442, SEQ ID NO 449, SEQ ID NO 484, SEQ ID NO 485, SEQ ID NO 486, SEQ ID NO 487, SEQ ID NO 488, SEQ ID NO 489, SEQ ID NO 490, SEQ ID NO 491, SEQ ID NO 492, SEQ ID NO 493, SEQ ID NO 494, SEQ ID NO 495, SEQ ID NO 496, SEQ ID NO 497, SEQ ID NO 498, SEQ ID NO 499, SEQ ID NO 500, SEQ ID NO 501, SEQ ID NO 502, SEQ ID NO 503, SEQ ID NO 505, SEQ ID NO 506, SEQ ID NO 507, SEQ ID NO 508, SEQ ID NO 509. SEQ ID NO 510, SEQ ID NO 511, SEQ ID NO 512, SEQ ID NO 513, SEQ ID NO 515, SEQ ID NO 517, SEQ ID NO 520, SEQ ID NO 572, SEQ ID NO 573, SEQ ID NO 576, SEQ ID NO 577, SEQ ID NO 588 and SEQ ID NO 589.

7. The antisense oligonucleotide according to embodiment 6, wherein the contiguous nucleotide sequence is selected from the group consisting of SEQ ID NO 8, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 32, SEQ ID NO 34, SEQ ID NO 36, SEQ ID NO 37, SEQ ID NO 39, SEQ ID NO 40, SEQ ID NO 41, SEQ ID NO 42, SEQ ID NO 43, SEQ ID NO 45, SEQ ID NO 46. SEQ ID NO 94, SEQ ID NO 95, SEQ ID NO 96, SEQ ID NO 97, SEQ ID NO 99, SEQ ID NO 100, SEQ ID NO 101, SEQ ID NO 102, SEQ ID NO 103, SEQ ID NO 104, SEQ ID NO 105, SEQ ID NO 106, SEQ ID NO 107, SEQ ID NO 108, SEQ ID NO 110, SEQ ID NO 111. SEQ ID NO 128, SEQ ID NO 140, SEQ ID NO 141, SEQ ID NO 142, SEQ ID NO 143, SEQ ID NO 147, SEQ ID NO 148, SEQ ID NO 149, SEQ ID NO 150, SEQ ID NO 151, SEQ ID NO 158, SEQ ID NO 193, SEQ ID NO 194, SEQ ID NO 195, SEQ ID NO 196, SEQ ID NO 197, SEQ ID NO 198, SEQ ID NO 199, SEQ ID NO 200, SEQ ID NO 201, SEQ ID NO 202, SEQ ID NO 203, SEQ ID NO 204, SEQ ID NO 205, SEQ ID NO 206, SEQ ID NO 207, SEQ ID NO 208, SEQ ID NO 209, SEQ ID NO 210, SEQ ID NO 211, SEQ ID NO 212, SEQ ID NO 214, SEQ ID NO 215, SEQ ID NO 216, SEQ ID NO 217, SEQ ID NO 218, SEQ ID NO 219, SEQ ID NO 220, SEQ ID NO 221, SEQ ID NO 222, SEQ ID NO 224, SEQ ID NO 226, SEQ ID NO 229, SEQ ID NO 281, SEQ ID NO 282, SEQ ID NO 285, SEQ ID NO 286, SEQ ID NO 297 and SEQ ID NO 298.

8. The antisense oligonucleotide according to embodiment 3, wherein the contiguous nucleotide sequence is complementary to a sequence selected from the group consisting of SEQ ID NO 305, SEQ ID NO 307, SEQ ID NO 314, SEQ ID NO 315, SEQ ID NO 316, SEQ ID NO 317, SEQ ID NO 319, SEQ ID NO 331, SEQ ID NO 332, SEQ ID NO 392, SEQ ID NO 394, SEQ ID NO 395, SEQ ID NO 440, SEQ ID NO 492, SEQ ID NO 497, SEQ ID NO 498, SEQ ID NO 499, SEQ ID NO 500, SEQ ID NO 501, SEQ ID NO 502, SEQ ID NO 513 and SEQ ID NO 576.

9. The antisense oligonucleotide according to embodiment 8, wherein the contiguous nucleotide sequence is selected from the group consisting of SEQ ID NO 14, SEQ ID NO 16, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 28, SEQ ID NO 40, SEQ ID NO 41, SEQ ID NO 101, SEQ ID NO 103, SEQ ID NO 104, SEQ ID NO 149, SEQ ID NO 201, SEQ ID NO 206, SEQ ID NO 207, SEQ ID NO 208, SEQ ID NO 209, SEQ ID NO 210, SEQ ID NO 211, SEQ ID NO 222 and SEQ ID NO 285.

10. The antisense oligonucleotide according to embodiment 3, wherein the contiguous nucleotide sequence is complementary to SEQ ID NO 314 or SEQ ID NO 315.

11. The antisense oligonucleotide according to embodiment 10, wherein the contiguous nucleotide sequence is SEQ ID 23 or SEQ ID 24.

12. The antisense oligonucleotide according to embodiment 1 or embodiment 2, wherein the contiguous nucleotide sequence is complementary to at least 10 contiguous nucleotides from the mouse XBP1 pre-mRNA transcript (SEQ ID NO 590).

13. The antisense oligonucleotide according to embodiment 12, wherein the contiguous nucleotide sequence is complementary to at least 10 contiguous nucleotides from nucleotides 3560-3783 of SEQ ID NO 590.

14. The antisense oligonucleotide according to embodiment 12, wherein the contiguous nucleotide sequence is complementary to a sequence selected from the group consisting of SEQ ID NO 699, SEQ ID NO 700, SEQ ID NO 703, SEQ ID NO 710, SEQ ID NO 713, SEQ ID NO 724, SEQ ID NO 729, SEQ ID NO 739, SEQ ID NO 743, SEQ ID NO 744, SEQ ID NO 745, SEQ ID NO 749, SEQ ID NO 750, SEQ ID NO 751, SEQ ID NO 752, SEQ ID NO 753, SEQ ID NO 754, SEQ ID NO 755, SEQ ID NO 756, SEQ ID NO 757, SEQ ID NO 758, SEQ ID NO 759, SEQ ID NO 760, SEQ ID NO 761, SEQ ID NO 762, SEQ ID NO 763, SEQ ID NO 773, SEQ ID NO 776, SEQ ID NO 778, SEQ ID NO 781, SEQ ID NO 783, SEQ ID NO 784, SEQ ID NO 785, SEQ ID NO 787, SEQ ID NO 789, SEQ ID NO 790, SEQ ID NO 791, SEQ ID NO 792, SEQ ID NO 793, SEQ ID NO 794, SEQ ID NO 795, SEQ ID NO 796, SEQ ID NO 797, SEQ ID NO 798, SEQ ID NO 799 and SEQ ID NO 800.

15. The antisense oligonucleotide according to embodiment 14, wherein the contiguous nucleotide sequence is selected from the group consisting of SEQ ID NO 597, SEQ ID NO 598, SEQ ID NO 601, SEQ ID NO 608, SEQ ID NO 611, SEQ ID NO 622, SEQ ID NO 627, SEQ ID NO 637, SEQ ID NO 641, SEQ ID NO 642, SEQ ID NO 643, SEQ ID NO 647, SEQ ID NO 648, SEQ ID NO 649, SEQ ID NO 650, SEQ ID NO 651, SEQ ID NO 652, SEQ ID NO 653, SEQ ID NO 654, SEQ ID NO 655, SEQ ID NO 656, SEQ ID NO 657, SEQ ID NO 658, SEQ ID NO 659, SEQ ID NO 660, SEQ ID NO 661, SEQ ID NO 671, SEQ ID NO 674, SEQ ID NO 676, SEQ ID NO 679, SEQ ID NO 681, SEQ ID NO 682, SEQ ID NO 683, SEQ ID NO 685, SEQ ID NO 687, SEQ ID NO 688, SEQ ID NO 689, SEQ ID NO 690, SEQ ID NO 691, SEQ ID NO 692, SEQ ID NO 693, SEQ ID NO 694, SEQ ID NO 695, SEQ ID NO 696, SEQ ID NO 697 and SEQ ID NO 698.

16. The antisense oligonucleotide according to embodiment 12, wherein the contiguous nucleotide sequence is complementary to a sequence selected from the group consisting of SEQ ID NO 710, SEQ ID NO 754, SEQ ID NO 756, SEQ ID NO 757, SEQ ID NO 758, SEQ ID NO 759, SEQ ID NO 760, SEQ ID NO 791, SEQ ID NO 792, SEQ ID NO 794, SEQ ID NO 795 and SEQ ID NO 797.

17. The antisense oligonucleotide according to embodiment 16, wherein the contiguous nucleotide sequence is selected from the group consisting of SEQ ID NO 608, SEQ ID NO 652, SEQ ID NO 654, SEQ ID NO 655, SEQ ID NO 656, SEQ ID NO 657, SEQ ID NO 658, SEQ ID NO 689, SEQ ID NO 690, SEQ ID NO 692, SEQ ID NO 693 and SEQ ID NO 695.

18. The antisense oligonucleotide according to embodiment 1 or embodiment 2, wherein the contiguous nucleotide sequence is complementary to at least 10 contiguous nucleotides of the human XBP1 pre-mRNA transcript (SEQ ID NO 801).

19. The antisense oligonucleotide according to embodiment 18, wherein the contiguous nucleotide sequence is complementary to at least 10 contiguous nucleotides from nucleotides 4338-4563 of SEQ ID NO 801.

20. The antisense oligonucleotide according to embodiment 18, wherein the contiguous nucleotide sequence is complementary to a sequence selected from the group consisting of SEQ ID NO 947, SEQ ID NO 948, SEQ ID NO 949, SEQ ID NO 950, SEQ ID NO 951 and SEQ ID NO 988.

21. The antisense oligonucleotide according to embodiment 21, wherein the contiguous nucleotide sequence is selected from the group consisting of SEQ ID NO 854, SEQ ID NO 855, SEQ ID NO 856, SEQ ID NO 857, SEQ ID NO 858 and SEQ ID NO 895.

22. The antisense oligonucleotide according to embodiment 18, wherein the contiguous nucleotide sequence is complementary to SEQ ID NO 951.

23. The antisense oligonucleotide according to embodiment 22, wherein the contiguous nucleotide sequence is SEQ ID NO 858.

24. The antisense oligonucleotide according to any one of the preceding embodiments, wherein the antisense oligonucleotide or contiguous nucleotide sequence thereof is fully complementary to a mammalian XBP1 pre-mRNA transcript.

25. The antisense oligonucleotide according to any one of the preceding embodiments, wherein the contiguous nucleotide sequence is at least 12 nucleotides in length.

26. The antisense oligonucleotide according to embodiment 25, wherein the contiguous nucleotide sequence is 12-16 or 12-18 nucleotides in length.

27. The antisense oligonucleotide according embodiment 25, wherein the contiguous nucleotide sequence is 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 or 40 nucleotides in length.

28. The antisense oligonucleotide according to any one of the preceding embodiments, wherein the contiguous nucleotide sequence is the same length as the antisense oligonucleotide.

29. The antisense oligonucleotide according to any one of the preceding embodiments, wherein the antisense oligonucleotide is isolated, purified or manufactured.

30. The antisense oligonucleotide according to any one of the preceding embodiment, wherein the antisense oligonucleotide or contiguous nucleotide sequence thereof comprises one or more modified nucleotides or one or more modified nucleosides.

31. The antisense oligonucleotide according to any one of the preceding embodiments, wherein the antisense oligonucleotide or contiguous nucleotide sequence thereof comprises one or more modified nucleosides, such as one or more modified nucleotides independently selected from the group consisting of 2′-O-alkyl-RNA; 2′-O-methyl RNA (2′-OMe); 2′-alkoxy-RNA; 2′-O-methoxyethyl-RNA (2′-MOE); 2′-amino-DNA; 2′-fluro-RNA; 2′-fluoro-DNA; arabino nucleic acid (ANA); 2′-fluoro-ANA; bicyclic nucleoside analog (LNA); or any combination thereof.

32. The antisense oligonucleotide according to embodiment 30 or embodiment 31, wherein one or more of the modified nucleosides is a sugar modified nucleoside.

33. The antisense oligonucleotide according to any one of embodiments 30 to 32, wherein one or more of the modified nucleosides comprises a bicyclic sugar.

34. The antisense oligonucleotide according to any one of embodiments 30 to 32, wherein one or more of the modified nucleosides is an affinity enhancing 2′ sugar modified nucleoside.

35. The antisense oligonucleotide according to any one of embodiments 30 to 34, wherein one or more of the modified nucleosides is an LNA nucleoside, such as one or more beta-D-oxy LNA nucleosides.

36. The antisense oligonucleotide according to any one of the preceding embodiments, wherein the antisense oligonucleotide or contiguous nucleotide sequence thereof comprises one or more 5′-methyl-cytosine nucleobases.

37. The antisense oligonucleotide according to any one of the preceding embodiments, wherein one or more of the internucleoside linkages within the contiguous nucleotide sequence of the antisense oligonucleotide are modified.

38. The antisense oligonucleotide according to embodiment 37, wherein at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95% or about 100% of the internucleoside linkages are modified.

39. The antisense oligonucleotide according to embodiment 37 or embodiment 38, wherein the one or more modified internucleoside linkages comprise a phosphorothioate linkage.

40. The antisense oligonucleotide according to any one of the preceding embodiments, wherein the antisense oligonucleotide is a morpholino modified antisense oligonucleotide.

41. The antisense oligonucleotide according to any one of the preceding embodiments, wherein the antisense oligonucleotide or contiguous nucleotide sequence thereof is or comprises an antisense oligonucleotide mixmer or totalmer.

42. An antisense oligonucleotide according to any one of the preceding embodiments covalently attached to at least one conjugate moiety.

43. The antisense oligonucleotide according to embodiment 42, wherein the conjugate moiety comprises a protein, a fatty acid chain, a sugar residue, a glycoprotein, a polymer or any combination thereof.

44. The antisense oligonucleotide according to any one of the preceding embodiments, wherein the antisense oligonucleotide is in the form of a pharmaceutically acceptable salt.

45. The antisense oligonucleotide according to embodiment 44, wherein the salt is a sodium salt, a potassium salt or an ammonium salt.

46. A composition comprising the antisense oligonucleotide according to any one of the preceding embodiments.

47. A pharmaceutical composition comprising the antisense oligonucleotide according to any one of embodiments 1 to 45 and a pharmaceutically acceptable diluent, solvent, carrier, salt and/or adjuvant.

48. The pharmaceutical composition according to embodiment 47, wherein the pharmaceutical composition comprises an aqueous diluent or solvent, such as phosphate buffered saline.

49. An isolated XBP1Δ4 protein.

50. The isolated XBP1Δ4 protein according to embodiment 49, wherein the protein comprises the sequence of SEQ ID NO: 7, SEQ ID NO: 596 or SEQ ID NO 807.

51. An isolated mRNA encoding the XBP1Δ4 protein according to embodiment 49 or embodiment 50

52. The isolated mRNA according to embodiment 51, comprising the sequence, of SEQ ID NO: 6, SEQ ID NO: 595 or SEQ ID NO: 806.

53. A method for producing a polypeptide comprising the steps of:

- a) cultivating a mammalian cell, which is expressing XBP1 and which comprises one or more nucleic acids encoding the polypeptide; and
- b) recovering the polypeptide from the cells or the cultivation medium,
- characterized in that the cultivating is in the presence of an antisense oligonucleotide according to any one of embodiments 1 to 45, a composition according to embodiment 46, a pharmaceutical composition according to embodiment 47 or embodiment 48, a protein according to embodiment 49 or 50 or an mRNA according to embodiment 51 or 52.

54. The method according to embodiment 53, comprising the steps of;

- a1) propagating a mammalian cell, which is expressing XBP1 and which comprises one or more nucleic acids encoding the polypeptide, in a cultivation medium comprising an antisense oligonucleotide according to any one of embodiments 1 to 45, to obtain a first cell population;
- a2) mixing an aliquot of the first cell population with cultivation medium to obtain a second cell population, wherein the cultivation medium optionally comprises the antisense oligonucleotide according to any one of embodiments 1 to 45;
- a3) cultivating the second cell population to obtain a third cell population; and
- b) recovering the polypeptide from the cells and/or the cultivation medium of the third cell cultivation.

55. The method according to embodiment 53 or embodiment 54, wherein the antisense oligonucleotide is added to a final concentration of 25 μM or more.

56. The method according to any one of embodiments 53 to 55, wherein the propagating and/or the cultivating is with a starting cell density of 1*10E6 to 2*10E6 cells/mL.

57. The method according to embodiment 56, wherein the starting cell density is about 2*10E6 cells/mL.

58. The method according to any one of embodiments 53 to 57, wherein the mammalian cell is a CHO cell.

59. The method according to any one of embodiments 53 to 58, wherein the polypeptide is an antibody.

60. An antisense oligonucleotide according to any one of embodiments 1 to 45, a composition according to embodiment 46 or a pharmaceutical composition according to embodiment 47 or embodiment 48 for use in medicine.

61. An antisense oligonucleotide according to any one of embodiments 1 to 45, a composition according to embodiment 46 or a pharmaceutical composition according to embodiment 47 or embodiment 48 for use in the treatment of patient with a proteopathological disease.

62. The antisense oligonucleotide for use according to embodiment 61, wherein the proteopathological disease has TDP-43 pathology.

63. The antisense oligonucleotide for use according to embodiment 61 or embodiment 62, wherein the proteopathological disease is motor neuron disease or frontotemporal lobar degeneration.

64. The use of an antisense oligonucleotide according to any one of embodiments 1 to 45, a composition according to embodiment 46 or a pharmaceutical composition according to embodiment 47 or embodiment 48 in the manufacture of a medicament for the treatment of proteopathological disease.

65. The use according to embodiment 64, wherein the disease has disease TDP-43 pathology.

66. The use according to embodiment 64 or embodiment 65, wherein the disease is motor neuron disease or frontotemporal lobar degeneration.

67. A method for treating a proteopathological disease in a patient, the method comprising administering to the patient an antisense oligonucleotide according to any one of embodiments 1 to 45, a composition according to embodiment 46 or a pharmaceutical composition according to embodiment 47 or embodiment 48.

68. The method according to embodiment 67, wherein the disease has TDP-43 pathology.

69. The method according to embodiment 67 or embodiment 68, wherein the disease is motor neuron disease or frontotemporal lobar degeneration.

EXAMPLES
General Techniques
Recombinant DNA Techniques

Standard methods were used to manipulate DNA as described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y, (1989). The molecular biological reagents were used according to the manufacturer's instructions.

Gene Synthesis

Desired gene segments were prepared by chemical synthesis at Geneart GmbH (Regensburg, Germany). The synthesized gene fragments were cloned into an E. coli plasmid for propagation/amplification. The DNA sequences of subcloned gene fragments were verified by DNA sequencing. Alternatively, short synthetic DNA fragments were assembled by annealing chemically synthesized oligonucleotides or via PCR. The respective oligonucleotides were prepared by metabion GmbH (Planegg-Martinsried, Germany).

DNA Sequence Determination

DNA sequences were determined by double strand sequencing performed at MediGenomix GmbH (Martinsried, Germany) or SequiServe GmbH (Vaterstetten, Germany).

DNA and Protein Sequence Analysis and Sequence Data Management

The EMBOSS (European Molecular Biology Open Software Suite) software package and Invitrogen's Vector NTI version 11.5 or Geneious prime were used for sequence creation, mapping, analysis, annotation and illustration.

Reagents

All commercial chemicals, antibodies and kits were used as provided according to the manufacturer's protocol if not stated otherwise.

Protein Determination

The protein concentration of purified antibodies and derivatives was determined by determining the optical density (OD) at 280 nm, using the molar extinction coefficient calculated on the basis of the amino acid sequence according to Pace, et al. Protein Science 4 (1995) 2411-1423.

Antibody Concentration Determination in Supernatants

The concentration of antibodies in cell culture supernatants was estimated by immunoprecipitation with protein A agarose-beads (Roche Diagnostics GmbH, Mannheim, Germany). Therefore, 60 μL protein A Agarose beads were washed three times in TBS-NP40 (50 mM Tris buffer, pH 7.5, supplemented with 150 mM NaCl and 1% Nonidet-P40). Subsequently, 1-15 mL cell culture supernatant was applied to the protein A Agarose beads pre-equilibrated in TBS-NP40. After incubation for at 1 hour at room temperature the beads were washed on an Ultrafree-MC-filter column (Amicon) once with 0.5 mL TBS-NP40, twice with 0.5 mL 2× phosphate buffered saline (2×PBS, Roche Diagnostics GmbH, Mannheim, Germany) and briefly four times with 0.5 mL 100 mM Na-citrate buffer (pH 5.0). Bound antibody was eluted by addition of 35 μl NuPAGE® LDS sample buffer (Invitrogen). Half of the sample was combined with NuPAGE® sample reducing agent or left unreduced, respectively, and heated for 10 min at 70° C. Consequently, 5-30 μl were applied to a 4-12% NuPAGE® Bis-Tris SDS-PAGE gel (Invitrogen) (with MOPS buffer for non-reduced SOS-PAGE and MES buffer with NuPAGE® antioxidant running buffer additive (Invitrogen) for reduced SDS-PAGE) and stained with Coomassie Blue.

The concentration of the antibodies in cell culture supernatants was quantitatively measured by affinity HPLC chromatography. Briefly, cell culture supernatants containing antibodies that bind to protein A were applied to an Applied Biosystems Poros A/20 column in 200 mM KH2PO4, 100 mM sodium citrate, pH 7.4 and eluted with 200 mM NaCl, 100 mM citric acid, pH 2.5 on an Agilent HPLC 1100 system. The eluted antibody was quantified by UV absorbance and integration of peak areas. A purified standard IgG1 antibody served as a standard.

Alternatively, the concentration of antibodies and derivatives in cell culture supernatants was measured by Sandwich-IgG-ELISA. Briefly, StreptaWell™ High Bind Streptavidin A-96 well microtiter plates (Roche Diagnostics GmbH, Mannheim, Germany) were coated with 100 μL/well biotinylated anti-human IgG capture molecule F(ab′)2<h-Fcγ> BI (Dianova) at 0.1 μg/mL for 1 hour at room temperature or alternatively overnight at 4° C. and subsequently washed three times with 200 μL/well PBS, 0.05% Tween (PBST, Sigma). Thereafter, 100 μL/well of a dilution series in PBS (Sigma) of the respective antibody containing cell culture supernatants was added to the wells and incubated for 1-2 hour on a shaker at room temperature. The wells were washed three times with 200 μL/well PBST and bound antibody was detected with 100 μl F(ab′)2<hFcγ>POD (Dianova) at 0.1 μg/mL as the detection antibody by incubation for 1-2 hours on a shaker at room temperature. Unbound detection antibody was removed by washing three times with 200 μL/well PBST. The bound detection antibody was detected by addition of 100 μL ABTS/well followed by incubation. Determination of absorbance was performed on a Tecan Fluor Spectrometer at a measurement wavelength of 405 nm (reference wavelength 492 nm).

Cultivation of CHO Host Cell Line

CHO host cells were cultivated at 37° C. in a humidified incubator with 85% humidity and 5% CO₂, They were cultivated in a proprietary DMEM/F12-based medium containing 300 μg/ml Hygromycin B and 4 μg/ml of a second selection marker. The cells were split every 3 or 4 days at a concentration of 0.3×10E6 cells/ml in a total volume of 30 ml. For the cultivation 125 ml non-baffle Erlenmeyer shake flasks were used. Cells were shaken at 150 rpm with a shaking amplitude of 5 cm. The cell count was determined with Cedex HiRes Cell Counter (Roche). Cells were kept in culture until they reached an age of 60 days.

Transformation 10-Beta Competent E. coli Cells

For transformation, the 10-beta competent E. coli cells were thawed on ice. After that, 2 μl of plasmid DNA were pipetted directly into the cell suspension. The tube was flicked and put on ice for 30 minutes. Thereafter, the cells were placed into the 42° C.-warm thermal block and heat-shocked for exactly 30 seconds. Directly afterwards, the cells were chilled on ice for 2 minutes. 950 μl of NEB 10-beta outgrowth medium were added to the cell suspension. The cells were incubated under shaking at 37° C. for one hour. Then, 50-100 μl were pipetted onto a pre-warmed (37° C.) LB-Amp agar plate and spread with a disposable spatula. The plate was incubated overnight at 37° C. Only bacteria, which have successfully incorporated the plasmid, carrying the resistance gene against ampicillin, can grow on these plates. Single colonies were picked the next day and cultured in LB-Amp medium for subsequent plasmid preparation.

Bacterial Culture

Cultivation of E. coli was done in LB-medium, short for Luria Bertani, which was spiked with 1 ml/L 100 mg/ml ampicillin resulting in an ampicillin concentration of 0.1 mg/ml. For the different plasmid preparation quantities, the following amounts were inoculated with a single bacterial colony.

TABLE 1

E. coli cultivation volumes

Quantity plasmid
Volume LB-Amp
Incubation

preparation
medium [ml]
time [h]

Mini-Prep 96-well (EpMotion)
1.5
23

Mini-Prep 15 ml-tube
3.6
23

Maxi-Prep
200
16

For Mini-Prep, a 96-well 2 ml deep-well plate was filled with 1.5 ml LB-Amp medium per well. The colonies were picked and the toothpick was tuck in the medium. When all colonies were picked, the plate closed with a sticky air porous membrane. The plate was incubated in a 37° C. incubator at a shaking rate of 200 rpm for 23 hours.

For Mini-Preps a 15 ml-tube (with a ventilated lid) was filled with 3.6 ml LB-Amp medium and equally inoculated with a bacterial colony. The toothpick was not removed but left in the tube during incubation. Like the 96-well plate, the tubes were incubated at 37° C., 200 rpm for 23 hours.

For Maxi-Prep 200 ml of LB-Amp medium were filled into an autoclaved glass 1 L Erlenmeyer flask and inoculated with 1 ml of bacterial day-culture, which was roundabout 5 hours old. The Erlenmeyer flask was closed with a paper plug and incubated at 37° C., 200 rpm for 16 hours.

Plasmid Preparation

For Mini-Prep, 50 μl of bacterial suspension were transferred into a 1 mi deep-well plate. After that, the bacterial cells were centrifuged down in the plate at 3000 rpm, 4° C. for 5 min. The supernatant was removed and the plate with the bacteria pellets placed into an EpMotion. After approx. 90 minutes, the run was done and the eluted plasmid-DNA could be removed from the EpMotion for further use.

For Mini-Prep, the 15 ml tubes were taken out of the incubator and the 3.6 ml bacterial culture split into two 2 ml Eppendorf tubes. The tubes were centrifuged at 6,800×g in a table-top microcentrifuge for 3 minutes at room temperature. After that, Mini-Prep was performed with the Qiagen QIAprep Spin Miniprep Kit according to the manufacturer's instructions. The plasmid DNA concentration was measured with Nanodrop.

Maxi-Prep was performed using the Macherey-Nagel NucleoBond® Xtra Maxi EF Kit according to the manufacturer's instructions. The DNA concentration was measured with Nanodrop.

Ethanol Precipitation

The volume of the DNA solution was mixed with the 2.5-fold volume ethanol 100%. The mixture was incubated at −20° C. for 10 min. Then the DNA was centrifuged for 30 min, at 14,000 rpm, 4° C. The supernatant was carefully removed and the pellet washed with 70% ethanol. Again, the tube was centrifuged for 5 min. at 14,000 rpm, 4° C. The supernatant was carefully removed by pipetting and the pellet dried. When the ethanol was evaporated, an appropriate amount of endotoxin-free water was added. The DNA was given time to re-dissolve in the water overnight at 4° C. A small aliquot was taken and the DNA concentration was measured with a Nanodrop device.

Preparative Antibody Purification

Antibodies were purified from filtered cell culture supernatants referring to standard protocols. In brief, antibodies were applied to a protein A Sepharose column (GE healthcare) and washed with PBS. Elution of antibodies was achieved at pH 2.8 followed by immediate neutralization. Aggregated protein was separated from monomeric antibodies by size exclusion chromatography (Superdex 200, GE Healthcare) in PBS or in 20 mM Histidine buffer comprising 150 mM NaCl (pH 6.0). Monomeric antibody fractions were pooled, concentrated (if required) using e.g., a MILLIPORE Amicon Ultra (30 MWCO) centrifugal concentrator, frozen and stored at −20° C. or −80° C. Part of the samples were provided for subsequent protein analytics and analytical characterization e.g. by SDS-PAGE, size exclusion chromatography (SEC) or mass spectrometry.

SDS-PAGE

The NuPAGE® Pre-Cast gel system (Invitrogen) was used according to the manufacturers instruction. In particular, 10% or 4-12% NuPAGE® Novex® Bis-TRIS Pre-Cast gels (pH 6.4) and a NuPAGE® MES (reduced gels, with NuPAGE® antioxidant running buffer additive) or MOPS (non-reduced gels) running buffer was used.

CE-SDS

Purity and antibody integrity were analyzed by CE-SDS using microfluidic Labchip technology (PerkinElmer, USA). Therefore, 5 μl of antibody solution was prepared for CE-SDS analysis using the HT Protein Express Reagent Kit according manufacturer's instructions and analyzed on Labchip GXII system using a HT Protein Express Chip. Data were analyzed using Labchip GX Software.

Analytical Size Exclusion Chromatography

Size exclusion chromatography (SEC) for the determination of the aggregation and oligomeric state of antibodies was performed by HPLC chromatography. Briefly, protein A purified antibodies were applied to a Tosoh TSKgel G3000SW column in 300 mM NaCl, 50 mM KH2PO4/K2HPO4 buffer (pH 7.5) on an Dionex Ultimate® system (Thermo Fischer Scientific), or to a Superdex 200 column (GE Healthcare) in 2×PBS on a Dionex HPLC-System. The eluted antibody was quantified by UV absorbance and integration of peak areas. BioRad Gel Filtration Standard 151-1901 served as a standard.

Mass Spectrometry

This section describes the characterization of the bispecific antibodies with emphasis on their correct assembly. The expected primary structures were analyzed by electrospray ionization mass spectrometry (ESI-MS) of the deglycosylated intact antibody and in special cases of the deglycosylated/limited LysC digested antibody.

The antibodies were deglycosylated with N-Glycosidase F in a phosphate or Tris buffer at 37° C. for up to 17 h at a protein concentration of 1 mg/ml. The limited LysC (Roche Diagnostics GmbH, Mannheim, Germany) digestions were performed with 100 μg deglycosylated antibody in a Tris buffer (pH 8) at room temperature for 120 hours, or at 37° C. for 40 min, respectively. Prior to mass spectrometry the samples were desalted via HPLC on a Sephadex G25 column (GE Healthcare). The total mass was determined via ESI-MS on a maXis 4G UHR-QTOF MS system (Bruker Daltonik) equipped with a TriVersa NanoMate source (Advion).

Example 1: Identification of Oligonucleotide to Induce Splice Skipping of Exon in the Hamster XBP1 mRNA, to Make a XBP1 Protein Mimic that Works Similarly to the Naturally Processed XBP1 Protein

CHOK1 cells were obtained from the ATCC cell bank, and were grown and maintained according to ATCC guidelines. 40 ASOs complementary to a region around exon 4 of the XBP1 mRNA NM_001244047.1 were tested for the ability to induce exon skipping of exon 4.

5000 cells CHOK1 cells were seeded in a 96 well plate, 6 hours later the ASOs were added directly to the cell medium at a final concentration of 5 μM and 25 μM. Cells were cultivated and harvested after 6 days and total RNA was isolated using the RNeasy 96 well kit from Qiagen according to manufacturer's instructions.

PCR was performed using the ddPCR supermix for probes (no UTP) from Biorad according to manufacturer's instructions.

The following primers and probes were used to measure the amount of mRNAs with exon skipping of exon 4 (XBP1 Δ 4 assay) and the amount of mRNA with normal joining of exon 4 and 5 (XBP1 WT) both purchased from IDT technologies. The XBP1 WT assay detected both the IRE-1 processed and unprocessed mRNA.

XBP1WT assay:

(SEQ ID NO: 1497)

Primer 2 (GTTCCTCCAGATTGGCAG)

(SEQ ID NO: 1498)

Primer 1 (CCAGGAGTTAAGAACTCGC)

(SEQ ID NO: 1499)

Probe /HEX/CGGAGTCCA /ZEN/ AGGGAAATGGAGTA/3IABkFQ/

XBP1Δ4 assay:

(SEQ ID NO: 1500)

Primer 2 (GTTCCTCCAGATTGGCAG)

(SEQ ID NO: 1501)

Primer 1 (CCAGGAGTTAAGAACTCGC)

(SEQ ID NO: 1502)

/56-FAM/CGGAGTCCA /ZEN/ AGTCTGATATCCTTTTG/3IABkFQ/

Data was analyzed using the QuantaSoft™ Analysis Pro software from Blared. The percentile of mRNAs containing skipping of exon 4, was calculated by (concΔ4/(concΔ14+concWT))*100. The normal percentile of mRNA with exon 4 skipping was calculated from the average of 14 control wells treated with PBS only. The average of PBS wells was 0.6%. The data are shown in Table 2.

TABLE 2

% of Xbp1 mRNA with exon 4 skipping.

ASO Used
5 μM
25 μM

SEQ ID 8
0.0
2.4

SEQ ID 9
0.0
0.0

SEQ ID 10
0.0
1.0

SEQ ID 11
0.8
2.2

SEQ ID 12
0.8
0.0

SEQ ID 13
2.7
1.0

SEQ ID 14
0.0
6.1

SEQ ID 15
0.0
1.5

SEQ ID 16
7.9
10.6

SEQ ID 17
2.4
1.3

SEQ ID 18
1.3
3.3

SEQ ID 19
0.0
1.0

SEQ ID 20
0.0
0.0

SEQ ID 21
0.9
0.0

SEQ ID 22
0.0
0.0

SEQ ID 23
1.1
9.1

SEQ ID 24
11.2
25.9

SEQ ID 25
4.6
16.5

SEQ ID 26
4.6
5.3

SEQ ID 27
2.5
3.5

SEQ ID 28
4.7
11.1

SEQ ID 29
0.0
0.0

SEQ ID 30
0.0
0.9

SEQ ID 31
0.0
0.9

SEQ ID 32
1.0
1.8

SEQ ID 33
0.0
0.0

SEQ ID 34
0.0
3.2

SEQ ID 35
0.0
0.0

SEQ ID 36
0.0
1.6

SEQ ID 37
1.1
1.1

SEQ ID 38
0.0
0.0

SEQ ID 39
0.0
2.3

SEQ ID 40
0.0
6.8

SEQ ID 41
0.0
10.0

SEQ ID 42
1.1
2.1

SEQ ID 43
0.0
1.2

SEQ ID 44
0.0
0.0

SEQ ID 45
0.0
3.3

SEQ ID 46
1.6
0.0

SEQ ID 47
0.0
0.0

Example 2: Identification of ASO to Induce Splice Skipping of Exon in the Hamster XBP1 mRNA, to Make a XBP1 Protein Mimic that Works Similar to the Naturally Processed XBP1 Protein, Now with an Extended Library Covering More Sequences Near Exon 4

CHOK1 cells were obtained from the ATCC cell bank, and were grown and maintained according to ATCC guidelines. 251 ASOs complementary to a region around exon 4 of the XBP1 mRNA NM_001244047.1 were tested for the ability to induce exon skipping of exon 4.

3000 CHOK1 cells were seeded in a 96 well plate, 24 hours later the ASOs were added directly to the cell medium at a final concentration of 5 μM and 25 μM. Cells were harvested after 6 days and total RNA was isolated using the RNeasy 96 well kit from Qiagen according to manufactures instructions.

cDNA was generated using the Script™ Advanced cDNA Synthesis Kit for RT-qPCR from Biorad. Relative mRNA expression was measured by droplet digital PCR using the QX200 ddPCR™ system from Biorad along with the automated droplet generator AutoDG from Biorad.

PCR was performed using the ddPCR supermix for probes (no UTP) from Biorad according to manufactories instructions.

The following primers and probes were used to measure the amount of mRNAs with exon skipping of exon4 (XBP1Δ4 assay) and the amount of mRNA with normal joining of exon 4 and 5 (XBP1 WT) both purchased from IDT technologies. The XBP1 WT assay detected both the IRE-1 processed and unprocessed mRNA.

XBP1 WT assay:

(SEQ ID NO: 1497)

Primer 2 (GTTCCTCCAGATTGGCAG)

(SEQ ID NO: 1498)

Primer 1 (CCAGGAGTTAAGAACTCGC)

(SEQ ID NO: 1499)

Probe /HEX/CGGAGTCCA /ZEN/ AGGGAAATGGAGTA/3IABkFQ/

XBP1Δ4 assay:

(SEQ ID NO: 1500)

Primer 2 (GTTCCTCCAGATTGGCAG)

(SEQ ID NO: 1501)

Primer 1 (CCAGGAGTTAAGAACTCGC)

(SEQ ID NO: 1502)

/56-FAM/CGGAGTCCA /ZEN/ AGTCTGATATCCTTTTG/3IABkFQ/

Data was analyzed using the QuantaSoft™ Analysis Pro software from Biorad. The percentile of mRNAs containing skipping of exon 4, was calculated by (concΔ4/(concΔ4+concWT))*100. The normal percentile of mRNA with exon 4 skipping was calculated from the average of 170 control wells treated with PBS only. The average of PBS wells were 0.1%. The data is shown in Table 3.

TABLE 3

% of Xbp1 mRNA with exon 4 skipping of 2 library.

Oligo ID
5 μM
25 μM

SEQ ID 48
0.00
0.54

SEQ ID 49
0.00
0.18

SEQ ID 50
0.18
0.00

SEQ ID 51
0.00
0.00

SEQ ID 52
0.00
0.17

SEQ ID 53
0.00
0.00

SEQ ID 54
0.00
0.00

SEQ ID 55
0.00
0.00

SEQ ID 56
0.16
0.31

SEQ ID 57
0.15
0.32

SEQ ID 58
0.00
0.15

SEQ ID 59
0.00
0.14

SEQ ID 60
0.00
0.13

SEQ ID 61
0.00
0.00

SEQ ID 62
0.00
0.00

SEQ ID 63
0.33
0.00

SEQ ID 64
0.00
0.00

SEQ ID 65
0.00
0.16

SEQ ID 66
0.00
0.15

SEQ ID 67
0.00
0.00

SEQ ID 68
0.35
0.00

SEQ ID 69
0.13
0.00

SEQ ID 70
0.17
0.18

SEQ ID 71
0.00
0.00

SEQ ID 72
0.48
0.92

SEQ ID 73
0.14
0.00

SEQ ID 74
0.00
0.00

SEQ ID 75
0.00
0.00

SEQ ID 76
0.00
0.21

SEQ ID 77
0.32
0.33

SEQ ID 78
0.00
0.17

SEQ ID 79
0.00
0.17

SEQ ID 80
0.00
0.16

SEQ ID 81
0.00
0.13

SEQ ID 82
0.00
0.28

SEQ ID 83
0.18
0.00

SEQ ID 84
0.00
0.00

SEQ ID 85
0.00
0.00

SEQ ID 86
0.00
0.20

SEQ ID 87
0.00
0.00

SEQ ID 88
0.34
0.00

SEQ ID 89
0.28
0.46

SEQ ID 90
0.15
0.00

SEQ ID 91
0.00
0.16

SEQ ID 92
0.15
0.51

SEQ ID 93
0.63
0.15

SEQ ID 94
1.53
1.51

SEQ ID 95
0.64
2.35

SEQ ID 96
0.48
2.15

SEQ ID 97
1.78
4.35

SEQ ID 98
0.32
0.44

SEQ ID 99
3.74
3.42

SEQ ID 100
1.28
1.87

SEQ ID 101
5.32
7.39

SEQ ID 102
2.66
4.13

SEQ ID 103
10.10
13.44

SEQ ID 104
6.99
11.25

SEQ ID 105
1.99
4.99

SEQ ID 106
1.39
1.32

SEQ ID 107
0.16
1.90

SEQ ID 108
0.79
1.48

SEQ ID 109
0.47
0.79

SEQ ID 110
1.59
1.58

SEQ ID 111
1.82
1.13

SEQ ID 112
0.41
0.82

SEQ ID 113
0.65
0.76

SEQ ID 114
0.35
0.00

SEQ ID 115
0.00
0.00

SEQ ID 116
0.34
0.59

SEQ ID 117
0.29
0.16

SEQ ID 118
0.15
0.00

SEQ ID 119
0.14
0.53

SEQ ID 120
0.15
0.00

SEQ ID 121
0.15
0.26

SEQ ID 122
0.00
0.15

SEQ ID 123
0.24
0.00

SEQ ID 124
0.17
0.32

SEQ ID 125
0.00
0.16

SEQ ID 126
0.00
0.00

SEQ ID 127
0.16
0.00

SEQ ID 128
0.67
1.01

SEQ ID 129
0.15
0.00

SEQ ID 130
0.57
0.00

SEQ ID 131
0.24
0.52

SEQ ID 132
0.00
0.00

SEQ ID 133
0.21
0.16

SEQ ID 134
0.00
0.00

SEQ ID 135
0.33
0.18

SEQ ID 136
0.16
0.60

SEQ ID 137
0.32
0.00

SEQ ID 138
0.00
0.00

SEQ ID 139
0.00
0.44

SEQ ID 140
1.11
0.70

SEQ ID 141
0.23
1.69

SEQ ID 142
0.20
1.07

SEQ ID 143
1.04
0.36

SEQ ID 144
0.00
0.33

SEQ ID 145
0.00
0.15

SEQ ID 146
0.00
0.00

SEQ ID 147
0.40
1.00

SEQ ID 148
1.95
3.31

SEQ ID 149
6.23
11.11

SEQ ID 150
1.61
2.14

SEQ ID 151
0.94
1.19

SEQ ID 152
0.24
0.00

SEQ ID 153
0.00
0.00

SEQ ID 154
0.20
0.00

SEQ ID 155
0.30
0.21

SEQ ID 156
0.00
0.75

SEQ ID 157
0.40
0.69

SEQ ID 158
0.20
1.18

SEQ ID 159
0.19
0.00

SEQ ID 160
0.00
0.00

SEQ ID 161
0.54
0.00

SEQ ID 162
0.00
0.00

SEQ ID 163
0.00
0.00

SEQ ID 164
0.00
0.40

SEQ ID 165
0.00
0.00

SEQ ID 166
0.20
0.23

SEQ ID 167
0.00
0.00

SEQ ID 168
0.00
0.00

SEQ ID 169
0.00
0.20

SEQ ID 170
0.00
0.00

SEQ ID 171
0.00
0.00

SEQ ID 172
0.21
0.15

SEQ ID 173
0.14
0.00

SEQ ID 174
0.00
0.00

SEQ ID 175
0.00
0.00

SEQ ID 176
0.16
0.00

SEQ ID 177
0.00
0.00

SEQ ID 178
0.00
0.15

SEQ ID 179
0.00
0.00

SEQ ID 180
0.00
0.13

SEQ ID 181
0.14
0.15

SEQ ID 182
0.25
0.40

SEQ ID 183
0.00
0.28

SEQ ID 184
0.00
0.00

SEQ ID 185
0.00
0.15

SEQ ID 186
0.00
0.00

SEQ ID 187
0.00
0.00

SEQ ID 188
0.00
0.00

SEQ ID 189
0.00
0.13

SEQ ID 190
0.14
0.00

SEQ ID 191
0.13
0.35

SEQ ID 192
0.00
0.00

SEQ ID 193
0.51
1.29

SEQ ID 194
1.53
1.35

SEQ ID 195
1.37
3.95

SEQ ID 196
1.69
2.48

SEQ ID 197
1.05
1.74

SEQ ID 198
0.80
1.51

SEQ ID 199
1.53
2.59

SEQ ID 200
1.94
2.52

SEQ ID 201
3.77
5.27

SEQ ID 202
2.87
2.98

SEQ ID 203
1.46
1.52

SEQ ID 204
0.65
1.87

SEQ ID 205
2.37
4.44

SEQ ID 206
2.57
5.31

SEQ ID 207
2.99
5.41

SEQ ID 208
3.89
8.39

SEQ ID 209
7.75
13.36

SEQ ID 210
7.69
11.34

SEQ ID 211
7.07
8.04

SEQ ID 212
4.27
0.41

SEQ ID 213
0.20
0.17

SEQ ID 214
2.16
2.25

SEQ ID 215
1.67
2.89

SEQ ID 216
1.49
1.64

SEQ ID 217
2.32
0.00

SEQ ID 218
0.62
1.36

SEQ ID 219
1.27
0.20

SEQ ID 220
1.02
3.76

SEQ ID 221
1.67
4.62

SEQ ID 222
2.76
6.39

SEQ ID 223
0.53
0.97

SEQ ID 224
0.47
1.48

SEQ ID 225
0.00
0.00

SEQ ID 226
0.52
1.37

SEQ ID 227
0.22
0.00

SEQ ID 228
0.69
0.33

SEQ ID 229
1.47
0.71

SEQ ID 230
0.63
0.96

SEQ ID 231
0.00
0.00

SEQ ID 232
0.64
0.00

SEQ ID 233
0.24
0.83

SEQ ID 234
0.22
0.00

SEQ ID 235
0.54
0.44

SEQ ID 236
0.00
0.00

SEQ ID 237
0.00
0.22

SEQ ID 238
0.00
0.00

SEQ ID 239
0.34
0.00

SEQ ID 240
0.00
0.00

SEQ ID 241
0.00
0.00

SEQ ID 242
0.66
0.42

SEQ ID 243
0.18
0.00

SEQ ID 244
0.00
0.00

SEQ ID 245
0.00
0.17

SEQ ID 246
0.00
0.71

SEQ ID 247
0.00
0.00

SEQ ID 248
0.18
0.20

SEQ ID 249
0.00
0.22

SEQ ID 250
0.00
0.00

SEQ ID 251
0.00
0.00

SEQ ID 252
0.39
0.45

SEQ ID 253
0.53
0.19

SEQ ID 254
0.00
0.00

SEQ ID 255
0.18
0.21

SEQ ID 256
0.17
0.36

SEQ ID 257
0.00
0.00

SEQ ID 258
0.00
0.34

SEQ ID 259
0.37
0.97

SEQ ID 260
0.33
0.86

SEQ ID 261
0.36
0.22

SEQ ID 262
0.00
0.17

SEQ ID 263
0.17
0.15

SEQ ID 264
0.00
0.00

SEQ ID 265
0.35
0.00

SEQ ID 266
0.16
0.00

SEQ ID 267
0.00
0.00

SEQ ID 268
0.49
0.53

SEQ ID 269
0.00
0.00

SEQ ID 270
0.00
0.20

SEQ ID 271
0.00
0.15

SEQ ID 272
0.20
0.17

SEQ ID 273
0.29
0.00

SEQ ID 274
0.40
0.49

SEQ ID 275
0.20
0.17

SEQ ID 276
0.14
0.00

SEQ ID 277
0.00
0.00

SEQ ID 278
0.00
0.19

SEQ ID 279
0.16
0.17

SEQ ID 280
0.18
0.17

SEQ ID 281
0.00
1.90

SEQ ID 282
0.59
1.75

SEQ ID 283
0.15
0.38

SEQ ID 284
0.15
0.53

SEQ ID 285
2.63
5.34

SEQ ID 286
0.60
2.62

SEQ ID 287
0.33
0.51

SEQ ID 288
0.35
0.43

SEQ ID 289
0.19
0.53

SEQ ID 290
0.00
0.00

SEQ ID 291
0.20
0.37

SEQ ID 292
0.39
0.00

SEQ ID 293
0.00
0.00

SEQ ID 294
0.28
0.31

SEQ ID 295
0.00
0.09

SEQ ID 296
0.14
0.45

SEQ ID 297
1.93
1.52

SEQ ID 298
0.98
1.77

Example 3—Identification of ASO to Induce Splice Skipping of Exon in the Mouse XBP1 mRNA, to Make a XBP1 Protein Mimic that Works Similarly to the Naturally Processed XBP1 Protein

Ltk-11 (ATCC® CRL-10422™) cells were obtained from the ATCC cell bank, and were grown and maintained according to ATCC guidelines. 102 ASOs complementary to a region around exon 4 of the XBP1 mRNA NM_013842.3 (SeqID 2) were tested for the ability to induce exon skipping of exon 4.

2000 cells LTK cells were seeded in a 96 well plate, 24 hours later the ASOs were added directly to the cell medium at a final concentration of 5 uM and 25 uM. Cells were harvested after 3 days and total RNA was isolated using the RNeasy 96 well kit from Qiagen according to manufactures instructions.

cDNA was generated using the iScrip™ Advanced cDNA Synthesis Kit for RT-qPCR from Biorad. Relative mRNA expression was measured by droplet digital PCR using the QX200 ddPCR system from Biorad along with the automated droplet generator AutoDG from Biorad. PCR was performed using the ddPCR supermix for probes (no UTP) from biorad according to manufactories instructions.

The following primers and probes were used to measure the amount of mRNAs with exon skipping of exon4 (XBP1 delta 4 assay) and the amount of mRNA with normal joining of exon 4 and 5 (XBP1 WT) both purchased from IDT technologies. The XBP1 WT assay detected both the IRE-1 processed and unprocessed mRNA.

XBP1 WT assay:

(SEQ ID NO: 1503)

Primer 2 (AGG GTC CAA CTT GTC C)

(SEQ ID NO: 1504)

Primer 1 (CTG GAT CCT GAC GAG GTT C)

(SEQ ID NO: 1505)

Probe /5HEX/CTT ACT CCA /ZEN/CTC CCC TTG GCC TCC

A/3IABkFQ/

XBP1 delta 4 assay:

(SEQ ID NO: 1503)

Primer 2 (AGG GTC CAA CTT GTC C)

(SEQ ID NO: 1504)

Primer 1 (CTG GAT CCT GAC GAG GTT C)

(SEQ ID NO: 1506)

/56-FAM/CCC AAA AGG /ZEN/ATA TCA GAC TTG GCC TCC

A/3IABkFQ/

Data was analyzed using the QuantaSoft™ Analysis Pro software from Biorad. The percentile of mRNAs containing skipping of exon 4, was calculated by (conc delta 4/(conc delta 4+concWT))*100. The normal percentile of mRNA with exon 4 skipping was calculated from the average of 61 control wells treated with PBS only. The average of PBS wells were 0.37% with a standard deviation of 0.17. The data is shown in Table 4.

TABLE 4

% of XBP1 exon 4 splice skipping

10 μM
25 μM

SEQ ID 597
1.80
3.24

SEQ ID 598
0.76
1.93

SEQ ID 599
0.40
0.86

SEQ ID 600
0.06
0.33

SEQ ID 601
2.61
3.56

SEQ ID 602
0.30
0.18

SEQ ID 603
0.43
0.71

SEQ ID 604
0.58
0.52

SEQ ID 605
0.27
0.54

SEQ ID 606
0.31
0.37

SEQ ID 607
0.39
0.28

SEQ ID 608
5.17
8.88

SEQ ID 609
0.67
0.54

SEQ ID 610
0.27
0.52

SEQ ID 611
1.03
1.01

SEQ ID 612
0.00
0.68

SEQ ID 613
0.57
0.62

SEQ ID 614
0.22
0.00

SEQ ID 615
0.26
0.22

SEQ ID 616
0.53
0.27

SEQ ID 617
0.48
0.09

SEQ ID 618
0.90
0.71

SEQ ID 619
0.38
0.16

SEQ ID 620
0.60
0.70

SEQ ID 621
0.59
0.24

SEQ ID 622
1.01
0.65

SEQ ID 623
0.22
0.30

SEQ ID 624
0.31
0.71

SEQ ID 625
0.34
0.60

SEQ ID 626
0.35
0.53

SEQ ID 627
0.74
1.03

SEQ ID 628
0.49
0.12

SEQ ID 629
0.24
0.12

SEQ ID 630
0.39
0.25

SEQ ID 631
0.25
0.22

SEQ ID 632
0.15
0.08

SEQ ID 633
0.31
0.26

SEQ ID 634
0.58
0.76

SEQ ID 635
0.07
0.19

SEQ ID 636
0.25
0.14

SEQ ID 637
0.74
1.76

SEQ ID 638
0.84
0.56

SEQ ID 639
0.66
1.17

SEQ ID 640
0.51
0.75

SEQ ID 641
0.97
1.82

SEQ ID 642
0.14
1.35

SEQ ID 643
2.56
3.91

SEQ ID 644
0.86
0.86

SEQ ID 645
0.67
0.14

SEQ ID 646
0.76
0.98

SEQ ID 647
0.68
1.00

SEQ ID 648
2.42
3.24

SEQ ID 649
1.86
3.57

SEQ ID 650
0.69
1.51

SEQ ID 651
1.84
3.76

SEQ ID 652
2.34
5.95

SEQ ID 653
1.97
3.85

SEQ ID 654
3.86
9.13

SEQ ID 655
11.01
19.67

SEQ ID 656
5.56
10.31

SEQ ID 657
4.77
7.55

SEQ ID 658
6.59
10.15

SEQ ID 659
1.59
4.31

SEQ ID 660
1.78
3.20

SEQ ID 661
1.57
3.49

SEQ ID 662
0.73
0.36

SEQ ID 663
0.58
0.91

SEQ ID 664
0.25
0.79

SEQ ID 665
0.19
0.19

SEQ ID 666
0.30
0.63

SEQ ID 667
0.27
0.29

SEQ ID 668
0.82
0.00

SEQ ID 669
0.76
0.50

SEQ ID 670
0.27
0.90

SEQ ID 671
1.04
0.84

SEQ ID 672
0.89
0.63

SEQ ID 673
0.72
0.23

SEQ ID 674
0.89
1.46

SEQ ID 675
0.84
0.80

SEQ ID 676
2.28
3.83

SEQ ID 677
0.28
0.23

SEQ ID 678
0.28
0.25

SEQ ID 679
2.24
4.35

SEQ ID 680
0.28
0.43

SEQ ID 681
1.15
2.02

SEQ ID 682
1.59
2.60

SEQ ID 683
1.76
2.90

SEQ ID 684
0.58
0.44

SEQ ID 685
0.65
1.36

SEQ ID 686
0.42
0.35

SEQ ID 687
0.76
1.26

SEQ ID 688
2.28
4.48

SEQ ID 689
2.46
7.24

SEQ ID 690
5.48
10.93

SEQ ID 691
2.82
4.68

SEQ ID 692
3.48
6.20

SEQ ID 693
2.76
5.11

SEQ ID 694
1.15
2.20

SEQ ID 695
4.86
6.35

SEQ ID 696
1.02
2.34

SEQ ID 697
3.20
4.73

SEQ ID 698
0.96
1.17

Example 4: Identification of ASO to Induce Splice Skipping of Exon in the Human XBP1 mRNA, to Make a XBP1 Protein Mimic that Works Similarly to the Naturally Processed XBP1 Protein

A459 cells were obtained from the ATCC cell bank, and were grown and maintained according to ATCC guidelines. 100 ASOs complementary to a region around exon 4 of the XBP1 mRNA NM_005080.4 (SeqID 2) were tested for the ability to induce exon skipping of exon 4.

4000 A549 cells were seeded in a 96 well plate, 24 hours later the ASOs were added directly to the cell medium at a final concentration of 25 M. Cells were harvested after 3 days and total RNA was isolated using the RNeasy 96 well kit from Qiagen according to manufactures instructions.

cDNA was generated using the iScript™ Advanced cDNA Synthesis Kit for RT-qPCR from Biorad. Relative mRNA expression was measured by droplet digital PCR using the QX200 ddPCR system from Biorad along with the automated droplet generator AutoDG from Biorad. PCR was performed using the ddPCR supermix for probes (no UTP) from biorad according to manufactories instructions.

The following primers and probes were used to measure the amount of mRNAs with exon skipping of exon4 (XBP14 assay) and the amount of mRNA with normal joining of exon 4 and 5 (XBP1 WT) both purchased from IDT technologies. The XBP1 WT assay detected both the IRE-1 processed and unprocessed mRNA.

XBP1WT assay:

(SEQ ID NO: 1503)

Primer 2 (CTG GGT CCA AGT TGT CCA GA)

(SEQ ID NO: 1504)

Primer 1 (ATG CCC TGG TTG CTG AAG)

(SEQ ID NO: 1505)

Probe /5HEX/TCA CTT CAT /ZEN/TCC CCT TGG CTT CCG

C/3IABkFQ/

XBP1Δ4 assay:

(SEQ ID NO: 1503)

Primer 2 (CTG GGT CCA AGT TGT CCA GA)

(SEQ ID NO: 1504)

Primer 1 (ATG CCC TGG TTG CTG AAG)

(SEQ ID NO: 1506)

/56-FAM/CCA ACA GGA /ZEN/TAT CAG ACT TGG CTT CCG

C/3IABkFQ/

Data was analyzed using the QuantaSoft™ Analysis Pro software from Biorad. The percentile of mRNAs containing skipping of exon 4, was calculated by (concΔ4/(concΔ4+concWT))*100. The normal percentile of mRNA with exon 4 skipping was calculated from the average of 40 control wells treated with PBS only. The average of PBS wells was 0.03% with a standard deviation of 0.05. The data is shown in Table 5.

TABLE 5

% of XBP1 exon4 skipping

Oligo used
% of XBP1 exon4 splice skipping

SEQ ID 808
0.00

SEQ ID 809
0.12

SEQ ID 810
0.00

SEQ ID 811
0.00

SEQ ID 812
0.13

SEQ ID 813
0.07

SEQ ID 814
0.05

SEQ ID 815
0.00

SEQ ID 816
0.00

SEQ ID 817
0.00

SEQ ID 818
0.11

SEQ ID 819
0.00

SEQ ID 820
0.06

SEQ ID 821
0.08

SEQ ID 822
0.11

SEQ ID 823
0.00

SEQ ID 824
0.11

SEQ ID 825
0.00

SEQ ID 826
0.06

SEQ ID 827
0.00

SEQ ID 828
0.00

SEQ ID 829
0.00

SEQ ID 830
0.00

SEQ ID 831
0.08

SEQ ID 832
0.00

SEQ ID 833
0.07

SEQ ID 834
0.12

SEQ ID 835
0.00

SEQ ID 836
0.18

SEQ ID 837
0.00

SEQ ID 838
0.00

SEQ ID 839
0.00

SEQ ID 840
0.07

SEQ ID 841
0.00

SEQ ID 842
0.19

SEQ ID 843
0.68

SEQ ID 844
0.91

SEQ ID 845
0.67

SEQ ID 846
0.85

SEQ ID 847
0.31

SEQ ID 848
0.25

SEQ ID 849
0.28

SEQ ID 850
0.35

SEQ ID 851
0.35

SEQ ID 852
0.47

SEQ ID 853
0.30

SEQ ID 854
1.67

SEQ ID 855
1.99

SEQ ID 856
2.93

SEQ ID 857
2.29

SEQ ID 858
6.67

SEQ ID 859
0.49

SEQ ID 860
0.00

SEQ ID 861
0.00

SEQ ID 862
0.27

SEQ ID 863
0.19

SEQ ID 864
0.00

SEQ ID 865
0.00

SEQ ID 866
0.00

SEQ ID 867
0.20

SEQ ID 868
0.00

SEQ ID 869
0.00

SEQ ID 870
0.00

SEQ ID 871
0.00

SEQ ID 872
0.00

SEQ ID 873
0.00

SEQ ID 874
0.21

SEQ ID 875
0.00

SEQ ID 876
0.07

SEQ ID 877
0.00

SEQ ID 878
0.06

SEQ ID 879
0.13

SEQ ID 880
0.09

SEQ ID 881
0.09

SEQ ID 882
0.06

SEQ ID 883
0.14

SEQ ID 884
0.00

SEQ ID 885
0.08

SEQ ID 886
0.25

SEQ ID 887
0.11

SEQ ID 888
0.25

SEQ ID 889
0.47

SEQ ID 890
0.08

SEQ ID 891
0.58

SEQ ID 892
0.60

SEQ ID 893
0.25

SEQ ID 894
0.69

SEQ ID 895
1.52

SEQ ID 896
0.07

SEQ ID 897
0.55

SEQ ID 898
0.14

SEQ ID 899
0.41

SEQ ID 900
0.25

Example 5: Plasmid Generation for Targeted Integration

In general, to construct the plasmids for RMCE, the respective expression cassettes for the antibody light chain and heavy chain were cloned into a first vector backbone flanked by L3 and LoxFas sequences, and a second vector flanked by LoxFas and 2L sequences and also further including a selectable marker. A Cre recombinase plasmid (see, e.g., Wong, E. T., et al., Nucl. Acids Res. 33 (2005) e147; O'Gorman, S., et al., Proc. Natl. Acad. Sci. USA 94 (1997) 14602-14607) was used for all RMCE processes.

The cDNAs encoding the respective polypeptides were generated by gene synthesis (Geneart, Life Technologies Inc.). The synthesized cDNAs and backbone-vectors were digested with HindIII-HF and EcoRI-HF (NEB) at 37° C. for 1 h and separated by agarose gel electrophoresis. The bands comprising the DNA-fragment of the insert and backbone, respectively, were cut out from the agarose gel and extracted by QIAquick Gel Extraction Kit (Qiagen). The purified insert and backbone fragment was ligated via the Rapid Ligation Kit (Roche Diagnostics GmbH, Mannheim, Germany) following the manufacturer's protocol with an Insert/Backbone ratio of 3:1. The ligation approach was then transformed into competent E. coli DH5a via heat shock and incubated for 1 h at 37° C., Thereafter the cells were plated out on agar plates with ampicillin for selection. Plates were incubated at 37° C. overnight.

On the following day clones were picked and incubated overnight at 37° C. under shaking for the Mini or Maxi-Preparation, which was performed with the EpMotion® 5075 (Eppendorf) or with the QIAprep Spin Mini-Prep Kit (Qiagen)/NucleoBond Xtra Maxi EF Kit (Macherey & Nagel), respectively. All constructs were sequenced to ensure correctness of the sequences.

In the second cloning step, the generated vectors were digested with KpnI-HF/SalI-HF and SalI-HF/MfeI-HF with the same conditions as outlined above. The respective RMCE (TI) backbone vector was digested with KpnI-HF and MfeI-HF. Separation and extraction was performed as described above. Ligation of the purified insert and backbone was performed using T4 DNA Ligase (NEB) following the manufacturer's protocol with an Insert/Insert/Backbone ratio of 1:1:1 overnight at 4° C. Thereafter ligase was inactivated at 65° C. for 10 min. The following steps were performed as described above.

Example 6: Generation of Stable Cell Lines by Targeted Integration

CHO TI host cells comprising a GFP expression cassette in the TI landing site were propagated in disposable 125 ml vented shake flasks under standard humidified conditions (95% rH, 37° C., and 5% CO₂) at a constant agitation rate of 150 rpm in a DMEM/F12-based medium. Every 3-4 days the cells were seeded with a concentration of 3×10E5 cells/ml in chemically defined medium containing selection marker 1 and selection marker 2 in effective concentrations. Density and viability of the cultures were measured with a Cedex HiRes cell counter (F, Hoffmann-La Roche Ltd. Basel, Switzerland).

For stable transfection, equimolar amounts of first and second vector generated according to Example 5 were mixed. 1 μg Cre encoding nucleic acid was added per 5 μg of the mixture, i.e. 5 μg Cre expression plasmid or Cre mRNA was added to 25 μg of the vector mixture.

Two days prior to transfection TI host cells were seeded in fresh medium with a density of about 4×10E5 cells/ml, Transfection was performed with the Nucleofector device using the Nucleofector Kit V (Lonza, Switzerland), according to the manufacturer's protocol. 3×10E7 cells were transfected with a total of 30 μg nucleic acid mixture, i.e. with 30 μg plasmid (5 μg Cre plasmid and 25 μg vector mixture). After transfection, the cells were seeded in 30 ml medium without selection agents.

On day 5 after seeding the cells were centrifuged and transferred at a cell density of 6×10E5 cells/ml to 80 mL chemically defined medium containing selection agent 1 and selection agent 2 at effective concentrations for selection of recombinant cells. The cells were incubated at 37° C., 150 rpm, 5% CO₂, and 85% humidity from this day on without splitting. Cell density and viability of the culture was monitored regularly. When the viability of the culture started to increase again, the concentrations of selection agents 1 and 2 were reduced to about half the amount used before. In more detail, to promote the recovering of the cells, the selection pressure was reduced if the viability is >40% and the viable cell density (VCD) is >0.5×10E6 cells/mL. Therefore, 4×10E5 cells/ml were centrifuged and re-suspended in 40 ml selection media II (chemically-defined medium, ½ selection marker 1 & 2). The cells were incubated with the same conditions as before and also not splitted.

Ten days after starting selection, the success of RMCE was checked by flow cytometry measuring the expression of intracellular GFP and extracellular heterologous polypeptide sticking to the cell surface. An APC antibody (allophycocyanin-labeled F(ab′)2 Fragment goat anti-human IgG) against human antibody light and heavy chain was used for FACS staining. Flow cytometry was performed with a BD FACS Canto II flow cytometer (BD, Heidelberg, Germany). Ten thousand events per sample were measured. Living cells were gated in a plot of forward scatter (FSC) against side scatter (SSC). The live cell gate was defined with non-transfected TI host cells and applied to all samples by employing the FlowJo™ 7.6.5 EN software (TreeStar, Olten, Switzerland). Fluorescence of GFP was quantified in the FITC channel (excitation at 488 nm, detection at 530 nm). Antibody was measured in the APC channel (excitation at 645 nm, detection at 660 nm). Parental CHO cells, i.e. those cells used for the generation of the TI host cell, were used as a negative control with regard to GFP and antibody expression. Fourteen days after the selection had been started, the viability exceeded 90% and selection was considered as complete.

Example 7: FACS Screening

FACS analysis was performed to check the transfection and RMCE efficiency. 4×10E5 cells of the transfected approaches were centrifuged (1200 rpm, 4 min.) and washed twice with 1 mL PBS. After the washing steps with PBS the pellet was re-suspended in 400 μL PBS and transferred into FACS tubes (Falcon 8 Round-Bottom Tubes with cell strainer cap; Corning). The measurement was performed with a FACS Canto II and the data were analyzed by the software FlowJo™.

Example 8: Fed-Batch Cultivation with LNA Addition

All fed-batch cultures were performed in shake flasks or Ambr®15 vessels (Sartorius Stedim) with the same proprietary serum-free, chemically defined medium and under the same cultivation and feeding conditions.

The recombinant mammalian cells used in this example were obtained according to the procedure described in Example 6 and expressed a heterologous antibody (Protein 1: antibody-multimer-fusion).

The cell culture process consisted of a seed train cultivation, followed with inoculation train (N-2 and N-1 cultures; pre-fermentation) and main fermentation (N). The seed- and inoculation train for the Ambr®15 was performed in shake flasks with cell splits every 3 or 4 days.

The antisense oligonucleotides of SEQ ID NO 23 and SEQ ID NO 24 were chosen as the LNAs due to the high level of exon 4 skipping observed with these antisense oligonucleotides in initial studies (see Example 1).

The (main) cultivations (N) in Ambr®15 were performed with a starting cell density of about 2*10E6 cells/mi in a total volume of 13 mi. The cultivation temperature was controlled, the N2 gassing rate was set constant, oxygen supply was regulated via a PID controller to maintain a constant DO, the agitation rate was set to 1200-1400 rpm (down stirring), the pH was set to pH 7.0. The pH-control was performed by adding a 1 M sodium carbonate solution or sparging CO₂into the bioreactor. The pH spots of the bioreactors were recalibrated every other day with the integrated analysis module of the Ambr®15. Defoamer was added one day before inoculation and daily during the cultivation. The cells were cultivated in a 14 days fed batch process with glucose control and two different feeds, which were added as bolus at predefined time points. The cell count and viability measurements were performed at-line with a Cedex HiRes (Roche Diagnostics GmbH, Mannheim, Germany). A Cedex Bio HT Analyzer (Roche Diagnostics GmbH) was used to measure product and metabolite concentrations.

The LNA addition at the beginning of the N-1 pre-cultivation (N-1), inoculation day (d0) or three days after the inoculation (d3) were performed by the liquid handling system of the Ambr®15 by adding a defined volume of a high concentrated LNA stock solution.

The supernatant was harvested 14 days after start of fed-batch by centrifugation (10 min, 1000 rpm and 10 min, 4000 rpm) and cleared by filtration (0.22 μm). Day 14 titers were determined using protein A affinity chromatography with UV detection. Product quality was determined by Caliper's LabChip® (Caliper Life Sciences).

It appears as though any efficiency in exon4 skipping of the LNA is sufficient to generate the effect of increased recombinant titer.

TABLE 6

Result of the 14-day fed-batch cultivation in Ambr ®15;

N-1 = LNA-addition at the start of the pre-fermentation;

d 0 = LNA addition at day 0, i.e. start of the main

fermentation; d 3 = LNA-addition at day 3 of the main fermentation.

rel.
average

SEQ
add LNA
titer
eff.
eff
rel. eff

exp.
ID
timing at
[mg/L]
titer
titer
titer

1
no
—
2300
1218.8
100%
100

2
23
N-1 & d 0
2676
1554.5
128%
119%

3
23
N-1 & d 0
2477
1353.6
111%

4
23
d 3
2335
1222.5
100%
102%

5
23
d 3
2356
1261.1
103%

6
24
N-1
2444
1330.1
109%
117%

7
24
N-1
2670
1523.0
125%

8
24
d 3
2316
1295.8
106%
103%

9
24
d 3
2304
1223.9
100%

Example 9: Fed-Batch Cultivation with Stable XBP1A4 Expression—Comparative Example

All fed-batch cultures were performed in shake flasks or Ambr®15 vessels (Sartorius Stedim) with the same proprietary serum free, chemically defined medium.

The recombinant mammalian cells used in this example were obtained according to the procedure described in Example 6 and stably expressed a heterologous antibody as well as XBP1 splice variant XBP1Δ4 with an amino acid sequences as depicted in SEQ ID NO; 7.

The (main) cultivations (N) in Ambr®15 were performed with a starting cell density of about 2*10E6 cells/ml in a total volume of 13 ml. The cultivation temperature was controlled, the N2 gassing rate was set constant, oxygen supply was regulated via a PID controller to maintain a constant DO, the agitation rate was set to 1200-1400 rpm (down stirring), the pH was set to pH 7.0. The pH-control was performed by adding a 1 M sodium carbonate solution or sparging CO₂into the bioreactor. The pH spots of the bioreactors were recalibrated every other day with the integrated analysis module of the Ambr®15. Defoamer was added one day before inoculation and daily during the cultivation. The cells were cultivated in a 14 days fed batch process with glucose control and two different feeds, which were added as bolus at predefined time points. The cell count and viability measurements were performed at-line with a Cedex HiRes (Roche Diagnostics GmbH, Mannheim, Germany). A Cedex Bio HT Analyzer (Roche Diagnostics GmbH) was used to measure product and metabolite concentrations.

TABLE 7

Result of the 14-day fed-batch cultivation in Ambr ®15

of recombinant mammalian CHO cell stably transfected with antibody

(Protein 1: antibody-multimer-fusion) and XBP1Δ4 variant

encoding nucleic acids. exp. = experiment number, eff.

titer = effective titer (product of titer and main peak

as determined by capillary electrophoresis or SEC), rel, eff.

titer = relative effective titer (relative titer normalized to exp. 1).

titer

average

exp.
extra factor
[mg/L]
eff. titer
rel. eff titer
rel. eff titer

1
—
2300
1218.8
100%
100

2
XBP1Δ4
1135
414.7
34%
38.5%

variant

3
XBP1Δ4
969
523.8
43%

variant

Example 10: Fed-Batch Cultivation with LNA Addition

The same conditions for the fed-batch cultivation as described in Example 8 above were also used herein. The only difference of the current Example 10 to Example 8 is with respect to the expressed protein and the addition time of the LNA.

Likewise, the recombinant CHO cells used in this Example were obtained with the method according to Example 6.

Protein 1: antibody-multimer-fusion

TABLE 8

Data for pool:

SEQ
add LNA
titer

exp.
ID
timing at
[mg/L]
rel. eff titer

1
no
—
899

100%

2
23
N-1 & d0
1185
131.9%

3
23
d0
1078
119.9%

4
23
d0 & d3
1106
123.1%

5
23
d3
1142

127%

TABLE 9

Data for single clone:

SEQ
add LNA
titer

exp.
ID
timing at
[mg/L]
rel. eff titer

1
no
—
2468
100%

2
23
N-1 & d0
3375
136.8%

3
23
d0
3049
123.6%

4
23
d0 & d3
3041
123.2%

5
23
d3
3285
133.1%

6
24
N-1 & d0
3371
136.6%

7
24
d0 & d3
2522
102.2%

8
24
d3
2914
118.1%

Protein 2: bispecific, trivalent antibody comprising a full-length antibody binding to human A-beta protein and additional heavy chain C-terminal Fab fragment with domain exchange binding to human transferrin receptor (see WO 2017/055540)

TABLE 10

Data for single clone:

SEQ
add LNA
titer

exp.
ID
timing at
[mg/L]
rel. eff titer

1
no
—
1312
100%

2
23
N-1 & d0
1522
116.0%

3
23
d0
1533
116.9%

4
23
d0 & d3
1433
109.3%

5
23
d3
1572
119.8%

6
24
N-1 & d0
1497
114.1%

7
24
d0
1554
118.4%

8
24
d0 & d3
1484
113.1%

9
24
d3
1434
109.3%

Protein 3: tetravalent, bispecific antibody with domain exchange

TABLE 11

Data for single clone:

SEQ
add LNA
titer

exp.
ID
timing at
[mg/L]
rel. eff titer

1
no
—
1339

100%

2
23
N-1 & d0
1434
107.1%

3
23
d0
1518
113.3%

4
23
d0 & d3
1431
106.8%

5
23
d3
1509
112.7%

6
24
N-1 & d0
1241
92.7%

7
24
d0
1472
109.9%

8
24
d0 & d3
1324
98.9%

9
24
d3
1379
103.0%

TABLE 12

Compound Table; compounds comprising modified nucleosides (SEQ ID NOs: 1011-1496).

Motif SEQ
Motif
Target site

SEQ ID

ID
Sequence
sequence
Target SEQ
Compounds (HELM ANNOTATION)
NO.

HAMSTER

SEQ ID 8
CCC
CTTTCCTTCCAGG
SEQ ID 299
[LR]([5meC])[sP].[LR]([5meC])[sP].[dR](C)[sP].[LR](T)[sP].[dR](G)
1011

TGG
G

[sP].[LR](G)[sP].[LR](A)[sP].[dR](A)[sP].[LR](G)[sP].[LR](G)[sP].[dR]

AAG

(A)[sP].[dR](A)[sP].[LR](A)[sP].[LR](G)

GAA

AG

SEQ ID 9
TTC
TTCCTTCCAGGG
SEQ ID 300
[LR](T)[sP].[dR](T)[sP].[LR]([5meC])[sP].[LR]([5meC])[sP].[dR](C)
1012

CCT
AA

sP].[LR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](A)[sP].[LR](A)[sP].[LR]

GGA

(G)[sP].[dR](G)[sP].[LR](A)[sP].[LR](A)

AGG

AA

SEQ ID 10
TTTC
TCCTTCCAGGGA
SEQ ID 301
[LR](T)[sP].[LR](T)[sP].[dR](T)[sP].[LR]([5meC])[sP].[LR]([5meC])
1013

CCT
AA

sP].[dR](C)[sP].[LR](T)[sP].[dR](G)[sP].[LR](G)[sP].[LR](A)[sP].[dR]

GGA

(A)[sP].[dR](G)[sP].[LR](G)[sP].[LR](A)

AGG

A

SEQ ID 11
ATT
CCTTCCAGGGAA
SEQ ID 302
[LR](A)[sP].[LR](T)[sP].[dR](T)[sP].[dR](T)[sP].[LR]([5meC])[sP].[LR]
1014

TCC
AT

([5meC])[sP].[dR](C)[sP].[LR](T)[sP].[dR](G)[sP].[LR](G)[sP].[LR]

CTG

(A)[sP].[dR](A)[sP].[LR](G)[sP].[LR](G)

GAA

GG

SEQ ID 12
CAT
CTTCCAGGGAAA
SEQ ID 303
[LR]([5meC])[sP].[LR](A)[sP].[dR](T)[sP].[LR](T)[sP].[dR](T)[sP].[LR]
1015

TTC
TG

([5meC])[sP].[LR]([5meC])[sP].[dR](C)[sP].[LR](T)[sP].[LR](G)[sP].

CCT

[dR](G)[sP].[dR](A)[sP].[LR](A)[sP].[LR](G)

GGA

AG

SEQ ID 13
CCA
TTCCAGGGAAAT
SEQ ID 304
[LR]([5meC])[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR](T)[sP].[LR](T)
1016

TTTC
GG

[sP].[LR](T)[sP].[dR](C)[sP].[LR]([5meC])[sP].[LR]([5meC])[sP].[dR]

CCT

(T)[sP].[LR][G][sP].[dR](G)[sP].[LR](A)[sP].[LR](A)

GGA

A

SEQ ID 14
TCC
TCCAGGGAAATG
SEQ ID 305
[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[LR](T)[sP].[dR](T)
1017

ATT
GA

[sP].[LR](T)[sP].[dR](C)[sP].[LR]([5meC])[sP].[LR]([5meC])[sP].[dR]

TCC

(T)[sP].[dR](G)[sP].[LR](G)[sP].[LR](A)

CTG

GÅ

SEQ ID 15
CTC
CCAGGGAAATGG
SEQ ID 306
[LR]([5meC])[sP].[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].
1018

CAT
AG

[LR](T)[sP].[dR](T)[sP].[dR](T)[sP].[LR]([5meC])[sP].[LR]([5meC])

TTC

[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[LR](G)

CCT

GG

SEQ ID 16
ACT
CAGGGAAATGGA
SEQ ID 307
[LR](A)[sP].[dR](C)[sP].[LR](T)[sP].[LR]([5meC])[sP].[dR](C)[sP].[LR]
1019

CCA
GT

(A)[sP].[LR](T)[sP].[dR](T)[sP].[dR](T)[sP].[LR]([5meC])[sP].[LR]

TTTC

([5meC])[sP].[dR](C)[sP].[LR](T)[sP].[LR](G)

CCT

G

SEQ ID 17
TAC
AGGGAAATGGA
SEQ ID 308
[LR](T)[sP].[dR](A)[sP].[LR]([5meC])[sP].[LR](T)[sP].[dR](C)[sP].[dR]
1020

TCC
GTA

(C)[sP].[LR](A)[sP].[LR](T)[sP].[dR](T)[sP].[LR](T)[sP].[LR]([5meC])

ATT

[sP].[dR](C)[sP].[LR]([5meC])[sP].[LR](T)

TCC

CT

SEQ ID 18
TTA
GGGAAATGGAGT
SEQ ID 309
[LR](T)[sP].[LR](T)[sP].[dR](A)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)
1021

CTC
AA

[sP].[LR]([5meC])[sP].[LR](A)[sP].[dR](T)[sP].[LR](T)[sP].[LR](T)[sP].

CAT

[dR](C)[sP].[LR]([5meC])[sP].[LR]([5meC])

TTC

CC

SEQ ID 19
CTT
GGAAATGGAGTA
SEQ ID 310
[LR]([5meC])[sP].[LR](T)[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].
1022

ACT
AG

[dR](T)[sP].[LR]([5meC])[sP].[LR]([5meC])[sP].[dR](A)[sP].[LR](T)

CCA

[sP].[LR](T)[sP].[dR](T)[sP].[LR]([5meC])[sP].[LR]([5meC])

TTTC

C

SEQ ID 20
CCT
GAAATGGAGTAA
SEQ ID 311
[LR]([5meC])[sP].[LR]([5meC])[sP].[dR](T)[sP].[LR](T)[sP].[dR](A)
1023

TAC
GG

[sP].[LR]([5meC])[sP].[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)

TCC

[sP].[LR](T)[sP].[dR](T)[sP].[LR](T)[sP].[LR]([5meC])

ATT

TC

SEQ ID 21
GCC
AAATGGAGTAAG
SEQ ID 312
[LR](G)[sP].[dR][C)[sP].[LR]([5meC])[sP].[LR](T)[sP].[dR](T)[sP].
1024

TTA
GC

[LR](A)[sP].[LR]([5meC])[sP].[dR](T)[sP].[LR]([5meC])[sP].[LR]

CTC

([5meC])[sP].[dR](A)[sP].[dR](T)[sP].[LR](T)[sP].[LR](T)

CAT

TT

SEQ ID 22
GGC
AATGGAGTAAGG
SEQ ID 313
[LR](G)[sP].[dR](G)[sP].[LR]([5meC])[sP].[LR]([5meC])[sP].[dR](T)
1025

CTT
CC

[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[LR](T)[sP].[LR]([5meC])

ACT

[sP].[dR](C)[sP].[dR](A)[sP].[LR](T)[sP].[LR](T)

CCA

TT

SEQ ID 23
GAA
CACTTTGGTCTTT
SEQ ID 314
[LR](G)[sP].[LR](A)[sP].[dR](A)[sP].[LR](A)[sP].[LR](G)[sP].[dR](A)
1026

AGA
C

[sP].[LR]([5meC])[sP].[LR]([5meC])[sP].[dR](A)[sP].[LR](A)[sP].

CCA

[LR](A)[sP].[dR](G)[sP].[LR](T)[sP].[LR](G)

AAG

TG

SEQ ID 24
AGG
CTTTGGTCTTTCC
SEQ ID 315
[LR](A)[sP].[dR](G)[sP].[LR](G)[sP].[LR](A)[sP].[dR](A)[sP].[LR](A)
1027

AAA
T

[sP].[LR](G)[sP].[dR](A)[sP].[LR]([5meC])[sP].[LR]([5meC])[sP].

GAC

[dR](A)[sP].[dR](A)[sP].[LR](A)[sP].[LR](G)

CAA

AG

SEQ ID 25
GAA
TTGGTCTTTCCTT
SEQ ID 316
[LR](G)[sP].[LR](A)[sP].[dR](A)[sP].[LR](G)[sP].[LR](G)[sP].[dR](A)
1028

GGA
C

[sP].[LR](A)[sP].[LR](A)[sP].[dR](G)[sP].[LR](A)[sP].[LR]([5meC])

AAG

[sP].[dR][C][sP].[LR](A)[sP].[LR](A)

ACC

AA

SEQ ID 26
GGA
TGGTCTTTCCTTC
SEQ ID 317
[LR](G)[sP].[LR](G)[sP].[dR](A)[sP].[LR](A)[sP].[LR](G)[sP].[dR](G)
1029

AGG
C

[sP].[LR](A)[sP].[LR](A)[sP].[dR](A)[sP].[LR](G)[sP].[LR](A)[sP].

AAA

[dR](C)[sP].[LR]([5meC])[sP].[LR](A)

GAC

CA

SEQ ID 27
TGG
GGTCTTTCCTTCC
SEQ ID 318
[LR](T)[sP].[LR](G)[sP].[dR](G)[sP].[LR](A)[sP].[LR](A)[sP].[dR](G)
1030

AAG
A

[sP].[LR](G)[sP].[LR](A)[sP].[dR](A)[sP].[LR](A)[sP].[LR](G)[sP].

GAA

[dR](A)[sP][LR]([5meC])[sP].[LR]([5meC])

AGA

CC

SEQ ID 28
CTG
GTCTTTCCTTCCA
SEQ ID 319
[LR]([5meC])[sP].[LR](T)[sP].[dR](G)[sP].[LR](G)[sP].[LR](A)[sP].
1031

GAA
G

[dR](A)[sP].[LR](G)[sP].[LR](G)[sP].[dR](A)[sP].[LR](A)[sP].[LR](A)

GGA

[sP].[dR](G)[sP].[LR](A)[sP].[LR]([5meC])

AAG

AC

SEQ ID 29
CCT
TCTTTCCTTCCAG
SEQ ID 320
[LR]([5meC])[sP].[LR]([5meC])[sP].[dR](T)[sP].[LR](G)[sP].[LR](G)
1032

GGA
G

[sP].[dR](A)[sP].[LR](A)[sP].[LR](G)[sP].[dR](G)[sP].[LR](A)[sP].

AGG

[LR](A)[sP].[dR](A)[sP].[LR][G)[sP].[LR](A)

AAA

GA

SEQ ID 30
CCC
CTTTCCTTCCAGG
SEQ ID 321
[LR]([5meC])[sP].[LR]([5meC])[sP].[dR](C)[sP].[LR](T)[sP].[LR](G)
1033

TGG
G

[sP].[dR](G)[sP].[LR](A)[sP].[LR][A][sP].[dR](G)[sP].[LR](G)[sP].

AAG

[LR](A)[sP].[dR](A)[sP].[LR](A)[sP].[LR](G)

GAA

AG

SEQ ID 31
TCC
TTTCCTTCCAGGG
SEQ ID 322
[LR](T)[sP].[LR]([5meC])[sP].[dR](C)[sP].[LR]([5meC])[sP].[LR](T)
1034

CTG
A

[sP].[dR](G)[sP].[LR](G)[sP].[LR](A)[sP].[dR](A)[sP].[LR](G)[sP].[LR]

GAA

(G)[sP].[dR](A)[sP].[LR](A)[sP].[LR](A)

GGA

AA

SEQ ID 32
TTC
TTCCTTCCAGGG
SEQ ID 323
[LR](T)[sP].[LR](T)[sP].[dR](C)[sP].[LR]([5meC])[sP].[LR]([5meC])
1035

CCT
AA

[sP].[dR](T)[sP].[LR](G)[sP].[LR](G)[sP].[dR](A)[sP].[LR](A)[sP].[LR]

GGA

(G)[sP].[dR](G)[sP].[LR](A)[sP].[LR](A)

AGG

AA

SEQ ID 33
TTTC
TCCTTCCAGGGA
SEQ ID 324
[LR](T)[sP].[LR](T)[sP].[dR](T)[sP].[LR]([5meC])[sP].[LR]([5meC])
1036

CCT
AA

[sP].[dR](C)[sP].[LR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](A)[sP].[LR]

GGA

(A)[sP].[dR](G)[sP].[LR](G)[sP].[LR](A)

AGG

A

SEQ ID 34
ATT
CCTTCCAGGGAA
SEQ ID 325
[LR](A)[sP].[LR](T)[sP].[dR](T)[sP].[LR](T)[sP].[LR]([5meC])[sP].[dR]
1037

TCC
AT

(C)[sP].[LR]([5meC])[sP].[LR](T)[sP].[dR](G)[sP].[LR](G)[sP].[LR]

CTG

(A)[sP].[dR](G)[sP].[LR](G)[sP].[LR](A)

GAA

GG

SEQ ID 35
CAT
CTTCCAGGGAAA
SEQ ID 326
[LR]([5meC])[sP].[LR](A)[sP].[dR](T)[sP].[dR](T)[sP].[LR](T)[sP].[dR]
1038

TTC
TG

(C)[sP].[LR]([5meC][sP].[LR]([5meC])[sP].[dR](T)[sP].[LR](G)[sP].

CCT

[LR](G)[sP].[dR](A)[sP].[LR](A)[sP].[LR](G)

GGA

AG

SEQ ID 36
CCA
TTCCAGGGAAAT
SEQ ID 327
[LR]([5meC])[sP].[LR]([5meC])[sP].[dR](A)[sP].[LR](T)[sP].[dR](T)
1039

TTTC
GG

[sP].[dR](T)[sP].[LR]([5meC])[sP].[LR]([5meC])[sP].[dR](C)[sP].[LR]

CCT

(T)[sP].[LR](G)[sP].[dR](G)[sP].[LR](A)[sP].[LR](A)

GGA

A

SEQ ID 37
TCC
TCCAGGGAAATG
SEQ ID 328
[LR](T)[sP].[LR]([5meC])[sP].[dR][C)[sP].[LR](A)[sP].[LR](T)[sP].
1040

ATT
GA

[dR](T)[sP].[dR](T)[sP].[LR]([5meC])[sP].[LR]([5meC])[sP].[dR](C)

TCC

[sP].[LR](T)[sP].[dR](G)[sP].[LR](G)[sP].[LR](A)

CTG

GA

SEQ ID 38
CTC
CCAGGGAAATGG
SEQ ID 329
[LR]([5meC])[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR](C)[sP].[LR](A)
1041

CAT
AG

[sP].[LR](T)[sP].[dR](T)[sP].[dR](T)[sP].[LR]([5meC])[sP].[LR]

TTC

([5meC])[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[LR](G)

CCT

GG

SEQ ID 39
ACT
CAGGGAAATGGA
SEQ ID 330
[LR](A)[sP].[LR]([5meC])[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR](C)
1042

CCA
GT

[sP].[LR](A)[sP].[LR](T)[sP].[dR](T)[sP].[dR](T)[sP].[LR]([5meC])

TTTC

[sP][LR]([5meC])[sP].[dR](C)[sP].[LR](T)[sP].[LR](G)

CCT

G

SEQ ID 40
TAC
AGGGAAATGGA
SEQ ID 331
[LR](T)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR]([5meC])[sP].
1043

TCC
GTA

[LR]([5meC])[sP].[dR](A)[sP].[LR](T)[sP].[LR](T)[sP].[dR](T)[sP].[LR]

ATT

([5meC])[sP].[dR](C)[sP].[LR]([5meC])[sP].[LR](T)

TCC

CT

SEQ ID 41
TTA
GGGAAATGGAGT
SEQ ID 332
[LR](T)[sP].[LR](T)[sP].[dR](A)[sP].[LR]([5meC])[sP].[LR](T)[sP].
1044

CTC
AA

[dR](C)[sP].[LR]([5meC])[sP].[LR](A)[sP].[dR](T)[sP].[LR](T)[sP].[dR]

CAT

(T)[sP].[dR](C)[sP].[LR]([5meC])[sP].[LR]([5meC])

TTC

CC

SEQ ID 42
CTT
GGAAATGGAGTA
SEQ ID 333
[LR]([5meC])[sP].[LR](T)[sP].[dR](T)[sP].[LR](A)[sP].[LR]([5meC])
1045

ACT
AG

[sP].[dR](T)[sP].[LR]([5meC])[sP].[LR]([5meC])[sP].[dR](A)[sP].[LR]

CCA

(T)[sP].[LR](T)[sP].[dR](T)[sP].[LR]([5meC])[sP].[LR]([5meC])

TTTC

C

SEQ ID 43
CCT
GAAATGGAGTAA
SEQ ID 334
[LR]([5meC])[sP].[LR]([5meC])[sP].[dR](T)[sP].[LR](T)[sP].[LR](A)
1046

TAC
GG

[sP].[dR](C)[sP].[LR](T)[sP].[LR]([5meC])[sP].[dR](C)[sP].[LR](A)

TCC

[sP].[LR](T)[sP].[dR](T)[sP].[LR](T)[sP].[LR]([5meC]

ATT

TC

SEQ ID 44
GCC
AAATGGAGTAAG
SEQ ID 335
[LR](G)[sP].[LR]([5meC])[sP].[dR](C)[sP].[LR](T)[sP].[LR](T)[sP].
1047

TTA
GC

[dR](A)[sP].[LR]([5meC])[sP].[LR](T)[sP].[dR](C)[sP].[LR]([5meC])

CTC

[sP].[dR](A)[sP].[dR](T)[sP].[LR](T)[sP].[LR](T)

CAT

TT

SEQ ID 45
GGC
AATGGAGTAAGG
SEQ ID 336
[LR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])[sP].[LR](T)[sP].
1048

CTT
CC

[dR](T)[sP].[LR](A)[sP].[LR]([5meC])[sP].[dR](T)[sP].[LR]([5meC])

ACT

[sP].[LR]([5meC])[sP].[dR](A)[sP].[LR](T)[sP].[LR](T)

CCA

TT

SEQ ID 46
CCG
TGGAGTAAGGCC
SEQ ID 337
[LR]([5meC])[sP].[LR]([5meC])[sP].[dR](G)[sP].[LR](G)[sP].[dR](C)
1049

GCC
GG

[sP].[dR](C)[sP].[LR](T)[sP].[dR](T)[sP].[dR](A)[sP].[LR]([5meC])

TTA

[sP].[LR](T)[sP].[dR](C)[sP].[LR]([5meC])[sP].[LR](A)

CTC

CA

SEQ ID 47
CAC
GAGTAAGGCCGG
SEQ ID 338
[LR]([5meC])[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR](G)[sP].
1050

CGG
TG

[dR](G)[sP].[LR]([5meC])[sP].[LR]([5meC])[sP].[dR](T)[sP].[LR](T)

CCT

[sP].[LR](A)[sP].[dR](C)[sP].[LR](T)[sP].[LR]([5meC])

TAC

TC

SEQ ID 48
CCA
TTTCTTAATTTCC
SEQ ID 339
LR]([5meC])[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR](C)[sP].[LR](T)
1051

CTG
AGTGG

[sP].[LR](G)[sP].[dR](G)[sP].[dR](A)[sP].[LR](A)[sP].[dR](A)[sP].

GAA

[dR](T)[sP].[LR](T)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[dR](A)

ATT

[sP].[LR](A)[sP].[LR](A)

AAG

AAA

SEQ ID 49
CTG
CACTTTCTTAATT
SEQ ID 340
LR]([5meC])[sP].[dR](T)[sP].[LR](G)[sP].[LR](G)[sP].[dR](A)[sP].
1052

GAA
TCCAG

[LR](A)[sP][LR](A)[sP].[dR](T)[sP].[dR](T)[sP].[LR](A)[sP].[dR](A)

ATT

[sP].[LR](G)[sP].[LR](A)[sP].[dR](A)[sP].[dR](A)[sP].[LR](G)[sP].[LR]

AAG

(T)[sP].[LR](G)

AAA

GTG

SEQ ID 50
GAA
AGTCACTTTCTTA
SEQ ID 341
LR](G)[sP].[LR](A)[sP].[dR](A)[sP].[dR](A)[sP].[dR](T)[sP].[LR](T)
1053

ATT
ATTTC

[sP].[LR](A)[sP].[dR](A)[sP].[LR](G)[sP].[LR](A)[sP].[dR](A)[sP].[LR]

AAG

(A)[sP].[LR](G)[sP].[dR](T)[sP].[dR](G)[sP].[dR](A)[sP].[LR]([5meC])

AAA

[sP].[LR](T)

GTG

ACT

SEQ ID 51
ATT
AGCAGTCACTTTC
SEQ ID 342
LR](A)[sP].[LR](T)[sP].[dR](T)[sP].[LR](A)[sP].[LR](A)[sP].[dR](G)
1054

AAG
TTAAT

[sP].[dR](A)[sP].[LR](A)[sP].[dR](A)[sP].[dR]{G)[sP].[dR](T)[sP].[LR]

AAA

(G)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[dR](G)[sP].[LR]([5meC])

GTG

[sP].[LR](T)

ACT

GCT

SEQ ID 52
AAG
GTAAGCAGTCAC
SEQ ID 343
LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[LR](A)
1055

AAA
TTTCTT

[sP].[LR](G)[sP].[dR](T)[sP].[dR](G)[sP].[LR](A)[sP].[dR](C)[sP].[dR]

GTG

(T)[sP].[dR](G)[sP].[LR]([5meC])[sP].[LR](T)[sP].[dR](T)[sP].[LR]

ACT

(A)[sP].[LR]([5meC])

GCT

TAC

SEQ ID 53
AAA
GGGGTAAGCAGT
SEQ ID 344
LR](A)[sP].[LR](A)[sP].[dR](A)[sP].[dR][G)[sP].[dR](T)[sP].[LR](G)
1056

GTG
CACTTT

[sP].[dR](A)[sP].[dR][C)[sP].[dR](T)[sP].[LR][G)[sP].[dR](C)[sP].[dR]

ACT

(T)[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR]([5meC])

GCT

[sP].[LR]([5meC.])

TAC

CCC

SEQ ID 54
GTG
TTAGGGGTAAGC
SEQ ID 345
LR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[dR](T)
1057

ACT
AGTCAC

[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[dR](T)[sP].[LR](A)[sP].[dR]

GCT

(C)[sP].[LR]([5meC])[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)[sP].[LR]

TAC

(A)[sP].[LR](A)

CCC

TAA

SEQ ID 55
ACT
GGGTTAGGGGTA
SEQ ID 346
LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)
1058

GCT
AGCAGT

[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](C)

TAC

[sP].[dR](C)[sP].[dR](T)[sP].[LR](A)[sP].[dR](A)[sP].[dR][C)[sP].[LR]

CCC

([5meC])[sP].[LR]([5meC])

TAA

CCC

SEQ ID 56
GCT
GTAGGGTTAGGG
SEQ ID 347
LR](G)[sP].[dR][C)[sP].[dR](T)[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)
1059

TAC
GTAAGC

[sP].[LR]([5meC])[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)[sP].[LR](A)

CCC

[sP].[dR](A)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](C)[sP].[dR](T)[sP].

TAA

[LR](A)[sP].[LR]([5meC])

CCC

TAC

SEQ ID 57
TAC
TTGGTAGGGTTA
SEQ ID 348
LRI(T)[sP].[dR](A)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](C)[sP].[dR]
1060

CCC
GGGGTA

(C)[sP].[dR](T)[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[LR]([5meC])

TAA

[sP].[dR](C)[sP].[dR](T)[sP].[LR](A)[sP].[dR][C)[sP].[dR](C)[sP].

CCC

[LR](A)[sP].[LR](A)

TAC

CAA

SEQ ID 58
CCC
GTATTGGTAGGG
SEQ ID 349
LR]([5meC])[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)[sP].[LR](A)[sP].[dR]
1061

TAA
TTAGGG

(A)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](C)[sP].[dR](T)[sP].[LR]

CCC

(A)[sP].[dR](C)[sP].[dR][C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](T)[sP].

TAC

[LR](A)[sP].[LR]([5meC])

CAA

TAC

SEQ ID 59
TAA
TTAGTATTGGTA
SEQ ID 350
LR](T)[sP].[dR](A)[sP].[dR](A)[sP].[LR]([5meC])[sP].[dR](C)[sP].[dR]
1062

CCC
GGGTTA

(C)[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[LR]([5meC])[sP].[LR]

TAC

(A)[sP].[dR](A)[sP].[LR](T)[sP].[dR][A][sP].[dR](C)[sP].[dR](T)[sP].

CAA

[LR](A)[sP][LR](A)

TAC

TAA

SEQ ID 60
CCC
AAGTTAGTATTG
SEQ ID 351
LR]([5meC])[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)[sP].[LR](A)[sP].[dR]
1063

TAC
GTAGGG

(C)[sP].[dR](C)[sP].[dR](A)[sP].[LR](A)[sP].[dR](T)[sP].[LR](A)[sP].

CAA

[dR](C)[sP].[dR](T)[sP].[LR](A)[sP].[LR](A)[sP].[dR](C)[sP].[LR]

TAC

T)[sP].[LR](T)

TAA

CTT

SEQ ID 61
TAC
AAGAAGTTAGTA
SEQ ID 352
LR](T)[sP].[dR](A)[sP].[LR]([5meC])[sP].[dR](C)[sP].[LR](A)[sP].[LR]
1064

CAA
TTGGTA

(A)[sP].[dR](T)[sP].[LR](A)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR]

TAC

(A)[sP].[LR](A)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR](T)[sP].[dR]

TAA

(C)[sP].[LR](T)[sP].[LR](T)

CTT

CTT

SEQ ID 62
CAA
AGAAAGAAGTTA
SEQ ID 353
LR]([5meC])[sP].[dR](A)[sP].[dR](A)[sP].[LR](T)[sP].[dR](A)[sP].[dR]
1065

TAC
GTATTG

(C)[sP].[dR](T)[sP].[LR](A)[sP].[dR](A)[sP].[LR]([5meC])[sP].[LR]

TAA

(T)[sP].[dR](T)[sP].[dR](C)[sP].[LR](T)[sP].[dR](T)[sP].[LR](T)[sP].

CTT

[LR]([5meC])[sP].[LR](T)

CTTT

CT

SEQ ID 63
TAC
TGGAGAAAGAAG
SEQ ID 354
LR](T)[sP].[dR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR](A)[sP].[dR](A)
1066

TAA
TTAGTA

[sP].[dR](C)[sP].[dR](T)[sP].[LR](T)[sP].[dR](C)[sP].[LR](T)[sP].[dR]

CTT

(T)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[dR](C)[sP].[LR]([5meC])

CTTT

[sP][R](A)

CTC

CA

SEQ ID 64
TAA
AAATGGAGAAAG
SEQ ID 355
LR](T)[sP].[dR](A)[sP].[LR](A)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR]
1067

CTT
AAGTTA

(T)[sP].[dR](C)[sP].[LR](T)[sP].[dR](T)[sP].[dR](T)[sP].[dR](C)[sP].

CTTT

[LR](T)[sP].[dR][C)[sP].[dR][C][sP].[LR](A)[sP].[dR](T)[sP].[LR](T)

CTC

[sP].[LR](T)

CAT

TT

SEQ ID 65
CTT
GGCAAATGGAGA
SEQ ID 356
LR]([5meC])[sP].[dR](T)[sP].[dR](T)[sP].[dR](C)[sP].[LR](T)[sP].[dR]
1068

CTTT
AAGAAG

(T)[sP].[dR](T)[sP].[dR][C)[sP].[LR](T)[sP].[dR](C)[sP].[dR](C)[sP]

CTC

.[LR](A)[sP].[dR](T)[sP].[LR](T)[sP].[dR](T)[sP].[dR](G)[sP].[LR]

CAT

([5meC])[sP].[LR]([5meC])

TTG

CC

SEQ ID 66
CTTT
CCAGGCAAATGG
SEQ ID 357
LR]([5meC])[sP].[dR](T)[sP].[LR](T)[sP].[dR](T)[sP].[dR](C)[sP].[LR]
1069

CTC
AGAAAG

(T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[dR](T)[sP]

CAT

.[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)[sP].[LR]

TTG

(G)[sP].[LR](G)

CCT

GG

SEQ ID 67
TCT
TAGCCAGGCAAA
SEQ ID 358
LR](T)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].
1070

CCA
TGGAGA

[dR](T)[sP].[dR](T)[sP].[dR](T)[sP].[LR][G][sP].[dR](C)[sP].[dR]

TTT

(C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR](T)[sP].

GCC

[LR](A)

TGG

CTA

SEQ ID 68
CCA
GCCTAGCCAGGC
SEQ ID 359
LR]([5meC])[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[dR](T)[sP].[dR]
1071

TTT
AAATGG

(T)[sP].[LR](G)[sP].[dR](C)[sP].[dR][C)[sP].[dR](T)[sP].[LR](G)[sP].

GCC

[dR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](A)[sP].[dR][G)[sP].[LR]

TGG

(G)[sP].[LR]([5meC])

CTA

GGC

SEQ ID 69
TTT
CATGCCTAGCCA
SEQ ID 360
LR](T)[sP].[dR](T)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](C)
1072

GCC
GGCAAA

[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR]

TGG

(A)[sP].[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR][A][sP].[LR](T)[sP].

CTA

[LR](G)

GGC

ATG

SEQ ID 70
GCC
TGACATGCCTAG
SEQ ID 361
LR](G)[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)
1073

TGG
CCAGGC

[sP].[dR](C)[sP].[dR](T)[sP].[LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[dR]

CTA

(C)[sP].[LR](A)[sP].[dR](T)[sP].[dR](G)[sP].[dR](T)[sP].[LR]([5meC])

GGC

[sP].[LR](A)

ATG

TCA

SEQ ID 71
TGG
GTGTGACATGCC
SEQ ID 362
LR](T)[sP].[dR](G)[sP].[LR](G)[sP].[dR][C)[sP].[dR](T)[sP].[LR](A)
1074

CTA
TAGCCA

[sP].[dR][G)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[LR]

GGC

(G)[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[LR](A)[sP].

ATG

[LR]([5meC])

TCA

CAC

SEQ ID 72
CTA
GATGTGTGACAT
SEQ ID 363
LR]([5meC])[sP].[dR](T)[sP].[LR](A)[sP].[dR](G)[sP].[dR][G)[sP].[dR]
1075

GGC
GCCTAG

(C)[sP].[LR](A)[sP].[dR](T)[sP].[LR](G)[sP].[dR](T)[sP].[dR](C)[sP).

ATG

[LR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](A)[sP].[LR]

TCA

(T)[sP].[LR]([5meC])

CAC

ATC

SEQ ID 73
GGC
TATGATGTGTGA
SEQ ID 364
LR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[LR](G)
1076

ATG
CATGCC

[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](A)[sP].[dR]

TCA

(C)[sP].[LR](A)[sP].[dR](T)[sP].[dR](C)[sP].[dR](A)[sP].[LR](T)[sP].

CAC

[LR](A)

ATC

ATA

SEQ ID 74
ATG
ATATATGATGTG
SEQ ID 365
LR](A)[sP].[dR](T)[sP].[LR](G)[sP].[LR](T)[sP].[dR](C)[sP].[dR](A)
1077

TCA
TGACAT

[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR](C)[sP].[dR](A)[sP].[LR](T)[sP].

CAC

[dR](C)[sP].[dR](A)[sP].[LR](T)[sP].[LR](A)[sP].[dR](T)[sP].[LR]

ATC

(A)[sP].[LR](T)

ATA

TAT

SEQ ID 75
TCA
TATATATATGATG
SEQ ID 366
LR](T)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR](C)[sP].[dR](A)[sP].[LR]
1078

CAC
TGTGA

([5meC])[sP].[LR](A)[sP].[dR](T)[sP].[dR](C)[sP].[dR](A)[sP].[LR]

ATC

(T)[sP].[dR](A)[sP].[LR](T)[sP].[dR](A)[sP].[LR](T)[sP].[dR](A)[sP].

ATA

[LR](T)[sP].[LR](A)

TAT

ATA

SEQ ID 76
CAC
TAGTATATATATG
SEQ ID 367
LR]([5meC])[sP].[LR](A)[sP].[dR][C)[sP].[LR](A)[sP].[dR](T)[sP].[dR]
1079

ATC
ATGTG

(C)[sP].[dR](A)[sP].[LR](T)[sP].[dR](A)[sP].[dR](T)[sP].[LR](A)[sP].

ATA

[LR](T)[sP].[dR](A)[sP].[LR](T)[sP].[LR](A)[sP].[dR](C)[sP].[LR]

TAT

(T)[sP].[LR](A)

ATA

CTA

SEQ ID 77
ATC
CTGTAGTATATAT
SEQ ID 368
LR](A)[sP].[LR](T)[sP].[dR](C)[sP].[dR](A)[sP].[LR](T)[sP].[LR](A)[sP].
1080

ATA
ATGAT

[dR](T)[sP].[dR](A)[sP].[LR](T)[sP].[LR](A)[sP].[dR](T)[sP].[LR]

TAT

(A)[sP].[dR](C)[sP].[dR](T)[sP].[dR](A)[sP].[LR]([5meC])[sP].[LR](A)

ATA

[sP].[LR](G)

CTA

CAG

SEQ ID 78
ATA
ATTCTGTAGTATA
SEQ ID 369
LR](A)[sP].[LR](T)[sP].[dR](A)[sP].[LR](T)[sP].[dR](A)[sP].[LR](T)[sP].
1081

TAT
TATAT

[LR](A)[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[LR](T)[sP].[dR]

ATA

(A)[sP].[LR]([5meC][sP].[LR](A)[sP].[dR][G)[sP].[LR](A)[sP].[LR](A)

CTA

[sP].[LR](T)

CAG

AAT

SEQ ID 79
TAT
CTCATTCTGTAGT
SEQ ID 370
LR](T)[sP].[LR](A)[sP].[dR](T)[sP].[LR](A)[sP].[LR](T)[sP].[dR](A)[sP].
1082

ATA
ATATA

[dR][C)[sP].[LR](T)[sP].[dR](A)[sP].[LR]([5meC])[sP].[dR](A)[sP].

CTA

[dR](G)[sP].[LR](A)[sP].[LR](A)[sP].[dR](T)[sP].[LR](G)[sP].[LR](A)

CAG

[sP].[LR](G)

AAT

GAG

SEQ ID 80
ATA
TGGCTCATTCTGT
SEQ ID 371
LR](A)[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR](A)[sP].
1083

CTA
AGTAT

[dR](C)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[LR](A)[sP].[dR]

CAG

(T)[sP].[LR](G)[sP].[dR](A)[sP].[LR](G)[sP].[dR](C)[sP].[LR]([5meC])

AAT

[sP].[LR](A)

GAG

CCA

SEQ ID 81
CTA
GATTGGCTCATTC
SEQ ID 372
LR]([5meC])[sP].[dR](T)[sP].[dR](A)[sP].[LR]([5meC][sP].[dR](A)
1084

CAG
TGTAG

[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[dR](T)[sP].[LR](G)[sP].[dR]

AAT

(A)[sP].[dR](G)[sP].[LR]([5meC])[sP].[dR](C)[sP].[LR](A)[sP].[dR]

GAG

(A)[sP].[LR](T)[sP].[LR]([5meC])

CCA

ATC

SEQ ID 82
CAG
TAAGATTGGCTC
SEQ ID 373
LR]([5meC])[sP].[dR](A)[sP].[LR](G)[sP].[dR](A)[sP].[LR](A)[sP].[dR]
1085

AAT
ATTCTG

(T)[sP][LR](G)[sP].[dR](A)[sP].[LR](G)[sP].[dR](C)[sP].[dR](C)[sP].

GAG

[LR](A)[sP].[LR](A)[sP].[dR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR]

CCA

(T)[sP].[LR](A)

ATC

TTA

SEQ ID 83
AAT
GTGTAAGATTGG
SEQ ID 374
LR](A)[sP].[dR](A)[sP].[LR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)
1086

GAG
CTCATT

[sP].[LR]([5meC])[sP].[dR](C)[sP].[dR](A)[sP].[dR](A)[sP].[LR](T)[sP].

CCA

[dR](C)[sP].[LR](T)[sP].[dR](T)[sP].[dR](A)[sP].[dR](C)[sP].[LR]

ATC

(A)[sP].[LR]([5meC])

TTA

CAC

SEQ ID 84
GAG
ACTGTGTAAGAT
SEQ ID 375
LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR][C)[sP].[LR]([5meC])[sP].
1087

CCA
TGGCTC

[dR](A)[sP].[dR](A)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[dR](T)

ATC

[sP].[LR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](A)[sP].[LR]

TTA

(G)[sP].[LR](T)

CAC

AGT

SEQ ID 85
CCA
AGCACTGTGTAA
SEQ ID 376
LR]([5meC])[sP].[dR](C)[sP].[dR](A)[sP].[dR](A)[sP].[LR](T)[sP].[dR]
1088

ATC
GATTGG

(C)[sP].[dR](T)[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[dR](A)[sP].

TTA

[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[dR](G)[sP].[LR]

CAC

([5meC])[sP].[LR](T)

AGT

GCT

SEQ ID 86
ATC
TAAAGCACTGTG
SEQ ID 377
LR](A)[sP].[dR](T)[sP].[dR](C)[sP].[LR](T)[sP].[dR](T)[sP].[dR][A)[sP].
1089

TTA
TAAGAT

[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](A)[sP].[dR](G)[sP].[LR]

CAC

(T)[sP].[dR](G)[sP].[dR](C)[sP].[LR](T)[sP].[dR](T)[sP].[LR](T)[sP].

AGT

[LR](A)

GCT

TTA

SEQ ID 87
TTA
TACTAAAGCACT
SEQ ID 378
LR](T)[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].
1090

CAC
GTGTAA

[LR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](G)[sP].[dR](C)[sP].[dR]

AGT

(T)[sP].[LR](T)[sP].[dR](T)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].

GCT

[LR](A)

TTA

GTA

SEQ ID 88
CAC
CCTTACTAAAGCA
SEQ ID 379
LR]([5meC])[sP].[dR](A)[sP].[dR][C)[sP].[LR](A)[sP].[dR][G][sP].[dR]
1091

AGT
CTGTG

(T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[dR](T)[sP].[LR](T)[sP].

GCT

[dR](A)[sP].[dR][G)[sP].[dR](T)[sP].[LR](A)[sP].[dR](A)[sP].[LR]

TTA

(G)[sP].[LR](G)

GTA

AGG

SEQ ID 89
AGT
TTGCCTTACTAAA
SEQ ID 380
LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[dR](G)[sP].[LR]([5meC])[sP].[dR]
1092

GCT
GCACT

(T)[sP].[dR](T)[sP].[dR](T)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].

TTA

[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR](G)[sP].[dR](C)[sP].[LR]

GTA

(A)[sP].[LR](A)

AGG

CAA

SEQ ID 90
GCT
TGTTTGCCTTACT
SEQ ID 381
LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](T)[sP].[LR](T)[sP].[dR](A)[sP].
1093

TTA
AAAGC

[dR](G)[sP].[dR](T)[sP].[LR](A)[sP].[dR](A)[sP].[dR][G][sP].[dR]

GTA

(G)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR](A)[sP].[dR](A)[sP].[LR]

AGG

([5meC])[sP].[LR](A)

CAA

ACA

SEQ ID 91
TTA
GCTTGTTTGCCTT
SEQ ID 382
LR](T)[sP].[dR](T)[sP].[LR](A)[sP].[LR](G)[sP].[dR](T)[sP].[dR](A)[sP].
1094

GTA
ACTAA

[dR](A)[sP].[LR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR]

AGG

(A)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[LR](G)[sP].

CAA

[LR]([5meC])

ACA

AGC

SEQ ID 92
GTA
AGAGCTTGTTTG
SEQ ID 383
LR](G)[sP].[dR](T)[sP].[dR](A)[sP].[LR](A)[sP].[dR][G)[sP].[dR](G)
1095

AGG
CCTTAC

[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](A)[sP].[dR](C)[sP].[LR]

CAA

(A)[sP].[dR](A)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR]([5meC])

ACA

[sP].[LR](T)

AGC

TCT

SEQ ID 93
AGG
GGTAGAGCTTGT
SEQ ID 384
LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)
1096

CAA
TTGCCT

[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[dR]

ACA

(C)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[dR](A)[sP].[LR]([5meC])

AGC

[sP].[LR]([5meC])

TCT

ACC

SEQ ID 94
CAA
CGAGGTAGAGCT
SEQ ID 385
LR]([5meC])[sP].[dR](A)[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[LR]
1097

ACA
TGTTTG

(A)[sP].[dR](A)[sP].[dR](G)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)[sP].

AGC

[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)[sP].[LR]

TCT

([5meC])[sP].[LR](G)

ACC

TCG

SEQ ID 95
ACA
CTCCGAGGTAGA
SEQ ID 386
LR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP][dR](C)
1098

AGC
GCTTGT

[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[dR]

TCT

(C)[sP].[LR](T)[sP].[dR]([5meC])[sP].[dR](G)[sP].[dR](G)[sP].[LR]

ACC

(A)[sP].[LR](G)

TCG

GAG

SEQ ID 96
AGC
AGACTCCGAGGT
SEQ ID 387
LR](A)[sP].[dR](G)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)
1099

TCT
AGAGCT

[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR]

ACC

([5meC])[sP].[LR](T)

TCG

GAG

TCT

SEQ ID 97
TCT
TTCAGACTCCGA
SEQ ID 388
LR](T)[sP].[dR][C)[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].
1100

ACC
GGTAGA

[LR](T)[sP].[dR]([5meC])[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].

TCG

[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[dR](G)[sP].[R]

GAG

(A)[sP].[LR](A)

TCT

GAA

SEQ ID 98
ACC
CTCTTCAGACTCC
SEQ ID 389
LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR]
1101

TCG
GAGGT

(G)[sP].[dR](G)[sP].[dR](A)[sP].[LR](G)[sP].[dR](T)[sP].[dR](C)[sP].

GAG

[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR]

TCT

(A)[sP].[LR](G)

GAA

GAG

SEQ ID 99
TCG
TGACTCTTCAGAC
SEQ ID 390
LR](T)[sP].[dR]([5meC])[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR]
1102

GAG
TCCGA

(G)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].

TCT

[dR](A)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR]

GAA

([5meC])[sP].[LR](A)

GAG

TCA

SEQ ID
GAG
TGTTGACTCTTCA
SEQ ID 391
LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)
1103

100
TCT
GACTC

[sP].[dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[LR](G)[sP].[dR](A)[sP].[dR]

GAA

(G)[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR](A)[sP].[LR]

GAG

([5meC])[sP].[LR](A)

TCA

ACA

SEQ ID
TCT
CACTGTTGACTCT
SEQ ID 392
LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)
1104

101
GAA
TCAGA

[sP].[dR](G)[sP].[LR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[LR]

GAG

(A)[sP].[LR](A)[sP].[dR][C][sP].[LR](A)[sP].[dR](G)[sP].[LR](T)[sP].

TCA

[LR](G)

ACA

GTG

SEQ ID
GAA
TGACACTGTTGA
SEQ ID 393
LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](G)
1105

102
GAG
CTCTTC

[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[dR]

TCA

(A)[sP].[LR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](T)[sP].[LR](/5meC])

ACA

[sP].[LR](A)

GTG

TCA

SEQ ID
GAG
TTCTGACACTGTT
SEQ ID 394
LR](G)[sP].[dR](A)[sP].[LR](G)[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR]
1106

103
TCA
GACTC

(A)[sP].[dR](A)[sP].[dR][C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].

ACA

[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[LR]

GTG

(A)[sP].[LR](A)

TCA

GAA

SEQ ID
TCA
GGATTCTGACAC
SEQ ID 395
LR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)
1107

104
ACA
TGTTGA

[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](T)[sP].[dR](C)[sP].[dR]

GTG

(A)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[dR](T)[sP].[LR]([5meC])

TCA

[sP].[LR]([5meC])

GAA

TCC

SEQ ID
ACA
CATGGATTCTGA
SEQ ID 396
LR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)
1108

105
GTG
CACTGT

[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR][G][sP].[dR](A)[sP].[dR]

TCA

(A)[sP].[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[dR](A)[sP].[LR](T)[sP].

GAA

[LR](G)

TCC

ATG

SEQ ID
GTG
TCCCATGGATTCT
SEQ ID 397
LR](G)[sP].[dR](T)[sP].[dR](G)[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR]
1109

106
TCA
GACAC

(A)[sP].[dR](G)[sP].[dR](A)[sP].[LR](A)[sP].[dR](T)[sP].[dR](C)[sP].

GAA

[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[dR](G)[sP].[dR](G)[sP].[LR]

TCC

(G)[sP].[LR](A)

ATG

GGA

SEQ ID
TCA
TCTTCCCATGGAT
SEQ ID 398
LR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR][A)[sP].[LR](A)
1110

107
GAA
TCTGA

[sP].[dR](T)[sP].[dR](C)[sP].[dR][C)[sP].[LR](A)[sP].[dR](T)[sP].[dR]

TCC

(G)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[LR](G)[sP].

ATG

[LR](A)

GGA

AGA

SEQ ID
GAA
ACATCTTCCCATG
SEQ ID 399
LR](G)[sP].[dR](A)[sP].[LR](A)[sP].[dR](T)[sP].[dR](C)[sP].[dR](C)
1111

108
TCC
GATTC

[sP].[LR](A)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[dR]

ATG

(A)[sP].[LR](A)[sP].[dR](G)[sP].[dR](A)[sP].[dR](T)[sP].[LR](G)[sP].

GGA

[LR](T)

AGA

TGT

SEQ ID
TCC
AGAACATCTTCCC
SEQ ID 400
LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[P].[dR](T)[sP].[dR](G)
1112

109
ATG
ATGGA

[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR]

GGA

(A)[sP].[dR](T)[sP].[LR](G)[sP].[dR](T)[sP].[dR](T)[sP].[LR]([5meC])

AGA

[sP].[LR](T)

TGT

TCT

SEQ ID
ATG
CCCAGAACATCTT
SEQ ID 401
LR](A)[sP].[dR](T)[sP].[dR](G)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)
1113

110
GGA
CCCAT

[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](T)[sP].[dR](G)[sP].[dR]

AGA

(T)[sP][LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[dR](G)[sP].[LR](G)[sP].

TGT

[LR](G)

TCT

GGG

SEQ ID
GGA
CTCCCCAGAACAT
SEQ ID 402
LR](G)[sP].[dR](G)[sP].[dR](A)[sP].[dR](A)[sP].[LR](G)[sP].[dR](A)
1114

111
AGA
CTTCC

[sP].[dR](T)[sP].[dR](G)[sP].[LR](T)[sP].[dR](T)[sP].[dR](C)[sP].[dR]

TGT

(T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].

TCT

[LR](G)

GGG

GAG

SEQ ID
AGA
CACCTCCCCAGA
SEQ ID 403
LR](A)[sP].[dR](G)[sP].[dR](A)[sP].[dR](T)[sP].[LR](G)[sP].[dR](T)
1115

112
TGT
ACATCT

[sP].[dR](T)[sP].[dR](C)[sP].[LR](T)[sP].[dR](G)[sP].[dR](G)[sP].[dR]

TCT

(G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR](G)[sP].[LR](T)[sP].

GGG

(LR](G)

GAG

GTG

SEQ ID
TGT
TGTCACCTCCCCA
SEQ ID 404
LR](T)[sP].[dR](G)[sP].[dR](T)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].
1116

113
TCT
GAACA

[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[dR][G)[sP].[LR](A)[sP].[dR]

GGG

(G)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[LR]([5meC])

GAG

[sP].[LR](A)

GTG

ACA

SEQ ID
TCT
AGTTGTCACCTCC
SEQ ID 405
LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR][G][sP].[dR](G)[sP].[dR](G)
1117

114
GGG
CCAGA

[sP].[dR](G)[sP].[LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[dR](T)[sP].[LR]

GAG

(G)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[LR]([5meC])

GTG

[sP].[LR](T)

ACA

ACT

SEQ ID
GGG
CCCAGTIGTCACC
SEQ ID 406
LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](G)
1118

115
GAG
TCCCC

[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR][C][sP].[dR]

GTG

(A)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[dR][G][sP].[LR](G)[sP].

ACA

[LR](G)

ACT

GGG

SEQ ID
GAG
AGGCCCAGTIGT
SEQ ID 407
LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](G)[sP].[dR](T)[sP].[LR](G)
1119

116
GTG
CACCTC

[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[dR]

ACA

(T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])

ACT

[sP].[LR](T)

GGG

CCT

SEQ ID
GTG
TGCAGGCCCAGT
SEQ ID 408
LR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)
1120

117
ACA
TGTCAC

[sP].[dR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR]

ACT

(G)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR](G)[sP].[LR]

GGG

([5meC])[sP].[LR](A)

CCT

GCA

SEQ ID
ACA
AGGTGCAGGCCC
SEQ ID 409
LR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[dR](T)
1121

118
ACT
AGTTGT

[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])[sP].

GGG

[dR](T)[sP].[dR][G][sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[LR]

CCT

([5meC])[sP].[LR](T)

GCA

CCT

SEQ ID
ACT
AGCAGGTGCAGG
SEQ ID 410
LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)
1122

119
GGG
CCCAGT

[sP].[dR][C)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR][G)[sP].[dR](C)[sP].

CCT

[LR](A)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR](G)[sP].

GCA

[LR]([5meC])[sP].[LR](T)

CCT

GCT

SEQ ID
GGG
TGCAGCAGGTGC
SEQ ID 411
LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])[sP].
1123

120
CCT
AGGCCC

[dR](T)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)

GCA

[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[dR](G)[sP].[LR]

CCT

([5meC])[sP].[LR](A)

GCT

GCA

SEQ ID
CCT
CTCTGCAGCAGG
SEQ ID 412
LR]([5meC])[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR][C)[sP].[LR]
1124

121
GCA
TGCAGG

(A)[sP].[dR](C)[sP].[dR][C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].

CCT

[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](A)[sP].[dR](G)[sP].[LR]

GCT

(A)[sP].[LR](G)

GCA

GAG

SEQ ID
GCA
CACCTCTGCAGC
SEQ ID 413
LR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)
1125

122
CCT
AGGTGC

[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR]

GCT

(A)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR](G)[sP].[LR](T)[sP].

GCA

[LR](G)

GAG

GTG

SEQ ID
CCT
GTGCACCTCTGC
SEQ ID 414
LR]([5meC])[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR]
1126

123
GCT
AGCAGG

(T)[sP].[LR](G)[sP].[dR](C)[sP].[dR][A)[sP].[dR](G)[sP].[LR](A)[sP].

GCA

[dR](G)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[LR]

GAG

(A)[sP].[LR]([5meC])

GTG

CAC

SEQ ID
GCT
TACGTGCACCTCT
SEQ ID 415
LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[R](A)
1127

124
GCA
GCAGC

[sP].[dR](G)[sP].[LR](A)[sP].[dR](G)[sP].[dR][G)[sP].[dR](T)[sP].[LR]

GAG

(G)[sP].[dR](C)[sP].[LR](A)[sP].[dR]([5meC])[sP].[dR](G)[sP].[LR]

GTG

(T)[sP].[LR](A)

CAC

GTA

SEQ ID
GCA
GACTACGTGCAC
SEQ ID 416
LR](G)[sP].[dR](C)[sP].[dR](A)[sP].[dR][G)[sP].[LR](A)[sP].[dR](G)
1128

125
GAG
CTCTGC

[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](A)[sP].[dR]

GTG

(C)[sP].[LR](G)[sP].[dR](T)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].

CAC

[LR]([5meC])

GTA

GTC

SEQ ID
GAG
TCAGACTACGTG
SEQ ID 417
LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](G)[sP].[dR](T)[sP].[dR](G)
1129

126
GTG
CACCTC

[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[LR](G)[sP].[dR](T)[sP].[dR]

CAC

(A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)|sP].

GTA

[LR](A)

GTC

TGA

SEQ ID
GTG
CACTCAGACTAC
SEQ ID 418
LR](G)[sP].[dR](T)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)
1130

127
CAC
GTGCAC

[sP].[LR](G)[sP].[dR](T)[sP].[dR](A)[sP].[dR][G)[sP].[LR](T)[sP].[dR]

GTA

(C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].

GTC

[LR](G)

TGA

GTG

SEQ ID
CAC
CAGCACTCAGAC
SEQ ID 419
LR]([5meC])[sP].[dR](A)[sP].[dR](C)[sP].[LR][G][sP].[dR](T)[sP].[dR]
1131

128
GTA
TACGTG

(A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].

GTC

[dR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[LR]

TGA

(T)[sP].[LR](G)

GTG

CTG

SEQ ID
GTA
CCGCAGCACTCA
SEQ ID 420
LR](G)[sP].[dR](T)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)
1132

129
GTC
GACTAC

[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR]

TGA

(G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[LR](G)[sP].

GTG

[LR](G)

CTG

CGG

SEQ ID
GTC
AGTCCGCAGCAC
SEQ ID 421
LR](G)[sP].[dR](T)[sP].[dR][C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)
1133

130
TGA
TCAGAC

[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[dR]

GTG

(G)[sP].[LR]([5meC])[sP].[dR](G)[sP].[dR](G)[sP].[dR](A)[sP].[LR]

CTG

([5meC])[sP].[LR](T)

CGG

ACT

SEQ ID
TGA
CTGAGTCCGCAG
SEQ ID 422
LR](T)[sP].[dR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](G)
1134

131
GTG
CACTCA

[sP].[dR][C)[sP].[dR](T)[sP].[LR](G)[sP].[dR]([5meC])[sP].[dR](G)[sP].

CTG

[dR](G)[sP].[LR](A)[sP].[dR][C][sP].[dR](T)[sP].[dR](C)[sP].[LR]

CGG

(A)[sP].[LR](G)

ACT

CAG

SEQ ID
GTG
CTGCTGAGTCCG
SEQ ID 423
LR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)
1135

132
CTG
CAGCAC

[sP].[dR]([5meC])[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](C)[sP].

CGG

[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](C)[sP].[LR]

ACT

(A)[sP].[LR](G)

CAG

CAG

SEQ ID
CTG
GGTCTGCTGAGT
SEQ ID 424
LR]([5meC])[sP].[dR](T)[sP].[LR](G)[sP].[dR]([5meC])[sP].[dR](G)
1136

133
CGG
CCGCAG

[sP].[dR](G)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[dR](C)[sP].[LR]

ACT

(A)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](A)[sP].

CAG

[LR]([5meC])[sP].[LR]([5meC])

CAG

ACC

SEQ ID
CGG
CCGGGTCTGCTG
SEQ ID 425
LR]([5meC])[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](C)[sP].
1137

134
ACT
AGTCCG

[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR][G)[sP].[dR](C)[sP].[LR](A)

CAG

[sP].[dR][G)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[dR](C)[sP].[LR]

CAG

(G)[sP].[LR](G)

ACC

CGG

SEQ ID
ACT
TGGCCGGGTCTG
SEQ ID 426
LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[dR][C)[sP].[LR](A)[sP].[dR](G)
1138

135
CAG
CTGAGT

[sP].[dR](C)[sP].[dR](A)[sP].[LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[dR]

CAG

(C)[sP].[LR]([5meC])[sP].[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]

ACC

([5meC])[sP].[LR](A)

CGG

CCA

SEQ ID
CAG
CGGTGGCCGGGT
SEQ ID 427
LR]([5meC])[sP].[dR](A)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].
1139

136
CAG
CTGCTG

[dR](G)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[dR](C)[sP].[LR](G)

ACC

[sP].[dR](G)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[LR]

CGG

([5meC])[sP].[LR](G)

CCA

CCG

SEQ ID
CAG
GGCCGGTGGCCG
SEQ ID 428
LR]([5meC])[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](C)[sP].[dR]
1140

137
ACC
GGTCTG

(C)[sP].[dR](C)[sP].[LR](G)[sP].[dR][G][sP].[dR](C)[sP].[dR](C)[sP].

CGG

[LR](A)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](G)[sP].[dR](G)[sP].

CCA

[LR]([5meC])[sP].[LR]([5meC])

CCG

GCC

SEQ ID
ACC
TAAGGCCGGTGG
SEQ ID 429
LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[dR](C)[sP].[LR](G)[sP].[dR](G)
1141

138
CGG
CCGGGT

[sP].[dR][C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR]([5meC])

CCA

[sP].[dR](G)[sP].[LR](G)[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)[sP].[LR]

CCG

(T)[sP].[LR](A)

GCC

TTA

SEQ ID
CGG
GAGTAAGGCCGG
SEQ ID 430
LR]([5meC])[sP].[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])
1142

139
CCA
TGGCCG

[sP].[dR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR](G)[sP].[dR](G)[sP].[dR]

CCG

(C)[sP].[dR](C)[sP].[LR](T)[sP].[dR](T)[sP].[dR][A][sP].[dR](C)[sP].

GCC

[LR](T)[sP].[LR]([5meC])

TTA

CTC

SEQ ID
CCA
ATGGAGTAAGGC
SEQ ID 431
LR]([5meC])[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR]
1143

140
CCG
CGGTGG

(G)[sP].[dR](G)[sP].[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].[dR](T)[sP].

GCC

[dR](A)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR]

TTA

(A)[sP].[LR](T)

CTC

CAT

SEQ ID
CCG
GAAATGGAGTAA
SEQ ID 432
LR]([5meC])[sP].[dR]([5meC])[sP].[dR](G)[sP].[dR](G)[sP].[LR][5meC])
1144

141
GCC
GGCCGG

[sP].[dR](C)[sP].[dR](T)[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)

TTA

[sP].[dR](T)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR](T)[sP].

CTC

[dR](T)[sP].[LR](T)[sP].[LR]([5meC])

CAT

TTC

SEQ ID
GCC
AGGGAAATGGA
SEQ ID 433
LR](G)[sP].[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].[dR](T)[sP].[dR](A)
1145

142
TTA
GTAAGGC

[sP].[dR][C][sP].[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR]

CTC

(T)[sP].[LR](T)[sP].[dR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR]([5meC])

CAT

[sP].[LR](T)

TTC

CCT

SEQ ID
TTA
TCCAGGGAAATG
SEQ ID 434
LR](T)[sP].[dR](T)[sP].[dR](A)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)[sP].
1146

143
CTC
GAGTAA

[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[LR](T)[sP].[dR](T)[sP].[dR](C)

CAT

[sP].[LR]([5meC])[sP].[dR](C)[sP].[dR](T)[sP].[dR](G)[sP].[LR](G)

TTC

[sP].[LR](A)

CCT

GGA

SEQ ID
CTC
CCTTCCAGGGAA
SEQ ID 435
LR]([5meC])[sP].[dR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR]
1147

144
CAT
ATGGAG

(T)[sP].[dR](T)[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR](C)[sP].[dR]

TTC

(C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](A)[sP].[dR](A)[sP].

CCT

[LR](G)[sP].[LR](G)

GGA

AGG

SEQ ID
CAT
TTTCCTTCCAGGG
SEQ ID 436
LR]([5meC])[sP].[dR](A)[sP].[dR](T)[sP].[dR](T)[sP].[LR](T)[sP].[dR]
1148

145
TTC
AAATG

(C)[sP].[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].[dR](G)[sP].[dR](G)[sP].

CCT

[dR](A)[sP].[LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[dR](A)[sP].[LR]

GGA

(A)[sP].[LR](A)

AGG

AAA

SEQ ID
TTC
GTCTTTCCTTCCA
SEQ ID 437
LR](T)[sP].[dR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR]
1149

146
CCT
GGGAA

(T)[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].

GGA

[dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[dR](A)[sP].[dR][G][sP].[LR]

AGG

(A)[sP].[LR]([5meC])

AAA

GAC

SEQ ID
CCT
TTGGTCTTTCCTT
SEQ ID 438
LR]([5meC])[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR]
1150

147
GGA
CCAGG

(A)[sP].[LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[dR](A)[sP].[LR](A)[sP].

AGG

[dR](A)[sP].[LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR]

AAA

(A)[sP].[LR](A)

GAC

CAA

SEQ ID
GGA
ACTTTGGTCTTTC
SEQ ID 439
LR](G)[sP].[dR](G)[sP].[LR][A)[sP].[dR](A)[sP].[dR](G)[sP].[LR](G)
1151

148
AGG
CTTCC

[sP].[dR](A)[sP].[dR](A)[sP].[LR](A)[sP].[dR](G)[sP].[dR](A)[sP].[dR]

AAA

(C)[sP].[LR]([5meC])[sP].[dR](A)[sP].[LR](A)[sP].[dR](A)[sP].[LR]

GAC

(G)[sP].[LR](T)

CAA

AGT

SEQ ID
AGG
CTCACTTTGGTCT
SEQ ID 440
LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[LR](A)
1152

149
AAA
TTCCT

[sP].[dR](G)[sP].[LR](A)[sP].[dR][C)[sP].[dR](C)[sP].[LR][A][sP].[LR]

GAC

(A)[sP].[dR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[LR](A)[sP].

CAA

[LR](G)

AGT

GAG

SEQ ID
AAA
TCCCTCACTTTGG
SEQ ID 441
LR](A)[sP].[dR](A)[sP].[LR](A)[sP].[dR](G)[sP].[dR](A)[sP].[dR](C)
1153

150 5
GAC
TCTTT

[sP].[LR]([5meC])[sP].[dR](A)[sP].[LR](A)[sP].[LR](A)[sP].[dR](G)[sP].

CAA

[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR](G)[sP].[LR]

AGT

(G)[sP].[LR](A)

GAG

GGA

SEQ ID
GAC
TGTTCCCTCACTT
SEQ ID 442
LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[LR](A)
1154

151
CAA
TGGTC

[sP].[dR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR]

AGT

(G)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[LR]([5meC])

GAG

[sP].[LR](A)

GGA

ACA

SEQ ID
CAA
GAGTGTTCCCTC
SEQ ID 443
LR]([5meC])[sP].[dR](A)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].
1155

152
AGT
ACTTTG

[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[LR](G)[sP].[dR](G)[sP].[LR](G)

GAG

[sP].[dR](A)[sP].[LR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[LR]

GGA

(T)[sP].[LR]([5meC])

ACA

CTC

SEQ ID
AGT
TTGGAGTGTTCC
SEQ ID 444
LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)
1156

153
GAG
CTCACT

sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[dR](C)[sP].[LR]

GGA

(A)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].

ACA

[LR](A)

CTC

CAA

SEQ ID
GAG
TCCTTGGAGTGTT
SEQ ID 445
LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR](G)[sP].[LR](G)[sP].[R](A)
1157

154
GGA
CCCTC

[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[dR]

ACA

(C)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR]

CTC

(G)[sP].[LR](A)

CAA

GGA

SEQ ID
GGA
ATTTCCTTGGAGT
SEQ ID 446
LR](G)[sP].[dR](G)[sP].[dR](A)[sP].[LR](A)[sP].[dR](C)[sP].[LR](A)
1158

155
ACA
GTTCC

[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[LR]

CTC

(A)[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[LR](A)[sP].

CAA

[LR](T)

GGA

AAT

SEQ ID
ACA
TGCATTTCCTTGG
SEQ ID 447
LR](A)[sP].[dR](C)[sP].[dR](A)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)
1159

156
CTC
AGTGT

[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[dR](G)[sP].[LR]

CAA

(A)[sP].[dR](A)[sP].[LR](A)[sP].[dR](T)[sP].[dR](G)[sP].[LR]([5meC])

GGA

[sP].[LR](A)

AAT

GCA

SEQ ID
CTC
GTGTGCATTTCCT
SEQ ID 448
LR]([5meC])[sP].[dR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR]
1160

157
CAA
TGGAG

(A)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[LR](A)[sP].[dR](A)[sP].

GGA

[dR](T)[sP].[dR](G)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR](C)[sP].

AAT

[LR](A)[sP].[LR]([5meC])

GCA

CAC

SEQ ID
CAA
CTGGTGTGCATTT
SEQ ID 449
LR]([5meC])[sP].[dR](A)[sP].[dR][A][sP].[dR](G)[sP].[LR](G)[sP].
1161

158
GGA
CCTTG

[dR](A)[sP].[LR](A)[sP].[LR](A)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)

AAT

[sP].[LR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR]

GCA

(A)[sP].[LR](G)

CAC

CAG

SEQ ID
GGA
AGCCTGGTGTGC
SEQ ID 450
LR](G)[sP].[dR](G)[sP].[dR](A)[sP].[dR](A)[&P].[LR](A)[sP].[dR](T)
1162

159
AAT
ATTTCC

[sP].[dR][G][sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](A)[sP].[dR]

GCA

(C)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR](G)[sP].[dR](G)[sP].[LR]

CAC

([5meC])[sP].[LR](T)

CAG

GCT

SEQ ID
AAT
CATAGCCTGGTG
SEQ ID 451
LR](A)[sP].[dR](A)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](A)
1163

160
GCA
TGCATT

[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR]

CAC

(G)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[dR](A)[sP].[LR](T)[sP].

CAG

[LR](G)

GCT

ATG

SEQ ID
GCA
TCCCATAGCCTG
SEQ ID 452
LR](G)[sP].[dR](C)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)
1164

161
CAC
GTGTGC

[sP].[dR](C)[sP].[dR](A)[sP].[LR](G)[sP].[dR][G)[sP].[dR](C)[sP].[dR]

CAG

(T)[sP].[LR](A)[sP].[dR](T)[sP].[dR](G)[sP].[dR](G)[sP].[LR](G)[sP].

GCT

[LR](A)

ATG

GGA

SEQ ID
CAC
ACCTCCCATAGCC
SEQ ID 453
LR]([5meC])[sP].[dR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR]
1165

162
CAG
TGGTG

(G)[sP].[LR](G)[sP].[dR][C)[sP].[dR](T)[sP].[LR](A)[sP].[dR](T)[sP].

GCT

[dR](G)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR]

ATG

(G)[sP].[LR](T)

GGA

GGT

SEQ ID
CAG
GCCACCTCCCATA
SEQ ID 454
LR]([5meC])[sP].[dR](A)[sP].[dR](G)[sP].[LR](G)[sP].[dR](C)[sP].
1166

163
GCT
GCCTG

[dR](T)[sP].[LR](A)[sP].[dR](T)[sP].[dR][G][sP].[dR](G)[sP].[LR](G)

ATG

[sP].[dR](A)[sP].[dR](G)[sP].[LR](G)[sP].[dR](T)[sP].[dR](G)[sP].[LR]

GGA

(G)[sP].[LR]([5meC]

GGT

GGC

SEQ ID
GCT
TGAGCCACCTCCC
SEQ ID 455
LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](A)[sP].[dR](T)[sP].[dR](G)
1167

164
ATG
ATAGC

[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](G)[sP].[dR]

GGA

(T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR]([5meC])

GGT

[sP].[LR](A)

GGC

TCA

SEQ ID
ATG
CTCTGAGCCACCT
SEQ ID 456
LR](A)[sP].[dR](T)[sP].[dR](G)[sP].[dR](G)[sP].[LR][G][sP].[dR](A)
1168

165
GGA
CCCAT

[sP].[dR](G)[sP].[dR](G)[sP].[LR](T)[sP].[dR][G)[sP].[dR](G)[sP].[dR]

GGT

(C)[sP].[LR](T)[sP].[dR](C)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].

GGC

[LR](G)

TCA

GAG

SEQ ID
GGA
ATGCTCTGAGCC
SEQ ID 457
LR](G)[sP].[dR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](G)[sP].[dR](T)
1169

166
GGT
ACCTCC

[sP].[dR](G)[sP].[dR](G)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR](C)

GGC

[sP].[dR](A)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR](C)[sP].[LR]

TCA

(A)[sP].[LR](T)

GAG

CAT

SEQ ID
GGT
CTTATGCTCTGAG
SEQ ID 458
LR](G)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](C)
1170

167
GGC
CCACC

[sP].[LR](T)[sP].[dR](C)[sP].[dR](A)[sP].[LR](G)[sP].[dR](A)[sP].[dR]

TCA

(G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[dR](A)[sP].[LR](A)[sP].

GAG

[LR](G)

CAT

AAG

SEQ ID
GGC
AGGCTTATGCTCT
SEQ ID 459
LR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)[sP].[dR](A)
1171

168
TCA
GAGCC

[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR]

GAG

(T)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])

CAT

[sP].[LR](T)

AAG

CCT

SEQ ID
TCA
AGCAGGCTTATG
SEQ ID 460
LR](T)[sP].[dR](C)[sP].[dR](A)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)
1172

169
GAG
CTCTGA

[sP].[dR][C)[sP].[LR](A)[sP].[dR](T)[sP].[LR](A)[sP].[dR](A)[sP].[dR]

CAT

(G)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR](G)[sP].[LR]

AAG

([5meC])[sP].[LR](T)

CCT

GCT

SEQ ID
GAG
ACAAGCAGGCTT
SEQ ID 461
LR](G)[sP].[dR](A)[sP].[dR][G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)
1173

170
CAT
ATGCTC

[sP].[dR](A)[sP].[dR](A)[sP].[LR](G)[sP].[dR](C)[sP].[dR](C)[sP].[dR]

AAG

(T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[dR](T)[sP].[LR](G)[sP].

CCT

[LR](T)

GCT

TGT

SEQ ID
CAT
CTAACAAGCAGG
SEQ ID 462
LR]([5meC])[sP].[dR](A)[sP].[dR](T)[sP].[LR](A)[sP].[dR](A)[sP].[dR]
1174

171
AAG
CTTATG

(G)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](T)[sP].[LR](G)[sP].[dR]

CCT

(C)[sP].[dR](T)[sP].[dR](T)[sP].[LR](G)[sP].[dR](T)[sP].[dR](T)[sP].

GCT

[LR](A)[sP].[LR](G)

TGT

TAG

SEQ ID
AAG
TGCCTAACAAGC
SEQ ID 463
LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR]
1175

172
CCT
AGGCTT

(T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[dR](T)[sP].[LR](G)[sP].

GCT

[dR](T)[sP].[dR](T)[sP].[LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[LR]

TGT

([5meC])[sP].[LR](A)

TAG

GCA

SEQ ID
CCT
GCTTGCCTAACA
SEQ ID 464
LR]([5meC])[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR]
1176

173
GCT
AGCAGG

(T)[sP].[dR](T)[sP].[LR](G)[sP].[dR](T)[sP].[dR](T)[sP].[LR](A)[sP].

TGT

[dR](G)[sP].[dR](G)[sP].[dR][C)[sP].[LR](A)[sP].[dR](A)[sP].[LR]

TAG

(G)[sP].[LR]([5meC])

GCA

AGC

SEQ ID
GCT
GATGCTTGCCTA
SEQ ID 465
LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[dR](T)[sP].[LR](G)[sP].[dR](T)
1177

174
TGT
ACAAGC

[sP].[dR](T)[sP].[LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]

TAG

(A)[sP].[dR](A)[sP].[dR](G)[sP].[dR][C)[sP].[LR](A)[sP].[LR](T)[sP].

GCA

[LR]([5meC])

AGC

ATC

SEQ ID
TGT
ATTGATGCTTGCC
SEQ ID 466
LR](T)[sP].[dR](G)[sP].[LR](T)[sP].[dR](T)[sP].[dR](A)[sP].[dR](G)
1178

175
TAG
TAACA

[sP].[LR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[dR]

GCA

(C)[sP].[LR](A)[&P].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].[LR](A)[sP].

AGC

[LR](T)

ATC

AAT

SEQ ID
TAG
TACATTGATGCTT
SEQ ID 467
LR](T)[sP].[dR](A)[sP].[dR](G)[sP].[dR](G)[sP].[LR]([5meC])[sP].[dR]
1179

176
GCA
GCCTA

(A)[sP].[dR](A)[sP].[dR](G)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR]

AGC

(T)[sP].[dR](C)[sP].[LR](A)[sP].[dR][A][sP].[dR](T)[sP].[dR](G)[sP].

ATC

[LR](T)[sP].[LR](A)

AAT

GTA

SEQ ID
GCA
TTTTACATTGATG
SEQ ID 468
LR](G)[sP].[LR]([5meC])[sP].[dR](A)[sP].[LR](A)[sP].[dR][G)[sP].[dR]
1180

177
AGC
CTTGC

(C)[sP].[dR](A)[sP].[LR](T)[sP].[dR][C)[sP].[dR](A)[sP].[dR](A)[sP].

ATC

[LR](T)[sP].[LR](G)[sP].[dR](T)[sP].[dR](A)[sP].[LR](A)[sP].[LR]

AAT

(A)[sP].[LR](A)

GTA

AAA

SEQ ID
AGC
AAATTTTACATTG
SEQ ID 469
LR](A)[sP].[dR](G)[sP].[LR]([5meC])[sP].[LR](A)[sP].[dR](T)[sP].[LR]
1181

178
ATC
ATGCT

([5meC])[sP].[dR](A)[sP].[dR](A)[sP].[LR](T)[sP].[dR](G)[sP].[LR]

AAT

(T)[sP].[LR](A)[sP].[dR](A)[sP].[LR](A)[sP].[dR](A)[sP].[LR](T)[sP].

GTA

[LR](T)[sP].[LR](T)

AAA

TTT

SEQ ID
ATC
TCCAAATTTTACA
SEQ ID 470
LR](A)[sP].[LR](T)[sP].[dR][C)[sP].[dR](A)[sP][LR](A)[sP].[dR](T)[sP].
1182

179
AAT
TTGAT

[LR](G)[sP].[LR](T)[sP].[dR](A)[sP].[LR](A)[sP].[dR](A)[sP].[LR]

GTA

(A)[sP].[LR](T)[sP].[dR](T)[sP].[dR](T)[sP].[LR](G)[sP].[LR](G)[sP].

AAA

[LR](A)

TTT

GGA

SEQ ID
AAT
TGCTCCAAATTTT
SEQ ID 471
LR](A)[sP].[dR](A)[sP].[dR](T)[sP].[dR][G][sP].[LR](T)[sP].[dR](A)
1183

180
GTA
ACATT

[sP].[LR](A)[sP].[LR](A)[sP].[dR](A)[sP].[dR](T)[sP].[LR](T)[sP].[LR]

AAA

(T)[sP].[dR](G)[sP].[dR](G)[sP].[dR](A)[sP].[LR](G)[sP].[LR]([5meC])

TTT

[sP].[LR](A)

GGA

GCA

SEQ ID
GTA
TCATGCTCCAAAT
SEQ ID 472
LR](G)[sP].[dR](T)[sP].[LR](A)[sP].[dR](A)[sP].[dR](A)[sP].[dR](A)
1184

181
AAA
TTTAC

[sP].[LR](T)[sP].[LR](T)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR]

TTT

(A)[sP].[LR](G)[sP].[dR](C)[sP].[dR](A)[sP].[LR](T)[sP].[LR](G)[sP].

GGA

[LR](A)

GCA

TGA

SEQ ID
AAA
CTGTCATGCTCCA
SEQ ID 473
LR](A)[sP].[LR](A)[sP].[dR](A)[sP].[dR](T)[sP].[dR](T)[sP].[LR](T)[sP].
1185

182
TTT
AATTT

[dR](G)[sP].[dR](G)[sP].[dR](A)[sP].[LR](G)[sP].[dR](C)[sP].[LR]

GGA

(A)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[LR]([5meC])[sP].[LR]

GCA

(A)[sP].[LR](G)

TGA

CAG

SEQ ID
TTT
CAACTGTCATGCT
SEQ ID 474
LR](T)[sP].[dR](T)[sP].[dR](T)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)
1186

183
GGA
CCAAA

[sP].[dR][G][sP].[dR][C)[sP].[LR](A)[sP].[dR](T)[sP].[LR](G)[sP].[dR]

GCA

(A)[sP].[dR](C)[sP].[dR](A)[sP].[LR](G)[sP].[dR](T)[sP].[LR](T)[sP].

TGA

[LR](G)

CAG

TTG

SEQ ID
GGA
GCACAACTGTCA
SEQ ID 475
LR](G)[sP].[dR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR]([5meC])[sP].
1187

184
GCA
TGCTCC

[dR](A)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)

TGA

[sP].[dR](G)[sP].[dR](T)[sP].[dR](T)[sP].[LR](G)[sP].[dR](T)[sP].[LR]

CAG

(G)[sP].[LR]([5meC])

TTG

TGC

SEQ ID
GCA
CAGGCACAACTG
SEQ ID 476
LR](G)[sP].[dR](C)[sP].[dR](A)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)
1188

185
TGA
TCATGC

[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[dR](T)[sP].[LR]

CAG

(G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].

TTG

[LR](G)

TGC

CTG

SEQ ID
TGA
ATACAGGCACAA
SEQ ID 477
LR](T)[sP].[dR](G)[sP].[dR][A][sP].[dR](C)[sP].[LR](A)[sP].[dR](G)
1189

186
CAG
CTGTCA

[sP].[dR](T)[sP].[dR](T)[sP].[LR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR]

TTG

(C)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](T)[sP].[LR](A)[sP].

TGC

[LR](T)

CTG

TAT

SEQ ID
CAG
GTTATACAGGCA
SEQ ID 478
LR]([5meC])[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](T)[sP].[dR]
1190

187
TTG
CAACTG

(G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)[sP].

TGC

[LR](G)[sP].[dR](T)[sP].[LR][A)[sP].[dR](T)[sP].[dR](A)[sP].[LR]

CTG

(A)[sP].[LR]([5meC])

TAT

AAC

SEQ ID
TTG
GGGGTTATACAG
SEQ ID 479
LR](T)[sP].[dR](T)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)
1191

188
TGC
GCACAA

[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](T)[sP].[dR](A)[sP].[dR]

CTG

(T)[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR]([5meC])

TAT

[sP].[LR]([5meC])

AAC

CCC

SEQ ID
TGC
GTTGGGGTTATA
SEQ ID 480
LR](T)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR]
1192

189
CTG
CAGGCA

(G)[sP].[dR](T)[sP].[LR](A)[sP].[dR](T)[sP].[LR](A)[sP].[dR](A)[sP].

TAT

[dR](C)[sP].[LR]([5meC])[sP].[dR](C)[sP].[dR](C)[sP].[dR](A)[sP].

AAC

[LR](A)[sP].[LR]([5meC])

CCC

AAC

SEQ ID
CTG
AGTGTTGGGGTT
SEQ ID 481
LR]([5meC])[sP].[dR](T)[sP].[dR](G)[sP].[dR](T)[sP].[LR](A)[sP].[dR]
1193

190
TAT
ATACAG

(T)[sP].[dR](A)[sP].[dR](A)[sP].[LR]([5meC])[sP].[dR](C)[sP].[dR]

AAC

(C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[dR](A)[sP].

CCC

[LR]([5meC])[sP].[LR](T)

AAC

ACT

SEQ ID
TAT
CTCAGTGTTGGG
SEQ ID 482
LR](T)[sP].[dR](A)[sP].[dR](T)[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)
1194

191
AAC
GTTATA

[sP].[LR]([5meC])[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].

CCC

[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[dR](G)[sP].[LR]

AAC

(A)[sP].[LR](G)

ACT

GAG

SEQ ID
AAC
TCCCTCAGTGTTG
SEQ ID 483
LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](C)[sP].[dR]
1195

192
CCC
GGGTT

(C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].

AAC

[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR](G)[sP].[LR]

ACT

(G)[sP].[LR](A)

GAG

GGA

SEQ ID
CAA
ACTCCGAGGTAG
SEQ ID 484
LR]([5meC])[sP].[dR](A)[sP].[dR][A)[sP].[dR](G)[sP].[LR]([5meC])
1196

193
GCT
AGCTTG

[sP].[dR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[dR]

CTA

(C)[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR](G)[sP].[dR](G)[sP].[dR]

CCT

(A)[sP].[LR](G)[sP].[LR](T)

CGG

AGT

SEQ ID
AAG
GACTCCGAGGTA
SEQ ID 485
LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)
1197

194
CTC
GAGCTT

[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].[dR]

TAC

([5meC])[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR]

CTC

(T)[sP].[LR]([5meC])

GGA

GTC

SEQ ID
GCT
CAGACTCCGAGG
SEQ ID 486
LR](G)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](A)[sP].
1198

195
CTA
TAGAGC

[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].[dR]([5meC])[sP].[dR](G)[sP].

CCT

[dR](G)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[dR](C)[sP].[LR]

CGG

(T)[sP].[LR](G)

AGT

CTG

SEQ ID
CTC
TCAGACTCCGAG
SEQ ID 487
LR]([5meC])[sP].[dR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](A)[sP].[dR]
1199

196
TAC
GTAGAG

(C)[sP].[dR](C)[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR](G)[sP].[dR]

CTC

(G)[sP].[dR](A)[sP].[LR](G)[sP].[dR](T)[sP].[dR](C)[sP].[dR](T)[sP].

GGA

[LR](G)[sP].[R](A)

GTC

TGA

SEQ ID
CTA
CTTCAGACTCCGA
SEQ ID 488
LR]([5meC])[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[dR]
1200

197
CCT
GGTAG

(T)[sP].[LR]([5meC])[sP].[dR][G)[sP].[dR](G)[sP].[dR](A)[sP].[LR]

CGG

(G)[sP].[dR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].

AGT

[LR](A)[sP].[LR](G)

CTG

AAG

SEQ ID
TAC
TCTTCAGACTCCG
SEQ ID 489
LR](T)[sP].[dR](A)[sP].[dR][C)[sP].[dR](C)[sP].[LR](T)[sP].[dR]([5meC])
1201

198
CTC
AGGTA

[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)

GGA

[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[LR]

GTC

(G)[sP].[R](A)

TGA

AGA

SEQ ID
CCT
ACTCTTCAGACTC
SEQ ID 490
LR]([5meC])[sP].[dR](C)[sP].[dR](T)[sP].[dR](C)[sP].[LR](G)[sP].[dR]
1202

199
CGG
CGAGG

(G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].

AGT

[dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[dR](A)[sP].[LR]

CTG

(G)[sP].[LR](T)

AAG

AGT

SEQ ID
CTC
GACTCTTCAGACT
SEQ ID 491
LR]([5meC])[sP].[dR](T)[sP].[dR]([5meC])[sP].[dR][G)[sP].[LR](G)
1203

200
GGA
CCGAG

[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR]

GTC

(G)[sP].[dR](A)[sP].[dR](A)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)

TGA

[sP].[LR](T)[sP].[LR]([5meC])

AGA

GTC

SEQ ID
CGG
TTGACTCTTCAGA
SEQ ID 492
LR]([5meC])[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[LR](G)[sP].
1204

201
AGT
CTCCG

[dR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)

CTG

[sP].[dR](G)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[dR](C)[sP].[LR]

AAG

(A)[sP].[LR](A)

AGT

CAA

SEQ ID
GGA
GTTGACTCTTCAG
SEQ ID 493
LR](G)[sP].[dR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)
1205

202
GTC
ACTCC

[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[LR](G)[sP].[dR]

TGA

(A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[dR](A)[sP].[LR](A)[sP].

AGA

[LR]([5meC])

GTC

AAC

SEQ ID
AGT
CTGTTGACTCTTC
SEQ ID 494
LR](A)[sP].[LR](G)[sP].[dR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)
1206

203
CTG
AGACT

[sP].[dR](A)[sP].[LR](A)[sP].[dR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR]

AAG

(T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].

AGT

[LR](G)

CAA

CAG

SEQ ID
GTC
ACTGTTGACTCTT
SEQ ID 495
LR](G)[sP].[dR](T)[sP].[dR][C)[sP].[dR](T)[sP].[LR][G)[sP].[dR](A)
1207

204
TGA
CAGAC

[sP].[dR](A)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR]

AGA

(C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[dR](A)[sP].[LR](G)[sP].

GTC

(LR](T)

AAC

AGT

SEQ ID
CTG
ACACTGTTGACTC
SEQ ID 496
LR]([5meC][sP].[dR](T)[sP].[dR][G)[sP].[dR](A)[sP].[LR](A)[sP].[dR]
1208

205
AAG
TTCAG

(G)[sP].[dR](A)[sP].[LR](G)[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].

AGT

[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR]

CAA

(G)[sP].[LR](T)

CAG

TGT

SEQ ID
TGA
GACACTGTTGAC
SEQ ID 497
LR](T)[sP].[dR][G)[sP].[dR](A)[sP].[dR](A)[sP].[LR](G)[sP].[dR](A)
1209

206
AGA
TCTTCA

[sP].[LR](G)[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR](A)[sP].

GTC

[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[dR](G)[sP].[LR]

AAC

(T)[sP].[LR]([5meC])

AGT

GTC

SEQ ID
AAG
CTGACACTGTTG
SEQ ID 498
LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)
1210

207
AGT
ACTCTT

[sP].[dR](C)[sP].[LR][A)[sP].[LR](A)[sP].[dR][C)[sP].[dR][A)[sP].[LR]

CAA

(G)[sP].[dR](T)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[LR](A)[sP].

CAG

[LR](G)

TGT

CAG

SEQ ID
AGA
TCTGACACTGTTG
SEQ ID 499
LR](A)[sP].[dR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)
1211

208
GTC
ACTCT

[sP].[LR](A)[sP].[dR](A)[sP].[dR][C)[sP].[LR](A)[sP].[dR](G)[sP].[dR]

AAC

(T)[sP].[LR](G)[sP].[dR](T)[sP].[dR](C)[sP].[dR](A)[sP].[LR](G)[sP].

AGT

[LR](A)

GTC

AGA

SEQ ID
AGT
ATTCTGACACTGT
SEQ ID 500
LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR](A)[sP].[LR]
1212

209
CAA
TGACT

(A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[dR](G)[sP].

CAG

[LR](T)[sP].[dR](C)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[LR]

TGT

(A)[sP].[LR](T)

CAG

AAT

SEQ ID
GTC
GATTCTGACACT
SEQ ID 501
LR](G)[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].[LR](A)[sP].[dR](C)
1213

210
AAC
GTTGAC

[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](G)[sP].[dR](T)[sP].[dR]

AGT

(C)[sP].[LR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[LR](T)[sP].

GTC

[LR]([5meC])

AGA

ATC

SEQ ID
CAA
TGGATTCTGACA
SEQ ID 502
LR]([5meC])[sP].[dR](A)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR]
1214

211
CAG
CTGTTG

(G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].

TGT

[dR](G)[sP].[dR](A)[sP].[LR](A)[sP].[dR](T)[sP].[dR](C)[sP].[LR]

CAG

([5meC])[sP].[LR](A)

AAT

CCA

SEQ ID
AAC
ATGGATTCTGAC
SEQ ID 503
LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[dR](A)[sP].[LR](G)[sP].[dR](T)
1215

212
AGT
ACTGTT

[sP].[LR](G)[sP].[dR](T)[sP].[dR](C)[sP].[dR](A)[sP].[LR](G)[sP].[dR]

GTC

(A)[sP].[LR](A)[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR](C)[sP].[LR]

AGA

(A)[sP].[LR](T)

ATC

CAT

SEQ ID
CAG
CCATGGATTCTG
SEQ ID 504
LR]([5meC])[sP].[dR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR]
1216

213
TGT
ACACTG

(T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](A)[sP].[LR](A)[sP].

CAG

[dR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[LR]

AAT

G)[sP].[LR](G)

CCA

TGG

SEQ ID
AGT
CCCATGGATTCT
SEQ ID 505
LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](T)[sP].[dR](C)
1217

214
GTC
GACACT

[sP].[LR](A)[sP].[dR](G)[sP].[dR](A)[sP].[LR](A)[sP].[dR](T)[sP].[dR]

AGA

(C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[dR](G)[sP].[LR](G)[sP].

ATC

[LR](G)

CAT

GGG

SEQ ID
TGT
TTCCCATGGATTC
SEQ ID 506
LR](T)[sP].[dR](G)[sP].[dR][T][sP].[dR](C)[sP].[LR](A)[sP].[dR](G)
1218

215
CAG
TGACA

[sP].[dR](A)[sP].[LR](A)[sP].[dR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR]

AAT

(A)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].

CCA

[LR](A)

TGG

GAA

SEQ ID
GTC
CTTCCCATGGATT
SEQ ID 507
LR](G)[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](A)
1219

216
AGA
CTGAC

[sP].[LR](A)[sP].[dR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR]

ATC

(T)[sP].[dR](G)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[LR](A)[sP].

CAT

[LR](G)

GGG

AAG

SEQ ID
CAG
ATCTTCCCATGGA
SEQ ID 508
LR]([5meC])[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[LR]
1220

217
AAT
TTCTG

(T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[LR](G)[sP].

CCA

[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR]

TGG

(A)[sP].[LR](T)

GAA

GAT

SEQ ID
AGA
CATCTTCCCATGG
SEQ ID 509
LR](A)[sP].[dR][G)[sP].[LR](A)[sP].[LR](A)[sP].[dR](T)[sP].[dR](C)
1221

218
ATC
ATTCT

[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[dR](G)[sP].[dR](G)[sP].[LR]

CAT

(G)[sP].[dR](A)[sP].[dR](A)[sP].[LR][G)[sP].[dR](A)[sP].[LR](T)[sP].

GGG

[LR](G)

AAG

ATG

SEQ ID
AAT
AACATCTTCCCAT
SEQ ID 510
LR](A)[sP].[dR](A)[sP].[dR](T)[sP].[dR](C)[sP].[LR]([5meC][sP].[dR]
1222

219
CCA
GGATT

(A)[sP].[dR](T)[sP].[LR][G)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].

TGG

[dR](A)[sP].[LR](G)[sP].[dR](A)[sP].[dR](T)[sP].[dR](G)[sP].[LR]

GAA

(T)[sP].[LR](T)

GAT

GTT

SEQ ID
ATC
GAACATCTTCCCA
SEQ ID 511
LR](A)[sP].[dR](T)[sP].[dR][C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)
1223

220
CAT
TGGAT

[sP].[dR](G)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[LR]

GGG

(G)[sP].[dR](A)[sP].[dR](T)[sP].[LR][G][sP].[dR](T)[sP].[LR](T)[sP].

AAG

[LR]([5meC])

ATG

TTC

SEQ ID
CCA
CAGAACATCTTCC
SEQ ID 512
LR]([5meC])[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[dR][G][sP].[dR]
1224

221
TGG
CATGG

(G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].

GAA

[dR](T)[sP].[LR](G)[sP].[dR](T)[sP].[dR](T)[sP].[dR](C)[sP].[LR]

GAT

(T)[P].[LR](G)

GTT

CTG

SEQ ID
CAT
CCAGAACATCTTC
SEQ ID 513
LR]([5meC])[sP].[dR](A)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR]
1225

222
GGG
CCATG

(G)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](T)[sP].

AAG

[dR](G)[sP].[dR](T)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR]

ATG

(G)[sP].[LR](G)

TTCT

GG

SEQ ID
TGG
CCCCAGAACATCT
SEQ ID 514
LR](T)[sP].[dR](G)[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](A)
1226

223
GAA
TCCCA

[sP].[dR](G)[sP].[LR](A)[sP].[dR](T)[sP].[dR](G)[sP].[dR](T)[sP].[LR]

GAT

(T)[sP].[dR](C)[sP].[dR][T][sP].[LR](G)[sP].[dR](G)[sP].[LR](G)[sP].

GTT

[LR](G)

CTG

GGG

SEQ ID
GGG
TCCCCAGAACATC
SEQ ID 515
LR](G)[sP].[dR][G)[sP].[dR](G)[sP].[dR](A)[sP].[LR](A)[sP].[dR](G)
1227

224
AAG
TTCCC

[sP].[dR](A)[sP].[dR](T)[sP].[LR](G)[sP].[dR](T)[sP].[dR](T)[sP].[dR]

ATG

(C)[sP].[LR](T)[sP].[dR](G)[sP].[dR](G)[sP].[dR](G)[sP].[LR](G)[sP].

TTCT

[LR](A)

GGG

GA

SEQ ID
GAA
CCTCCCCAGAAC
SEQ ID 516
LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](T)
1228

225
GAT
ATCTTC

[sP].[dR][G)[sP].[dR](T)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[dR]

GTT

(G)[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[dR](A)[sP].[LR](G)[sP].

CTG

[LR](G)

GGG

AGG

SEQ ID
AAG
ACCTCCCCAGAA
SEQ ID 517
LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](T)[sP].[dR](G)
1229

226
ATG
CATCTT

[sP].[dR](T)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR]

TTCT

(G)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](G)[sP].

GGG

[LR](T)

GAG

GT

SEQ ID
GAT
TCACCTCCCCAGA
SEQ ID 518
LR](G)[sP].[dR](A)[sP].[dR](T)[sP].[dR](G)[sP].[LR](T)[sP].[dR](T)
1230

227
GTT
ACATC

[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[dR]

CTG

(G)[sP].[LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].

GGG

[LR](A)

AGG

TGA

SEQ ID
ATG
GTCACCTCCCCAG
SEQ ID 519
LR](A)[sP].[dR](T)[sP].[dR](G)[sP].[dR](T)[sP].[LR](T)[sP].[dR](C)
1231

228
TTCT
AACAT

[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[LR](G)[sP].[dR]

GGG

(A)[sP].[dR](G)[sP].[LR](G)[sP].[dR](T)[sP].[dR](G)[sP].[LR](A)[sP].

GAG

[LR]([5meC])

GTG

AC

SEQ ID
GTT
TTGTCACCTCCCC
SEQ ID 520
LR](G)[sP].[dR](T)[sP].[LR](T)[sP].[dR][C)[sP].[dR](T)[sP].[LR](G)[sP].
1232

229
CTG
AGAAC

[dR](G)[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](G)[sP].[dR]

GGG

(G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR][C)[sP].[LR](A)[sP].

AGG

[LR](A)

TGA

CAA

SEQ ID
TTCT
GTTGTCACCTCCC
SEQ ID 521
LR](T)[sP].[dR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR][G][sP].[dR](G)
1233

230
GGG
CAGAA

[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[dR]

GAG

(T)[sP].[LR](G)[sP].[dR](A)[sP].[dR][C)[sP].[dR][A][sP].[LR](A)[sP].

GTG

[LR]([5meC])

ACA

AC

SEQ ID
CTG
CAGTTGTCACCTC
SEQ ID 522
LR]([5meC])[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].
1234

231
GGG
CCCAG

[dR](G)[sP].[LR](A)[sP].[dR](G)[sP].[dR][G)[sP].[dR](T)[sP].[LR](G)

AGG

[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[LR]

TGA

(T)[sP].[LR](G)

CAA

CTG

SEQ ID
TGG
CCAGTTGTCACCT
SEQ ID 523
LR](T)[sP].[dR](G)[sP].[dR](G)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)
1235

232
GGA
CCCCA

[sP].[dR](G)[sP].[LR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR]

GGT

(C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].

GAC

[LR](G)

AAC

TGG

SEQ ID
GGG
GCCCAGTTGTCA
SEQ ID 524
LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](G)[sP].[dR](G)
1236

233
AGG
CCTCCC

[sP].[dR](T)[sP].[LR][G)[sP].[dR](A)[sP].[dR][C)[sP].[LR](A)[sP].[dR]

TGA

(A)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[LR](G)[sP].

CAA

[LR]([5meC])

CTG

GGC

SEQ ID
GGA
GGCCCAGTTGTC
SEQ ID 525
LR](G)[sP].[dR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](G)[sP].[dR](T)
1237

234
GGT
ACCTCC

[sP].[LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR]

GAC

(C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[LR][5meC])

AAC

[sP].[LR]([5meC])

TGG

GCC

SEQ ID
AGG
CAGGCCCAGTTG
SEQ ID 526
LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[dR](T)[sP].[LR][G)[sP].[dR](A)
1238

235
TGA
TCACCT

[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR]

CAA

(G)[sP].[dR](G)[sP].[LR](G)[sP].[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].

CTG

[LR](G)

GGC

CTG

SEQ ID
GGT
GCAGGCCCAGTT
SEQ ID 527
LR](G)[sP].[dR](G)[sP].[dR](T)[sP].[LR][G][sP].[dR](A)[sP].[dR](C)
1239

236
GAC
GTCACC

[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR]

AAC

(G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](T)[sP].[LR]

TGG

(G)[sP].[LR]([5meC])

GCC

TGC

SEQ ID
TGA
GTGCAGGCCCAG
SEQ ID 528
LR](T)[sP].[dR](G)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)
1240

237
CAA
TTGTCA

[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR][G)[sP].[dR]

CTG

(C)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR](G)[sP].[dR][C)[sP].[LR]

GGC

(A)[sP].[LR]([5meC])

CTG

CAC

SEQ ID
GAC
GGTGCAGGCCCA
SEQ ID 529
LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)
1241

238
AAC
GTTGTC

[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]

TGG

([5meC])[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](A)[sP].[LR]

GCC

([5meC])[sP].[LR]([5meC])

TGC

ACC

SEQ ID
CAA
CAGGTGCAGGCC
SEQ ID 530
LR]([5meC])[sP].[dR](A)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR]
1242

239
CTG
CAGTTG

(G)[sP].[dR][G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR]

GGC

(T)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].

CTG

[LR](T)[sP].[LR](G)

CAC

CTG

SEQ ID
AAC
GCAGGTGCAGGC
SEQ ID 531
LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)
1243

240
TGG
CCAGTT

[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR](G)[sP].

GCC

[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)[sP].[LR]

TGC

(G)[sP].[LR]([5meC])

ACC

TGC

SEQ ID
CTG
CAGCAGGTGCAG
SEQ ID 532
LR]([5meC])[sP].[dR](T)[sP].[dR](G)[sP].[dR](G)[sP].[LR][G)[sP].
1244

241
GGC
GCCCAG

[dR][C)[sP].[dR](C)[sP].[dR][T][sP].[LR](G)[sP].[dR][C)[sP].[LR](A)

CTG

[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[R](C)[sP].[LR]

CAC

(T)[sP].[LR](G)

CTG

CTG

SEQ ID
TGG
GCAGCAGGTGCA
SEQ ID 533
LR](T)[sP].[dR](G)[sP].[dR][G)[sP].[LR][G)[sP].[dR](C)[sP].[dR](C)
1245

242
GCC
GGCCCA

[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR]

TGC

(C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].

ACC

[LR]([5meC])

TGC

TGC

SEQ ID
GGC
CTGCAGCAGGTG
SEQ ID 534
LR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR]
1246

243
CTG
CAGGCC

(G)[sP].[dR][C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)[sP].

CAC

[LR](G)[sP].[dR][C)[sP].[dR](T)[sP].[LR][G)[sP].[dR](C)[sP].[LR]

CTG

(A)[sP].[LR](G)

CTG

CAG

SEQ ID
GCC
TCTGCAGCAGGT
SEQ ID 535
LR](G)[sP].[dR](C)[sP].[dR][C)[sP].[dR](T)[sP].[LR][G][sP].[dR](C)
1247

244
TGC
GCAGGC

[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR]

ACC

(C)[sP].[dR](T)[sP].[LR][G)[sP].[dR](C)[sP].[dR](A)[sP].[LR](G)[sP].

TGC

[LR](A)

TGC

AGA

SEQ ID
CTG
CCTCTGCAGCAG
SEQ ID 536
LR]([5meC])[sP].[dR](T)[sP].[dR](G)[sP].[dR][C][sP].[LR](A)[sP].[dR]
1248

245
CAC
GTGCAG

(C)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR][C)[sP].[dR](T)[sP].

CTG

[dR](G)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR](G)[sP].[dR](A)[sP].

CTG

[LR](G)[sP].[LR](G)

CAG

AGG

SEQ ID
TGC
ACCTCTGCAGCA
SEQ ID 537
LR](T)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)
1249

246
ACC
GGTGCA

[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR]

TGC

(C)[sP].[LR](A)[sP].[dR][G][sP].[dR](A)[sP].[dR](G)[sP].[LR](G)[sP].

TGC

(LR](T)

AGA

GGT

SEQ ID
CAC
GCACCTCTGCAG
SEQ ID 538
LR]([5meC])[sP].[dR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].[dR]
1250

247
CTG
CAGGTG

(G)[sP].[dR][C)[sP].[dR](T)[sP].[LR](G)[sP].[dR][C)[sP].[dR](A)[sP].

CTG

[dR](G)[sP].[LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[dR](T)[sP].[LR]

CAG

(G)[sP].[LR]([5meC])

AGG

TGC

SEQ ID
ACC
TGCACCTCTGCA
SEQ ID 539
LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)
1251

248
TGC
GCAGGT

[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR]

TGC

(A)[sP][dR](G)[sP].[LR](G)[sP].[dR](T)[sP].[dR](G)[sP].[LR]([5meC])

AGA

[sP].[LR](A)

GGT

GCA

SEQ ID
CTG
CGTGCACCTCTG
SEQ ID 540
LR]([5meC])[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR]
1252

249
CTG
CAGCAG

(G)[sP].[dR](C)[sP].[dR](A)[sP].[dR][G)[sP].[LR](A)[sP].[dR](G)

CAG

[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](A)[sP].[LR]

AGG

([5meC])[sP].[LR](G)

TGC

ACG

SEQ ID
TGC
ACGTGCACCTCT
SEQ ID 541
LR](T)[sP].[dR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)
1253

250
TGC
GCAGCA

[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](G)[sP].[dR][G)[sP].[dR]

AGA

(T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](A)[sP].[dR](C)[sP].[LR](G)[sP].

GGT

[LR](T)

GCA

CGT

SEQ ID
CTG
CTACGTGCACCTC
SEQ ID 542
LR]([5meC])[sP].[dR](T)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR]
1254

251
CAG
TGCAG

(G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](G)[sP].[dR](T)[sP].[dR](G)

AGG

[sP].[dR](C)[sP].[LR](A)[sP].[dR]([5meC])[sP].[dR](G)[sP].[dR](T)[sP].

TGC

[LR](A)[sP].[LR](G)

ACG

TAG

SEQ ID
TGC
ACTACGTGCACCT
SEQ ID 543
LR](T)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](A)
1255

252
AGA
CTGCA

[sP].[dR](G)[sP].[LR](G)[sP].[dR](T)[sP].[dR](G)[sP].[dR](C)[sP].[LR]

GGT

(A)[sP].[dR](C)[sP].[LR](G)[sP].[dR](T)[sP].[dR](A)[sP].[LR](G)[sP].

GCA

[LR](T)

CGT

AGT

SEQ ID
CAG
AGACTACGTGCA
SEQ ID 544
LR]([5meC])[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](G)[sP].
1256

253
AGG
CCTCTG

[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR]

TGC

([5meC])[sP].[dR](G)[sP].[dR](T)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)

ACG

[sP].[LR]([5meC])[sP].[LR](T)

TAG

TCT

SEQ ID
AGA
CAGACTACGTGC
SEQ ID 545
LR](A)[sP].[dR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](G)[sP].[dR](T)
1257

254
GGT
ACCTCT

[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR]([5meC])[sP].[dR](G)[sP].

GCA

[dR](T)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[dR](C)[sP].[LR]

CGT

(T)[sP].[LR](G)

AGT

CTG

SEQ ID
AGG
CTCAGACTACGT
SEQ ID 546
LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)
1258

255
TGC
GCACCT

[sP].[dR](A)[sP].[dR](C)[sP].[LR](G)[sP].[dR](T)[sP].[dR](A)[sP].[dR]

ACG

(G)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[dR](G)[sP].[LR](A)[sP].

TAG

[LR](G)

TCT

GAG

SEQ ID
GGT
ACTCAGACTACG
SEQ ID 547
LR](G)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](A)
1259

256
GCA
TGCACC

[sP].[dR](C)[sP].[LR](G)[sP].[dR](T)[sP].[dR](A)[sP].[dR](G)[sP].[LR]

CGT

(T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[LR](G)[sP].

AGT

[LR](T)

CTG

AGT

SEQ ID
TGC
GCACTCAGACTA
SEQ ID 548
LR](T)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR][C)[sP].[LR](G)
1260

257
ACG
CGTGCA

[sP].[dR](T)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[dR]

TAG

(T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].

TCT

[LR]([5meC])

GAG

TGC

SEQ ID
GCA
AGCACTCAGACT
SEQ ID 549
LR](G)[sP].[dR](C)[sP].[dR](A)[sP].[dR](C)[sP].[LR](G)[sP].[dR](T)
1261

258
CGT
ACGTGC

[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[dR]

AGT

(G)[sP].[LR](A)[sP].[dR][G)[sP].[dR](T)[sP].[dR](G)[sP].[LR]([5meC])

CTG

[sP].[LR](T)

AGT

GCT

SEQ ID
ACG
GCAGCACTCAGA
SEQ ID 550
LR](A)[sP].[dR](C)[sP].[LR](G)[sP].[dR](T)[sP].[dR](A)[sP].[dR](G)
1262

259
TAG
CTACGT

[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR][G)[sP].[dR](A)[sP].[dR]

TCT

(G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].

GAG

[LR]([5meC])

TGC

TGC

SEQ ID
CGT
CGCAGCACTCAG
SEQ ID 551
LR]([5meC])[sP].[dR](G)[sP].[dR](T)[sP].[dR](A)[sP].[LR](G)[sP].
1263

260
AGT
ACTACG

[dR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].

CTG

[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[dR](G)[sP].[LR]

AGT

([5meC])[sP].[LR](G)

GCT

GCG

SEQ ID
TAG
TCCGCAGCACTC
SEQ ID 552
LR](T)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)
1264

261
TCT
AGACTA

[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR][T)[sP].[LR](G)[sP].[dR]

GAG

(C)[sP].[dR](T)[sP].[LR](G)[sP].[dR]([5meC])[sP].[dR](G)[sP].[LR]

TGC

(G)[sP].[LR](A)

TGC

GGA

SEQ ID
AGT
GTCCGCAGCACT
SEQ ID 553
LR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)
12.65

262
CTG
CAGACT

[sP].[dR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR]

AGT

(T)[sP].[LR](G)[sP].[dR]([5meC])[sP].[dR](G)[sP].[dR](G)[sP].[LR]

GCT

(A)[sP].[LR]([5meC])

GCG

GAC

SEQ ID
TCT
GAGTCCGCAGCA
SEQ ID 554
LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)
1266

263
GAG
CTCAGA

[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR]

TGC

([5meC])[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](C)[sP].[LR]

TGC

(T)[sP].[LR]([5meC])

GGA

CTC

SEQ ID
CTG
TGAGTCCGCAGC
SEQ ID 555
LR]([5meC])[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR]
1267

264
AGT
ACTCAG

(T)[sP][LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR]([5meC])

GCT

[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].

GCG

[LR]([5meC])[sP].[LR](A)

GAC

TCA

SEQ ID
GAG
GCTGAGTCCGCA
SEQ ID 556
LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)
1268

265
TGC
GCACTC

[sP].[dR](T)[sP].[LR](G)[sP].[dR]([5meC])[sP].[dR](G)[sP].[dR](G)[sP].

TGC

[LR](A)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)[sP].[dR](A)[sP].[LR]

GGA

(G)[sP].[LR]([5meC])

CTC

AGC

SEQ ID
AGT
TGCTGAGTCCGC
SEQ ID 557
LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR][C)[sP].[dR](T)
1269

266
GCT
AGCACT

[sP].[LR](G)[sP].[dR]([5meC])[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].

GCG

[dR](C)[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[LR]

GAC

([5meC])[sP].[LR](A)

TCA

GCA

SEQ ID
TGC
TCTGCTGAGTCC
SEQ ID 558
LR](T)[sP].[dR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR]
1270

267
TGC
GCAGCA

([5meC])[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)

GGA

[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](C)[sP].[dR](A)[sP].[LR]

CTC

(G)[sP].[LR](A)

AGC

AGA

SEQ ID
GCT
GTCTGCTGAGTC
SEQ ID 559
LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR]([5meC])[sP].[dR]
1271

268
GCG
CGCAGC

(G)[sP].[dR][G)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[dR](C)[sP].

GAC

[LR](A)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[LR]

TCA

(A)[sP].[LR]([5meC])

GCA

GAC

SEQ ID
TGC
GGGTCTGCTGAG
SEQ ID 560
LR](T)[sP].[dR](G)[sP].[dR]([5meC])[sP].[dR](G)[sP].[LR](G)[sP].
1272

269
GGA
TCCGCA

[dR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR]

CTC

(G)[sP].[dR][C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](A)[sP].[dR](C)[sP].

AGC

[LR]([5meC])[sP].[LR]([5meC])

AGA

CCC

SEQ ID
GCG
CGGGTCTGCTGA
SEQ ID 561
LR](G)[sP].[dR]([5meC])[sP].[dR](G)[sP].[dR][G)[sP].[LR](A)[sP].
1273

270
GAC
GTCCGC

[dR](C)[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](C)

TCA

[sP].[dR](A)[sP].[LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR]

GCA

([5meC])[sP].[LR](G)

GAC

CCG

SEQ ID
GGA
GCCGGGTCTGCT
SEQ ID 562
LR](G)[sP][dR](G)[sP].[dR](A)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)
1274

271
CTC
GAGTCC

[sP].[dR](A)[sP].[dR](G)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR](G)

AGC

[sP].[dR](A)[sP].[LR]([5meC])[sP].[dR](C)[sP].[dR]([5meC])[sP].[dR]

AGA

(G)[sP].[LR](G)[sP].[LR]([5meC])

CCC

GGC

SEQ ID
GAC
GGCCGGGTCTGC
SEQ ID 563
LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR]
1275

272
TCA
TGAGTC

(A)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](A)[sP].

GCA

[dR](C)[sP].[LR]([5meC])[sP].[dR]([5meC])[sP].[dR](G)[sP].[dR]

GAC

(G)[sP].[LR]([5meC])[sP].[LR]([5meC])

CCG

GCC

SEQ ID
CTC
GTGGCCGGGTCT
SEQ ID 564
LR]([5meC])[sP].[dR](T)[sP].[dR](C)[sP].[dR](A)[sP].[LR](G)[sP].[dR]
1276

273
AGC
GCTGAG

(C)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].

AGA

[dR](C)[sP].[LR](G)[sP].[dR](G)[sP].[dR](C)[sP].[dR](C)[sP].[LR]

CCC

(A)[sP].[LR]([5meC])

GGC

CAC

SEQ İD
TCA
GGTGGCCGGGTC
SEQ ID 565
LR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](C)[sP].[dR](A)
1277

274
GCA
TGCTGA

[sP].[LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR]([5meC])[sP].

GAC

[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](A)[sP].

CCG

[LR]([5meC])[sP].[LR]([5meC])

GCC

ACC

SEQ ID
AGC
CCGGTGGCCGGG
SEQ ID 566
LR](A)[sP].[dR](G)[sP].[dR](C)[sP].[dR](A)[sP].[LR](G)[sP].[dR](A)
1278

275
AGA
TCTGCT

[sP].[dR](C)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](G)[sP].[dR](G)[sP].

CCC

[dR](C)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR](C)[sP].[dR](C)[sP].

GGC

[LR](G)[sP].[LR](G)

CAC

CGG

SEQ ID
GCA
GCCGGTGGCCGG
SEQ ID 567
LR](G)[sP].[dR](C)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](C)
12.79

276
GAC
GTCTGC

[sP].[dR](C)[sP].[dR](C)[sP].[LR][G)[sP].[dR](G)[sP].[dR](C)[sP].[dR]

CCG

(C)[sP].[LR](A)[sP].[dR](C)[sP].[dR]([5meC])[sP].[dR](G)[sP].[LR]

GCC

(G)[sP].[LR]([5meC])

ACC

GGC

SEQ ID
AGA
AGGCCGGTGGCC
SEQ ID 568
LR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR][C)[sP].[dR](C)[sP].[dR](C)
1280

277
CCC
GGGTCT

[sP].[LR](G)[sP].[dR](G)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR]

GGC

(C)[sP].[LR]([5meC])[sP].[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]

CAC

([5meC])[sP].[LR](T)

CGG

CCT

SEQ ID
GAC
AAGGCCGGTGGC
SEQ ID 569
LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR]
1281

278
CCG
CGGGTC

(G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR]

GCC

(C)[sP].[dR](C)[sP].[LR][G)[sP].[dR](G)[sP].[dR](C)[sP].[dR](C)[sP].

ACC

[LR](T)[sP].[LR](T)

GGC

CTT

SEQ ID
CCC
GTAAGGCCGGTG
SEQ ID 570
LR]([5meC])[sP].[dR](C)[sP].[dR](C)[sP].[LR](G)[sP].[dR](G)[sP].[dR]
1282

279
GGC
GCCGGG

(C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR](G)[sP].

CAC

[dR](G)[sP].[dR](C)[sP].[dR][C)[sP].[LR](T)[sP].[dR](T)[sP].[LR]

CGG

(A)[sP].[LR]([5meC])

CCT

TAC

SEQ ID
CCG
AGTAAGGCCGGT
SEQ ID 571
LR]([5meC])[sP].[dR](C)[sP].[LR](G)[sP].[dR](G)[sP].[dR](C)[sP].[dR]
1283

280
GCC
GGCCGG

(C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR](G)[sP].[dR](G)|sP].

ACC

[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].[dR](T)[sP].[dR](A)[sP].[LR]

GGC

([5meC][sP].[LR](T)

CTT

ACT

SEQ ID
GGC
GGAGTAAGGCCG
SEQ ID 572
LR](G)[sP].[dR](G)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)
1284

281
CAC
GTGGCC

[sP].[dR][C)[sP].[LR](G)[sP].[dR][G)[sP].[dR](C)[sP].[dR](C)[sP].[LR]

CGG

(T)[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR]([5meC])

CCT

[sP].[LR]([5meC])

TAC

TCC

SEQ ID
GCC
TGGAGTAAGGCC
SEQ ID 573
LR](G)[sP].[dR](C)[sP].[dR](C)[sP].[LR][A)[sP].[dR](C)[sP].[dR](C)
1285

282
ACC
GGTGGC

[sP].[LR](G)[sP].[dR](G)[sP].[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].[dR]

GGC

(T)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[dR](C)[sP].[LR]([5meC])

CTT

[sP].[LR](A)

ACT

CCA

SEQ ID
CAC
AATGGAGTAAGG
SEQ ID 574
LR]([5meC])[sP].[dR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR](G)[sP].[dR]
1286

283
CGG
CCGGTG

(G)[sP].[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].[dR](T)[sP].[dR](A)[sP].

CCT

[dR](C)[sP].[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[dR](A)[sP].[LR]

TAC

(T)[sP].[LR](T)

TCC

ATT

SEQ ID
ACC
AAATGGAGTAAG
SEQ ID 575
LR](A)[sP].[dR][C)[sP].[dR](C)[sP].[LR](G)[sP].[dR][G)[sP].[dR](C)
1287

284
GGC
GCCGGT

[sP].[dR](C)[sP].[LR](T)[sP].[dR](T)[sP].[dR](A)[sP].[dR](C)[sP].[LR]

CTT

(T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[LR](T)[sP].

ACT

[LR](T)

CCA

TTT

SEQ ID
CGG
GGAAATGGAGTA
SEQ ID 576
LR]([5meC])[sP].[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])
1288

285
CCT
AGGCCG

[sP].[dR](T)[sP].[dR](T)[sP].[LR](A)[sP].[dR][C)[sP].[LR](T)[sP].[dR]

TAC

(C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[dR](T)[sP].[dR](T)[sP].

TCC

[LR]([5meC])[sP].[LR]([5meC])

ATT

TCC

SEQ ID
GGC
GGGAAATGGAGT
SEQ ID 577
LR](G)[sP].[dR](G)[sP].[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].[dR](T)
1289

286
CTT
AAGGCC

[sP].[dR](A)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR]

ACT

(A)[sP].[dR](T)[sP].[LR](T)[sP].[dR](T)[sP].[dR][C][sP].[LR]([5meC])

CCA

[sP].[LR]([5meC])

TTTC

CC

SEQ ID
CCT
CAGGGAAATGGA
SEQ ID 578
LR]([5meC])[sP].[dR](C)[sP].[dR](T)[sP].[dR](T)[sP].[LR](A)[sP].[dR]
1290

287
TAC
GTAAGG

(C)[sP].[dR](T)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR]

TCC

(T)[sP].[dR](T)[sP].[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[dR](C)[sP].

ATT

[LR](T)[sP].[LR](G)

TCC

CTG

SEQ ID
CTT
CCAGGGAAATGG
SEQ ID 579
LR]([5meC])[sP].[dR](T)[sP].[dR](T)[sP].[LR](A)[sP].[dR][C)[sP].[LR]
1291

288
ACT
AGTAAG

(T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[dR](T)[sP].

CCA

[dR](T)[sP].[LR]([5meC])[sP].[dR](C)[sP].[dR](C)[CP].[dR](T)[sP).

TTTC

[LR](G)[sP].[LR](G)

CCT

GG

SEQ ID
TAC
TTCCAGGGAAAT
SEQ ID 580
LR](T)[sP].[dR](A)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].
1292

289
TCC
GGAGTA

[LR](A)[sP].[dR](T)[sP].[LR](T)[sP].[dR](T)[sP].[dR][C)[sP].[dR]

ATT

(C)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)

TCC

[sP].[LR](A)

CTG

GAA

SEQ ID
ACT
CTTCCAGGGAAA
SEQ ID 581
LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR]
1293

290
CCA
TGGAGT

(A)[sP].[dR](T)[sP].[dR](T)[sP].[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].

TTTC

[dR](C)[sP].[LR](T)[sP].[dR](G)[sP].[dR](G)[sP].[dR](A)[sP].[LR]

CCT

(A)[sP].[LR](G)

GGA

AG

SEQ ID
TCC
TCCTTCCAGGGA
SEQ ID 582
LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[LR](T)[sP].
1294

291
ATT
AATGGA

[dR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](T)[sP]

TCC

[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR]

CTG

(G)[sP].[LR](A)

GAA

GGA

SEQ ID
CCA
TTCCTTCCAGGG
SEQ ID 583
LR]([5meC])[sP].[dR](C)[sP].[dR](A)[sP].[dR](T)[sP].[LR](T)[sP].[dR]
1295

292
TTTC
AAATGG

(T)[sP].[dR](C)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR]

CCT

(G)[sP].[dR][G][sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[dR](G)[sP].

GGA

[LR](A)[sP].[LR](A)

AGG

AA

SEQ ID
ATT
CTTTCCTTCCAGG
SEQ ID 584
LR](A)[sP].[dR](T)[sP].[LR](T)[sP].[dR](T)[sP].[dR][C)[sP].[dR](C)[sP].
1296

293
TCC
GAAAT

[LR]([5meC])[sP].[dR](T)[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].

CTG

[dR](A)[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[LR]

GAA

(A)[sP].[LR](G)

GGA

AAG

SEQ ID
TTTC
TCTTTCCTTCCAG
SEQ ID 585
LR](T)[sP].[dR](T)[sP].[dR](T)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR]
1297

294
CCT
GGAAA

(C)[sP].[dR](T)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].

GGA

[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[dR](A)[sP].[LR]

AGG

(G)[sP].[LR](A)

AAA

GA

SEQ ID
TCC
GGTCTTTCCTTCC
SEQ ID 586
LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[dR](C)[&P].[LR](T)[sP].[dR](G)
1298

295
CTG
AGGGA

[sP].[dR](G)[sP].[dR](A)[sP].[LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[dR]

GAA

(A)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[dR](A)[sP].[LR]([5meC)

GGA

[sP].[LR]([5meC])

AAG

ACC

SEQ ID
CCC
TGGTCTTTCCTTC
SEQ ID 587
LR]([5meC])[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR]
1299

296
TGG
CAGGG

(G)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].

AAG

[dR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](C)[sP].[LR]

GAA

([5meC])[sP].[LR](A)

AGA

CCA

SEQ ID
CTG
TTTGGTCTTTCCT
SEQ ID 588
LR]([5meC])[sP].[dR](T)[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[LR]
1300

297
GAA
TCCAG

(A)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[LR](A)[sP].

GGA

[LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](A)[sP].

AAG

[LR](A)[sP].[LR](A)

ACC

AAA

SEQ ID
TGG
CTTTGGTCTTTCC
SEQ ID 589
LR](T)[sP][LR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)
1301

298
AAG
TTCCA

[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[LR](A)[sP].[dR](G)[sP].[dR]

GAA

(A)[sP].[LR]([5meC])[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[LR]

AGA

(A)[sP].[LR](G)

CCA

AAG

MOUSE

SEQ ID 597
TCCAT
AGAACATCTTCCCA
SEQ ID
[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[dR](G)
1302

699
[sP].[dR](G)[sP].[LR](G)[sP].[dR][A)[sP].[dR](A)[sP].[dR](G)[sP].[LR]

(A)[sP].[dR](T)[sP].[LR](G)[sP].[dR](T)[sP].[dR](T)[sP].[LR]([5meC])

[sP].[LR](T)

SEQ ID 598
GGAA
CTCCCCAGAACATC
SEQ ID
[LR](G)[sP].[dR](G)[sP].[dR](A)[sP].[dR](A)[sP].[LR](G)[sP].[dR](A)
1303

700
[sP].[dR](T)[sP].[dR][G)[sP].[LR](T)[sP].[dR](T)[sP].[dR](C)[sP].[dR]

(T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].

[LR](G)

SEQ ID 599
TGTTC
TGTCACCTCCCCAG
SEQ ID
[LR](T)[sP].[dR](G)[sP].[dR](T)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)
1304

701
[sP].[LR](G)[sP].[dR](G)[sP].[dR][G][sP].[dR](G)[sP].[LR](A)[sP].[dR]

(G)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR][A][sP].[LR]([5meC])

[sP].[LR](A)

SEQ ID 600
GGGG
CCCAGTTGTCACCT
SEQ ID
[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](G)
1305

702
[sP].[dR][G)[sP].[dR](T)[sP].[LR][G)[sP].[dR](A)[sP].[dR](C)[sP].

[dR](A)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[dR](G)[sP].[LR](G)

[sP].[LR](G)

SEQ ID 601
GTGA text missing or illegible when filed

TGCAGGCCCAGTT text missing or illegible when filed

SEQ ID
[LR](G)[sP].[dR](T)[sP].[LR][G][sP].[dR](A)[sP].[dR](C)[sP].[LR](A)
1306

703
[sP].[dR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR]

(G)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR](G)[sP].[LR]

([5meC])[sP].[LR](A)

SEQ ID 602
ACTG text missing or illegible when filed

AGCAGGTGCAGGC
SEQ ID
[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)
1307

704
[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR][G][sP].[dR](C)

[sP].[LR](A)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR](G)[sP].

[LR]([5meC])[sP].[LR](T)

SEQ ID 603
CCTG text missing or illegible when filed

CTCTGCAGCAGGT text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](C)[sP].[dR](T)[sP].[LR][G][sP].[dR](C)[sP].
1308

705
[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)

[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](A)[sP].[dR](G)[sP].[LR]

(A)[sP].[LR](G)

SEQ ID 604
CCTG text missing or illegible when filed

GTGCACCTCTGCA text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].
1309

706
[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)

[sP].[dR](G)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[LR]

(A)[sP].[LR]([5meC])

SEQ ID 605
TGAG text missing or illegible when filed

CTGAGTCCGCAGCA
SEQ ID
[LR](T)[sP].[dR][G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](G)
1310

707
[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR]([5meC])[sP].[dR](G)

[sP].[dR](G)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[dR](C)[sP].[LR]

(A)[sP].[LR](G)

SEQ ID 606
CTGC text missing or illegible when filed

GGTCTGCTGAGTC text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](T)[sP].[LR](G)[sP].[dR]([5meC])[sP].[dR](G)
1311

708
[sP].[dR][G)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[dR](C)[sP].[LR]

(A)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](A)[sP].

[LR]([5meC])[sP].[LR]([5meC])

SEQ ID 607
ACTCA
TGGCCGGGTCTGC text missing or illegible when filed

SEQ ID
[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)
1312

709
[sP].[dR](C)[sP].[dR](A)[sP].[LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[dR]

(C)[sP].[LR]([5meC])[sP].[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]

([5meC])[sP].[LR](A)

SEQ ID 608
CATG text missing or illegible when filed

CCAGAACATCTTC text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](A)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].
1313

710
dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](T)

[sP].[dR](G)[sP].[dR](T)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR]

(G)[sP].[LR](G)

SEQ ID 609
TGGG
CCCCAGAACATCTT
SEQ ID
[LR](T)[sP].[dR](G)[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](A)
1314

711
[sP].[dR](G)[sP].[LR](A)[sP].[dR](T)[sP].[dR](G)[sP].[dR](T)[sP].[LR]

(T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR][G][sP].[LR](G)[sP].

[LR](G)

SEQ ID 610
AAGA text missing or illegible when filed

ACCTCCCCAGAACA
SEQ ID
[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](T)[sP].[dR](G)
1315

712
[sP].[dR](T)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR]

(G)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](G)[sP].

[LR](T)

SEQ ID 611
GATG text missing or illegible when filed

TCACCTCCCCAGAA
SEQ ID
[LR](G)[sP].[dR](A)[sP].[dR](T)[sP].[dR](G)[sP].[LR](T)[sP].[dR](T)
1316

713
[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[dR]

(G)[sP].[LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].

[LR](A)

SEQ ID 612
TTCTG
GTTGTCACCTCCCC
SEQ ID
[LR](T)[sP].[dR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)
1317

714
[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](G)[sP].[dR][G)[sP].[dR]

(T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[dR](A)[sP].[LR](A)[sP].

[LR]([5meC])

SEQ ID 613
CTGG text missing or illegible when filed

CAGTTGTCACCTCC
SEQ ID
[LR]([5meC])[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].
1318

715
[dR](G)[sP].[LR](A)[sP].[dR](G)[sP].[dR][G)[sP].[dR](T)[sP].[LR](G)

[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[LR]

(T)[sP].[LR](G)

SEQ ID 614
GGAG
GGCCCAGTTGTCA text missing or illegible when filed

SEQ ID
[LR](G)[sP].[dR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](G)[sP].[dR](T)
1319

716
[sP][LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR]

(C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[LR]([5meC])

[sP].[LR]([5meC])

SEQ ID 615
AGGT text missing or illegible when filed

CAGGCCCAGTTGT text missing or illegible when filed

SEQ ID
[LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)
1320

717
[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR]

(G)[sP].[dR](G)[sP].[LR](G)[sP].[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].

[LR](G)

SEQ ID 616
GACA text missing or illegible when filed

GGTGCAGGCCCAG
SEQ ID
[LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR][A][sP].[dR](C)
1321

718
[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]

([5meC])[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](A)[sP].[LR]

([5meC])[sP].[LR]([5meC])

SEQ ID 617
CAACT
CAGGTGCAGGCCC text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](A)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].
1322

719
[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])[sP].

[dR](T)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)

[sP].[LR](T)[sP].[LR](G)

SEQ ID 618
TGGG text missing or illegible when filed

GCAGCAGGTGCAG
SEQ ID
[LR](T)[sP].[dR](G)[sP].[dR](G)[sP].[LR](G)[sP].[dR](C)[sP].[dR](C)
1323

720
[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR]

(C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].

[LR]([5meC])

SEQ ID 619
GGCC text missing or illegible when filed

CTGCAGCAGGTGC text missing or illegible when filed

SEQ ID
[LR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](T)[sP].
1324

721
[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)

[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[LR]

(A)[sP].[LR](G)

SEQ ID 620
CACCT
GCACCTCTGCAGCA
SEQ ID
[LR]([5meC])[sP].[dR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].
1325

722
[dR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR][G)[sP].[dR](C)[sP].[dR](A)

[sP].[dR](G)[sP].[LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[dR](T)[sP].[LR]

(G)[sP].[LR]([5meC])

SEQ ID 621
AGTC text missing or illegible when filed

GTCCGCAGCACTCA
SEQ ID
[LR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)
1326

723
[sP].[dR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR]

(T)[sP].[LR](G)[sP].[dR]([5meC])[sP].[dR][G)[sP].[dR](G)[sP].[LR]

(A)[sP].[LR]([5meC])

SEQ ID 622
TCTGA
GAGTCCGCAGCAC
SEQ ID
[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR][A)[sP].[dR](G)
1327

724
[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR]

([5meC])[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](C)[sP].[LR]

(T)[sP].[LR]([5meC])

SEQ ID 623
AGTG text missing or illegible when filed

TGCTGAGTCCGCA text missing or illegible when filed

SEQ ID
[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)
1328

725
[sP].[LR](G)[sP].[dR]([5meC])[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)

[sP].[dR](C)[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[LR]

([5meC])[sP].[LR](A)

SEQ ID 624
TGCT text missing or illegible when filed

TCTGCTGAGTCCG text missing or illegible when filed

SEQ ID
[LR](T)[sP].[dR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR]([5meC])
1329

726
[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)

[sP].[dR][C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](C)[sP].[dR](A)[sP].[LR]

(G)[sP].[LR](A)

SEQ ID 625
GCGG
CGGGTCTGCTGAG text missing or illegible when filed

SEQ ID
[LR](G)[sP].[dR]([5meC])[sP].[dR][G)[sP].[dR](G)[sP].[LR](A)[sP].
1330

727
[dR](C)[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](C)

[sP].[dR](A)[sP].[LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR]

([5meC])[sP].[LR](G)

SEQ ID 626
GGAC
GCCGGGTCTGCTG text missing or illegible when filed

SEQ ID
[LR](G)[sP].[dR](G)[sP].[dR](A)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)
1331

728
[sP].[dR](A)[sP].[dR](G)[sP].[LR]([5me° C.])[sP].[dR](A)[sP].[dR](G)

[sP].[dR](A)[sP].[LR]([5meC])[sP].[dR](C)[sP].[dR]([5meC])[sP].[dR]

(G)[sP].[LR](G)[sP].[LR]([5meC])

SEQ ID 627
TCAG text missing or illegible when filed

GGTGGCCGGGTCT
SEQ ID
[LR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](C)[sP].[dR](A)
1332

729
[sP].[LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR]([5meC])

[sP].[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](A)[sP].

[LR]([5meC])[sP].[LR]([5meC])

SEQ ID 628
GTGT text missing or illegible when filed

CTAAGTGGCGTGT text missing or illegible when filed

SEQ ID
[LR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)
1333

730
[sP].[dR](C)[sP].[dR](A)[sP].[dR](C)[sP].[LR](G)[sP].[dR](C)[sP].[dR]

(C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[dR](T)[sP].[LR](A)[sP].

[LR](G)

SEQ ID 629
GTCA text missing or illegible when filed

GCCTAAGTGGCGT text missing or illegible when filed

SEQ ID
[LR](G)[sP].[dR](T)[sP].[dR][C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](A)
1334

731
[sP].[dR](C)[sP].[LR](G)[sP].[dR][C)[sP].[dR](C)[sP].[LR](A)[sP].[dR]

(C)[sP].[LR](T)[sP].[dR](T)[sP].[dR](A)[sP].[dR](G)[sP].[LR](G)[sP].

[LR]([5meC])

SEQ ID 630
CACA text missing or illegible when filed

TAGCCTAAGTGGC text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].
1335

732
[LR](G)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR][C)[sP].[dR](T)

[sP].[dR](T)[sP].[LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]

(T)[sP].[LR](A)

SEQ ID 631
CACG text missing or illegible when filed

TGTAGCCTAAGTG text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](A)[sP].[dR](C)[sP].[LR](G)[sP].[dR](C)[sP].
1336

733
[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[LR](T)[sP].[dR](T)[sP].[dR](A)

[sP].[dR][G)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[dR](A)[sP].[LR]

([5meC])[sP].[LR](A)

SEQ ID 632
CGCC text missing or illegible when filed

TCTGTAGCCTAAGT
SEQ ID
[LR]([5meC])[sP].[dR](G)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].
1337

734
[dR](C)[sP].[dR](T)[sP].[dR][T)[sP].[LR](A)[sP].[dR](G)[sP].[dR](G)

[sP].[dR](C)[sP].[LR](T)[sP].[dR](A)[sP].[dR](C)[sP].[dR](A)[sP].[LR]

(G)[sP].[LR](A)

SEQ ID 633
CCACT
ATTCTGTAGCCTAA
SEQ ID
[LR]([5meC])[sP].[dR](C)[sP].[dR](A)[sP].[dR](C)[sP].[LR](T)[sP].
1338

735
[dR](T)[sP].[dR](A)[sP].[dR](G)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)

[sP].[dR](A)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR](G)[sP].[dR](A)[sP].

[LR](A)[sP].[LR](T)

SEQ ID 634
ACTTA
TTATTCTGTAGCCT text missing or illegible when filed

SEQ ID
[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[dR](T)[sP].[LR](A)[sP].[dR](G)
1339

736
[sP].[dR](G)[sP].[LR]([5meC])[sP].[LR](T)[sP].[dR](A)[sP].[dR](C)

[sP].[dR](A)[sP].[LR](G)[sP].[LR](A)[sP].[dR](A)[sP].[dR](T)[sP].[LR]

(A)[sP].[LR](A)

SEQ ID 635
TTAG text missing or illegible when filed

GCTTATTCTGTAGC
SEQ ID
[LR](T)[sP].[dR](T)[sP].[dR](A)[sP].[dR](G)[sP].[LR](G)[sP].[dR](C)
1340

737
[sP].[dR](T)[sP].[LR](A)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR](G)

[sP].[LR](A)[sP].[dR](A)[sP].[dR](T)[sP].[LR](A)[sP].[dR](A)[sP].[LR]

(G)[sP].[LR]([5meC])

SEQ ID 636
AGGC
GAGCTTATTCTGTA
SEQ ID
[LR](A)[sP].[dR](G)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](A)
1341

738
[sP].[dR](C)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[LR](A)[sP].[dR]

(T)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[dR](C)[sP].[LR](T)[sP].

[LR]([5meC])

SEQ ID 637
GCTA text missing or illegible when filed

TAGAGCTTATTCTG
SEQ ID
[LR](G)[sP].[dR][C)[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[LR](A)
1342

739
[sP].[dR](G)[sP].[LR](A)[sP].[LR](A)[sP].[dR](T)[sP].[dR](A)[sP].[dR]

(A)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[dR](C)[sP].[LR](T)[sP].

[LR](A)

SEQ ID 638
TACA text missing or illegible when filed

GGTAGAGCTTATT text missing or illegible when filed

SEQ ID
[LR](T)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR][G)[sP].[dR](A)
1343

740
[sP].[LR](A)[sP].[dR](T)[sP].[dR](A)[sP].[LR](A)[sP].[dR](G)[sP].[dR]

(C)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[dR](A)[sP].[LR][5meC])

[sP].[LR]([5meC])

SEQ ID 639
CAGA|
GAGGTAGAGCTTA
SEQ ID
[LR]([5meC])[sP].[dR](A)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].
1344

741
[dR](T)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[dR](C)[sP].[LR](T)

[sP].[dR](C)[sP].[dR](T)[sP].[dR](A)[sP].[LR]([5meC])[sP].[dR](C)[sP].

[LR](T)[sP].[LR]([5meC])

SEQ ID 640
GAAT text missing or illegible when filed

CTGAGGTAGAGCT
SEQ ID
[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[dR](T)[sP].[LR](A)[sP].[dR](A)
1345

742
[sP].[dR](G)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)[sP].[LR](T)[sP].[dR]

(A)[sP].[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)[sP].[LR](A)[sP].

[LR](G)

SEQ ID 641
ATAA text missing or illegible when filed

TTCTGAGGTAGAG text missing or illegible when filed

SEQ ID
[LR](A)[sP].[dR](T)[sP].[dR](A)[sP].[LR](A)[sP].[dR](G)[sP].[dR](C)
1346

743
[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[LR]

([5meC])[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[LR]

(A)[sP].[LR](A)

SEQ ID 642
AAGC
GATTCTGAGGTAG text missing or illegible when filed

SEQ ID
[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)
1347

744
[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].[dR]

(C)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[LR](T)[sP].

[LR]([5meC])

SEQ ID 643
GCTC text missing or illegible when filed

CAGATTCTGAGGTA
SEQ ID
[LR](G)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](A)
1348

745
[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR](A)[sP].

[dR](G)[sP].[dR](A)[sP].[LR](A)[sP].[dR](T)[sP].[dR](C)[sP].[LR]

(T)[sP].[LR](G)

SEQ ID 644
TCTA text missing or illegible when filed

TTCAGATTCTGAGG
SEQ ID
[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)
1349

746
[sP].[LR](T)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR](G)[sP].[dR](A)[sP].

[LR](A)[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR](G)[sP].

[LR](A)[sP].[LR](A)

SEQ ID 645
TACCT
TCTTCAGATTCTGA
SEQ ID
[LR](T)[sP].[dR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)
1350

747
[sP].[LR](A)[sP].[dR](G)[sP].[LR](A)[sP].[LR](A)[sP].[dR](T)[sP].[dR]

(C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[LR](G)[sP].

[LR](A)

SEQ ID 646
CCTCA
CCTCTTCAGATTCT text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](C)[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].
1351

748
[dR](G)[sP].[dR](A)[sP].[dR][A][sP].[LR](T)[sP].[dR](C)[sP].[dR](T)

[sP].[dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[dR](A)[sP].[LR]

(G)[sP].[LR](G)

SEQ ID 647
TCAGA
TGCCTCTTCAGATT
SEQ ID
[LR](T)[sP].[dR][C)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[LR](A)
1352

749
[sP].[dR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR]

(A)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR](G)[sP].[LR]([5meC])

[sP].[LR](A)

SEQ ID 648
AGAAT
GTTGCCTCTTCAGA
SEQ ID
[LR](A)[sP].[dR](G)[sP].[dR](A)[sP].[LR](A)[sP].[dR](T)[sP].[LR][5meC])
1353

750
[sP].[dR](T)[sP].[dR](G)[sP].[dR][A)[sP].[LR](A)[sP].[LR](G)[sP].

[dR](A)[sP].[dR][G)[sP].[dR](G)[sP].[LR]([5meC])[sP].[dR](A)

[sP].[LR](A)[sP].[LR]([5meC])

SEQ ID 649
AATCT
CTGTTGCCTCTTCA text missing or illegible when filed

SEQ ID
[LR](A)[sP].[dR](A)[sP].[LR](T)[sP].[dR](C)[sP].[LR](T)[sP].[dR](G)
1354

751
[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[dR](A)[sP].[LR](G)[sP].[dR]

(G)[sP].[dR](C)[sP].[LR](A)[sP].[LR](A)[sP].[dR](C)[sP].[LR](A)[sP].

[LR](G)

SEQ ID 650
TCTGA
CACTGTTGCCTCTT text missing or illegible when filed

SEQ ID
[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)
1355

752
[sP].[dR](G)[sP].[LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]

(A)[sP].[dR](A)[sP].[dR][C][sP].[LR](A)[sP].[dR](G)[sP].[LR](T)[sP].

[LR](G)

SEQ ID 651
TGAA text missing or illegible when filed

GACACTGTTGCCT text missing or illegible when filed

SEQ ID
[LR](T)[sP].[dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)
1356

753
[sP].[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR]

(C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[dR](G)[sP].[LR](T)[sP].

[LR]([5meC])

SEQ ID 652
AAGA
CTGACACTGTTGC text missing or illegible when filed

SEQ ID
[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](G)[sP].[dR](G)
1357

754
[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR]

(G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].

[LR](G)

SEQ ID 653
GAGG
CTCTGACACTGTTG
SEQ ID
[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR][G)[sP].[LR]([5meC])[sP].
1358

755
[dR](A)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)

[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[dR](A)[sP].[dR](G)[sP].[LR]

(A)[sP].[LR](G)

SEQ ID 654
GGCA
GACTCTGACACTGT
SEQ ID
[LR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)
1359

756
[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](T)[sP].[dR]

(C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].

[LR]([5meC])

SEQ ID 655
CAAC text missing or illegible when filed

TGGACTCTGACACT
SEQ ID
[LR]([5meC])[sP].[dR](A)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].
1360

757
[dR](G)[sP].[dR](T)[sP].[LR][G)[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)

[sP].[dR](G)[sP].[dR][A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[LR]

([5meC])[sP].[LR](A)

SEQ ID 656
ACAG text missing or illegible when filed

CATGGACTCTGACA
SEQ ID
[LR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)
1361

758
[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](A)[sP].[dR]

(G)[sP].[LR](T)[sP].[dR][C)[sP].[dR](C)[sP].[dR](A)[sP].[LR](T)[sP].

[LR](G)

SEQ ID 657
AGTG text missing or illegible when filed

CCCATGGACTCTGA
SEQ ID
[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR][T)[sP].[dR](C)
1362

759
[sP].[LR](A)[sP].[dR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR]

(C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[dR][G)[sP].[LR](G)[sP].

[LR](G)

SEQ ID 658
TGTCA
TTCCCATGGACTCT
SEQ ID
[LR](T)[sP].[dR](G)[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)
1363

760
[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR]

(A)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].

[LR](A)

SEQ ID 659
TCAGA
TCTTCCCATGGACT
SEQ ID
[LR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](A)[sP].[dR](G)
1364

761
[sP].[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[dR]

(G)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[LR](G)[sP].

[LR](A)

SEQ ID 660
AGAG
CATCTTCCCATGGA
SEQ ID
[LR](A)[sP].[dR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)
1365

762
[sP].[dR](C)[sP].[LR][A][sP].[dR](T)[sP].[LR](G)[sP].[dR][G)[sP].[dR]

(G)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[dR](A)[sP].[LR](T)[sP].

[LR](G)

SEQ ID 661
AGTC text missing or illegible when filed

AACATCTTCCCATG
SEQ ID
[LR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)
1366

763
[sP].[dR](T)[sP].[dR](G)[sP].[dR][G)[sP].[LR][G)[sP].[dR](A)[sP].[dR]

(A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](T)[sP].[dR](G)[sP].[LR](T)[sP].

[LR](T)

SEQ ID 662
TGCT text missing or illegible when filed

ATGTGCACCTCTG text missing or illegible when filed

SEQ ID
[LR](T)[sP].[dR](G)[sP].[dR][C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)
1367

764
[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR][G)[sP].[dR](G)[sP].[dR]

(T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].

[LR](T)

SEQ ID 663
CTGCA
CTATGTGCACCTCT
SEQ ID
[LR]([5meC][sP].[dR](T)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].
1368

765
[dR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR][G)[sP].[dR](T)[sP].[dR](G)

[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](A)[sP].[dR](T)[sP].[LR]

(A)[sP].[LR](G)

SEQ ID 664
GCAG
GACTATGTGCACCT
SEQ ID
[LR](G)[sP].[dR](C)[sP].[dR][A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](G)
1369

766
[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](A)[sP].

[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)

[sP].[LR]([5meC])

SEQ ID 665
AGAG
CAGACTATGTGCA text missing or illegible when filed

SEQ ID
[LR](A)[sP].[dR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](G)[sP].[dR](T)
1370

767
[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](A)[sP].[dR]

(T)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[dR](C)[sP].[LR](T)[sP].

[LR](G)

SEQ ID 666
AGGT text missing or illegible when filed

CTCAGACTATGTGC
SEQ ID
[LR](A)[sP].[dR](G)[sP][dR](G)[sP].[dR](T)[sP].[LR][G][sP].[dR](C)
1371

768
[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[dR](A)[sP].[dR]

(G)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[dR](G)[sP].[LR](A)[sP].

[LR](G)

SEQ ID 667
GTGC text missing or illegible when filed

CACTCAGACTATGT
SEQ ID
[LR](G)[sP].[dR](T)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)
1372

769
[sP].[LR](A)[sP].[dR](T)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR]

(C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].

[LR](G)

SEQ ID 668
GCAC
AGCACTCAGACTAT
SEQ ID
[LR](G)[sP].[dR](C)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)
1373

770
[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[dR]

(G)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[dR](G)[sP].[LR]([5meC])

[sP].[LR](T)

SEQ ID 669
ACATA
GCAGCACTCAGAC text missing or illegible when filed

SEQ ID
[LR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[dR](A)[sP].[dR](G)
1374

771
[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR]

(G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP][LR](G)[sP].

[LR]([5meC])

SEQ ID 670
ATAG text missing or illegible when filed

CCGCAGCACTCAG text missing or illegible when filed

SEQ ID
[LR](A)[sP].[dR](T)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)
1375

772
[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR]

(G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[LR](G)[sP].

[LR](G)

SEQ ID 671
AGCA text missing or illegible when filed

CTGGTGGCCGGGT
SEQ ID
[LR](A)[sP].[dR](G)[sP].[dR](C)[sP].[dR](A)[sP].[LR](G)[sP].[dR](A)
1376

773
[sP].[dR](C)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](G)[sP].[dR](G)

[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR][C)[sP].[dR](C)[sP].

[LR](A)[sP].[LR](G)

SEQ ID 672
CAGA text missing or illegible when filed

GGCTGGTGGCCGG
SEQ ID
[LR]([5meC])[sP].[dR](A)[sP].[dR][G)[sP].[LR](A)[sP].[dR](C)[sP].
1377

774
[dR](C)[sP].[dR](C)[sP].[LR](G)[sP].[dR](G)[sP].[dR](C)[sP].[dR](C)

[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[LR]

([5meC])[sP].[LR]([5meC])

SEQ ID 673
GACC text missing or illegible when filed

AAGGCTGGTGGCC
SEQ ID
[LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR]([5meC])[sP].
1378

775
[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](A)[sP].

[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](C)[sP].[dR](C)

[sP].[LR](T)[sP].[LR](T)

SEQ ID 674
CCCG text missing or illegible when filed

GTAAGGCTGGTGG
SEQ ID
[LR]([5meC])[sP].[dR](C)[sP].[dR](C)[sP].[LR](G)[sP].[dR](G)[sP].
1379

776
[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR][C)[sP].[LR](A)

[sP].[dR](G)[sP].[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].[dR](T)[sP].[LR]

(A)[sP].[LR]([5meC])

SEQ ID 675
CGGC text missing or illegible when filed

GAGTAAGGCTGGT
SEQ ID
[LR]([5meC])[sP].[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])
1380

777
[sP].[dR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR][A)[sP].[dR][G)[sP].

[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].[dR](T)[sP].[dR](A)[sP].[dR](C)

[sP].[LR](T)[sP].[LR]([5meC])

SEQ ID 676
GCCA text missing or illegible when filed

TGGAGTAAGGCTG
SEQ ID
[LR](G)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)
1381

778
[sP].[LR](A)[sP].[dR](G)[sP].[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].[dR]

(T)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[dR](C)[sP].[LR]([5meC])

[sP].[LR](A)

SEQ ID 677
CACCA text missing or illegible when filed

AGTGGAGTAAGGC
SEQ ID
[LR]([5meC][sP].[dR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].
1382

779
[dR](G)[sP].[dR](C)[sP].[dR][C)[sP].[LR](T)[sP].[dR](T)[sP].[dR](A)

[sP].[dR](C)[sP].[LR](T)[sP].[dR][C)[sP].[dR](C)[sP].[dR](A)[sP].[LR]

([5meC])[sP].[LR](T)

SEQ ID 678
CCAG text missing or illegible when filed

GGAGTGGAGTAAG
SEQ ID
[LR]([5meC])[sP].[dR](C)[sP].[dR](A)[sP].[dR][G)[sP].[LR]([5meC])
1383

780
[sP].[dR](C)[sP].[dR](T)[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[dR]

(T)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR](C)[sP].[dR]

(T)[sP].[LR]([5meC])[sP].[LR]([5meC])

SEQ ID 679
AGCC text missing or illegible when filed

GGGGAGTGGAGTA
SEQ ID
[LR](A)[sP].[dR](G)[sP].[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].[dR](T)
1384

781
[sP].[dR](A)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR]

(A)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR]([5meC])

[sP].[LR]([5meC])

SEQ ID 680
CCTTA
CAGGGGAGTGGA text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](C)[sP].[dR](T)[sP].[dR](T)[sP].[LR](A)[sP].
1385

782
[dR](C)[sP].[dR](T)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR]

(C)[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR](C)[sP].[dR](C)[sP].[dR]

(C)[sP].[LR](T)[sP].[LR](G)

SEQ ID 681
TTACT
TCCAGGGGAGTGG
SEQ ID
[LR](T)[sP].[dR](T)[sP].[dR](A)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)
1386

783
[sP].[dR](C)[P].[LR](A)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)[sP].[dR]

(C)[sP].[LR]([5meC])[sP].[dR](C)[sP].[dR](T)[sP].[dR](G)[sP].[LR]

(G)[sP].[LR](A)

SEQ ID 682
ACTCC
CTTCCAGGGGAGT text missing or illegible when filed

SEQ ID
[LR](A)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)
1387

784
[sP].[dR](C)[sP].[dR](T)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](C)[sP].

[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](A)[sP].[LR]

(A)[sP].[LR](G)

SEQ ID 683
TCCA text missing or illegible when filed

TCCTTCCAGGGGA text missing or illegible when filed

SEQ ID
[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[LR](T)
1388

785
[sP].[dR](C)[sP].[dR](C)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](T)[sP].

[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR]

(G)[sP].[LR](A)

SEQ ID 684
CACTC
TTTCCTTCCAGGGG
SEQ ID
[LR]([5meC])[sP].[dR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR]([5meC])
1389

786
[sP].[dR](C)[sP].[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].[dR](G)[sP].[dR]

(G)[sP].[dR](A)[sP].[LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[dR](A)

[sP].[LR](A)[sP].[LR](A)

SEQ ID 685
CTCCC
TCTTTCCTTCCAGG text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR]([5meC])
1390

787
[sP].[dR](C)[sP].[dR](T)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR]

(A)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[dR](A)[sP].

[LR](G)[sP].[LR](A)

SEQ ID 686
CCCCT
GGTCTTTCCTTCCA text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](C)[sP].[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].
1391

788
[dR](G)[sP].[dR](G)[sP].[dR](A)[sP].[LR](A)[sP].[dR](G)[sP].[dR](G)

[sP].[dR](A)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[dR](A)[sP].[LR]

([5meC])[sP].[LR]([5meC])

SEQ ID 687
CCTG text missing or illegible when filed

GTGGTCTTTCCTTC text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](C)[sP].[dR](T)[sP].[dR](G)[sP].[LR](G)[sP].
1392

789
[dR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)

[sP].[dR](A)[sP].[LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR]

(A)[sP].[LR]([5meC])

SEQ ID 688
TGGA text missing or illegible when filed

CTGTGGTCTTTCCT
SEQ ID
[LR](T)[sP].[dR](G)[sP].[dR](G)[sP].[dR](A)[sP].[LR](A)[sP].[dR](G)
1393

790
[sP].[dR](G)[sP].[dR][A)[sP].[LR](A)[sP].[dR][A)[sP].[LR](G)[sP].[dR]

(A)[sP].[LR]([5meC][sP].[dR](C)[sP].[dR](A)[sP].[dR](C)[sP].[LR]

(A)[sP].[LR](G)

SEQ ID 689
GAAG
CACTGTGGTCTTTC
SEQ ID
[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)
1394

791
[sP].[dR](A)[sP].[dR](A)[sP].[LR](G)[sP].[dR](A)[sP].[LR]([5meC])

[sP].[dR](C)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[LR]

(T)[sP].[LR](G)

SEQ ID 690
AGGA
CTCACTGTGGTCTT
SEQ ID
[LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[dR](A)[sP].[LR](A)[sP].[dR](A
1395

[sP].[dR](G)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR]

792
(C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[dR](G)[sP].[LR](A)[sP].

[LR](G)

SEQ ID 691
GAAA text missing or illegible when filed

TACTCACTGTGGTC
SEQ ID
[LR](G)[sP].[dR](A)[sP].[LR](A)[sP].[dR](A)[sP].[LR](G)[sP].[dR](A)
1396

793
[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](A)[sP].[dR]

(G)[sP].[LR](T)[sP].[dR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].

[LR](A)

SEQ ID 692
AAGA text missing or illegible when filed

TTTACTCACTGTGG
SEQ ID
[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)
1397

794
[sP].[LR](A)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].

[dR](G)[sP].[LR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](A)[sP].[LR]

(A)[sP].[LR](A)

SEQ ID 693
GACC text missing or illegible when filed

CTTTTACTCACTGT text missing or illegible when filed

SEQ ID
[LR](G)[sP].[dR](A)[sP].[LR]([5meC])[sP].[dR][C)[sP].[dR](A)[sP].
1398

795
[LR]([5meC])[sP].[dR](A)[sP].[dR][G)[sP].[dR](T)[sP].[LR](G)[sP].[dR]

(A)[sP].[dR](G)[sP].[LR](T)[sP].[LR](A)[sP].[dR](A)[sP].[dR](A)[sP].

[LR](A)[sP].[LR](G)

SEQ ID 694
CCACA
AACTTTTACTCACT text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](C)[sP].[dR](A)[sP].[dR][C)[sP].[LR](A)[sP].
1399

796
[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[LR](G)[sP].[dR](T)

[sP].[dR](A)[sP].[LR](A)[sP].[dR](A)[sP].[dR](A)[sP].[LR](G)[sP].[LR]

(T)[sP].[LR](T)

SEQ ID 695
ACAGT
GCAACTTTTACTCA
SEQ ID
[LR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)
1400

797
[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[LR](A)[sP].[dR](A)[sP].[dR]

(A)[sP].[dR](A)[sP].[LR](G)[sP].[dR](T)[sP].[dR](T)[sP].[LR](G)[sP].

[LR]([5meC])

SEQ ID 696
AGTGT
TGGCAACTTTTACT
SEQ ID
[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[dR](G)[sP].[LR](A)[sP].[dR](G)
1401

798
[sP].[dR](T)[sP].[dR](A)[sP].[LR](A)[sP].[dR][A][sP].[dR](A)[sP].[dR]

(G)[sP].[LR](T)[sP].[dR](T)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])

[sP].[LR](A)

SEQ ID 697
TGAG text missing or illegible when filed

CTTGGCAACTTTTA
SEQ ID
[LR](T)[sP].[LR](G)[sP].[dR](A)[sP].[LR](G)[sP].[dR](T)[sP].[dR](A)
1402

799
[sP].[LR](A)[sP].[LR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR]

(T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](C)[sP].[dR](A)[sP].[LR](A)[sP].

[LR](G)

SEQ ID 698
AGTA text missing or illegible when filed

TCCTTGGCAACTTT
SEQ ID
[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[dR](A)[sP].[LR](A)[sP].[LR](A)
1403

800
[sP].[dR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](T)[sP].[dR](G)[sP].[dR]

(C)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR]

(G)[sP].[LR](A)

SEQ ID 808
ACAA text missing or illegible when filed

AGGTGCAGGCCCA
SEQ ID
[LR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)
1404

901
[sP].[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR][T][sP].[dR](G)[sP]

[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[LR]([5meC])[sP].[LR](T)}

SEQ ID 809
GGGC
TGCAGCAGGTGCA
SEQ ID
[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR]
1405

902
(G)[sP].[dR](C)[sP].[LR](A)[sP].[dR][C][sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].

[dR](C)[sP].[dR](T)[sP].[dR](G)[sP].[LR]([5meC])[sP].[LR](A)}

SEQ ID 810
GCAC text missing or illegible when filed

CACCTCTGCAGCA text missing or illegible when filed

SEQ ID
[LR](G)[sP].[dR][C][sP].[LR](A)[sP].[dR][C][sP].[dR][C)[sP].[dR][T)[sP].[LR](G)
1406

903
[sP].[dR][C][sP].[dR](T)[sP].[LR](G)[sP].[dR][C][sP].[dR](A)[sP].[LR][G)[sP].[dR]

(A)[sP].[dR](G)[sP].[dR](G)[sP].[LR](T)[sP].[LR](G)}

SEQ ID 811
GAGG
TCAGACTACGTGCA
SEQ ID
[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](G)[sP].[dR](T)[sP].[dR][G)[sP].[dR](C)
1407

904
[sP].[LR](A)[sP].[dR](C)[sP].[LR](G)[sP].[dR](T)[sP].[dR](A)[sP].[dR](G)[sP].[LR]

(T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[LR](A)}

SEQ ID 812
CACG text missing or illegible when filed

CAGCACTCAGACTA
SEQ ID
[LR]([5meC])[sP].[dR](A)[sP].[dR](C)[sP].[LR](G)[sP].[dR](T)[sP].[dR](A)[sP].[dR]
1408

905
(G)[sP].[LR](T)[sP].[BR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].

[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[LR](T)[sP].[LR](G)}

SEQ ID 813
GTCT text missing or illegible when filed

AGTCCGCAGCACT text missing or illegible when filed

SEQ ID
[LR](G)[sP].[dR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)
1409

906
[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[dR][G)[sP].[LR]([5meC])[sP].

[dR](G)[sP].[dR](G)[sP].[dR](A)[sP].[LR]([5meC])[sP].[LR](T)}

SEQ ID 814
GTGC text missing or illegible when filed

CTGCTGAGTCCGCA
SEQ ID
[LR](G)[sP][dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR]
1410

907
([5meC])[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[dR](C)

[sP].[LR](A)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[LR](G)}

SEQ ID 815
CGGA text missing or illegible when filed

CCGGGTCTGCTGA text missing or illegible when filed

SEQ ID
[LR]([5meC][sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](C)[sP].[dR][T][sP].[dR]
1411

908
(C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[LR](A)[sP].

[dR](C)[sP].[dR](C)[sP].[dR](C)[sP].[LR](G)[sP].[LR](G)}

SEQ ID 816
GGTG text missing or illegible when filed

GCAGGCCCAGTTG text missing or illegible when filed

SEQ ID
[LR](G)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)
1412

909
[sP].[dR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[dR]

(C)[sP].[LR]([5meC])[sP].[dR](T)[sP].[LR][G][sP].[LR]([5meC])}

SEQ ID 817
AACT text missing or illegible when filed

GCAGGTGCAGGCC
SEQ ID
[LR](A)[sP].[dR](A)[sP].[dR][C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)
1413

910
[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR][G)[sP].[dR][C)[sP].[LR](A)[sP].

[dR](C)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[LR]([5meC])}

SEQ ID 818
CTGCA
CCTCTGCAGCAGG text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](T)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR][C)[sP].[dR]
1414

911
(C)[sP].[dR](T)[sP].[LR][G)[sP].[dR](C)[sP].[dR](T)[sP].[dR](G)[sP].[LR]([5meC])

[sP].[dR](A)[sP].[dR](G)[sP].[dR](A)[sP].[LR](G)[sP].[LR](G)}

SEQ ID 819
CTGCT
CGTGCACCTCTGCA
SEQ ID
[LR]([5meC])[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR]
1415

912
(C)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[dR](T)[sP].

[LR](G)[sP].[dR][C][sP].[dR](A)[sP].[LR]([5meC])[sP].[LR](G)}

SEQ ID 820
TGCA text missing or illegible when filed

ACTACGTGCACCT text missing or illegible when filed

SEQ ID
[LR](T)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR][G)[sP].[dR](A)[sP].[dR](G)
1416

913
[sP].[LR](G)[sP].[dR](T)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[LR]

(G)[sP].[dR](T)[sP].[dR](A)[sP].[LR](G)[sP].[LR](T)}

SEQ ID 821
CAGA text missing or illegible when filed

AGACTACGTGCAC text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](G)[sP].[dR](G)[sP].[dR]
1417

914
(T)[sP].[LR](G)[sP].[dR](C)[sP].[LR][A][sP].[dR]([5meC])[sP].[dR](G)[sP].[dR]

(T)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR]([5meC])[sP].[LR](T)}

SEQ ID 822
GGTG text missing or illegible when filed

ACTCAGACTACGT text missing or illegible when filed

SEQ ID
[LR](G)[sP].[dR][G][sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](A)[sP].[dR](C)
1418

915
[sP].[LR](G)[sP].[dR](T)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[dR]

(T)[sP].[LR](G)[sP].[dR](A)[sP].[LR](G)[sP].[LR](T)}

SEQ ID 823
TGCA text missing or illegible when filed

GCACTCAGACTAC text missing or illegible when filed

SEQ ID
[LR](T)[sP].[dR](G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[LR](G)[sP].[dR](T)
1419

916
[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR]

(A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[LR]([5meC])}

SEQ ID 824
CGTA text missing or illegible when filed

CGCAGCACTCAGA text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](G)[sP].[dR][T)[sP].[dR](A)[sP].[LR](G)[sP].[dR](T)[sP].[dR]
1420

917
(C)[sP].[dR](T)[sP].[LR|(G)[sP].[dR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP]

[dR](C)[sP].[dR](T)[sP].[dR](G)[sP].[LR]([5meC)[sP].[LR](G)}

SEQ ID 825
TAGT text missing or illegible when filed

TCCGCAGCACTCA text missing or illegible when filed

SEQ ID
[LR](T)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR][C][sP].[dR](T)[sP].[LR](G)
1421

918
[sP].[dR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR]

(G)[sP].[dR]([5meC])[sP].[dR](G)[sP].[LR](G)[sP].[LR](A)}

SEQ ID 826
CTGA text missing or illegible when filed

TGAGTCCGCAGCA text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR](T)[sP].
1422

919
[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR]([5meC])[sP].[dR][G][sP].[dR]

(G)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR]([5meC])[sP].[LR](A)}

SEQ ID 827
GAGT text missing or illegible when filed

GCTGAGTCCGCAG text missing or illegible when filed

SEQ ID
[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](C)[sP].[dR](T)
1423

920
[sP].[LR][G)[sP].[dR]([5meC])[sP].[dR](G)[sP].[dR][G][sP].[LR](A)[sP].[dR](C)[sP].

[LR](T)[sP].[dR][C][sP].[dR](A)[sP].[LR](G)[sP].[LR]([5meC])}

SEQ ID 828
GCTG text missing or illegible when filed

GTCTGCTGAGTCC text missing or illegible when filed

SEQ ID
[LR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR]([5meC])[sP].[dR](G)[sP].[dR]
1424

921
(G)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[dR][C][sP].[LR](A)[sP].[dR](G)[sP].

[dR](C)][sP].[LR](A)[sP].[dR](G)[sP].[LR](A)[sP].[LR]([5meC])}

SEQ ID 829
TGCG text missing or illegible when filed

GGGTCTGCTGAGT text missing or illegible when filed

SEQ ID
[LR](T)[sP].[dR](G)[sP].[dR]([5meC])[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].
1425

922
[dR](C)[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR|(G)[sP].[dR](C)[sP].[LR]

(A)[sP].[dR](G)[sP].[dR](A)[sP].[dR](C)[sP].[LR]([5meC])[sP].[LR]([5meC])}

SEQ ID 830
GACT text missing or illegible when filed

GGCCGGGTCTGCT text missing or illegible when filed

SEQ ID
[LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR]
1426

923
(G)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)[sP].[dR](A)[sP].[dR](C)[sP][LR]([5meC])

[sP].[dR]([5meC])[sP].[dR](G)[sP].[dR](G)[sP].[LR]([5meC])[sP].[LR]([5meC])}

SEQ ID 831
CTCA text missing or illegible when filed

GTGGCCGGGTCTG
SEQ ID
[LR]([5meC])[sP].[dR](T)[sP].[dR](C)[sP].[dR](A)[sP].[LR](G)[sP].[dR][C)[sP].[dR]
1427

924
(A)[sP].[dR](G)[sP].[LR](A)[sP].[dR][C)[sP].[dR](C)[sP].[dR](C)[sP].[LR](G)[sP].

[dR](G)[sP].[dR](C)[sP].[dR][C)[sP].[LR][A][sP].[LR]([5meC])}

SEQ ID 832
CTGG text missing or illegible when filed

AAAGAGCTATATA text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](T)[sP].[dR](G)[sP].[LR][G)[sP].[dR](T)[sP].[dR](T)[sP].[LR]
1428

925
(A)[sP].[dR](T)[sP].[dR](A)[sP].[dR](T)[sP].[LR](A)[sP].[dR](G)[sP].[dR][C][sP].

[dR](T)[sP].[LR]([5meC])[sP].[dR](T)[sP].[LR](T)[sP].[LR](T)}

SEQ ID 833
GGTTA
TTAAAGAGCTATAT
SEQ ID
[LR](G)[sP].[dR](G)[sP].[dR](T)[sP].[dR](T)[sP].[LR](A)[sP].[dR](T)[sP].[LR](A)
1429

926
[sP].[dR](T)[sP].[LR](A)[sP].[dR](G)[sP].[dR](C)[sP].[dR](T)[sP].[LR]([5meC])[sP].

[LR](T)[sP].[dR](T)[sP].[LR](T)[sP].[LR](A)[sP].[LR](A)}

SEQ ID 834
TTATA
TATTAAAGAGCTAT
SEQ ID
[LR](T)[sP].[LR](T)[sP].[dR](A)[sP].[dR](T)[sP].[LR](A)[sP].[LR](T)[sP].[dR](A)
1430

927
[sP].[dR](G)[sP].[LR]([5meC])[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR](T)[sP].[LR](T)

[sP].[LR](T)[sP].[dR](A)[sP].[dR](A)[sP].[LR](T)[sP].[LR](A)}

SEQ ID 835
ATATA
CTTATTAAAGAGCT
SEQ ID
[LR](A)[sP].[LR](T)|sP].[dR](A)[sP].[LR](T)[sP].[dR](A)[sP].[dR](G)[sP].[LR]([5meC])
1431

928
[sP].[LR](T)[sP].[dR](C)[sP].[LR](T)[sP].[R](T)[sP].[dR](T)[sP].[LR](A)[sP].

[LR](A)[sP].[dR](T)[sP].[LR](A)[sP].[LR](A)[sP].[LR](G)}

SEQ ID 836
ATAG text missing or illegible when filed

GACTTATTAAAGA text missing or illegible when filed

SEQ ID
[LR](A)[sP].[dR](T)[sP].[dR](A)[sP].[LR](G)[sP].[dR](C)[sP].[LR](T)[sP].[LR]([5meC])
1432

929
[sP].[dR](T)[sP].[dR](T)[sP].[dR](T)[sP].[LR](A)[sP].[LR](A)[sP].[dR](T)[sP].

[LR](A)[sP].[dR](A)[sP].[LR](G)[sP].[LR](T)[sP].[LR]([5meC])}

SEQ ID 837
AGCT text missing or illegible when filed

CTGACTTATTAAAG
SEQ ID
[LR](A)[sP].[dR](G)[sP].[LR]([5meC])[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR](T)[sP].
1433

930
[dR](T)[sP].[dR](T)[sP].[LR](A)[sP].[dR](A)[sP].[dR](T)[sP].[LR](A)[sP].[LR](A)

[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[LR](A)[sP].[LR](G)}

SEQ ID 838
CTCTT
TTCTGACTTATTAA text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[LR](T)[sP].[dR][C][sP].[LR](T)[sP].[LR](T)[sP].[dR](T)[sP].[dR]
1434

931
(A)[sP].[LR](A)[sP].[dR](T)[sP].[LR](A)[sP].[LR](A)[sP].[dR](G)[sP].[dR][T)[sP].

[LR]([5meC])[sP].[LR[(A)[sP].[dR][G][sP].[LR](A)[sP].[LR][A)}

SEQ ID 839
CTTTA
CATTCTGACTTATT text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](T)[sP].[LR](T)[sP].[LR](T)[sP].[dR](A)[sP].[dR](A)[sP].[LR]
1435

932
(T)[sP].[dR](A)[sP].[LR](A)[sP].[dR](G)[sP].[LR](T)[sP].[LR]([5meC])[sP].[dR](A)

[sP].[LR](G)[sP].[dR](A)[sP].[LR](A)[sP].[LR](T)[sP].[LR](G)}

SEQ ID 840
TTAAT
ATCATTCTGACTTA
SEQ ID
[LR](T)[sP].[LR](T)[sP].[dR](A)[sP].[LR](A)[sP].[LR](T)[sP].[dR](A)[sP].[LR](A)[sP].
1436

933
[dR](G)[sP].[LR](T)[sP].[dR][C][sP].[LR](A)[sP].[LR](G)[sP].[dR](A)[sP].[LR](A)

[sP].[dR](T)[sP].[LR](G)[P].[LR](A)[sP].[LR](T)}

SEQ ID 841
AATAA
GGATCATTCTGACT
SEQ ID
[LR](A)[sP].[LR](A)[sP].[dR](T)[sP].[dR](A)[sP].[dR](A)[sP].[LR](G)[sP].[dR](T)[sP].
1437

934
[LR]([5meC])[sP].[dR](A)[sP].[LR][G)[sP].[LR](A)[sP].[dR](A)[sP].[dR](T)[sP].

[LR](G)[sP].[dR](A)[sP].[dR](T)[sP].[LR]([5meC])[sP].[LR]([5meC])}

SEQ ID 842
TAAGT
AGGGATCATTCTGA
SEQ ID
[LR](T)[sP].[dR|(A)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[dR](A)[sP].
1438

935
[dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[dR](T)[sP].[dR][G)[sP].[LR](A)[sP].[dR](T)

[sP].[dR](C)[sP].[dR](C)[sP].[LR]([5meC])[sP].[LR](T)}

SEQ ID 843
AGTCA
GTAGGGATCATTCT
SEQ ID
[LR](A)[sP].[dR][G][sP].[dR](T)[sP].[dR][C][sP].[LR](A)[sP].[dR](G)[sP].[dR](A)
1439

936
[sP][LR](A)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](T)[sP].[dR](C)[sP].[LR]

([5meC)[sP].[dR][C][sP].[dR](T)[sP].[LR][A][sP].[LR]([5meC])}

SEQ ID 844
TCAGA
AGGTAGGGATCAT
SEQ ID
[LR](T)[sP].[dR](C)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[dR](T)[sP].
1440

937
[LR](G)[sP].[dR](A)[sP].[dR](T)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](C)[sP].

[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[LR]([5meC])[sP].[LR](T)}

SEQ ID 845
AGAAT
AGAGGTAGGGATC
SEQ ID
[LR](A)[sP].[dR](G)[sP].[dR](A)[sP].[LR](A)[sP].[dR](T)[sP].[LR](G)|sP].[dR](A)[sP].
1441

938
[dR](T)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](C)[sP].[dR](T)[sP].[LR](A)[sP].

[dR](C)[sP].[dR][C)[sP].[dR](T)[sP].[LR]([5meC])[sP].[LR](T)}

SEQ ID 846
AATGA
TCAGAGGTAGGGA
SEQ ID
[LR](A)[sP].[dR](A)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](T)[sP].[dR](C)[sP].
1442

939
[LR]([5meC])[sP].[dR](C)[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].

[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[LR](A)}

SEQ ID 847
ATCCC
AGATTCAGAGGTA text missing or illegible when filed

SEQ ID
[LR](A)[sP].[dR](T)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](C)[sP].[dR](T)[sP].[LR]
1443

940
(A)[sP].[dR](C)[sP].[dR][C][sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].

[dR](A)[sP].[dR](A)[sP].[dR](T)[sP].[LR]([5meC])[sP].[LR](T)}

SEQ ID 848
CCCTA
TCAGATTCAGAGGT
SEQ ID
[LR]([5meC])[sP].[dR](C)[sP].[dR][C)[sP].[dR](T)[sP].[LR](A)[sP].[dR][C)[sP].[dR]
1444

941
(C)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR][A][sP].[LR](A)[sP].

[dR](T)[sP].[dR][C)[sP].[dR](T)[sP].[LR](G)[sP].[LR](A)}

SEQ ID 849
CTACC
CTTCAGATTCAGAG
SEQ ID
[LR]([5meC][sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].[dR]
1445

942
(C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[LR](T)[sP].[dR](C)[sP].

[dR](T)[sP].[dR](G)[sP].[LR](A)[sP].[LR](A)[sP].[LR](G)}

SEQ ID 850
ACCT text missing or illegible when filed

CTCTTCAGATTCAG
SEQ ID
[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR](T)[sP].[LR]
1446

943
(G)[sP].[dR](A)[sP].[LR](A)[sP].[dR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].

[dR](A)[sP].[LR](A)[sP].[dR](G)[sP].[LR](A)[sP].[LR](G)}

SEQ ID 851
CTCTG
GACTCTTCAGATTC
SEQ ID
[LR]([5meC])[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR](G)[sP].[dR](A)
1447

944
[sP].[LR](A)[sP].[dR](T)[sP].[dR](C)[sP].[LR](T)[sP].[dR](G)[sP].[dR](A)[sP].[dR]

(A)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[LR]([5meC])}

SEQ ID 852
CTGAA
TTGACTCTTCAGAT
SEQ ID
[LR]([5meC])[sP].[dR](T)[sP].[LR](G)[sP].[dR][A)[sP].[dR](A)[sP].[LR](T)[sP].[dR]
1448

945
(C)[sP].[LR](T)[sP].[LR](G)[sP].[dR](A)[sP].[LR](A)[sP].[LR](G)[sP].[dR](A)[sP].

[dR](G)[sP].[LR](T)[sP].[dR](C)[sP].[LR](A)[sP].[LR](A)}

SEQ ID 853
ATCT text missing or illegible when filed

GGTATTGACTCTTC
SEQ ID
[LR](A)[sP].[dR](T)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].
1449

946
[dR](G)[sP].[LR](A)[sP].[LR](G)[sP].[dR](T)[sP].[dR](C)[sP].[dR|(A)[sP].[LR](A)

[sP].[dR](T)[sP].[dR](A)[sP].[LR]([5meC])[sP].[LR]([5meC])}

SEQ ID 854
CTGAA
GCGGTATTGACTCT
SEQ ID
[LR]([5meC])[sP].[dR](T)[sP].[LR|(G)[sP].[dR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR]
1450

947
(A)[sP].[dR](G)[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](T)[sP].

[LR](A)[sP].[dR][C][sP].[dR](C)[sP].[LR](G)[sP].[LR]([5meC])}

SEQ ID 855
GAAG
TGGCGGTATTGAC text missing or illegible when filed

SEQ ID
[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[dR][G)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)
1451

948
[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].[dR]

(C)[sP].[LR](G)[sP].[dR](C)[sP].[LR]([5meC])[sP].[LR](A)}

SEQ ID 856
AGAG
TCTGGCGGTATTGA
SEQ ID
[LR](A)[sP].[dR](G)[sP].[dR](A)[sP].[dR][G)[sP].[LR](T)[sP].[dR](C)[sP].[LR](A)[sP].
1452

949
[dR](A)[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)|sP].[dR](C)[sP].[LR][G)[sP].[dR](C)

[sP].[dR](C)[sP].[dR](A)[sP].[LR](G)[sP].[LR](A)}

SEQ ID 857
AGTCA
ATTCTGGCGGTATT
SEQ ID
[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](T)[sP].
1453

950
[LR](A)[sP].[dR](C)[sP].[dR][C][sP].[LR](G)[sP].[dR][C][sP].[dR](C)[sP].[LR](A)

[sP].[dR](G)[sP].[dR][A][sP].[LR](A)[sP].[LR](T)}

SEQ ID 858
TCAAT
GGATTCTGGCGGT text missing or illegible when filed

SEQ ID
[LR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR][A)[sP].[dR](T)[sP].[LR](A)[sP].[dR](C)[sP].
1454

951
[dR](C)[sP].[LR](G)[sP].[dR](C)[sP].[dR](C)[sP].[dR](A)[sP].[LR](G)[sP].[dR](A)

[sP].[LR](A)[sP].[dR](T)[sP].[LR](G)[sP].[LR](G)}

SEQ ID 859
TACCC
CCATGGATTCTGG text missing or illegible when filed

SEQ ID
[LR](T)[sP].[dR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR](G)[sP].[dR](C)[sP].[dR][C][sP].
1455

952
[LR](A)[sP].[dR](G)[sP].[dR](A)[sP].[LR][A)[sP].[dR](T)[sP].[dR][C][sP].[dR]

(C)[sP].[LR](A)[sP].[dR](T)[sP].[LR](G)[sP].[LR](G)}

SEQ ID 860
CCGC text missing or illegible when filed

CCCCATGGATTCTG
SEQ ID
[LR]([5meC][sP].[dR]([5meC])[sP].[dR][G][sP].[dR][C)[sP].[LR]([5meC])[sP].[dR]
1456

953
(A)[sP].[dR](G)[sP].[dR](A)[sP].[LR](A)[sP].[dR](T)[sP].[dR](C)[sP].[dR](C)[sP].

[LR](A)[sP].[dR](T)[sP].[dR](G)[sP].[dR](G)[sP].[LR](G)[sP].[LR](G)}

SEQ ID 861
CAGA text missing or illegible when filed

ATCTCCCCATGGAT
SEQ ID
[LR]([5meC])[sP].[dR][A][sP].[dR](G)[sP].[dR](A)[sP].[LR](A)[sP].[dR](T)[sP].[dR]
1457

954
(C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[dR](G)[sP].[dR](G)[sP].[LR](G)|sP].

[dR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[LR](T)}

SEQ ID 862
GAAT text missing or illegible when filed

ACATCTCCCCATGG
SEQ ID
[LR](G)[sP].[dR](A)[sP].[LR](A)[sP].[dR](T)[sP].[dR](C)[sP].[dR](C)[sP].|LR](A)
1458

955
[sP].[dR](T)[sP].[LR][G][sP].[dR](G)[sP].[dR](G)[sP].[dR][G)[sP].[LR](A)[sP].[dR]

(G)[sP].[dR](A)[sP].[dR](T)[sP].[LR](G)[sP].[LR](T)}

SEQ ID 863
ATCCA
GAACATCTCCCCAT
SEQ ID
[LR](A)[sP].[dR](T)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[dR](G)[sP].
1459

956
[dR](G)[sP].[LR](G)[sP].[dR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR]

(T)[sP].[dR](G)[sP].[dR](T)[sP].[LR](T)[sP].[LR]([5meC])}

SEQ ID 864
ATGG text missing or illegible when filed

TCCAGAACATCTCC
SEQ ID
[LR](A)[sP].[dR](T)[sP].[dR](G)[sP].[dR](G)[sP].[LR](G)[sP].[dR](G)[sP].[dR](A)
1460

957
[sP].[dR](G)[sP].[LR](A)[sP].[dR](T)[sP].[dR|(G)[sP].[dR](T)[sP].[LR](T)[sP].[R]

(C)[sP].[dR](T)[sP].[dR](G)[sP].[LR](G)[sP].[LR](A)}

SEQ ID 865
GGGG
CCTCCAGAACATCT
SEQ ID
[LR](G)[sP].[dR][G][sP].[dR](G)[sP].[dR][G)[sP].[LR](A)[sP].[dR](G)[sP].[dR](A)
1461

958
[sP].[dR](T)[sP].[LR](G)[sP].[dR](T)[sP].[dR](T)[sP].[dR][C][sP].[LR](T)[sP].[dR]

(G)[sP].[dR](G)[sP].[dR](A)[sP].[LR](G)[sP].[LR](G)}

SEQ ID 866
GGAG
CCCCTCCAGAACAT
SEQ ID
[LR](G)[sP].[dR](G)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](T)[sP].[dR](G)
1462

959
[sP].[dR](T)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[dR](G)[sP].[LR](G)[sP].[dR]

(A)[sP].[dR](G)[sP].[dR](G)[sP].[LR](G)[sP].[LR](G)}

SEQ ID 867
AGAT text missing or illegible when filed

CACCCCTCCAGAA text missing or illegible when filed

SEQ ID
[LR](A)[sP].[dR](G)[sP].[dR](A)[sP].[dR](T)[sP].[LR](G)[sP].[dR](T)[sP].[dR](T)
1463

960
[sP].[dR](C)[sP].[LR](T)[sP].[dR](G)[sP].[dR](G)[sP].[dR](A)[sP].[LR](G)[sP].[dR]

(G)[sP].[dR](G)[sP].[dR](G)[sP].[LR](T)[sP].[LR](G)}

SEQ ID 868
ATGTT
GTCACCCCTCCAGA
SEQ ID
[LR](A)[sP].[dR](T)[sP].[dR](G)[sP].[dR](T)[sP].[LR](T)[sP].[dR](C)[sP].[dR](T)
1464

961
[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR](G)[sP].[LR](G)[sP].[dR]

(G)[sP].[dR](T)[sP].[dR](G)[sP].[LR](A)[sP][LR]([5meC])}

SEQ ID 869
GTTCT
TTGTCACCCCTCCA
SEQ ID
[LR](G)[sP].[dR](T)[sP].[LR](T)[sP].[dR][C][sP].[dR](T)[sP].[dR][G)[sP].[LR](G)[sP].
1465

962
[dR](A)[sP].[dR](G)[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[dR](T)[sP].[LR]

(G)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[LR](A)}

SEQ ID 870
TCTG text missing or illegible when filed

AGTTGTCACCCCTC
SEQ ID
[LR](T)[sP].[dR](C)[sP].[dR](T)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)
1466

963
[sP].[LR][G)[sP].[dR](G)[sP].[dR][G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)[sP].[dR]

(C)[sP].[LR](A)[sP].[dR](A)[sP].[LR]([5meC])[sP].[LR](T)}

SEQ ID 871
GAGG
GCCCAGTTGTCAC text missing or illegible when filed

SEQ ID
[LR](G)[sP].[dR](A)[sP].[dR|(G)[sP].[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[dR](T)
1467

964
[sP].[LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR][A)[sP].[dR][C)[sP].[dR]

(T)[sP].[LR](G)[sP].[dR](G)[sP].[LR](G)[sP].[LR]([5meC])}

SEQ ID 872
GGGG
AGGCCCAGTIGTCA
SEQ ID
[LR](G)[sP].[dR](G)[sP].[dR](G)[sP].[LR](G)[sP].[dR](T)[sP].[LR](G)[sP].[dR](A)
1468

965
[sP].[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR]

(G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])[sP].[LR](T)}

SEQ ID 873
CAGCA
CAGTGGCCGGGTC
SEQ ID
[LR]([5meC])[sP].[dR](A)[sP].[dR](G)[sP].[dR][C][sP].[LR](A)[sP].[dR][G)[sP].[LR]
1469

966
(A)[sP].[dR](C)[sP].[dR](C)[sP].[dR](C)[sP].[LR](G)[sP].[dR](G)[sP][dR](C)[sP].

[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[LR](T)[sP].[LR](G)}

SEQ ID 874
GCAG text missing or illegible when filed

GCCAGTGGCCGGG
SEQ ID
[LR](G)[sP].[dR](C)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)
1470

967
[sP].[dR](C)[sP].[LR](G)[sP].[dR](G)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR]

(C)[sP].[dR](T)[sP].[dR](G)[sP].[LR](G)[sP].[LR]([5meC])}

SEQ ID 875
AGAC text missing or illegible when filed

AGGCCAGTGGCCG
SEQ ID
[LR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[dR](C)[sP].[LR](G)
1471

968
[sP].[dR](G)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR]

(G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])[sP].[LR](T)}

SEQ ID 876
ACCC text missing or illegible when filed

TGAGGCCAGTGGC
SEQ ID
[LR](A)[sP].[dR](C)[sP].[dR](C)[sP].[dR][C)[sP].[LR](G)[sP].[dR](G)[sP].[dR][C)
1472

969
[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR][T][sP].[dR](G)[sP].[LR][G)[sP].[dR]

(C)[sP].[dR](C)[sP].[dR](T)[sP].[LR]([5meC])[sP].[LR](A)}

SEQ ID 877
CCGG text missing or illegible when filed

AGTGAGGCCAGTG
SEQ ID
[LR]([5meC])[sP].[dR](C)[sP].[LR](G)[sP].[dR](G)[sP].[dR](C)[sP].[dR](C)[sP].
1473

970
[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR][G][sP].[dR][C][sP].[dR](C)

[sP].[LR](T)[sP].[dR](C)[sP].[dR](A)[sP].[LR]([5meC])[sP].[LR](T)}

SEQ ID 878
GGCC text missing or illegible when filed

GAAGTGAGGCCAG
SEQ ID
[LR](G)[sP].[dR](G)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)
1474

971
[sP].[LR](G)[sP].[dR](G)[sP].[dR](C)[sP].[dR](C)[sP].[LR](T)[sP].[dR](C)[sP].[LR]

(A)[sP].[dR](C)[sP].[dR](T)[sP][LR](T)[sP].[LR]([5meC])}

SEQ ID 879
CCACT
ATGAAGTGAGGCC
SEQ ID
[LR]([5meC])[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR]
1475

972
(G)[sP].[dR](C)[sP].[dR][C][sP].[LR](T)[sP].[dR](C)[sP].[dR](A)[sP].[dR](C)[sP].

[LR](T)[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].[LR](T)}

SEQ ID 880
ACTG text missing or illegible when filed

GAATGAAGTGAGG
SEQ ID
[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[dR](C)[sP].[dR](C)
1476

973
[sP].[LR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR](T)[sP].[dR]

(C)[sP].[dR][A)[sP].[dR](T)[sP].[LR](T)[sP].[LR]([5meC])]

SEQ ID 881
TGGC text missing or illegible when filed

GGGAATGAAGTGA
SEQ ID
[LR](T)[sP].[dR](G)[sP].[dR](G)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR]
1477

974
(C)[sP].[LR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR](T)[sP].[dR](C)[sP].[LR](A)[sP].

[dR](T)[sP].[dR](T)[sP].[dR](C)[sP].[LR]([5meC])[sP].LR]([5MEc])}

SEQ ID 882
GCCT text missing or illegible when filed

AGGGGAATGAAGT
SEQ ID
[LR](G)[sP].[dR](C)[sP].[dR][C)[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR]
1478

975
(C)[sP].[dR](T)[sP].[LR|(T)[sP].[dR](C)[sP].[dR](A)[sP].[dR}(T)[sP].[LR](T)[sP].

[dR][C][sP].[dR](C)[sP].[dR](C)[sP].[LR]([5meC])[sP].[LR](T)}

SEQ ID 883
CTCA text missing or illegible when filed

CCAGGGGAATGAA
SEQ ID
[LR]([5meC])[sP].[dR](T)[sP].[dR](C)[sP].[LR](A)[sP].[dR][C][sP].[dR](T)[sP].[LR]
1479

976
(T)[sP].[dR](C)[sP].[LR](A)[sP].[dR](T)[sP].[dR](T)[sP].[dR][C)[P].[LR]([5meC])

[sP].[dR][C][sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[LR](G)}

SEQ ID 884
CACTT
TCCCAGGGGAATG text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](A)[sP].[dR](C)[sP].[dR](T)[sP].[LR](T)[sP].[dR](C)[sP].[dR]
1480

977
(A)[sP].[dR](T)[sP].[LR](T)[sP].[dR](C)[sP].[dR](C)[sP].[dR][C][sP].[LR]([5meC])

[sP].[dR](T)[sP].[dR][G][sP].[dR][G)[sP].[LR](G)[sP].[LR](A)}

SEQ ID 885
CTTCA
CCTCCCAGGGGAA text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](T)[sP].[dR](T)[sP].[dR](C)[P].[LR](A)[sP].[dR](T)[sP].[dR]
1481

978
(T)[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)[sP].[LR]

(G)[sP].[dR](G)[sP].[dR](G)[sP].[dR](A)[sP].[LR](G)[sP].[LR](G)}

SEQ ID 886
TCATT
TTCCTCCCAGGGGA
SEQ ID
[LR](T)[sP].[dR](C)[sP].[dR](A)[sP].[dR](T)[sP].[LR](T)[sP].[dR](C)[sP].[dR](C)
1482

979
[sP].[dR](C)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR](G)[sP].[dR](G)[sP].[LR](G)[sP].

[dR](A)[sP].[dR](G)[sP].[dR][G)[sP].[LR](A)[sP].[LR](A)}

SEQ ID 887
ATTCC
CTTTCCTCCCAGGG
SEQ ID
[LR](A)[sP].[dR](T)[sP].[LR](T)[sP].[dR](C)[sP].[dB](C)[sP].[dR](C)[sP].[LR]
1483

980
([5meC])[sP].[dR](T)[sP].[dR](G)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)

[sP].[dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[LR](A)[sP].[LR](G)}

SEQ ID 888
TCCCC
GTCTTTCCTCCCAG text missing or illegible when filed

SEQ ID
[LR](T)[sP].[BR][C)[sP].[dR](C)[sP].[dR][C)[sP].[LR]([5meC])[sP].[dR](T)[sP].[dR]
1484

981
(G)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)[sP].

[dR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[LR]([5meC])]

SEQ ID 889
CCCTG
TGGTCTTTCCTCCC text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](C)[sP].[dR](C)[sP].[dR](T)[sP].[LR](G)[sP].[dR](G)[sP].[LR]
1485

982
(G)[sP].[dR](A)[sP].[dR](G)[sP].[dR][G][sP].[LR](A)[sP].[dR](A)[sP].[dR](A)[sP].

[dR](G)[sP].[LR](A)[sP].[dR](C)[sP].[LR]([5meC])[sP].[LR](A)}

SEQ ID 890
CTGG text missing or illegible when filed

TTTGGTCTTTCCTC text missing or illegible when filed

SEQ ID
[LR]([5meC])[sP].[dR](T)[sP].[dR](G)[sP].[dR][G][sP].[LR](G)[sP].[dR](A)[sP].[dR]
1486

983
(G)[sP].[dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[dR](A)[sP].[dR][G][sP].[LR](A)[sP].

[LR]([5meC])[sP].[dR](C)[sP].[dR](A)[sP].[LR](A)[sP].[LR](A)}

SEQ ID 891
GGGA
ACTTTGGTCTTTCC text missing or illegible when filed

SEQ ID
[LR](G)[sP].[dR](G)[sP].[LR](G)[sP].[dR](A)[sP].[dR](G)[sP].[dR](G)[sP].[LR](A)
1487

984
[sP].[dR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR](A)[sP].[dR](C)[sP].[dR][C)[sP].[LR]

(A)[sP].[dR](A)[sP].[dR](A)[sP].[LR](G)[sP].[LR](T)}

SEQ ID 892
GAGG
TCACTTTGGTCTTT text missing or illegible when filed

SEQ ID
[LR](G)[sP].[dR](A)[sP].[LR](G)[sP].[dR](G)[sP].[dR](A)[sP].[dR][A][sP].[LR](A)
1488

985
[sP].[LR](G)[sP].[dR](A)[sP].[dR](C)[sP].[dR](C)[sP].[LR](A)[sP].[LR](A)[sP].[dR]

(A)[sP][dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[LR](A)}

SEQ ID 893
GGAA
ATTCACTTTGGTCT
SEQ ID
[LR](G)[sP].[dR](G)[sP].[dR](A)[sP].[dR](A)[sP].[LR](A)[sP].[LR](G)[sP].[dR](A)
1489

986
[sP].[LR]([5meC])[sP].[dR](C)[sP].[dR](A)[sP].[dR](A)[sP].[LR](A)[sP].[LR](G)[sP].

[dR](T)[sP].[dR](G)[sP].[dR](A)[sP].[LR](A)[sP].[LR](T)}

SEQ ID 894
AAAG text missing or illegible when filed

TTATTCACTTTGGT text missing or illegible when filed

SEQ ID
[LR](A)[sP].[dR](A)[sP].[LR](A)[sP].[LR](G)[sP].[dR](A)[sP].[LR]([5meC)[sP].[LR]
1490

987
([5meC])[sP].[dR](A)[sP].[dR](A)[sP].[LR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR]

(G)[sP].[LR](A)[sP][dR](A)[sP].[LR](T)[sP].[LR](A)[sP].[LR](A)}

SEQ ID 895
AGAC text missing or illegible when filed

GTTTATTCACTTTG text missing or illegible when filed

SEQ ID
[LR](A)[sP].[LR](G)[sP][dR][A)[sP].[dR](C)[sP].[LR]([5meC])[sP].[LR](A)[sP].[dR]
1491

988
(A)[sP].[LR](A)[sP].[dR][G][sP].[dR](T)[sP].[LR][G][sP].[LR](A)[sP].[dR](A)[sP].

[LR](T)[sP].[dR](A)[sP].[LR](A)[sP].[LR](A)[sP].[LR]([5meC])}

SEQ ID 896
CAAA text missing or illegible when filed

AGCTGTTTATTCAC
SEQ ID
[LR]([5meC])[sP].[LR](A)[sP].[dR](A)[sP].[dR](A)[sP].[dR](G)[sP].[LR](T)[sP].[dR]
1492

989
(G)[sP].[LR][A][sP].[dR](A)[sP].[dR](T)[sP].[LR](A)[sP].[dR](A)[sP].[LR](A)[sP].

[R](C)[sP].[dR](A)[sP].[LR](G)[sP].[LR]([5meC])[sP].[LR](T)}

SEQ ID 897
AAGT text missing or illegible when filed

GAAGCTGTTTATTC
SEQ ID
[LR](A)[sP].[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR][G][sP].[LR](A)[sP].[dR][A][sP].
1493

990
[LR](T)[sP].[LR](A)[sP].[dR](A)[sP].[dR](A)[sP].[dR](C)[sP].[LR](A)[sP].[dR](G)

[sP].[LR]([5meC])[sP].[dR](T)[sP].[LR](T)[sP].[LR]([5meC]}

SEQ ID 898
GTGA text missing or illegible when filed

TTGAAGCTGTTTAT
SEQ ID
[LR](G)[sP].[LR](T)[sP].[dR](G)[sP].[LR](A)[sP].[dR](A)[sP].[LR](T)[sP].[LR](A)[sP].
1494

991
[dR](A)[sP].[LR](A)[sP].[LR]([5meC])[sP].[dR](A)[sP].[dR][G][sP].[dR][C)[sP].

[LR](T)[sP].[dR](T)[sP].[dR](C)[sP].[LR][A][sP].[LR](A)]

SEQ ID 899
GAAT text missing or illegible when filed

ACTTGAAGCTGTTT
SEQ ID
[LR](G)[sP].[dR](A)[sP].[dR](A)[sP].[LR](T)[sP].[LR](A)[sP].[dR](A)[sP].[LR](A)[sP].
1495

992
[LR]([5meC])[sP].[dR](A)[sP].[dR][G)[sP].[dR][C)[sP].[LR](T)[sP].[LR](T)[sP].

[dR](C)[sP].[LR](A)[sP].[dR](A)[sP].[LR](G)[sP].[LR](T)

SEQ ID 900
ATAA text missing or illegible when filed

GCACTTGAAGCTG text missing or illegible when filed

SEQ ID
[LR](A)[sP].[dR](T)[sP].[LR](A)[sP].[dR](A)[sP].[LR](A)[sP].[dR](C)[sP].[LR](A)[sP].
1496

993
[dR](G)[sP].[dR](C)[sP].[LR](T)[sP].[dR](T)[sP].[LR]([5meC])[sP].[dR][A)[sP].

[LR](A)[sP].[dR](G)[sP].[dR](T)[sP].[LR](G)[sP].[LR]([5meC])|

Helm Annotation Key:

[LR](G) is a beta-D-oxy-LNA guanine nucleoside,

[LR](T) is a beta-D-oxy-LNA thymine nucleoside,

[LR](A) is a beta-D-oxy-LNA adenine nucleoside,

[LR]([5meC] is a beta-D-oxy-LNA 5-methyl cytosine nucleoside,

[dR](G)is a DNA guanine nucleoside,

[dR](T)is a DNA thymine nucleoside,

[dR](A)is a DNA adenine nucleoside,

[dR][dR](C) is a DNA cytosine nucleoside,

[mR](G) is a 2′-O-methyl RNA guanine nucleoside,

[mR](U) is a 2′-O-methyl RNA uracil nucleoside,

[mR](A) is a 2′-O-methyl RNA adenine nucleoside,

[mR](C) is a 2′-O-methyl RNA cytosine nucleoside,

[sP] is a phosPhorothioate internucleoside linkage

text missing or illegible when filed

indicates data missing or illegible when filed

SEQUENCES

HAMSTER

SEQ ID 1: Hamster XBP1 gene

ATGGTGGTGGTGGCAGCGTCGCCGAGCGCGGCCACGGCGGCCCCGAAAGTACTGCTTCTATCGGGCCAGCCC

GCCGCGGACGGCCGGGCGCTGCCACTCATGGTTCCAGGCTCGCGGGCAGCAGGGTCCGAGGCGAACGGGGC

GCCACAGGCTCGCAAGCGGCAGCGCCTCACGCACCTGAGCCCGGAGGAGAAGGCGCTGCGGAGgtgggctcgg

cgggcggggcggcaaggccgggcatgggaccctttctcgtgtggcggtcgggagggctctgtggggtggcgtagatgagcctctagtacctattt

ctggagggaggcacggagctgaggtgacagcccctccgaaggtctgcttagtctgtgtcggggagtctaacacttgtcagacgggacctgacgc

tcagccctctgtgaatgcttgctcttcttggaggacccatggcagggtccgctctggctgttgttgcagccgcttgggaacttaacactgggatccg

agtcaccatcctccggcagcccgagttgagcttggggagggacggttggtagcgcccccgccgccttcacggagcctgttggacagaatcggaa

ctagaaagccgcgggggaggagggaagatgcttatgacgcaacgggaatgtgtgtcagcccggtggtaaaataagactcgagtggacagcaa

catgggagagaatcgagcaagtcttcaaggcccacgggcagaaaagctgtggtttttgtctttttgagaggaggagcctcagaatgtgtttacca

ctgtttagtcttattctgtaaagtcagcgaaagcaccagctggccacatttacaaatgaagatacaggaaagctgaagatgactcggttcgttat

gtgccctgtcttccttcagGAAACTGAAAAACAGAGTAGCAGCGCAGACTGCCCGAGATCGAAAGAAAGCCCGGAT

GAGCGAGCTGGAACAGCAAGTGGTGGATTTGGAAGAAGAGgtaaagggatttaaggccatgctttcttctctgcccattcta

agctgctgcagccctttagaatacaactaaagtgccatttaaagtttaactagcttagcagataggtggtgaaggcagacatgactcactcctga

cagctagatactatcgatagaagttgctcagagattagccaggtcagatagatcctggcttaaccttcagtactcttgctcttgccaaaggctcac

tagaattgccttccttctagggttctcttgttatctaatctgagcaagggctattgttttaaaagttttaatcatcagctggttcttagaagaaatgtg

ggtcatatcagtagcagtttaaaaaaaatattttgttaggtatagcccaccattcccactttgtttttatactcagcatacagagtattaggacattt

tcaaacagcgtgttttagttaattgattcttcctgccattttccctacacccccagtatccttttaccttctcttggacttctagttgttttttaaggcc

ttacacacatttacatccattcatatgcattcacactctcacacacagtaaggtctacatatgcaagaaactcttggttctgtttgggccacctcactt

aaaatatttaacaaatctacacatcttcctgccaacttctattttctttatagccgagtaacattcttctgtgcacatgtaccatattttcatctgtttc

attggtgtctcccaattgctggtgttacaggcatgagccacccatgctagttttatgtagagctggaggctgaacccagggcttcatgtgtagtag

ggcaagcactcttaccaactgatctacaccattagccaccagtgttgcaacagttatgaacgactgcatatgcacagaatttatcagttcaatga

ggaaaccaactgtaacaaatcacgttttaatagcctcttctggattttcttacagAACCAAAAACTTCTGTTAGAAAATCAGCTTTT

GAGAGAGAAAACTCATGGCCTTGTAATTGAGAACCAGGAGTTAAGAACTCGCTTGGGAATGGATGTGCTGAC

TACTGAAGAGGCTCCAGAGACGGAGTCCAAGgtaaatcttatgagacttggttgtgacatgaacggattgtatttgtgatcccaac

ctctatcaagccttccttttctcttttccttcttttgagacagggtcttaatttcttaattttggatggtcttgaaattgtatcagttttatggcctctg

cctccaaagtaatggaactagacatgtgccaccatgcctagctgatcagtcttgaaaatttctccacatttccaacagacctgttcagtcttcagtgac

tcattcttcaagtgtgtaatgaagtgttactaagccctaataatcctaataatttacatagctctctcagaataagtgctaacaccagtagccagca

agctataccatgcaggcatcaaatagaatgagactgtaagggctagtcagatttgggagattttgatcttgttttgagacagagtctctgtatata

attaacccaggttggctttggactcatcctctggccatagcctcccaggtgctgggattttaggcactacaattggcttgtttcctggacttttgaca

gccctcatgtggcctaggttggtcttaaacttgatatgttagctgataattctgtctctgctttccaagtgttaagatacgggcacatactactttatc

tggcggagttatgtaggcatggtgtttgtgtacatgagtatcttactaaatctggagctaggctggtggctagcaaatcctggtgatcctcttgtctc

tgtctccctcagtgttggggttatacaggcacaactgtcatgctccaaattttacattgatgcttgcctaacaagcaggcttatgctctgagccacct

cccatagcctggtgtgcatttccttggagtgttccctcactttggtctttccttccagGGAAATGGAGTAAGGCCGGTGGCCGGGTCT

GCTGAGTCCGCAGcactcagactacgtgcacctctgcagCAGGTGCAGGCCCAGTTGTCACCTCCCCAGAACATCTTCC

CATGGATTCTGACACTGTTGACTCTTCAGACTCCGAGgtagagcttgtttgccttactaaagcactgtgtaagattggctcattct

gtagtatatatatgatgtgtgacatgcctagccaggcaaatggagaaagaagttagtattggtagggttaggggtaagcagtcactttcttaattt

ccagtggtttaggtcatggagtcgggagaagctgttctgatgggtgtgtccttcgatctgacagcataaggcctaactgacattgtggaactcagt

actaagtgtttctggtagaccatcacattctaatagtgaactttttttgtcttacctcttgcagTCTGATATCCTTTTGGGCATTCTGGAC

AAGTTGGACCCTGTCATGTTTTTCAAATGTCCATCCCCAGAGTCTGCCAATCTGGAGGAACTCCCAGAGGTCTA

CCCAGGACCTAGTTCCTTACCAGCCTCCCTTTCTCTGTCAGTGGGGACCTCATCAGCCAAGCTGGAAGCCATTA

ATGAACTCATTCGCTTTGACCATGTATACACCAAGCCTCTAGTCTTAGAGATCCCTTCTGAGACAGAGAGTCAA

ACTAATGTGGTAGTGAAAATTGAGGAAGCACCTCTCAGCTCTTCAGAGGAGGATCACCCTGAATTCATTGTCT

CAGTGAAGAAAGAACCTTTGGAAGAAGACTTCATTCCAGAGCCGGGCATCTCAAACCTGCTTTCATCCAGCCA

CTGTCTGAAACCATCTTCCTGCCTGCTGGATGCTTATAGTGACTGTGGATATGAGGGCTCCCCTTCTCCCTTCA

GTGACATGTCTTCTCCACTTGGTATAGACCATTCTTGGGAGGACACTTTTGCCAATGAACTCTTTCCCCAGCTA

ATTAGTGTCTAA

SEQ ID 2: Hamster Xbp1-202 (Xbp-1u)

ATGGTGGTGGTGGCAGCGGCGCCGAGCGCGGCCACGGCGGCCCCGAAAGTACTGCTTCTATCGGGCCAGC

CCGCCGCGGACGGCCGGGCGCTGCCACTCATGGTTCCAGGCTCGCGGGCAGCAGGGTCCGAGGCGAACGG

GGCGCCACAGGCTCGCAAGCGGCAGCGCCTCACGCACCTGAGCCCGGAGGAGAAGGCGCTGCGGAGGAAA

CTGAAAAACAGAGTAGCAGCGCAGACTGCCCGAGATCGAAAGAAAGCCCGGATGAGCGAGCTGGAACAGC

AAGTGGTGGATTTGGAAGAAGAGAACCAAAAACTTCTGTTAGAAAATCAGCTTTTGAGAGAGAAAACTCA

TGGCCTTGTAATTGAGAACCAGGAGTTAAGAACTCGCTTGGGAATGGATGTGCTGACTACTGAAGAGGCT

CCAGAGACGGAGTCCAAGGGAAATGGAGTAAGGCCGGTGGCCGGGTCTGCTGAGTCCGCAGCACTCAGAC

TACGTGCACCTCTGCAGCAGGTGCAGGCCCAGTTGTCACCTCCCCAGAACATCTTCCCATGGATTCTGAC

ACTGTTGACTCTTCAGACTCCGAGTCTGATATCCTTTTGGGCATTCTGGACAAGTTGGACCCTGTCATGT

TTTTCAAATGTCCATCCCCAGAGTCTGCCAATCTGGAGGAACTCCCAGAGGTCTACCCAGGACCTAGTTC

CTTACCAGCCTCCCTTTCTCTGTCAGTGGGGACCTCATCAGCCAAGCTGGAAGCCATTAATGAACTCATT

CGCTTTGACCATGTATACACCAAGCCTCTAGTCTTAGAGATCCCTTCTGAGACAGAGAGTCAAACTAATG

TGGTAGTGAAAATTGAGGAAGCACCTCTCAGCTCTTCAGAGGAGGATCACCCTGAATTCATTGTCTCAGT

GAAGAAAGAACCTTTGGAAGAAGACTTCATTCCAGAGCCGGGCATCTCAAACCTGCTTTCATCCAGCCAC

TGTCTGAAACCATCTTCCTGCCTGCTGGATGCTTATAGTGACTGTGGATATGAGGGCTCCCCTTCTCCCT

TCAGTGACATGTCTTCTCCACTTGGTATAGACCATTCTTGGGAGGACACTTTTGCCAATGAACTCTTTCC

CCAGCTAATTAGTGTCTAA

SEQ ID 3: Hamster predicted protein from SEQ ID 2

MVVVAAAPSAATAAPKVLLLSGQPAADGRALPLMVPGSRAAGSEANGAPQARKRQRLTHLSPEEKALRRKLKNR

VAAQTARDRKKARMSELEQQVVDLEEENQKLLLENQLLREKTHGLVIENQELRTRLGMDVLTTEEAPETESKGNG

VRPVAGSAESAALRLRAPLQQVQAQLSPPQNIFPWILTLLTLQTPSLISFWAFWTSWTLSCFSNVHPQSLPIWRNS

QRSTQDLVPYQPPFLCQWGPHQPSWKPLMNSFALTMYTPSL

SEQ ID 4: Hamster Xbp1-201 (Xbp-1s)

ATGGTGGTGGTGGCAGCGTCGCCGAGCGCGGCCACGGCGGCCCCGAAAGTACTGCTTCTATCGGGCCAGC

CCGCCGCGGACGGCCGGGCGCTGCCACTCATGGTTCCAGGCTCGCGGGCAGCAGGGTCCGAGGCGAACGG

GGCGCCACAGGCTCGCAAGCGGCAGCGCCTCACGCACCTGAGCCCGGAGGAGAAGGCGCTGCGGAGGAAA

CTGAAAAACAGAGTAGCAGCGCAGACTGCCCGAGATCGAAAGAAAGCCCGGATGAGCGAGCTGGAACAGC

AAGTGGTGGATTTGGAAGAAGAGAACCAAAAACTTCTGTTAGAAAATCAGCTTTTGAGAGAGAAAACTCA

TGGCCTTGTAATTGAGAACCAGGAGTTAAGAACTCGCTTGGGAATGGATGTGCTGACTACTGAAGAGGCT

CCAGAGACGGAGTCCAAGGGAAATGGAGTAAGGCCGGTGGCCGGGTCTGCTGAGTCCGCAGCAGGTGCAG

GCCCAGTTGTCACCTCCCCAGAACATCTTCCCATGGATTCTGACACTGTTGACTCTTCAGACTCCGAGTC

TGATATCCTTTTGGGCATTCTGGACAAGTTGGACCCTGTCATGTTTTTCAAATGTCCATCCCCAGAGTCT

GCCAATCTGGAGGAACTCCCAGAGGTCTACCCAGGACCTAGTTCCTTACCAGCCTCCCTTTCTCTGTCAG

TGGGGACCTCATCAGCCAAGCTGGAAGCCATTAATGAACTCATTCGCTTTGACCATGTATACACCAAGCC

TCTAGTCTTAGAGATCCCTTCTGAGACAGAGAGTCAAACTAATGTGGTAGTGAAAATTGAGGAAGCACCT

CTCAGCTCTTCAGAGGAGGATCACCCTGAATTCATTGTCTCAGTGAAGAAAGAACCTTTGGAAGAAGACT

TCATTCCAGAGCCGGGCATCTCAAACCTGCTTTCATCCAGCCACTGTCTGAAACCATCTTCCTGCCTGCT

GGATGCTTATAGTGACTGTGGATATGAGGGCTCCCCTTCTCCCTTCAGTGACATGTCTTCTCCACTTGGT

ATAGACCATTCTTGGGAGGACACTTTTGCCAATGAACTCTTTCCCCAGCTGATTAGTGTCTAA

SEQ ID 5: Hamster predicted protein from SEQ ID 4

MVVVAASPSAATAAPKVLLLSGQPAADGRALPLMVPGSRAAGSEANGAPQARKRQRLTHLSPEEKALRRKLKNR

VAAQTARDRKKARMSELEQQVVDLEEENQKLLLENQLLREKTHGLVIENQELRTRLGMDVLTTEEAPETESKGNG

VRPVAGSAESAAGAGPVVTSPEHLPMDSDTVDSSDSESDILLGILDKLDPVMFFKCPSPESANLEELPEVYPGPSSLP

ASLSLSVGTSSAKLEAINELIRFDHVYTKPLVLEIPSETESQTNVVVKIEEAPLSSSEEDHPEFIVSVKKEPLEEDFIPEPGI

SNLLSSSHCLKPSSCLLDAYSDCGYEGSPSPFSDMSSPLGIDHSWEDTFANELFPQLISV

SEQ ID 6: Hamster XBP1 custom-character

4

ATGGTGGTGGTGGCAGCGGCGCCGAGCGCGGCCACGGCGGCCCCGAAAGTACTGCTTCTATCGGGCCAGC

CCGCCGCGGACGGCCGGGCGCTGCCACTCATGGTTCCAGGCTCGCGGGCAGCAGGGTCCGAGGCGAACGG

GGCGCCACAGGCTCGCAAGCGGCAGCGCCTCACGCACCTGAGCCCGGAGGAGAAGGCGCTGCGGAGGAAA

CTGAAAAACAGAGTAGCAGCGCAGACTGCCCGAGATCGAAAGAAAGCCCGGATGAGCGAGCTGGAACAGC

AAGTGGTGGATTTGGAAGAAGAGAACCAAAAACTTCTGTTAGAAAATCAGCTTTTGAGAGAGAAAACTCA

TGGCCTTGTAATTGAGAACCAGGAGTTAAGAACTCGCTTGGGAATGGATGTGCTGACTACTGAAGAGGCT

CCAGAGACGGAGTCCAAGTCTGATATCCTTTTGGGCATTCTGGACAAGTTGGACCCTGTCATGT

TTTTCAAATGTCCATCCCCAGAGTCTGCCAATCTGGAGGAACTCCCAGAGGTCTACCCAGGACCTAGTTC

CTTACCAGCCTCCCTTTCTCTGTCAGTGGGGACCTCATCAGCCAAGCTGGAAGCCATTAATGAACTCATT

CGCTTTGACCATGTATACACCAAGCCTCTAGTCTTAGAGATCCCTTCTGAGACAGAGAGTCAAACTAATG

TGGTAGTGAAAATTGAGGAAGCACCTCTCAGCTCTTCAGAGGAGGATCACCCTGAATTCATTGTCTCAGT

GAAGAAAGAACCTTTGGAAGAAGACTTCATTCCAGAGCCGGGCATCTCAAACCTGCTTTCATCCAGCCAC

TGTCTGAAACCATCTTCCTGCCTGCTGGATGCTTATAGTGACTGTGGATATGAGGGCTCCCCTTCTCCCT

TCAGTGACATGTCTTCTCCACTTGGTATAGACCATTCTTGGGAGGACACTTTTGCCAATGAACTCTTTCC

CCAGCTAATTAGTGTCTAA

SEQ ID 7: Hamster predicted protein from SEQ ID 6

MVVVAAAPSAATAAPKVLLLSGQPAADGRALPLMVPGSRAAGSEANGAPQARKRQRLTHLSPEEKALRRKLKNR

VAAQTARDRKKARMSELEQQVVDLEEENQKLLLENQLLREKTHGLVIENQELRTRLGMDVLTTEEAPETESKSDILL

GILDKLDPVMFFKCPSPESANLEELPEVYPGPSSLPASLSLSVGTSSAKLEAINELIRFDHVYTKPLVLEIPSETESQTNV

VVKIEEAPLSSSEEDHPEFIVSVKKEPLEEDFIPEPGISNLLSSSHCLKPSSCLLDAYSDCGYEGSPSPFSDMSSPLGIDHS

WEDTFANELFPQLISV

MOUSE

SEQ ID 590: Mouse XBP1 gene

CTAGGGTAAAACCGTGAGACTCGGTCTGGAAATCTGGCCTGAGAGGACAGCCTGGCAATCCTCAGCCGGGGT

GGGGACGTCTGCCGAAGATCCTTGGACTCCAGCAACCAGTGGTCGCCACCGTCCATCCACCCTAAGGCCCAGT

TTGCACGGCGGAGAACAGCTGTGCAGCCACGCTGGACACTCACCCCGCCCGAGTTGAGCCCGCCCCCGGGAC

TACAGGACCAATAAGTGATGAATATACCCGCGCGTCACGGAGCACCGGCCAATCGCGGACGGCCACGACCCT

AGAAAGGCTGGGGGGGGCAGGAGGCCACGGGGCGGTGGGGGCGCTGGCGTAGACGTTTCCTGGCTATGGT

GGTGGTGGCAGCGGCGCCGAGCGCGGCCACGGCGGCCCCCAAAGTGCTACTCTTATCTGGCCAGCCCGCCTC

CGGCGGCCGGGCGCTGCCGCTCATGGTACCCGGTCCGCGGGCAGCAGGGTCGGAGGCGAGCGGGACACCG

CAGGCTCGCAAGCGGCAGCGGCTCACGCACCTGAGCCCGGAGGAGAAAGCGCTGCGGAGGTGGGCCCGGC

GGGCAAGGCTGGGGCGCGGGGCGGCAGGACTGGGATTGGGACTCTCTCGTGTGTGCCAGCTGGTGGGCTCC

GTACGGTGGGTTAGATTCACCTCTAGTGTCTAACCTGGGAAGCGGAGCTGAGGGGGATGCCCCTCCGAAGGT

CTGCGTCGGGGGTGTGTGCAGGAGCTCCCGACACAGGCACAGAAGAAGGTGCCCGACGCCCAGTCCTCTGTA

AATGCTCGCTCTTTGTGGTCGTAGGGTAGGAACCGCTCCAGCTGTCATTGCAGCCACTTGGGAACCCCACCCT

GGGAACCGAGTCCACAGCGTCCGGCATCCCGAGAGTTTGGCTTGGGGAGGGACAGTTGGTAGCGTCCCCGCC

GCCTTCACGGATATCGCTCTAGCAAGGAGCCTGTGGGACGGAATTGGACCCAGAAAGTAGCGGGGGAGGAG

GGAAGAAGCATATGACGCAACGGGAATGTATCAGCCCGGTGGTAAAATGAGATCCGGGTGGACAGCCGCAC

GGGAGAGAATCAAGCAAGTCTTCAAGGCCTGTGGATAGAAAGCAGCGTGTGTATGCGTGTGCGTGTGCGTTT

TGATAGGAGCTTTAAGCGTGTTTACTTGCTAAGCCTTATTCTGTAAAGTCAACGAAAGCACCAGCTGGCCACG

TCTACAAATGAAGACACATGAAAGCTGGAGATGACTCAGTTATGTTCCCTGTCTCCTCCCCAAGGAAACTGAA

AAACAGAGTAGCAGCGCAGACTGCTCGAGATAGAAAGAAAGCCCGGATGAGCGAGCTGGAGCAGCAAGTG

GTGGATTTGGAAGAAGAGGTAAAGGGACTTCAGGCCATGCTTTCATCCCATCCATATCAGGGCCCATCCTAAA

CTGCTTCAGCCCTTTAGAATACAACCCAAAGTGCCATTTAAAGTTTAACCAGCCTAGCAGATAGGCCGTGAAA

GCAGACGTGACTCACCCTGGCCTGCCCTCCCCTCGGAGATTAGCCAGGTTGGATAGATCATTGGTTGCTTAAG

CTGTAGCGCCGCCTGTCTTTGCCAAAGGCTCACTAACGCTGCCCTTCCTTCTGGGATCCCCCCCCCCCCGCGCG

CCCCCAATCCTCCCACCCTCTGTATCCTTTCTGCTGTCAGTGCCCTTTTGTGCCCCTCCACCCCGGCATCCTTTTA

CCCTTTGGGGAGTTATTTTAGTTTCTAAGTTAAGTTTAGTTAACTTTAGCTATTTCTAGCGTTTCTAGGCATTGC

CACATTTACGTCCATTTATATGCGCACGTGCGCCCTGGTTTGAGTTTGGGTCACCTCACTTTGTAATACACTTTC

CAAATTTATACATTTTCCCTGCTAGTTTCCTTTCTCTATACAGGCGAGTGGTACCTCACTGTGTGTGCACCCCAC

TTTCACGGTTCTCTGGGCATCTGTGCTCAGCATCTAGGCTGCCACCATTTCTTTGCCATTGGACCACTACCACTT

GCACCAACACTTGCCATTTCAAGACAGGATGGTGAATTATTTAAAGATTATTTTTAGATAGGGTCTTAGGTTGG

CCTGTAACTCATGGCATGCCTCCTGTTTTACCATGCTGACATTACAGGCAGTGAACCACCTTGCCATACTTTTTT

TTTTTAAAGGTAGTGTATTAACACAACTGTAAATTCAAGCTGCAAGTGACCTTTTTTTTTGGCTGAAATCTGCG

AGTAGTACTTGTAGGCATTATGTTGTTTCTGTCACCATTGAAAACACTTTTGTTTTCTTCAGAGATTGGCCTTGA

ATAAACTTGCTTCTCCCGCCTCAGCCTGCTTGAGTGTTCAATGGCATTTTTGGGGGGACAGCTTGATGTCTCCC

AGGCTGTGCTCTAACTTGCTGTGTAGCCAAAGATGACCCCAAATTTGTTTCTCTTGCTGCTATGTCCCAGGTGC

TGGGATTACAGTTTATGCAGAGCTGAAGATGGAGCCCAGGGCTGCAAGCCTGGGAGGGCAGGCCTTCTCCCA

ACTCCTCTGTCCCATTAGCCACCGGTGACAGAATGGCTGTGACCCGCACCAGCAGGGAAACAGCTGGAGCAG

AACTTGCAGTGGATTCTTTAGTGACGGAACCACACGGTCTAACCGCACGGCCTCTTATGTGATTCCTTACAGAA

CCACAAACTCCAGCTAGAAAATCAGCTTTTACGGGAGAAAACTCACGGCCTTGTGGTTGAGAACCAGGAGTTA

AGAACACGCTTGGGAATGGACACGCTGGATCCTGACGAGGTTCCAGAGGTGGAGGCCAAGGTAAGTATTGG

GAGACCTGGCTGCAGCACTACCTGGCTGCAGGTTTGTGTTCTGGACCTCCAATCAAATCCTTTTCTCTTTTCCTT

TATGAGACAAGGTCTTAATGTCTAATTTTGGCTGGTCTTGAACTTGTGTCAGTTCTTTTGCTTCTAAGTAGTAG

GACTATAAGCACCTGCCCCTGTGCCTAGCTGAGGAATCCTGAATTTTCCCTGTTTCCTTGAACTAAACTTATGAT

CTTCTTGCCTTAGCCTTCCAAGCGCTGGAATTACATGCATGAACAAGTGGTTTGTTTCTTGGCTTTTTTGGGGG

ATAGGGTGTCATGTAGTCCAGGTTGGCCTCAAACTTGCTCTGTAGCTGATAATCCTACCTCCACCTTCCAGATG

TTACCATTACAGGCAGATGTTCCTTTGTGTGGTTATGTAGGTGTGTATGTGTACATGGGTGTGGGTTTATACAC

ATCTCTGCTTACGTACAGAGGCCTAAGGAGCATATAGATGTCTTGCCCTAGCACTGTCCACCCTGCTCCTCTGC

AGCAGAGTGTCTCACTGAATCTGGGGCTAGGCAGGTGGACAGCAAGCCCTGGTGAACTTCCTGTTTCTGCCTC

CCTTGATGCTGAGGATTTGAACTTGGGTCTTCAGGATTGTACAGCAAGCACATTATATTCAGAGCCACCTCCCC

AGTTCCTTTCGAGCCCTTTGAGGAGCAGAGACTCACAGCTACCCAGCATGTATATCCTTGGCAACTTTTACTCA

CTGTGGTCTTTCCTTCCAGGGGAGTGGAGTAAGGCTGGTGGCCGGGTCTGCTGAGTCCGCAGCACTCAGACT

ATGTGCACCTCTGCAGCAGGTGCAGGCCCAGTTGTCACCTCCCCAGAACATCTTCCCATGGACTCTGACACTGT

TGCCTCTTCAGATTCTGAGGTAGAGCTTATTCTGTAGCCTAAGTGGCGTGTGACACGCTTAGCCAGGCAAACG

GAGAAGTTAGTATTGGTGGGGTTAGGATTAAGCACTTTCCTAGTCTGCTTAAGTGGATGGAGTAGGGGGAAA

CTGTTCCGTGGGTGGGTCCTATGATCTGAGAGCATAAGTCTGGTGGATGGCTGGGTCCTGTGATCTGAGAGT

GTAAGCCCTAAGTAACATTGTGGAACCCAGTACTAAAAGTATTTCTGGTAGACTGTCACATTCATTCTAATAGT

GAACTCTTTTGTGTTTTGCCTCTTGTAGTCTGATATCCTTTTGGGCATTCTGGACAAGTTGGACCCTGTCATGTT

TTTCAAATGTCCTTCCCCAGAGTCTGCTAGTCTGGAGGAACTCCCAGAGGTCTACCCAGAAGGACCTAGTTCCT

TACCAGCCTCCCTTTCTCTGTCAGTGGGGACCTCATCAGCCAAGCTGGAAGCCATTAATGAACTCATTCGTTTT

GACCATGTATACACCAAGCCTCTAGTTTTAGAGATCCCCTCTGAGACAGAGAGTCAAACTAACGTGGTAGTGA

AAATTGAGGAAGCACCTCTAAGCTCTTCAGAAGAGGATCACCCTGAATTCATTGTCTCAGTGAAGAAAGAGCC

TTTGGAAGATGACTTCATCCCAGAGCTGGGCATCTCAAACCTGCTTTCATCCAGCCATTGTCTGAGACCACCTT

CTTGCCTGCTGGACGCTCACAGTGACTGTGGATATGAGGGCTCCCCTTCTCCCTTCAGTGACATGTCTTCTCCA

CTTGGTACAGACCACTCCTGGGAGGATACTTTTGCCAATGAACTTTTCCCCCAGCTGATTAGTGTCTAAAGAGC

CACATAACACTGGGCCCCTTTCCCTGACCATCACATTGCCTAGAGGATAGCATAGGCCTGTCTCTTTCGTTAAA

AGCCAAAGTAGAGGCTGTCTGGCCTTAGAAGAATTCCTCTAAAGTATTTCAAATCTCATAGATGACTTCCAAGT

ATTGTCGTTTGACACTCAGCTGTCTAAGGTATTCAAAGGTATTCCAGTACTACAGCTTTTGAGATTCTAGTTTAT

CTTAAAGGTGGTAGTATACTCTAAATCGCAGGGAGGGTCATTTGACAGTTTTTTCCCAGCCTGGCTTCAAACTA

TGTAGCCGAGGCTAGGCAGAAACTTCTGACCCTCTTGACCCCACCTCCCAAGTGCTGGGCTTCACCAGGTGTG

CACCTCCACACCTGCCCCCCCGACATGTCAGGTGGACATGGGATTCATGAATGGCCCTTAGCATTTCTTTCTCC

ACTCTCTGCTTCCCAGGTTTCGTAACCTGAGGGGGCTTGTTTTCCCTTATGTGCATTTTAAATGAAGATCAAGA

ATCTTTGTAAAATGATGAAAATTTACTATGTAAATGCTTGATGGATCTTCTTGCTAGTGTAGCTTCTAGAAGGT

GCTTTCTCCATTTATTTAAAACTACCCTTGCAATTAAAAAAAAAGCAACACAGCGTCCTGTTCTGTGATTTCTAG

GGCTGTTGTAATTTCTCTTTATTGTTGGCTAAAGGAGTAATTTATCCAACTAAAGTGAGCATACCACTTTTTAAA

GTCA

SEQ ID 591: Mouse Xbp1, transcript variant 1, (mRNA not IRE1 processed)

CTAGGGTAAAACCGTGAGACTCGGTCTGGAAATCTGGCCTGAGAGGACAGCCTGGCAATCCTCAGCCGGGGT

GGGGACGTCTGCCGAAGATCCTTGGACTCCAGCAACCAGTGGTCGCCACCGTCCATCCACCCTAAGGCCCAGT

TTGCACGGCGGAGAACAGCTGTGCAGCCACGCTGGACACTCACCCCGCCCGAGTTGAGCCCGCCCCCGGGAC

TACAGGACCAATAAGTGATGAATATACCCGCGCGTCACGGAGCACCGGCCAATCGCGGACGGCCACGACCCT

AGAAAGGCTGGGCGCGGCAGGAGGCCACGGGGCGGTGGCGGCGCTGGCGTAGACGTTTCCTGGCTATGGT

GGTGGTGGCAGCGGCGCCGAGCGCGGCCACGGCGGCCCCCAAAGTGCTACTCTTATCTGGCCAGCCCGCCTC

CGGCGGCCGGGCGCTGCCGCTCATGGTACCCGGTCCGCGGGCAGCAGGGTCGGAGGCGAGCGGGACACCG

CAGGCTCGCAAGCGGCAGCGGCTCACGCACCTGAGCCCGGAGGAGAAAGCGCTGCGGAGGAAACTGAAAAA

CAGAGTAGCAGCGCAGACTGCTCGAGATAGAAAGAAAGCCCGGATGAGCGAGCTGGAGCAGCAAGTGGTG

GATTTGGAAGAAGAGAACCACAAACTCCAGCTAGAAAATCAGCTTTTACGGGAGAAAACTCACGGCCTTGTG

GTTGAGAACCAGGAGTTAAGAACACGCTTGGGAATGGACACGCTGGATCCTGACGAGGTTCCAGAGGTGGA

GGCCAAGGGGAGTGGAGTAAGGCTGGTGGCCGGGTCTGCTGAGTCCGCAGCACTCAGACTATGTGCACCTCT

GCAGCAGGTGCAGGCCCAGTTGTCACCTCCCCAGAACATCTTCCCATGGACTCTGACACTGTTGCCTCTTCAGA

TTCTGAGTCTGATATCCTTTTGGGCATTCTGGACAAGTTGGACCCTGTCATGTTTTTCAAATGTCCTTCCCCAGA

GTCTGCTAGTCTGGAGGAACTCCCAGAGGTCTACCCAGAAGGACCTAGTTCCTTACCAGCCTCCCTTTCTCTGT

CAGTGGGGACCTCATCAGCCAAGCTGGAAGCCATTAATGAACTCATTCGTTTTGACCATGTATACACCAAGCC

TCTAGTTTTAGAGATCCCCTCTGAGACAGAGAGTCAAACTAACGTGGTAGTGAAAATTGAGGAAGCACCTCTA

AGCTCTTCAGAAGAGGATCACCCTGAATTCATTGTCTCAGTGAAGAAAGAGCCTTTGGAAGATGACTTCATCC

CAGAGCTGGGCATCTCAAACCTGCTTTCATCCAGCCATTGTCTGAGACCACCTTCTTGCCTGCTGGACGCTCAC

AGTGACTGTGGATATGAGGGCTCCCCTTCTCCCTTCAGTGACATGTCTTCTCCACTTGGTACAGACCACTCCTG

GGAGGATACTTTTGCCAATGAACTTTTCCCCCAGCTGATTAGTGTCTAAAGAGCCACATAACACTGGGCCCCTT

TCCCTGACCATCACATTGCCTAGAGGATAGCATAGGCCTGTCTCTTTCGTTAAAAGCCAAAGTAGAGGCTGTCT

GGCCTTAGAAGAATTCCTCTAAAGTATTTCAAATCTCATAGATGACTTCCAAGTATTGTCGTTTGACACTCAGCT

GTCTAAGGTATTCAAAGGTATTCCAGTACTACAGCTTTTGAGATTCTAGTTTATCTTAAAGGTGGTAGTATACT

CTAAATCGCAGGGAGGGTCATTTGACAGTTTTTTCCCAGCCTGGCTTCAAACTATGTAGCCGAGGCTAGGCAG

AAACTTCTGACCCTCTTGACCCCACCTCCCAAGTGCTGGGCTTCACCAGGTGTGCACCTCCACACCTGCCCCCC

CGACATGTCAGGTGGACATGGGATTCATGAATGGCCCTTAGCATTTCTTTCTCCACTCTCTGCTTCCCAGGTTTC

GTAACCTGAGGGGGCTTGTTTTCCCTTATGTGCATTTTAAATGAAGATCAAGAATCTTTGTAAAATGATGAAAA

TTTACTATGTAAATGCTTGATGGATCTTCTTGCTAGTGTAGCTTCTAGAAGGTGCTTTCTCCATTTATTTAAAAC

TACCCTTGCAATTAAAAAAAAAGCAACACAGCGTCCTGTTCTGTGATTTCTAGGGCTGTTGTAATTTCTCTTTAT

TGTTGGCTAAAGGAGTAATTTATCCAACTAAAGTGAGCATACCACTTITTAAAGTCAAAAAAAAAAAAAAAAAA

SEQ ID 592: Mouse X-box-binding protein 1 isoform XBP1(U)

MVVVAAAPSAATAAPKVLLLSGQPASGGRALPLMVPGPRAAGSEASGTPQARKRQRLTHLSPEEKALRRKLKNRV

AAQTARDRKKARMSELEQQVVDLEEENHKLQLENQLLREKTHGLVVENQELRTRLGMDTLDPDEVPEVEAKGSG

VRLVAGSAESAALRLCAPLQQVQAQLSPPQNIFPWTLTLLPLQILSLISFWAFWTSWTLSCFSNVLPQSLLVWRNSQ

RSTQKDLVPYQPPFLCQWGPHQPSWKPLMNSFVLTMYTPSL

SEQ ID 593: Mouse X-box binding protein 1 (Xbp1), transcript variant 2, mRNA

CTAGGGTAAAACCGTGAGACTCGGTCTGGAAATCTGGCCTGAGAGGACAGCCTGGCAATCCTCAGCCGGGGT

GGGGACGTCTGCCGAAGATCCTTGGACTCCAGCAACCAGTGGTCGCCACCGTCCATCCACCCTAAGGCCCAGT

TTGCACGGCGGAGAACAGCTGTGCAGCCACGCTGGACACTCACCCCGCCCGAGTTGAGCCCGCCCCCGGGAC

TACAGGACCAATAAGTGATGAATATACCCGCGCGTCACGGAGCACCGGCCAATCGCGGACGGCCACGACCCT

AGAAAGGCTGGGCGCGGCAGGAGGCCACGGGGCGGTGGCGGCGCTGGCGTAGACGTTTCCTGGCTATGGT

GGTGGTGGCAGCGGCGCCGAGCGCGGCCACGGCGGCCCCCAAAGTGCTACTCTTATCTGGCCAGCCCGCCTC

CGGCGGCCGGGCGCTGCCGCTCATGGTACCCGGTCCGCGGGCAGCAGGGTCGGAGGCGAGCGGGACACCG

CAGGCTCGCAAGCGGCAGCGGCTCACGCACCTGAGCCCGGAGGAGAAAGCGCTGCGGAGGAAACTGAAAAA

CAGAGTAGCAGCGCAGACTGCTCGAGATAGAAAGAAAGCCCGGATGAGCGAGCTGGAGCAGCAAGTGGTG

GATTTGGAAGAAGAGAACCACAAACTCCAGCTAGAAAATCAGCTTTTACGGGAGAAAACTCACGGCCTTGTG

GTTGAGAACCAGGAGTTAAGAACACGCTTGGGAATGGACACGCTGGATCCTGACGAGGTTCCAGAGGTGGA

GGCCAAGGGGAGTGGAGTAAGGCTGGTGGCCGGGTCTGCTGAGTCCGCAGCAGGTGCAGGCCCAGTTGTCA

CCTCCCCAGAACATCTTCCCATGGACTCTGACACTGTTGCCTCTTCAGATTCTGAGTCTGATATCCTTTTGGGCA

TTCTGGACAAGTTGGACCCTGTCATGTTTTTCAAATGTCCTTCCCCAGAGTCTGCTAGTCTGGAGGAACTCCCA

GAGGTCTACCCAGAAGGACCTAGTTCCTTACCAGCCTCCCTTTCTCTGTCAGTGGGGACCTCATCAGCCAAGCT

GGAAGCCATTAATGAACTCATTCGTTTTGACCATGTATACACCAAGCCTCTAGTTTTAGAGATCCCCTCTGAGA

CAGAGAGTCAAACTAACGTGGTAGTGAAAATTGAGGAAGCACCTCTAAGCTCTTCAGAAGAGGATCACCCTG

AATTCATTGTCTCAGTGAAGAAAGAGCCTTTGGAAGATGACTTCATCCCAGAGCTGGGCATCTCAAACCTGCT

TTCATCCAGCCATTGTCTGAGACCACCTTCTTGCCTGCTGGACGCTCACAGTGACTGTGGATATGAGGGCTCCC

CTTCTCCCTTCAGTGACATGTCTTCTCCACTTGGTACAGACCACTCCTGGGAGGATACTTTTGCCAATGAACTTT

TCCCCCAGCTGATTAGTGTCTAAAGAGCCACATAACACTGGGCCCCTTTCCCTGACCATCACATTGCCTAGAGG

ATAGCATAGGCCTGTCTCTTTCGTTAAAAGCCAAAGTAGAGGCTGTCTGGCCTTAGAAGAATTCCTCTAAAGT

ATTTCAAATCTCATAGATGACTTCCAAGTATTGTCGTTTGACACTCAGCTGTCTAAGGTATTCAAAGGTATTCCA

GTACTACAGCTTTTGAGATTCTAGTTTATCTTAAAGGTGGTAGTATACTCTAAATCGCAGGGAGGGTCATTTGA

CAGTTTTTTCCCAGCCTGGCTTCAAACTATGTAGCCGAGGCTAGGCAGAAACTTCTGACCCTCTTGACCCCACC

TCCCAAGTGCTGGGCTTCACCAGGTGTGCACCTCCACACCTGCCCCCCCGACATGTCAGGTGGACATGGGATT

CATGAATGGCCCTTAGCATTTCTTTCTCCACTCTCTGCTTCCCAGGTTTCGTAACCTGAGGGGGCTTGTTTTCCC

TTATGTGCATTTTAAATGAAGATCAAGAATCTTTGTAAAATGATGAAAATTTACTATGTAAATGCTTGATGGAT

CTTCTTGCTAGTGTAGCTTCTAGAAGGTGCTTTCTCCATTTATTTAAAACTACCCTTGCAATTAAAAAAAAAGCA

ACACAGCGTCCTGTTCTGTGATTTCTAGGGCTGTTGTAATTTCTCTTTATTGTTGGCTAAAGGAGTAATTTATCC

AACTAAAGTGAGCATACCACTTTTTAAAGTCAAAAAAAAAAAAAAAAAA

SEQ ID 594: X-box-binding protein 1 isoform XBP1(S)

MVVVAAAPSAATAAPKVLLLSGQPASGGRALPLMVPGPRAAGSEASGTPQARKRQRLTHLSPEEKALRRKLKNRV

AAQTARDRKKARMSELEQQVVDLEEENHKLQLENQLLREKTHGLVVENQELRTRLGMDTLDPDEVPEVEAKGSG

VRLVAGSAESAAGAGPVVTSPEHLPMDSDTVASSDSESDILLGILDKLDPVMFFKCPSPESASLEELPEVYPEGPSSL

PASLSLSVGTSSAKLEAINELIRFDHVYTKPLVLEIPSETESQTNVVVKIEEAPLSSSEEDHPEFIVSVKKEPLEDDFIPEL

GISNLLSSSHCLRPPSCLLDAHSDCGYEGSPSPFSDMSSPLGTDHSWEDTFANELFPQLISV

SEQ ID 595: Mouse XBP1 delta 4 mRNA

CTAGGGTAAAACCGTGAGACTCGGTCTGGAAATCTGGCCTGAGAGGACAGCCTGGCAATCCTCAGCCGGGGT

GGGGACGTCTGCCGAAGATCCTTGGACTCCAGCAACCAGTGGTCGCCACCGTCCATCCACCCTAAGGCCCAGT

TTGCACGGCGGAGAACAGCTGTGCAGCCACGCTGGACACTCACCCCGCCCGAGTTGAGCCCGCCCCCGGGAC

TACAGGACCAATAAGTGATGAATATACCCGCGCGTCACGGAGCACCGGCCAATCGCGGACGGCCACGACCCT

AGAAAGGCTGGGCGCGGCAGGAGGCCACGGGGCGGTGGCGGCGCTGGCGTAGACGTTTCCTGGCTATGGT

GGTGGTGGCAGCGGCGCCGAGCGCGGCCACGGCGGCCCCCAAAGTGCTACTCTTATCTGGCCAGCCCGCCTC

CGGCGGCCGGGCGCTGCCGCTCATGGTACCCGGTCCGCGGGCAGCAGGGTCGGAGGCGAGCGGGACACCG

CAGGCTCGCAAGCGGCAGCGGCTCACGCACCTGAGCCCGGAGGAGAAAGCGCTGCGGAGGAAACTGAAAAA

CAGAGTAGCAGCGCAGACTGCTCGAGATAGAAAGAAAGCCCGGATGAGCGAGCTGGAGCAGCAAGTGGTG

GATTTGGAAGAAGAGAACCACAAACTCCAGCTAGAAAATCAGCTTTTACGGGAGAAAACTCACGGCCTTGTG

GTTGAGAACCAGGAGTTAAGAACACGCTTGGGAATGGACACGCTGGATCCTGACGAGGTTCCAGAGGTGGA

GGCCAAGTCTGATATCCTTTTGGGCATTCTGGACAAGTTGGACCCTGTCATGTTTTTCAAATGTCCTTCCCCAG

AGTCTGCTAGTCTGGAGGAACTCCCAGAGGTCTACCCAGAAGGACCTAGTTCCTTACCAGCCTCCCTTTCTCTG

TCAGTGGGGACCTCATCAGCCAAGCTGGAAGCCATTAATGAACTCATTCGTTTTGACCATGTATACACCAAGC

CTCTAGTTTTAGAGATCCCCTCTGAGACAGAGAGTCAAACTAACGTGGTAGTGAAAATTGAGGAAGCACCTCT

AAGCTCTTCAGAAGAGGATCACCCTGAATTCATTGTCTCAGTGAAGAAAGAGCCTTTGGAAGATGACTTCATC

CCAGAGCTGGGCATCTCAAACCTGCTTTCATCCAGCCATTGTCTGAGACCACCTTCTTGCCTGCTGGACGCTCA

CAGTGACTGTGGATATGAGGGCTCCCCTTCTCCCTTCAGTGACATGTCTTCTCCACTTGGTACAGACCACTCCT

GGGAGGATACTTTTGCCAATGAACTTTTCCCCCAGCTGATTAGTGTCTAAAGAGCCACATAACACTGGGCCCCT

TTCCCTGACCATCACATTGCCTAGAGGATAGCATAGGCCTGTCTCTTTCGTTAAAAGCCAAAGTAGAGGCTGTC

TGGCCTTAGAAGAATTCCTCTAAAGTATTTCAAATCTCATAGATGACTTCCAAGTATTGTCGTTTGACACTCAGC

TGTCTAAGGTATTCAAAGGTATTCCAGTACTACAGCTTTTGAGATTCTAGTTTATCTTAAAGGTGGTAGTATAC

TCTAAATCGCAGGGAGGGTCATTTGACAGTTTTTTCCCAGCCTGGCTTCAAACTATGTAGCCGAGGCTAGGCA

GAAACTTCTGACCCTCTTGACCCCACCTCCCAAGTGCTGGGCTTCACCAGGTGTGCACCTCCACACCTGCCCCC

CCGACATGTCAGGTGGACATGGGATTCATGAATGGCCCTTAGCATTTCTTTCTCCACTCTCTGCTTCCCAGGTTT

CGTAACCTGAGGGGGCTTGTTTTCCCTTATGTGCATTTTAAATGAAGATCAAGAATCTTTGTAAAATGATGAAA

ATTTACTATGTAAATGCTTGATGGATCTTCTTGCTAGTGTAGCTTCTAGAAGGTGCTTTCTCCATTTATTTAAAA

CTACCCTTGCAATTAAAAAAAAAGCAACACAGCGTCCTGTTCTGTGATTTCTAGGGCTGTTGTAATTTCTCTTTA

TTGTTGGCTAAAGGAGTAATTTATCCAACTAAAGTGAGCATACCACTTTTTAAAGTCAAAAAAAAAAAAAAAAAA

SEQ ID 596: protein predicted to be produced by the XBP1 delta 4 mRNA

MVVVAAAPSAATAAPKVLLLSGQPASGGRALPLMVPGPRAAGSEASGTPQARKRQRLTHLSPEEKALRRKLKNRV

AAQTARDRKKARMSELEQQVVDLEEENHKLQLENQLLREKTHGLVVENQELRTRLGMDTLDPDEVPEVEAKSDIL

LGILDKLDPVMFFKCPSPESASLEELPEVYPEGPSSLPASLSLSVGTSSAKLEAINELIRFDHVYTKPLVLEIPSETESQTN

VVVKIEEAPLSSSEEDHPEFIVSVKKEPLEDDFIPELGISNLLSSSHCLRPPSCLLDAHSDCGYEGSPSPFSDMSSPLGTD

HSWEDTFANELFPQLISV

HUMAN

SEQ ID 801: Human XBP1 gene

GCTGGGCGGCTGCGGCGCGCGGTGCGCGGTGCGTAGTCTGGAGCTATGGTGGTGGTGGCAGCCGCGCCGAA

CCCGGCCGACGGGACCCCTAAAGTTCTGCTTCTGTCGGGGCAGCCCGCCTCCGCCGCCGGAGCCCCGGCCGG

CCAGGCCCTGCCGCTCATGGTGCCAGCCCAGAGAGGGGCCAGCCCGGAGGCAGCGAGCGGGGGGCTGCCCC

AGGCGCGCAAGCGACAGCGCCTCACGCACCTGAGCCCCGAGGAGAAGGCGCTGAGGAGGTGGGCGAGGGG

CCGGGGTCTGGGGCCAGATCTGAAGCCGGGACTAGGGACAGGGGCAGGGGCAGGGGCTGGGAGCGGGGA

CCCAGCACTGGCCGCCCCGCAGGGCTCCGTCGCCTTTGGCCTGGCGGGTCGGTGCCAGCGTGGCGCGGGGC

GGGGCAGGAAGCCCGGACTGACCGGATCCGCCACGCTGGGAACCTAGGGCGGCCCAGGGCTCTTTTCTGTAC

TTTTTAACTCTCTCGTTAGAGATGACCAGAGCTGGGGATGCGGGCACCTGTCTTCCAGGCCCTCTTGCTGTGTG

GCCGCAGACTGGTGGTTCAGCCTCTTAACTCGGACATGAGGTCGAATAATCTGTTTTGGTTTACTGCTATTTCT

GGAGAGGCGCGGAGCTGAAATAACAGAGCTGTTGAAAGGGCTGGGAATTCTGCGAGGCTCACTGGTCTAGC

TCAGTATCTGCGTTCTTAAAATGGAACCTACTTCATGAGGTCTTTGGGGAGATTGAGACTTGGATATAATGTG

CCTAGCACTTAGTCCTCCGTAAATGTTCACTCTTTTGTGATCATTGTGCCTTCTGTGATTTATGAAGTGTCTCTTC

TGAGTTAATTCTTTTAAAAAAAAAAGTGTCTCCTCCAACAGACACGGACCCATCAGCAGGTCACTGCCTAGGA

TCTCAACACTAGAGATCAGGGAGTGGCATCAGCCTCTCCCTTTTCTAAATTGGACTGGGGGACGGAGGGTTGA

TGTCATAGCAAGATTGCAGCCTTCACTAGATTAATGAGGCCAGGTTGGATCCTGTTTAAGAGAACTGGAGACA

GGAAGCAGCGGGGGAATAGATGGGGAAAGAGGAAAGTTCCTTATGATGCAAGATGAATAGTGTGTGTGTCC

AGCCCCAGTGCTGTGACGGGGATGAGTCTGAGGTGGACGGATGATGCAATATAGGAGAGAATAAAGCAGGT

CTTCGAGCTAGATTGACAGAAGACTGTATTTTTTATTTTGTTTTATTGAGGGGAGGAGCCTGAAGTGTATTTTA

TCATTAGTCTGTCTTATACTGTAAATAAAAATGAAAGCACCAGCTGGTAAAGTTTTCAAATAAAGACATAAATA

AGGTTTGATATGACTCAGTGTGGTATGTTCCTTCTCTTCCTAGGAAACTGAAAAACAGAGTAGCAGCTCAGAC

TGCCAGAGATCGAAAGAAGGCTCGAATGAGTGAGCTGGAACAGCAAGTGGTAGATTTAGAAGAAGAGGTAA

AACTACTTAAGGTCAAACTCTTTTATCCATTGTATACCCTTCCTTGGTGAATGTTCTGATATTTGCTTCCCATCCC

AAGTTGTTTCAGCCCCTATTAGAATACAATTGAATATATGATTAAAAGTTAAACTAGGCTGGGCATGGTGGCT

CATGCCTGTAATCCCAGCACTTTGGGAGCCTGAGTTGGGCAGATCACTTGAAGCCAGCAGTTTGAGACCAGCC

TAGCCAACATGGTAAAATCCCGTCTCTACCCAAAAATATACCAAAAAAAAAAAAAAAAAAAAGGCCAAGCGT

GAGTGCCTGTAGTCCCAGCTACTCGGGAGGTTGAGGTGGGAGGATTGTTTGAACCTGGGAGAGGGAGGTTG

CAGTGAGCTGAGATCGCACCACTGCACTCCAGCCTGGGCAACAGAGTGAGACTCTGTCTCAAGAAAAAAAAA

AAAAGTTTGCTGGGCACCGGGGCTCACACCTGTAATCCCAGCACTTTGGGAGGCCAAGGTGGGTAGATAACT

TGAGATCAGGAGTTCGAGACCAGCCTGACCAACGTGGTGAAACCCCATCTCTATTAAAAATACAAAAATTAGC

CGGGTGTCGTGGCAGGCACCTGTAATCCCAGCTGCTCCGGAGGCTGACGCAGGAGAATCACTTGAACCCAGG

AGGCGGAGGTTGCAGTGAGCTGAGATCACGAGATCATGCCACTGCACTCCAGTCTGGGCGACAGAGCAAAA

ACCCTGTCTCAAAAAAAAAAAAAAAGTTAATCTAAGTTAGGACAGAGAGTTGGTGAAGTGGTGAAGCTTGTT

GAGGGCAGAAGTGATTGACTTTGTGGCATTTGGTGCTAGATGTATCTCAAAGTAGATGGATTTAACAATGTTT

ATTGAGTTTGTAGTAAGAAATTAGCAAGGGCTAATAGGAAATAATTGCTTAAACTTTACATTCTTCCTGGCATG

GCCAGAAATTCACTAAAGGTTCCTTTCCCCCTCTAGGGTCCACCTGTTAATCAATCTTAAATTGTTGCCAATTAC

ACATCTTGAATACATAGAGATTATTTATATTGTTTTTTTAACCCCTTGGTCAATTTGCATATATTGAGCTTTTTAA

AGTTTTAATCATTAGTTGGTTCTTCTAAGAATCATGAGTCAGGAGCAGGGATTTTTTTTAACTTATTTTGGATTT

ATAGTCACCACTACCACTTTTATTATTACCTGCCAGTTCAAGATAGTTATTTATTTTTATTTTATATTATTATTATT

ATTATTATCATCATCATTATTTTGAGATGGAGTCTCACTCTGTTGCCCAGGCTGGAGTGCAGTGGTGCAATCTC

GGCTCACTGCAACCTCTGCCTCCCAGGTTCAAGCAATTCTCCCTGCTTCAGCCTCCAGATTAGCTGGGATTACA

GGCACCCCTCACCACATCCAGCTAATTTTTGGATTTTTTAGTAGAGATGGGGGTTTGCCATGTTGGCCAGGCTG

GTTTTGAACTCTTGACCTCAGGTGATCCACCTGCCTTGGCCTCCCAAAGTGTTAGGATTACAAGTGTGAGCCAC

CGAGCCTGGCCAAGATAGTTTAAAAAAAAAATTATATCTACATTAAAGCCACAAGTCACCCTTTGCTGAAGTCA

GTATTAGTAGTTGGAAGCAGTGTGTTATTCTTGACCCCATGAAGTGGCACTTATTAAGTAGCTTGCTTTTCCAT

AATTATGGCCTAGCTTTTTAAAACCTACTATGAACACCACAAGCATAGAGTTTTCCAAAAGTTCAAGAAGGAAA

GGAAACCAATTATACTGAATCAGGTAGATTCTTAACTGAAATAATTAGATGTTTTAATAGCCTCTTATGAACTT

TCTTCCAGAACCAAAAACTTTTGCTAGAAAATCAGCTTTTACGAGAGAAAACTCATGGCCTTGTAGTTGAGAAC

CAGGAGTTAAGACAGCGCTTGGGGATGGATGCCCTGGTTGCTGAAGAGGAGGCGGAAGCCAAGGTAAATCA

TCTCCTTTATTTGGTGCCTCATGTGAGTACTGGTTCCAAGTGACATGACCCAGCGATTATGTTTACAGTCTGGA

CTTCTGATCAAGAGCGTTCTTGAAATTTTCCTTCAGTTTTAAGACATTTTCATGCAGGCAGAGTGTTCTTCCCCT

AAAGGCACTTGACACTCATTTTTTAAGTGTGTAGTGAACAGTACTAAGATCTAATAATGAAAACAAGTTACAT

GGCTCCCTAAGAACAAGTACTAACAAATGCAGTAGCCAACAAGATTACCATGCAATCATTAAGGAGAACCAAA

GTAAGAGAGCCACTCAAACCAGATTTTGAACGCTACTAAAATTAAAGTAGTTCTTTGATGAATATGAATGAGT

AGGGAAAGGATTCTTTGTAATAGTGATACCTCTGTGGTAAGAGAAGGGTGGTATGTGAGTTTTAGTCTACAG

ATTATGGCAAATTCAGTGACAACAATCAAATGGTCTAAGATTGACAGTAGCACAGTTTTACTCTGTGAAGGTA

ATGTTCAGGACAAATTTCAAGAAAACTAGAAAACCATTCTTTACAGCTGAAATCTTTCCCTAACCATTGTTATTT

CCACTTTTAAGTCCTCAAGAGATGAGAAAAGGGAGGTAAGGCTTCCTTATACATTTCCTGCACAATGAAACAT

TTTTCCTCCTCCAGGCAAAGATTCAAGCAGAACTGGCAAATATCTTATCTTGCTCTTCTCAATAATAATAATGTT

GTTAGATAATAAAGTTCTATAGCAATTTAACCCTAGAATCTTTTTGAAAAGTAATTCTTTAAAGTTGAGAATCA

CAGCTGTCTAGCAAGCATTTCCTTGGGCACTTGAAGCTGTTTATTCACTTTGGTCTTTCCTCCCAGGGGAATGA

AGTGAGGCCAGTGGCCGGGTCTGCTGAGTCCGCAGCACTCAGACTACGTGCACCTCTGCAGCAGGTGCAGGC

CCAGTTGTCACCCCTCCAGAACATCTCCCCATGGATTCTGGCGGTATTGACTCTTCAGATTCAGAGGTAGGGAT

CATTCTGACTTATTAAAGAGCTATATAACCAGTTAATTCCATCTGTTTGATGCTTGACATCCCTAACTAGACAGA

TGAGGGTTGAAGTTAGTTTTTGGTGGGGTTGGAGGTGAACATCAACTACCTTCCTAGTTCCAGGTAATATAGA

ACATGGAGTGAAGTGTAGATAAATGGGTCTGGTGGGTCCCGAGGTCATCTTATCACATAATGACTAATTTACA

TTATGGAACCCAGTACAAAGTGTTCCAGTTAGATTTTCCATTGTATTCTGACAGTTGTACTTCATTTAATTTTTG

CCTCTTACAGTCTGATATCCTGTTGGGCATTCTGGACAACTTGGACCCAGTCATGTTCTTCAAATGCCCTTCCCC

AGAGCCTGCCAGCCTGGAGGAGCTCCCAGAGGTCTACCCAGAAGGACCCAGTTCCTTACCAGCCTCCCTTTCT

CTGTCAGTGGGGACGTCATCAGCCAAGCTGGAAGCCATTAATGAACTAATTCGTTTTGACCACATATATACCA

AGCCCCTAGTCTTAGAGATACCCTCTGAGACAGAGAGCCAAGCTAATGTGGTAGTGAAAATCGAGGAAGCAC

CTCTCAGCCCCTCAGAGAATGATCACCCTGAATTCATTGTCTCAGTGAAGGAAGAACCTGTAGAAGATGACCT

CGTTCCGGAGCTGGGTATCTCAAATCTGCTTTCATCCAGCCACTGCCCAAAGCCATCTTCCTGCCTACTGGATG

CTTACAGTGACTGTGGATACGGGGGTTCCCTTTCCCCATTCAGTGACATGTCCTCTCTGCTTGGTGTAAACCAT

TCTTGGGAGGACACTTTTGCCAATGAACTCTTTCCCCAGCTGATTAGTGTCTAAGGAATGATCCAATACTGTTG

CCCTTTTCCTTGACTATTACACTGCCTGGAGGATAGCAGAGAAGCCTGTCTGTACTTCATTCAAAAAGCCAAAA

TAGAGAGTATACAGTCCTAGAGAATTCCTCTATTTGTTCAGATCTCATAGATGACCCCCAGGTATTGTCTTTTG

ACATCCAGCAGTCCAAGGTATTGAGACATATTACTGGAAGTAAGAAATATTACTATAATTGAGAACTACAGCT

TTTAAGATTGTACTTTTATCTTAAAAGGGTGGTAGTTTTCCCTAAAATACTTATTATGTAAGGGTCATTAGACAA

ATGTCTTGAAGTAGACATGGAATTTATGAATGGTTCTTTATCATTTCTCTTCCCCCTTTTTGGCATCCTGGCTTG

CCTCCAGTTTTAGGTCCTTTAGTTTGCTTCTGTAAGCAACGGGAACACCTGCTGAGGGGGCTCTTTCCCTCATG

TATACTTCAAGTAAGATCAAGAATCTTTTGTGAAATTATAGAAATTTACTATGTAAATGCTTGATGGAATTTTTT

CCTGCTAGTGTAGCTTCTGAAAGGTGCTTTCTCCATTTATTTAAAACTACCCATGCAATTAAAAGGTACAATGCA

SEQ ID 802: Human X-box binding protein 1 (XBP1), transcript variant 1, mRNA (not processed by

IRE1)

GCTGGGCGGCTGCGGCGCGCGGTGCGCGGTGCGTAGTCTGGAGCTATGGTGGTGGTGGCAGCCGCGCCGAA

CCCGGCCGACGGGACCCCTAAAGTTCTGCTTCTGTCGGGGCAGCCCGCCTCCGCCGCCGGAGCCCCGGCCGG

CCAGGCCCTGCCGCTCATGGTGCCAGCCCAGAGAGGGGCCAGCCCGGAGGCAGCGAGCGGGGGGCTGCCCC

AGGCGCGCAAGCGACAGCGCCTCACGCACCTGAGCCCCGAGGAGAAGGCGCTGAGGAGGAAACTGAAAAAC

AGAGTAGCAGCTCAGACTGCCAGAGATCGAAAGAAGGCTCGAATGAGTGAGCTGGAACAGCAAGTGGTAGA

TTTAGAAGAAGAGAACCAAAAACTTTTGCTAGAAAATCAGCTTTTACGAGAGAAAACTCATGGCCTTGTAGTT

GAGAACCAGGAGTTAAGACAGCGCTTGGGGATGGATGCCCTGGTTGCTGAAGAGGAGGCGGAAGCCAAGG

GGAATGAAGTGAGGCCAGTGGCCGGGTCTGCTGAGTCCGCAGCACTCAGACTACGTGCACCTCTGCAGCAGG

TGCAGGCCCAGTTGTCACCCCTCCAGAACATCTCCCCATGGATTCTGGCGGTATTGACTCTTCAGATTCAGAGT

CTGATATCCTGTTGGGCATTCTGGACAACTTGGACCCAGTCATGTTCTTCAAATGCCCTTCCCCAGAGCCTGCC

AGCCTGGAGGAGCTCCCAGAGGTCTACCCAGAAGGACCCAGTTCCTTACCAGCCTCCCTTTCTCTGTCAGTGG

GGACGTCATCAGCCAAGCTGGAAGCCATTAATGAACTAATTCGTTTTGACCACATATATACCAAGCCCCTAGTC

TTAGAGATACCCTCTGAGACAGAGAGCCAAGCTAATGTGGTAGTGAAAATCGAGGAAGCACCTCTCAGCCCC

TCAGAGAATGATCACCCTGAATTCATTGTCTCAGTGAAGGAAGAACCTGTAGAAGATGACCTCGTTCCGGAGC

TGGGTATCTCAAATCTGCTTTCATCCAGCCACTGCCCAAAGCCATCTTCCTGCCTACTGGATGCTTACAGTGACT

GTGGATACGGGGGTTCCCTTTCCCCATTCAGTGACATGTCCTCTCTGCTTGGTGTAAACCATTCTTGGGAGGAC

ACTTTTGCCAATGAACTCTTTCCCCAGCTGATTAGTGTCTAAGGAATGATCCAATACTGTTGCCCTTTTCCTTGA

CTATTACACTGCCTGGAGGATAGCAGAGAAGCCTGTCTGTACTTCATTCAAAAAGCCAAAATAGAGAGTATAC

AGTCCTAGAGAATTCCTCTATTTGTTCAGATCTCATAGATGACCCCCAGGTATTGTCTTTTGACATCCAGCAGTC

CAAGGTATTGAGACATATTACTGGAAGTAAGAAATATTACTATAATTGAGAACTACAGCTTTTAAGATTGTACT

TTTATCTTAAAAGGGTGGTAGTTTTCCCTAAAATACTTATTATGTAAGGGTCATTAGACAAATGTCTTGAAGTA

GACATGGAATTTATGAATGGTTCTTTATCATTTCTCTTCCCCCTTTTTGGCATCCTGGCTTGCCTCCAGTTTTAGG

TCCTTTAGTTTGCTTCTGTAAGCAACGGGAACACCTGCTGAGGGGGCTCTTTCCCTCATGTATACTTCAAGTAA

GATCAAGAATCTTTTGTGAAATTATAGAAATTTACTATGTAAATGCTTGATGGAATTTTTTCCTGCTAGTGTAGC

TTCTGAAAGGTGCTTTCTCCATTTATTTAAAACTACCCATGCAATTAAAAGGTACAATGCA

SEQ ID 803: Human X-box-binding protein 1 isoform XBP1(U)

MVVVAAAPNPADGTPKVLLLSGQPASAAGAPAGQALPLMVPAQRGASPEAASGGLPQARKRQRLTHLSPEEKAL

RRKLKNRVAAQTARDRKKARMSELEQQVVDLEEENQKLLLENQLLREKTHGLVVENQELRQRLGMDALVAEEEAE

AKGNEVRPVAGSAESAALRLRAPLQQVQAQLSPLQNISPWILAVLTLQIQSLISCWAFWTTWTQSCSSNALPQSLP

AWRSSQRSTQKDPVPYQPPFLCQWGRHQPSWKPLMN

SEQ ID 804: Human X-box binding protein 1 (XBP1), transcript variant 2, mRNA (processed by IRE1)

GCTGGGCGGCTGCGGCGCGCGGTGCGCGGTGCGTAGTCTGGAGCTATGGTGGTGGTGGCAGCCGCGCCGAA

CCCGGCCGACGGGACCCCTAAAGTTCTGCTTCTGTCGGGGCAGCCCGCCTCCGCCGCCGGAGCCCCGGCCGG

CCAGGCCCTGCCGCTCATGGTGCCAGCCCAGAGAGGGGCCAGCCCGGAGGCAGCGAGCGGGGGGCTGCCCC

AGGCGCGCAAGCGACAGCGCCTCACGCACCTGAGCCCCGAGGAGAAGGCGCTGAGGAGGAAACTGAAAAAC

AGAGTAGCAGCTCAGACTGCCAGAGATCGAAAGAAGGCTCGAATGAGTGAGCTGGAACAGCAAGTGGTAGA

TTTAGAAGAAGAGAACCAAAAACTTTTGCTAGAAAATCAGCTTTTACGAGAGAAAACTCATGGCCTTGTAGTT

GAGAACCAGGAGTTAAGACAGCGCTTGGGGATGGATGCCCTGGTTGCTGAAGAGGAGGCGGAAGCCAAGG

GGAATGAAGTGAGGCCAGTGGCCGGGTCTGCTGAGTCCGCAGCAGGTGCAGGCCCAGTTGTCACCCCTCCAG

AACATCTCCCCATGGATTCTGGCGGTATTGACTCTTCAGATTCAGAGTCTGATATCCTGTTGGGCATTCTGGAC

AACTTGGACCCAGTCATGTTCTTCAAATGCCCTTCCCCAGAGCCTGCCAGCCTGGAGGAGCTCCCAGAGGTCT

ACCCAGAAGGACCCAGTTCCTTACCAGCCTCCCTTTCTCTGTCAGTGGGGACGTCATCAGCCAAGCTGGAAGC

CATTAATGAACTAATTCGTTTTGACCACATATATACCAAGCCCCTAGTCTTAGAGATACCCTCTGAGACAGAGA

GCCAAGCTAATGTGGTAGTGAAAATCGAGGAAGCACCTCTCAGCCCCTCAGAGAATGATCACCCTGAATTCAT

TGTCTCAGTGAAGGAAGAACCTGTAGAAGATGACCTCGTTCCGGAGCTGGGTATCTCAAATCTGCTTTCATCC

AGCCACTGCCCAAAGCCATCTTCCTGCCTACTGGATGCTTACAGTGACTGTGGATACGGGGGTTCCCTTTCCCC

ATTCAGTGACATGTCCTCTCTGCTTGGTGTAAACCATTCTTGGGAGGACACTTTTGCCAATGAACTCTTTCCCCA

GCTGATTAGTGTCTAAGGAATGATCCAATACTGTTGCCCTTTTCCTTGACTATTACACTGCCTGGAGGATAGCA

GAGAAGCCTGTCTGTACTTCATTCAAAAAGCCAAAATAGAGAGTATACAGTCCTAGAGAATTCCTCTATTTGTT

CAGATCTCATAGATGACCCCCAGGTATTGTCTTTTGACATCCAGCAGTCCAAGGTATTGAGACATATTACTGGA

AGTAAGAAATATTACTATAATTGAGAACTACAGCTTTTAAGATTGTACTTTTATCTTAAAAGGGTGGTAGTTTT

CCCTAAAATACTTATTATGTAAGGGTCATTAGACAAATGTCTTGAAGTAGACATGGAATTTATGAATGGTTCTT

TATCATTTCTCTTCCCCCTTTTTGGCATCCTGGCTTGCCTCCAGTTTTAGGTCCTTTAGTTTGCTTCTGTAAGCAA

CGGGAACACCTGCTGAGGGGGCTCTTTCCCTCATGTATACTTCAAGTAAGATCAAGAATCTTTTGTGAAATTAT

AGAAATTTACTATGTAAATGCTTGATGGAATTTTTTCCTGCTAGTGTAGCTTCTGAAAGGTGCTTTCTCCATTTA

TTTAAAACTACCCATGCAATTAAAAGGTACAATGCA

SEQ ID 805: Human X-box-binding protein 1 isoform XBP1(S)

MVVVAAAPNPADGTPKVLLLSGQPASAAGAPAGQALPLMVPAQRGASPEAASGGLPQARKRQRLTHLSPEEKAL

RRKLKNRVAAQTARDRKKARMSELEQQVVDLEEENQKLLLENQLLREKTHGLVVENQELRQRLGMDALVAEEEAE

AKGNEVRPVAGSAESAAGAGPVVTPPEHLPMDSGGIDSSDSESDILLGILDNLDPVMFFKCPSPEPASLEELPEVYP

EGPSSLPASLSLSVGTSSAKLEAINELIRFDHIYTKPLVLEIPSETESQANVVVKIEEAPLSPSENDHPEFIVSVKEEPVED

DLVPELGISNLLSSSHCPKPSSCLLDAYSDCGYGGSLSPFSDMSSLLGVNHSWEDTFANELFPQLISV

SEQ ID 806: Human X-box binding protein 1 (XBP1) delta 4 variant

GCTGGGCGGCTGCGGCGCGCGGTGCGCGGTGCGTAGTCTGGAGCTATGGTGGTGGTGGCAGCCGCGCCGAA

CCCGGCCGACGGGACCCCTAAAGTTCTGCTTCTGTCGGGGCAGCCCGCCTCCGCCGCCGGAGCCCCGGCCGG

CCAGGCCCTGCCGCTCATGGTGCCAGCCCAGAGAGGGGCCAGCCCGGAGGCAGCGAGCGGGGGGCTGCCCC

AGGCGCGCAAGCGACAGCGCCTCACGCACCTGAGCCCCGAGGAGAAGGCGCTGAGGAGGAAACTGAAAAAC

AGAGTAGCAGCTCAGACTGCCAGAGATCGAAAGAAGGCTCGAATGAGTGAGCTGGAACAGCAAGTGGTAGA

TTTAGAAGAAGAGAACCAAAAACTTTTGCTAGAAAATCAGCTTTTACGAGAGAAAACTCATGGCCTTGTAGTT

GAGAACCAGGAGTTAAGACAGCGCTTGGGGATGGATGCCCTGGTTGCTGAAGAGGAGGCGGAAGCCAAGTC

TGATATCCTGTTGGGCATTCTGGACAACTTGGACCCAGTCATGTTCTTCAAATGCCCTTCCCCAGAGCCTGCCA

GCCTGGAGGAGCTCCCAGAGGTCTACCCAGAAGGACCCAGTTCCTTACCAGCCTCCCTTTCTCTGTCAGTGGG

GACGTCATCAGCCAAGCTGGAAGCCATTAATGAACTAATTCGTTTTGACCACATATATACCAAGCCCCTAGTCT

TAGAGATACCCTCTGAGACAGAGAGCCAAGCTAATGTGGTAGTGAAAATCGAGGAAGCACCTCTCAGCCCCT

CAGAGAATGATCACCCTGAATTCATTGTCTCAGTGAAGGAAGAACCTGTAGAAGATGACCTCGTTCCGGAGCT

GGGTATCTCAAATCTGCTTTCATCCAGCCACTGCCCAAAGCCATCTTCCTGCCTACTGGATGCTTACAGTGACT

GTGGATACGGGGGTTCCCTTTCCCCATTCAGTGACATGTCCTCTCTGCTTGGTGTAAACCATTCTTGGGAGGAC

ACTTTTGCCAATGAACTCTTTCCCCAGCTGATTAGTGTCTAAGGAATGATCCAATACTGTTGCCCTTTTCCTTGA

CTATTACACTGCCTGGAGGATAGCAGAGAAGCCTGTCTGTACTTCATTCAAAAAGCCAAAATAGAGAGTATAC

AGTCCTAGAGAATTCCTCTATTTGTTCAGATCTCATAGATGACCCCCAGGTATTGTCTTTTGACATCCAGCAGTC

CAAGGTATTGAGACATATTACTGGAAGTAAGAAATATTACTATAATTGAGAACTACAGCTTTTAAGATTGTACT

TTTATCTTAAAAGGGTGGTAGTTTTCCCTAAAATACTTATTATGTAAGGGTCATTAGACAAATGTCTTGAAGTA

GACATGGAATTTATGAATGGTTCTTTATCATTTCTCTTCCCCCTTTTTGGCATCCTGGCTTGCCTCCAGTTTTAGG

TCCTTTAGTTTGCTTCTGTAAGCAACGGGAACACCTGCTGAGGGGGCTCTTTCCCTCATGTATACTTCAAGTAA

GATCAAGAATCTTTTGTGAAATTATAGAAATTTACTATGTAAATGCTTGATGGAATTTTTTCCTGCTAGTGTAGC

TTCTGAAAGGTGCTTTCTCCATTTATTTAAAACTACCCATGCAATTAAAAGGTACAATGCA

SEQ ID 807; Human Predicted amino acid sequence from XBP1 delta4 mRNA transcript (SEQ ID 562)

MVVVAAAPNPADGTPKVLLLSGQPASAAGAPAGQALPLMVPAQRGASPEAASGGLPQARKRQRLTHLSPEEKAL

RRKLKNRVAAQTARDRKKARMSELEQQVVDLEEENQKLLLENQLLREKTHGLVVENQELRQRLGMDALVAEEEAE

AKSDILLGILDNLDPVMFFKCPSPEPASLEELPEVYPEGPSSLPASLSLSVGTSSAKLEAINELIRFDHIYTKPLVLEIPSET

ESQANVVVKIEEAPLSPSENDHPEFIVSVKEEPVEDDLVPELGISNLLSSSHCPKPSSCLLDAYSDCGYGGSLSPFSDM

SSLLGVNHSWEDTFANELFPQLISV

	Number	Date	Country
Parent	PCT/EP2021/086382	Dec 2021	US
Child	18340016		US

OLIGONUCLEOTIDES TARGETING XBP1

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)

CROSS-REFERENCE TO RELATED APPLICATIONS

Continuations (1)