Mad nucleases

Information

  • Patent Grant
  • 11306298
  • Patent Number
    11,306,298
  • Date Filed
    Tuesday, August 31, 2021
    2 years ago
  • Date Issued
    Tuesday, April 19, 2022
    2 years ago
Abstract
The present disclosure provides new RNA-guided nuclease systems and engineered nickases for making rational, direct edits to nucleic acids in live cells.
Description
FIELD OF THE INVENTION

The present disclosure provides new RNA-guided nuclease systems and engineered nickases for making rational, direct edits to nucleic acids in live cells.


INCORPORATION BY REFERENCE

Submitted with the present application is an electronically filed sequence listing via EFS-Web as an ASCII formatted sequence listing, entitled “INSC083US_seqlist_20210812”, created Aug. 12, 2021, and 359,000 bytes in size. The sequence listing is part of the specification filed herewith and is incorporated by reference in its entirety.


BACKGROUND OF THE INVENTION

In the following discussion certain articles and methods will be described for background and introductory purposes. Nothing contained herein is to be construed as an “admission” of prior art. Applicant expressly reserves the right to demonstrate, where appropriate, that the methods referenced herein do not constitute prior art under the applicable statutory provisions.


The ability to make precise, targeted changes to the genome of living cells has been a long-standing goal in biomedical research and development. Recently, various nucleases have been identified that allow manipulation of gene sequence; hence, gene function. These nucleases include nucleic acid-guided nucleases. The range of target sequences that nucleic acid-guided nucleases can recognize, however, is constrained by the need for a specific PAM to be located near the desired target sequence. PAMs are short nucleotide sequences recognized by a gRNA/nuclease complex where this complex directs editing of the target sequence. The precise PAM sequence and PAM length requirements for different nucleic acid-guided nucleases vary; however, PAMs typically are 2-7 base-pair sequences adjacent or in proximity to the target sequence and, depending on the nuclease, can be 5′ or 3′ to the target sequence. Engineering nucleic acid-guided nucleases or mining for new nucleic acid-guided nucleases may provide nucleases with altered PAM preferences and/or altered activity or fidelity; all changes that may increase the versatility of a nucleic acid-guided nuclease for certain editing tasks.


There is thus a need in the art of nucleic acid-guided nuclease gene editing for novel nucleases with varied PAM preferences, varied activity in cells from different organisms such as mammals and/or altered enzyme fidelity. The novel MAD nucleases described herein satisfy this need.


SUMMARY OF THE INVENTION

This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. Other features, details, utilities, and advantages of the claimed subject matter will be apparent from the following written Detailed Description including those aspects illustrated in the accompanying drawings and defined in the appended claims.


The present disclosure provides Type II MAD nucleases (e.g., RNA-guided nucleases or RGNs) with varied PAM preferences, and/or varied activity in mammalian cells.


Thus, in one embodiment there are provided MAD nuclease systems that perform nucleic acid-guided nuclease editing including a MAD2015 system comprising SEQ ID Nos. 1 (MAD2015 nuclease), 2 (CRISPR RNA) and 3 (trans-activating crispr RNA); a MAD2016 system comprising SEQ ID Nos. 4 (MAD2016 nuclease), 5 (CRISPR RNA) and 6 (trans-activating crispr RNA); a MAD2017 system comprising SEQ ID Nos. 7 (MAD2017 nuclease), 8 (CRISPR RNA) and 9 (trans-activating crispr RNA); a MAD2019 system comprising SEQ ID Nos. 10 (MAD2019 nuclease), 11 (CRISPR RNA) and 12 (trans-activating crispr RNA); a MAD2020 system comprising SEQ ID Nos. 13 (MAD2020 nuclease), 14 (CRISPR RNA) and 15 (trans-activating crispr RNA); a MAD2021 system comprising SEQ ID Nos. 16 (MAD2021 nuclease), 17 (CRISPR RNA) and 18 (trans-activating crispr RNA); a MAD2022 system comprising SEQ ID Nos. 19 (MAD2022 nuclease), 20 (CRISPR RNA) and 21 (trans-activating crispr RNA); a MAD2023 system comprising SEQ ID Nos. 22 (MAD2023 nuclease), 23 (CRISPR RNA) and 24 (trans-activating crispr RNA); a MAD2024 system comprising SEQ ID Nos. 25 (MAD2024 nuclease), 26 (CRISPR RNA) and 27 (trans-activating crispr RNA); a MAD2025 system comprising SEQ ID Nos. 28 (MAD2025 nuclease), 29 (CRISPR RNA) and 30 (trans-activating crispr RNA); a MAD2026 system comprising SEQ ID Nos. 31 (MAD2026 nuclease), 32 (CRISPR RNA) and 33 (trans-activating crispr RNA); a MAD2027 system comprising SEQ ID Nos. 34 (MAD2034 nuclease), 35 (CRISPR RNA) and 36 (trans-activating crispr RNA); a MAD2028 system comprising SEQ ID Nos. 37 (MAD2028 nuclease), 38 (CRISPR RNA) and 39 (trans-activating crispr RNA); a MAD2029 system comprising SEQ ID Nos. 40 (MAD2029 nuclease), 41 (CRISPR RNA) and 42 (trans-activating crispr RNA); a MAD2030 system comprising SEQ ID Nos. 43 (MAD2030 nuclease), 44 (CRISPR RNA) and 45 (trans-activating crispr RNA); a MAD2031 system comprising SEQ ID Nos. 46 (MAD2031 nuclease), 47 (CRISPR RNA) and 48 (trans-activating crispr RNA); a MAD2032 system comprising SEQ ID Nos. 49 (MAD2032 nuclease), 50 (CRISPR RNA) and 51 (trans-activating crispr RNA); a MAD2033 system comprising SEQ ID Nos. 52 (MAD2033 nuclease), 53 (CRISPR RNA) and 54 (trans-activating crispr RNA); a MAD2034 system comprising SEQ ID Nos. 55 (MAD2034 nuclease), 56 (CRISPR RNA) and 57 (trans-activating crispr RNA); a MAD2035 system comprising SEQ ID Nos. 58 (MAD2035 nuclease), 59 (CRISPR RNA) and 60 (trans-activating crispr RNA); a MAD2036 system comprising SEQ ID Nos. 61 (MAD2036 nuclease), 62 (CRISPR RNA) and 63 (trans-activating crispr RNA); a MAD2037 system comprising SEQ ID Nos. 64 (MAD2031 nuclease), 65 (CRISPR RNA) and 66 (trans-activating crispr RNA); a MAD2038 system comprising SEQ ID Nos. 67 (MAD2038 nuclease), 68 (CRISPR RNA) and 69 (trans-activating crispr RNA); a MAD2039 system comprising SEQ ID Nos. 70 (MAD2039 nuclease), 71 (CRISPR RNA) and 72 (trans-activating crispr RNA); and a MAD2040 system comprising SEQ ID Nos. 73 (MAD2040 nuclease), 74 (CRISPR RNA) and 75 (trans-activating crispr RNA). In some aspects, the MAD system components are delivered as sequences to be transcribed (in the case of the gRNA components) and transcribed and translated (in the case of the MAD nuclease), and in some aspects, the coding sequence for the MAD nuclease and the gRNA component sequences are on the same vector. In other aspects, the coding sequence for the MAD nuclease and the gRNA component sequences are on a different vector and in some aspects, the gRNA component sequences are located in an editing cassette which also comprises a donor DNA (e.g., homology arm). In other aspects, the MAD nuclease is delivered to the cells as a peptide or the MAD nuclease and gRNA components are delivered to the cells as a ribonuclease complex.


Additionally there is provided engineered nickases derived from the nucleases from the above-referenced systems, including MAD2016-H851A (SEQ ID NO: 177); MAD2016-N874A (SEQ ID NO: 178); MAD2032-H590A (SEQ ID NO: 179); MAD2039-H587A (SEQ ID NO: 180); MAD2039-N610A (SEQ ID NO: 181).


These aspects and other features and advantages of the invention are described below in more detail.





BRIEF DESCRIPTION OF THE FIGURES


FIG. 1 is an exemplary workflow for creating and screening mined MAD nucleases or RGNs.



FIG. 2 is a simplified depiction of an in vitro test conducted on candidate enzymes.



FIG. 3 is a list of novel Type II MADzymes that have been identified.



FIG. 4 is a map of Type II MADzymes in cluster 59.



FIG. 5 is a map of Type II MADzymes in cluster 55, 56, 57 and 58.



FIG. 6 is a map of Type II MADzymes in cluster 141.



FIG. 7 is a reproduction of a gel showing nicked plasmid formation with different MADzyme nickases compared to corresponding MADzyme nucleases.





It should be understood that the drawings are not necessarily to scale.


DETAILED DESCRIPTION

The description set forth below in connection with the appended drawings is intended to be a description of various, illustrative embodiments of the disclosed subject matter. Specific features and functionalities are described in connection with each illustrative embodiment; however, it will be apparent to those skilled in the art that the disclosed embodiments may be practiced without each of those specific features and functionalities. Moreover, all of the functionalities described in connection with one embodiment are intended to be applicable to the additional embodiments described herein except where expressly stated or where the feature or function is incompatible with the additional embodiments. For example, where a given feature or function is expressly described in connection with one embodiment but not expressly mentioned in connection with an alternative embodiment, it should be understood that the feature or function may be deployed, utilized, or implemented in connection with the alternative embodiment unless the feature or function is incompatible with the alternative embodiment.


The practice of the techniques described herein may employ, unless otherwise indicated, conventional techniques and descriptions of organic chemistry, polymer technology, molecular biology (including recombinant techniques), cell biology, biochemistry, biological emulsion generation, and sequencing technology, which are within the skill of those who practice in the art. Such conventional techniques include polymer array synthesis, hybridization and ligation of polynucleotides, and detection of hybridization using a label. Specific illustrations of suitable techniques can be had by reference to the examples herein. However, other equivalent conventional procedures can, of course, also be used. Such conventional techniques and descriptions can be found in standard laboratory manuals such as Green, et al., Eds. (1999), Genome Analysis: A Laboratory Manual Series (Vols. I-IV); Weiner, Gabriel, Stephens, Eds. (2007), Genetic Variation: A Laboratory Manual; Dieffenbach, Dveksler, Eds. (2003), PCR Primer: A Laboratory Manual; Bowtell and Sambrook (2003), DNA Microarrays: A Molecular Cloning Manual; Mount (2004), Bioinformatics: Sequence and Genome Analysis; Sambrook and Russell (2006), Condensed Protocols from Molecular Cloning: A Laboratory Manual; and Sambrook and Russell (2002), Molecular Cloning: A Laboratory Manual (all from Cold Spring Harbor Laboratory Press); Stryer, L. (1995) Biochemistry (4th Ed.) W.H. Freeman, New York N.Y.; Gait, “Oligonucleotide Synthesis: A Practical Approach” 1984, IRL Press, London; Nelson and Cox (2000), Lehninger, Principles of Biochemistry 3rd Ed., W. H. Freeman Pub., New York, N.Y.; Berg et al. (2002) Biochemistry, 5th Ed., W.H. Freeman Pub., New York, N.Y.; Cell and Tissue Culture: Laboratory Procedures in Biotechnology (Doyle & Griffiths, eds., John Wiley & Sons 1998), all of which are herein incorporated in their entirety by reference for all purposes. Nuclease-specific techniques can be found in, e.g., Genome Editing and Engineering From TALENs and CRISPRs to Molecular Surgery, Appasani and Church, 2018; and CRISPR: Methods and Protocols, Lindgren and Charpentier, 2015; both of which are herein incorporated in their entirety by reference for all purposes. Basic methods for enzyme engineering may be found in, Enzyme Engineering Methods and Protocols, Samuelson, ed., 2013; Protein Engineering, Kaumaya, ed., (2012); and Kaur and Sharma, “Directed Evolution: An Approach to Engineer Enzymes”, Crit. Rev. Biotechnology, 26:165-69 (2006).


Note that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to “an oligonucleotide” refers to one or more oligonucleotides. Terms such as “first,” “second,” “third,” etc., merely identify one of a number of portions, components, steps, operations, functions, and/or points of reference as disclosed herein, and likewise do not necessarily limit embodiments of the present disclosure to any particular configuration or orientation.


Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. All publications mentioned herein are incorporated by reference for the purpose of describing and disclosing devices, methods and cell populations that may be used in connection with the presently described invention.


Where a range of values is provided, it is understood that each intervening value, between the upper and lower limit of that range and any other stated or intervening value in that stated range is encompassed within the invention. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges, and are also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either both of those included limits are also included in the invention.


In the following description, numerous specific details are set forth to provide a more thorough understanding of the present invention. However, it will be apparent to one of ordinary skill in the art that the present invention may be practiced without one or more of these specific details. In other instances, well-known features and procedures well known to those skilled in the art have not been described in order to avoid obscuring the invention.


The term “complementary” as used herein refers to Watson-Crick base pairing between nucleotides and specifically refers to nucleotides hydrogen bonded to one another with thymine or uracil residues linked to adenine residues by two hydrogen bonds and cytosine and guanine residues linked by three hydrogen bonds. In general, a nucleic acid includes a nucleotide sequence described as having a “percent complementarity” or “percent homology” to a specified second nucleotide sequence. For example, a nucleotide sequence may have 80%, 90%, or 100% complementarity to a specified second nucleotide sequence, indicating that 8 of 10, 9 of 10 or 10 of 10 nucleotides of a sequence are complementary to the specified second nucleotide sequence. For instance, the nucleotide sequence 3′-TCGA-5′ is 100% complementary to the nucleotide sequence 5′-AGCT-3′; and the nucleotide sequence 3′-TCGA-5′ is 100% complementary to a region of the nucleotide sequence 5′-TAGCTG-3′.


The term DNA “control sequences” refers collectively to promoter sequences, polyadenylation signals, transcription termination sequences, upstream regulatory domains, origins of replication, internal ribosome entry sites, nuclear localization sequences, enhancers, and the like, which collectively provide for the replication, transcription and translation of a coding sequence in a recipient cell. Not all of these types of control sequences need to be present so long as a selected coding sequence is capable of being replicated, transcribed and—for some components—translated in an appropriate host cell.


As used herein the term “donor DNA” or “donor nucleic acid” refers to nucleic acid that is designed to introduce a DNA sequence modification (insertion, deletion, substitution) into a locus by homologous recombination using nucleic acid-guided nucleases. For homology-directed repair, the donor DNA must have sufficient homology to the regions flanking the “cut site” or site to be edited in the genomic target sequence. The length of the homology arm(s) will depend on, e.g., the type and size of the modification being made. In many instances and preferably, the donor DNA will have two regions of sequence homology (e.g., two homology arms) to the genomic target locus. Preferably, an “insert” region or “DNA sequence modification” region—the nucleic acid modification that one desires to be introduced into a genome target locus in a cell—will be located between two regions of homology. The DNA sequence modification may change one or more bases of the target genomic DNA sequence at one specific site or multiple specific sites. A change may include changing 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400, or 500 or more base pairs of the target sequence. A deletion or insertion may be a deletion or insertion of 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50, 75, 100, 150, 200, 300, 400, or 500 or more base pairs of the target sequence.


The terms “guide nucleic acid” or “guide RNA” or “gRNA” refer to a polynucleotide comprising 1) a guide sequence capable of hybridizing to a genomic target locus, and 2) a scaffold sequence capable of interacting or complexing with a nucleic acid-guided nuclease.


“Homology” or “identity” or “similarity” refers to sequence similarity between two peptides or, more often in the context of the present disclosure, between two nucleic acid molecules. The term “homologous region” or “homology arm” refers to a region on the donor DNA with a certain degree of homology with the target genomic DNA sequence. Homology can be determined by comparing a position in each sequence which may be aligned for purposes of comparison. When a position in the compared sequence is occupied by the same base or amino acid, then the molecules are homologous at that position. A degree of homology between sequences is a function of the number of matching or homologous positions shared by the sequences.


“Operably linked” refers to an arrangement of elements where the components so described are configured so as to perform their usual function. Thus, control sequences operably linked to a coding sequence are capable of effecting the transcription, and in some cases, the translation, of a coding sequence. The control sequences need not be contiguous with the coding sequence so long as they function to direct the expression of the coding sequence. Thus, for example, intervening untranslated yet transcribed sequences can be present between a promoter sequence and the coding sequence and the promoter sequence can still be considered “operably linked” to the coding sequence. In fact, such sequences need not reside on the same contiguous DNA molecule (i.e. chromosome) and may still have interactions resulting in altered regulation.


A “promoter” or “promoter sequence” is a DNA regulatory region capable of binding RNA polymerase and initiating transcription of a polynucleotide or polypeptide coding sequence such as messenger RNA, ribosomal RNA, small nuclear or nucleolar RNA, guide RNA, or any kind of RNA transcribed by any class of any RNA polymerase I, II or III. Promoters may be constitutive or inducible and, in some embodiments—particularly many embodiments in which selection is employed—the transcription of at least one component of the nucleic acid-guided nuclease editing system is under the control of an inducible promoter.


As used herein the term “selectable marker” refers to a gene introduced into a cell, which confers a trait suitable for artificial selection. General use selectable markers are well-known to those of ordinary skill in the art. Drug selectable markers such as ampicillin/carbenicillin, kanamycin, chloramphenicol, erythromycin, tetracycline, gentamicin, bleomycin, streptomycin, rhamnose, puromycin, hygromycin, blasticidin, and G418 may be employed. In other embodiments, selectable markers include, but are not limited to human nerve growth factor receptor (detected with a MAb, such as described in U.S. Pat. No. 6,365,373); truncated human growth factor receptor (detected with MAb); mutant human dihydrofolate reductase (DHFR; fluorescent MTX substrate available); secreted alkaline phosphatase (SEAP; fluorescent substrate available); human thymidylate synthase (TS; confers resistance to anti-cancer agent fluorodeoxyuridine); human glutathione S-transferase alpha (GSTA1; conjugates glutathione to the stem cell selective alkylator busulfan; chemoprotective selectable marker in CD34+cells); CD24 cell surface antigen in hematopoietic stem cells; human CAD gene to confer resistance to N-phosphonacetyl-L-aspartate (PALA); human multi-drug resistance-1 (MDR-1; P-glycoprotein surface protein selectable by increased drug resistance or enriched by FACS); human CD25 (IL-2α; detectable by Mab-FITC); Methylguanine-DNA methyltransferase (MGMT; selectable by carmustine); and Cytidine deaminase (CD; selectable by Ara-C). “Selective medium” as used herein refers to cell growth medium to which has been added a chemical compound or biological moiety that selects for or against selectable markers.


The terms “target genomic DNA sequence”, “target sequence”, or “genomic target locus” refer to any locus in vitro or in vivo, or in a nucleic acid (e.g., genome) of a cell or population of cells, in which a change of at least one nucleotide is desired using a nucleic acid-guided nuclease editing system. The target sequence can be a genomic locus or extrachromosomal locus.


A “vector” is any of a variety of nucleic acids that comprise a desired sequence or sequences to be delivered to and/or expressed in a cell. Vectors are typically composed of DNA, although RNA vectors are also available. Vectors include, but are not limited to, plasmids, fosmids, phagemids, virus genomes, synthetic chromosomes, and the like. As used herein, the phrase “engine vector” comprises a coding sequence for a nuclease to be used in the nucleic acid-guided nuclease systems and methods of the present disclosure. The engine vector may also comprise, in a bacterial system, the λ Red recombineering system or an equivalent thereto. Engine vectors also typically comprise a selectable marker. As used herein the phrase “editing vector” comprises a donor nucleic acid, optionally including an alteration to the target sequence that prevents nuclease binding at a PAM or spacer in the target sequence after editing has taken place, and a coding sequence for a gRNA. The editing vector may also comprise a selectable marker and/or a barcode. In some embodiments, the engine vector and editing vector may be combined; that is, the contents of the engine vector may be found on the editing vector. Further, the engine and editing vectors comprise control sequences operably linked to, e.g., the nuclease coding sequence, recombineering system coding sequences (if present), donor nucleic acid, guide nucleic acid, and selectable marker(s).


Editing in Nucleic Acid-Guided Nuclease Genome Systems


RNA-guided nucleases (RGNs) have rapidly become the foundational tools for genome engineering of prokaryotes and eukaryotes. Clustered Rapidly Interspaced Short Palindromic Repeats (CRISPR) systems are an adaptive immunity system which protect prokaryotes against mobile genetic elements (MGEs). RGNs are a major part of this defense system because they identify and destroy MGEs. RGNs can be repurposed for genome editing in various organisms by reprogramming the CRISPR RNA (crRNA) that guides the RGN to a specific target DNA. A number of different RGNs have been identified to date for various applications; however, there are various properties that make some RGNs more desirable than others for specific applications. RGNs can be used for creating specific double strand breaks (DSBs), specific nicks of one strand of DNA, or guide another moiety to a specific DNA sequence.


The ability of an RGN to specifically target any genomic sequence is perhaps the most desirable feature of RGNs; however, RGNs can only access their desired target if the target DNA also contains a short motif called PAM (protospacer adjacent motif) that is specific for every RGN. Type V RGNs such as MAD7, AsCas12a and LbCas12a tend to access DNA targets that contain YTTN/TTTN on the 5′ end whereas type II RGNs—such as the MADzymes disclosed herein—target DNA sequences containing a specific short motif on the 3′ end. An example well known in the art for a type II RGN is SpCas9 which requires an NGG on the 3′ end of the target DNA. Type II RGNs, unlike type V RGNS, require a transactivating RNA (tracrRNA) in addition to a crRNA for optimal function. Compared to type V RGNs, the type II RGNs create a double-strand break closer to the PAM sequence, which is highly desirable for precise genome editing applications.


A number of type II RGNs have been discovered so far; however, their use in widespread applications is limited by restrictive PAMs. For example, the PAM of SpCas9 occurs less frequently in AT-rich regions of the genome. New type II RGNs with new and less restrictive PAMs are beneficial for the field. Further, not all type II nucleases are active in multiple organisms. For example, a number of RGNs have been discussed in the scientific literature but only a few have been demonstrated to be active in vitro and fewer still are active in cells, particularly in mammalian cells. The present disclosure identifies multiple type II RGNs that have novel PAMs and are active in mammalian cells.


In performing nucleic acid-guided nuclease editing, the type II RGNs or MADzymes may be delivered to cells to be edited as a polypeptide; alternatively, a polynucleotide sequence encoding the MADzyme are transformed or transfected into the cells to be edited. The polynucleotide sequence encoding the MADzyme may be codon optimized for expression in particular cells, such as archaeal, prokaryotic or eukaryotic cells. Eukaryotic cells can be yeast, fungi, algae, plant, animal, or human cells. Eukaryotic cells may be those of or derived from a particular organism, such as a mammal, including but not limited to human, mouse, rat, rabbit, dog, or non-human mammals including non-human primates. The choice of the MADzyme to be employed depends on many factors, such as what type of edit is to be made in the target sequence and whether an appropriate PAM is located close to the desired target sequence. The MADzyme may be encoded by a DNA sequence on a vector (e.g., the engine vector) and be under the control of a constitutive or inducible promoter. In some embodiments, the sequence encoding the nuclease is under the control of an inducible promoter, and the inducible promoter may be separate from but the same as an inducible promoter controlling transcription of the guide nucleic acid; that is, a separate inducible promoter may drive the transcription of the nuclease and guide nucleic acid sequences but the two inducible promoters may be the same type of inducible promoter (e.g., both are pL promoters). Alternatively, the inducible promoter controlling expression of the nuclease may be different from the inducible promoter controlling transcription of the guide nucleic acid; that is, e.g., the nuclease may be under the control of the pBAD inducible promoter, and the guide nucleic acid may be under the control of the pL inducible promoter.


In general, a guide nucleic acid (e.g., gRNA) complexes with a compatible nucleic acid-guided nuclease and can then hybridize with a target sequence, thereby directing the nuclease to the target sequence. With the type II MADzymes described herein, the nucleic acid-guided nuclease editing system uses two separate guide nucleic acid components that combine and function as a guide nucleic acid; that is, a CRISPR RNA (crRNA) and a transactivating CRISPR RNA (tracrRNA). The gRNA may be encoded by a DNA sequence on a polynucleotide molecule such as a plasmid, linear construct, or the coding sequence may reside within an editing cassette and is under the control of a constitutive promoter, or, in some embodiments, an inducible promoter as described below.


A guide nucleic acid comprises a guide polynucleotide sequence having sufficient complementarity with a target sequence to hybridize with the target sequence and direct sequence-specific binding of a complexed nucleic acid-guided nuclease to the target sequence. The degree of complementarity between a guide sequence and the corresponding target sequence, when optimally aligned using a suitable alignment algorithm, is about or more than about 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99%, or more. Optimal alignment may be determined with the use of any suitable algorithm for aligning sequences. In some embodiments, a guide sequence is about or more than about 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75, or more nucleotides in length. In some embodiments, a guide sequence is less than about 75, 50, 45, 40, 35, 30, 25, 20 nucleotides in length. Preferably the guide sequence is 10-30 or 15-20 nucleotides long, or 15, 16, 17, 18, 19, or 20 nucleotides in length.


In the present methods and compositions, the components of the guide nucleic acid is provided as a sequence to be expressed from a plasmid or vector and comprises both the guide sequence and the scaffold sequence as a single transcript under the control of a promoter, and in some embodiments, an inducible promoter. In general, to generate an edit in a target sequence, the gRNA/nuclease complex binds to a target sequence as determined by the guide RNA, and the nuclease recognizes a protospacer adjacent motif PAM) sequence adjacent to the target sequence. The target sequence can be any polynucleotide endogenous or exogenous to a prokaryotic or eukaryotic cell, or in vitro. For example, the target sequence can be a polynucleotide residing in the nucleus of a eukaryotic cell. A target sequence can be a sequence encoding a gene product (e.g., a protein) or a non-coding sequence (e.g., a regulatory polynucleotide, an intron, a PAM, or “junk” DNA).


The guide nucleic acid may be part of an editing cassette that encodes the donor nucleic acid. Alternatively, the guide nucleic acid may not be part of the editing cassette and instead may be encoded on the engine or editing vector backbone. For example, a sequence coding for a guide nucleic acid can be assembled or inserted into a vector backbone first, followed by insertion of the donor nucleic acid in, e.g., the editing cassette. In other cases, the donor nucleic acid in, e.g., an editing cassette can be inserted or assembled into a vector backbone first, followed by insertion of the sequence coding for the guide nucleic acid. In yet other cases, the sequence encoding the guide nucleic acid and the donor nucleic acid (inserted, for example, in an editing cassette) are simultaneously but separately inserted or assembled into a vector. In yet other embodiments, the sequence encoding the guide nucleic acid and the sequence encoding the donor nucleic acid are both included in the editing cassette.


The target sequence is associated with a PAM, which is a short nucleotide sequence recognized by the gRNA/nuclease complex. The precise PAM sequence and length requirements for different nucleic acid-guided nucleases vary; however, PAMs typically are 2-7 base-pair sequences adjacent or in proximity to the target sequence and, depending on the nuclease, can be 5′ or 3′ to the target sequence. Engineering of the PAM-interacting domain of a nucleic acid-guided nuclease may allow for alteration of PAM specificity, improve fidelity, or decrease fidelity. In certain embodiments, the genome editing of a target sequence both introduces a desired DNA change to a target sequence, e.g., the genomic DNA of a cell, and removes, mutates, or renders inactive a proto-spacer mutation (PAM) region in the target sequence. Rendering the PAM at the target sequence inactive precludes additional editing of the cell genome at that target sequence, e.g., upon subsequent exposure to a nucleic acid-guided nuclease complexed with a synthetic guide nucleic acid in later rounds of editing. Thus, cells having the desired target sequence edit and an altered PAM can be selected using a nucleic acid-guided nuclease complexed with a synthetic guide nucleic acid complementary to the target sequence. Cells that did not undergo the first editing event will be cut rendering a double-stranded DNA break, and thus will not continue to be viable. The cells containing the desired target sequence edit and PAM alteration will not be cut, as these edited cells no longer contain the necessary PAM site and will continue to grow and propagate.


As mentioned previously, the range of target sequences that nucleic acid-guided nucleases can recognize is constrained by the need for a specific PAM to be located near the desired target sequence. As a result, it often can be difficult to target edits with the precision that is necessary for genome editing. It has been found that nucleases can recognize some PAMs very well (e.g., canonical PAMs), and other PAMs less well or poorly (e.g., non-canonical PAMs). Because the mined MAD nucleases disclosed herein may recognize different PAMs, the mined MAD nucleases increase the number of target sequences that can be targeted for editing; that is, mined MAD nucleases decrease the regions of “PAM deserts” in the genome. Thus, the mined MAD nucleases expand the scope of target sequences that may be edited by increasing the number (variety) of PAM sequences recognized. Moreover, cocktails of mined MAD nucleases may be delivered to cells such that target sequences adjacent to several different PAMs may be edited in a single editing run.


Another component of the nucleic acid-guided nuclease system is the donor nucleic acid. In some embodiments, the donor nucleic acid is on the same polynucleotide (e.g., editing vector or editing cassette) as the guide nucleic acid and may be (but not necessarily) under the control of the same promoter as the guide nucleic acid (e.g., a single promoter driving the transcription of both the guide nucleic acid and the donor nucleic acid). For cassettes of this type, see U.S. Pat. Nos. 10,240,167; 10,266,849; 9,982,278; 10,351,877; 10,364,442; 10,435,715; and 10,465,207. The donor nucleic acid is designed to serve as a template for homologous recombination with a target sequence nicked or cleaved by the nucleic acid-guided nuclease as a part of the gRNA/nuclease complex. A donor nucleic acid polynucleotide may be of any suitable length, such as about or more than about 20, 25, 50, 75, 100, 150, 200, 500, or 1000 nucleotides in length. In certain preferred aspects, the donor nucleic acid can be provided as an oligonucleotide of between 20-300 nucleotides, more preferably between 50-250 nucleotides. The donor nucleic acid comprises a region that is complementary to a portion of the target sequence (e.g., a homology arm). When optimally aligned, the donor nucleic acid overlaps with (is complementary to) the target sequence by, e.g., about 20, 25, 30, 35, 40, 50, 60, 70, 80, 90 or more nucleotides. In many embodiments, the donor nucleic acid comprises two homology arms (regions complementary to the target sequence) flanking the mutation or difference between the donor nucleic acid and the target template. The donor nucleic acid comprises at least one mutation or alteration compared to the target sequence, such as an insertion, deletion, modification, or any combination thereof compared to the target sequence.


Often the donor nucleic acid is provided as an editing cassette, which is inserted into a vector backbone where the vector backbone may comprise a promoter driving transcription of the gRNA and the coding sequence of the gRNA, or the vector backbone may comprise a promoter driving the transcription of the gRNA but not the gRNA itself. Moreover, there may be more than one, e.g., two, three, four, or more guide nucleic acid/donor nucleic acid cassettes inserted into an engine vector, where each guide nucleic acid is under the control of separate different promoters, separate like promoters, or where all guide nucleic acid/donor nucleic acid pairs are under the control of a single promoter. In some embodiments the promoter driving transcription of the gRNA and the donor nucleic acid (or driving more than one gRNA/donor nucleic acid pair) is an inducible promoter. Inducible editing is advantageous in that isolated cells can be grown for several to many cell doublings to establish colonies before editing is initiated, which increases the likelihood that cells with edits will survive, as the double-strand cuts caused by active editing are largely toxic to the cells. This toxicity results both in cell death in the edited colonies, as well as a lag in growth for the edited cells that do survive but must repair and recover following editing. However, once the edited cells have a chance to recover, the size of the colonies of the edited cells will eventually catch up to the size of the colonies of unedited cells. See, e.g., U.S. Pat. Nos. 10,533,152; 10,550,363; 10,532,324; 10,550,363; 10,633,626; 10,633,627; 10,647,958; 10,760,043; 10,723,995; 10,801,008; and 10,851,339. Further, a guide nucleic acid may be efficacious directing the edit of more than one donor nucleic acid in an editing cassette; e.g., if the desired edits are close to one another in a target sequence.


In addition to the donor nucleic acid, an editing cassette may comprise one or more primer sites. The primer sites can be used to amplify the editing cassette by using oligonucleotide primers; for example, if the primer sites flank one or more of the other components of the editing cassette.


In addition, the editing cassette may comprise a barcode. A barcode is a unique DNA sequence that corresponds to the donor DNA sequence such that the barcode can identify the edit made to the corresponding target sequence. The barcode typically comprises four or more nucleotides. In some embodiments, the editing cassettes comprise a collection of donor nucleic acids representing, e.g., gene-wide or genome-wide libraries of donor nucleic acids. The library of editing cassettes is cloned into vector backbones where, e.g., each different donor nucleic acid is associated with a different barcode.


Additionally, in some embodiments, an expression vector or cassette encoding components of the nucleic acid-guided nuclease system further encodes one or more nuclear localization sequences (NLSs), such as about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs. In some embodiments, the nuclease comprises NLSs at or near the amino-terminus of the MADzyme, NLSs at or near the carboxy-terminus of the MADzyme, or a combination.


The engine and editing vectors comprise control sequences operably linked to the component sequences to be transcribed. As stated above, the promoters driving transcription of one or more components of the mined MAD nuclease editing system may be inducible, and an inducible system is likely employed if selection is to be performed. A number of gene regulation control systems have been developed for the controlled expression of genes in plant, microbe, and animal cells, including mammalian cells, including the pL promoter (induced by heat inactivation of the CI857 repressor), the pBAD promoter (induced by the addition of arabinose to the cell growth medium), and the rhamnose inducible promoter (induced by the addition of rhamnose to the cell growth medium). Other systems include the tetracycline-controlled transcriptional activation system (Tet-On/Tet-Off, Clontech, Inc. (Palo Alto, Calif.); Bujard and Gossen, PNAS, 89(12):5547-5551 (1992)), the Lac Switch Inducible system (Wyborski et al., Environ Mol Mutagen, 28(4):447-58 (1996); DuCoeur et al., Strategies 5(3):70-72 (1992); U.S. Pat. No. 4,833,080), the ecdysone-inducible gene expression system (No et al., PNAS, 93(8):3346-3351 (1996)), the cumate gene-switch system (Mullick et al., BMC Biotechnology, 6:43 (2006)), and the tamoxifen-inducible gene expression (Zhang et al., Nucleic Acids Research, 24:543-548 (1996)) as well as others.


Typically, performing genome editing in live cells entails transforming cells with the components necessary to perform nucleic acid-guided nuclease editing. For example, the cells may be transformed simultaneously with separate engine and editing vectors; the cells may already be expressing the mined MAD nuclease (e.g., the cells may have already been transformed with an engine vector or the coding sequence for the mined MAD nuclease may be stably integrated into the cellular genome) such that only the editing vector needs to be transformed into the cells; or the cells may be transformed with a single vector comprising all components required to perform nucleic acid-guided nuclease genome editing.


A variety of delivery systems can be used to introduce (e.g., transform or transfect) nucleic acid-guided nuclease editing system components into a host cell. These delivery systems include the use of yeast systems, lipofection systems, microinjection systems, biolistic systems, virosomes, liposomes, immunoliposomes, polycations, lipid:nucleic acid conjugates, virions, artificial virions, viral vectors, electroporation, cell permeable peptides, nanoparticles, nanowires, exosomes. Alternatively, molecular trojan horse liposomes may be used to deliver nucleic acid-guided nuclease components across the blood brain barrier. Of particular interest is the use of electroporation, particularly flow-through electroporation (either as a stand-alone instrument or as a module in an automated multi-module system) as described in, e.g., U.S. Pat. Nos. 10,435,713; 10,443,074; 10,323,258; and 10,415,058.


After the cells are transformed with the components necessary to perform nucleic acid-guided nuclease editing, the cells are cultured under conditions that promote editing. For example, if constitutive promoters are used to drive transcription of the mined MAD nucleases and/or gRNA, the transformed cells need only be cultured in a typical culture medium under typical conditions (e.g., temperature, CO2 atmosphere, etc.) Alternatively, if editing is inducible—by, e.g., activating inducible promoters that control transcription of one or more of the components needed for nucleic acid-guided nuclease editing, such as, e.g., transcription of the gRNA, donor DNA, nuclease, or, in the case of bacteria, a recombineering system—the cells are subjected to inducing conditions.


EXAMPLES

The following examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to make and use the present invention, and are not intended to limit the scope of what the inventors regard as their invention, nor are they intended to represent or imply that the experiments below are all of or the only experiments performed. It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the invention as shown in the specific aspects without departing from the spirit or scope of the invention as broadly described. The present aspects are, therefore, to be considered in all respects as illustrative and not restrictive.


Example 1: Exemplary Workflow Overview

The disclosed MADzyme Type II CRISPR enzymes were identified by the method depicted in FIG. 1. FIG. 1 shows an exemplary workflow for creating and for in vitro screening of MADzymes, including those in untapped clusters. In a first step, metagenome mining was performed to identify putative RGNs of interest based on, e.g., sequence (HMMER profile) and a search for CRISPR arrays. Once putative RGNs of interest were identified in silico, candidate pools were created and each MADzyme was identified by cluster, the tracrRNA was identified, and the sgRNA structure was predicted. Final candidates were identified, then the genes were synthesized. An in vitro depletion test was performed (see FIG. 2), where a synthetic target library was constructed in which to test target depletion for each of the candidate MADzymes. After target depletion, amplicons were produced for analysis for in vivo analysis. FIG. 2 depicts the in vitro depletion test in more detail.


Example 2: Metagenome Mining

The NCBI Metagenome database was used to search for novel, putative CRISPR nucleases using HMMER hidden Markov model searches. Hundreds of potential nucleases were identified. For each potential nuclease candidate, putative CRISPR arrays were identified and CRISPR repeat and anitirepeats were identified. Thirteen nucleases (FIG. 3) were chosen for in vitro validation and 11 active MADzymes were identified and assigned to clusters. There was less than 40% sequence identity between clusters. Cluster 59 shown in FIG. 4 presents two unique subclusters with distinct sgRNA architecture. Clusters 55-57 are shown in FIG. 5. These new MADzymes have diverse PAM preferences and distinct sgRNA structure. Cluster 141 (FIG. 6) is a distant cluster from 55, 56, 57 and 59 and shows diverse Cas protein structure and smaller-sized enzymes (e.g., approximately 200 amino acids shorter than the counterparts from the 55, 56, 57 and 59 clusters). Table 1 lists the identified MADzymes, including amino acid sequences, origin, and nucleic acid sequences of the CRISPR RNA and the trans-activating crispr RNA.
















TABLE 1








Organism






MAD
Clus-

(meta-


CRISPR



name
ter
Contig_id
genome)
Source
aa_seq
repeat
tracrRNA






















MAD
59
DPZI

Vagococcus


MGKNYTIGLDIGTNSVGWSVVTENQQLVKKRMKIRGDS
GTTTT
TGTTGGT


2015

01000013.1
sp.

EKKQVKKNFWGVRLFDEGETAEATRLKRTTRRRYTRRRN
AGAGC
AGCATTC







RVVDLQNIFKDEINQKDSNFFNRLNESFLVVEDKKQPKQ
TATGC
AAAACA







MIFGTVEEEASYHESFPTIYHLRKELVDNKDQADIRLVYLA
TGTTTT
ACATAGC







MAHMIKYRGHFLIEGQLSTENTSVEEKFHLFLKEYNSTFCK
GAATG
AAGTTAA







QEDGSLVNPVNEDINGEEILMGTLSRSKKAEQIMKSFEGE
CTTCC
AATAAG







KSNGVFSQFLKMIVGNQGNFKKAFNLEEDAKIQFAKEEY
AAAAC
GCTTTGT







DEDLTTLLSNIGDEYANVFSLAKETYEAIELSGILSTKDKETY
[SEQ ID
CCGTTCT







AKLSSSMTERYEDHEKDLASLKSFFREHLPEKYAVMFKDV
NO. 2]
CAACTTT







SKNGYAGYIENSNKISQEEFYKYTKKLIGQIEGADYFIKKME

TAGTGAC







QEAFLRKQRTYDNGVIPYQVHLSELTHIINNQKKYYPFLLE

GCTGTTT







KEEEIKSILTFKIPYYIGPLAKGNSDFAWLIRNSNDKITPSNF

CGGCG







NEVLDIENSASQFIERMTNNDVYLPEEKVLPKNSMLYQKY

[SEQ ID







IVFNELTKVRYINDRGTECNFSGEEKLQIFERFFKDSSTKVK

NO. 3]







KVSLENYLNKEYMIESPTIKGIEDDFNASFRTYHDFIKLGVS









REMLDDIDNEEMFEDIVKILTIFEDRQMIKKQLEKYKDVFD









SDILKKMVRRHYTGWGRLSKKLLHEMKDDNSGKTILDYLI









EDDRLPKHINRNFMQLINDSNLSFKEKIEKAQLTDGTEDID









SVVKNLIGSPAIKKGISQSLKIVEELVSIMGYQPTSIVVEMA









RENQTTSKGKRQSIQRYKRLEAAINELGSDLLKVCPTDNH









ALKDDRLYLYYLQNGRDMYTGLELDIHNLSQYDIDHIVPRS









FITDNSIDNRVLVSSKKNRGKLDNVPSKEIVQKNKLLWMN









LKKSKLMSEKKYANLIKGETGGLTEDDKAKFLNRQLVETR









QITKNVAQILDQRFNTQKDEKGNIIREVKVITLKSALVSQF









RQNFEFYKVREVNDFHHAHDAYLNAVVANTLLKVYPKLT









PDFVYGEYRKGNPFKNTKATAKKHYYSNIMENLCHETTIID









DETGEILWDKKCIGTIKQVLNYHQVNVVKKVETQTGRFSE









ETLVPRGSTKNPIALKSHLDPQKYGGFKSPTIAYTIVIEYKK









GKKDILIKELLGISIMNRGAFEKNNKEYLEKLNYKEPRVLM









VLPKYSLFELENGRRRLLASDKESQKGNQMAVPSYLNNLL









YHTNKSLSKNAKSLEYVNEHRQQFEELLEEIIDFANQFTLA









EKNTLLIADLYESNKEADIELLASSFINLLRFNQMGAPAEFS









FFEKPIPRKRYSSTFELLKGKVIHQSITGLYETHQKV









[SEQ ID NO. 1]




MAD
59
DGLK

Entero-

New
MKKDYVIGLDIGTNSVGWAVMTEDYQLVKKKMPIYGNT
GTTTT
TCTTTTG


2016

01000042.1

coccus

York
EKKKIKKNFWGVRLFEEGHTAEDRRLKRTARRIISRRRNRL
AGAGT
GGACTAT






faecalis

City
RYLQAFFEEAMTDLDENFFARLQESFLVPEDKKWHRHPIF
CATGT
TCTAAAC






MTA
AKLEDEVAYHETYPTIYHLRKKLADSSEQADLRLIYLALAHI
TGTTT
AACATAG






subway
VKYRGHFLIEGKLSTENISVKEQFQQFMIIYNQTFVNGESR
AGAAT
CAAGTTA







LVSAPLPESVLIEEELTEKASRTKKSEKVLQQFPQEKANGLF
GGTAC
AAATAA







GQFLKLMVGNKADFKKVFGLEEEAKITYASESYEEDLEGIL
CAAAA
GGTTTTA







AKVGDEYSDVFLAAKNVYDAVELSTILADSDKKSHAKLSSS
C
ACCGTAA







MIVRFTEHQEDLKKFKRFIRENCPDEYDNLFKNEQKDGYA
[SEQ ID
TCAACTG







GYIAHAGKVSQLKFYQYVKKIIQDIAGAEYFLEKIAQENFLR
NO. 5]
TAAAGTG







KQRTFDNGVIPHQIHLAELQAIIHRQAAYYPFLKENQEKIE

GCGCTGT







QLVTFRIPYYVGPLSKGDASTFAWLKRQSEEPIRPWNLQE

TTCGGCG







TVDLDQSATAFIERMTNFDTYLPSEKVLPKHSLLYEKFMVF

C







NELTKISYTDDRGIKANFSGKEKEKIFDYLFKTRRKVKKKDII

[SEQ ID







QFYRNEYNTEIVTLSGLEEDQFNASFSTYQDLLKCGLTRAE

NO. 6]







LDHPDNAEKLEDIIKILTIFEDRQRIRTQLSTFKGQFSAEVLK









KLERKHYTGWGRLSKKLINGIYDKESGKTILGYLIKDDGVSK









HYNRNFMQLINDSQLSFKNAIQKAQSSEHEETLSETVNEL









AGSPAIKKGIYQSLKIVDELVAIMGYAPKRIVVEMARENQT









TSTGKRRSIQRLKIVEKAMAEIGSNLLKEQPTTNEQLRDTR









LFLYYMQNGKDMYTGDELSLHRLSHYDIDHIIPQSFMKD









DSLDNLVLVGSTENRGKSDDVPSKEVVKDMKAYWEKLYA









AGLISQRKFQRLTKGEQGGLTLEDKAHFIQRQLVETRQITK









NVAGILDQRYNANSKEKKVQIITLKASLTSQFRSIFGLYKVR









EVNDYHHGQDAYLNCVVATTLLKVYPNLAPEFVYGEYPKF









QTFKENKATAKAIIYTNLLRFFTEDEPRFTKDGEILWSNSYL









KTIKKELNYHQMNIVKKVEVQKGGFSKESIKPKGPSNKLIP









VKNGLDPQKYGGFDSPIVAYTVLFTHEKGKKPLIKQEILGIT









IMEKTRFEQNPILFLEEKGFLRPRVLMKLPKYTLYEFPEGRR









RLLASAKEAQKGNQMVLPEHLLTLLYHAKQCLLPNQSESL









TYVEQHQPEFQEILERVVDFAEVHTLAKSKVQQIVKLFEA









NQTADVKEIAASFIQLMQFNAMGAPSTFKFFQKDIERAR









YTSIKEIFDATIIYQSTTGLYETRRKVVD









[SEQ ID NO. 4]




MAD
59
DMKA

Strepto-


MKKPYSIGLDIGTNSVGWAVITDDYKVPAKKMKVLGNTD
GTTTT
TGTTGGA


2017

01000006.1

coccus sp.


KKYIKKNLLGALLFDSGETAEVTRLKRTARRRYTRRKNRLR
AGAGC
ACTATTC





(firmicutes)

YLQEIFAKEMTKVDESFFQRLEESFLTDDDKTFDSHPIFGN
TGTGC
GAAACA







KAEEDAYHQKFPTIYHLRKYLADSQEKADLRLVYLALAHMI
TGTTT
ACACAGC







KYRGHFLIEGELNAENTDVQKLFNVFVETYDKIVDESHLSEI
CGAAT
GAGTTAA







EVDASSILTEKVSKSRRLENLIKQYPTEKKNTLFGNLIALALG
GGTTC
AATAAG







LQPNFKTNFKLSEDAKLQFSKDTYEEDLEELLGKVGDDYA
CAAAA
GCTTTGT







DLFISAKNLYDAILLSGILTVDDNSTKAPLSASMIKRYVEHH
C
CCGTACA







EDLEKLKEFIKINKLKLYHDIFKDKTKNGYAGYIDNGVKQDE
[SEQ ID
CAACTTG







FYKYLKTILTKIDDSDYFLDKIERDDFLRKQRTFDNGSIPHQI
NO. 8]
TAAAAG







HLQEMHSILRRQGEYYPFLKENQAKIEKILTFRIPYYVGPLA

GGGCAC







RKDSRFAWANYHSDEPITPWNFDEVVDKEKSAEKFITRM

CCGATTC







TLNDLYLPEEKVLPKHSHVYETFTVYNELTKIKYVNEQGESF

GGGTGC







FFDANMKQEIFDHVFKENRKVTKAKLLSYLNNEFEEFRIN

A







DLIGLDKDSKSFNASLGTYHDLKKILDKSFLDDKTNEQIIEDI

[SEQ ID







VLTLTLFEDRDMIHERLQKYSDFFTSQQLKKLERRHYTGW

NO. 9]







GRLSYKLINGIRNKENNKTILDFLIDDGHANRNFMQLINDE









SLSFKTIIQEAQVVGDVDDIEAVVHDLPGSPAIKKGILQSVK









IVDELVKVMGDNPDNIVIEMARENQTTGYGRNKSNQRL









KRLQDSLKEFGSDILSKKKPSYVDSKVENSHLQNDRLFLYYI









QNGKDMYTGEELDIDRLSDYDIDHIIPQAFIKDNSIDNKVL









TSSAKNRGKSDDVPSIEIVRNRRSYWYKLYKSGLISKRKFD









NLTKAERGGLTEADKAGFIKRQLVETRQITKHVAQILDARF









NTKRDENDKVIRDVKVITLKSNLVSQFRKEFKFYKVREIND









YHHANDAYLNAVVGTALLKKYPKLTPEFVYGEYKKYDVRK









LIAKSSDDYSEMGKATAKYFFYSNLMNFFKTEVKYADGRV









FERPDIETNADGEVVWNKQKDFDIVRKVLSYPQVNIVKKV









EAQTGGFSKESILSKGDSDKLIPRKTKKVYWNTKKYGGFDS









PTVAYSVLVVADIEKGKAKKLKTVKELVGISIMERSFFEENP









VSFLEKKGYHNVQEDKLIKLPKYSLFEFEGGRRRLLASATEL









QKGNEVMLPAHLVELLYHAHRIDSFNSTEHLKYVSEHKKE









FEKVLSCVENFSNLYVDVEKNLSKVRAAAESMTNFSLEEIS









ASFINLLTLTALGAPADFNFLGEKIPRKRYTSTKECLSATLIH









QSVTGLYETRIDLSKLGEE









[SEQ ID NO. 7]




MAD
59
DOTL

Strepto-


MTKPYSIGLDIGTNSVGWAVITDDYKVPSKKMKVLGNTS
GTTTT
GGTTTGA


2019

01000042.1

coccus sp.


KKYIKKNLLGALLFDSGITAEGRRLKRTARRRYTRRRNRILY
AGAGC
AACCATT





(firmicutes)

LQEIFSTEMATLDDAFFQRLDDSFLVPDDKRDSKYPIFGNL
TGTGT
CGAAAC







VEEKAYHDEFPTIYHLRKYLADSTKKADLRLVYLALAHMIK
TGTTT
AATACAG







YRGHFLIEGEFNSKNNDIQKNFQDFLDTYNAIFESDLSLEN
CGAAT
CAAAGTT







SKQLEEIVKDKISKLEKKDRILKLFPGEKNSGIFSEFLKLIVGN
GGTTC
AAAATAA







QADFKKYFNLDEKASLHFSKESYDEDLETLLGYIGDDYSDV
CAAAA
GGCTAGT







FLKAKKLYDAILLSGILTVTDNGTETPLSSAMIMRYKEHEED
C
CCGTATA







LGLLKAYIRNISLKTYNEVFNDDTKNGYAGYIDGKTNQEDF
[SEQ ID
CAACGTG







YVYLKKLLAKFEGADYFLEKIDREDFLRKQRTFDNGSIPYQI
NO. 11]
AAAACAC







HLQEMRAILDKQAKFYPFLAKNKERIEKILTFRIPYYVGPLA

GTGGCA







RGNSDFAWSIRKRNEKITPWNFEDVIDKESSAEAFINRMT

CCGATTC







SFDLYLPEEKVLPKHSLLYETFTVYNELTKVRFIAEGMSDYQ

GGTGC







FLDSKQKKDIVRLYFKGKRKVKVTDKDIIEYLHAIDGYDGIE

[SEQ ID







LKGIEKQFNSSLSTYHDLLNIINDKEFLDDSSNEAIIEEIIHTL

NO. 12]







TIFEDREMIKQRLSKFENIFDKSVLKKLSRRHYTGWGKLSA









KLINGIRDEKSGNTILDYLIDDGISNRNFMQLIHDDALSFKK









KIQKAQIIGDKDKDNIKEVVKSLPGSPAIKKGILQSIKIVDEL









VKVMGRKPESIVVEMARENQYTNQGKSNSQQRLKRLEE









SLEELGSKILKENIPAKLSKIDNNSLQNDRLYLYYLQNGKD









MYTGDDLDIDRLSNYDIDHIIPQAFLKDNSIDNKVLVSSAS









NRGKSDDVPSLEVVKKRKTLWYQLLKSKLISQRKFDNLTK









AERGGLSPEDKAGFIQRQLVETRQITKHVARLLDEKFNNK









KDENNRAVRTVKIITLKSTLVSQFRKDFELYKVREINDFHH









AHDAYLNAVVASALLKKYPKLEPEFVYGDYPKYNSFRERKS









ATEKVYFYSNIMNIFKKSISLADGRVIERPLIEVNEETGESV









WNKESDLATVRRVLSYPQVNVVKKVEVQSGGFSKELVQP









HGNSDKLIPRKTKKMIWDTKKYGGFDSPIVAYSVLVMAE









REKGKSKKLKPVKELVRITIMEKESFKENTIDFLERRGLRNI









QDENIILLPKFSLFELENGRRRLLASAKELQKGNEFILPNKLV









KLLYHAKNIHNTLEPEHLEYVESHRADFGKILDVVSVFSEKY









ILAEAKLEKIKEIYRKNMNTEIHEMATAFINLLTFTSIGAPAT









FKFFGHNIERKRYSSVAEILNATLIHQSVTGLYETRIDLGKL









GED









[SEQ ID NO. 10]




MAD
55
DQFW

Achole-

human
MKNNEETLKKLRLGLDIGTNSVGYALLDENNKLIKKNGHT
GTTTG
TGTAAAT


2020

01000027.1

plasma-

gut
FWGVRMFDEAETAKDRGSYRKSRRRLLRRKERMEILRSFF
CTAGT
AACATAA






tales


TKEICDIDPTFFERLDDSFYYKEDKKNKNTYNLFTSEYTDKD
TATGT
CGAGTG






bacterium


FYLEYPTIYHLRKAMQEEDKKFDIRMVYLAIAHIIKYRGNFL
TATTT
CAAATAA







YPGEEFSTSEYTSIKQFFLDFNDILDELSNELEDNEDYSAEYF
ATAGT
GCGTTTC







DKIENINDDFLEKLKVILMEIKGISNKKKELLDLFNVNKKSIY
ATTAA
GCGAAA







NELVIPFISGSAKVNISSLSVIKNSKYPKTEISLGSEELEGQVE
GCAAA
ATTTACA







EAISVAPEIKSVLEMIIKIKEISDFYFINKILSDSKTISESMVK
C
GTGGCCC







MYDEHNEDLKKLKGFFKKYAEDQYNEIFKIRDEKLANYVA
[SEQ ID
TGCTGTG







YVGFNKLRKNKVERFKHASREEFYGYLKQKLNNIKYAEAQ
NO. 14]
GGGCCTT







EEIKYFIDKIDNNEFLLKQNSNQNGAFPMQLHLKELKTILN

TTTTATTT







NQEKYYPFLSEGNDGYSIKEKIILTFKYKIPYYVGPLNKESKY

ATCAAA







SWVVREDEKIYPWNFDKVVKLDETAEKFILRMQNKCTYL

[SEQ ID







KGDNDYCLPKNSLIFSEYSCLSYLNKLSINGKPIDPIMKSKIF

NO. 15]







NEVFLIKKQPTKKDIIEFIKTNYNADALTTTEKELPEATCNM









ASYIKMKEIFGKDFNDNKEMIENIIKDITIFEDKSILGNRLKE









LYKLNNDRIKQIKGLNYKGYSRLSKNLLVGLQIVDNQTGEI









KGNVIEVMRKTNLNLQEILYLDGYRLIDAIDEYNRKNSLND









SYLCARDYIAENLVISPSFKRALIQTCSIIQEIERIFHKKIDEFY









VEVTRTNKDKNKGKTTSSRYDKIKKIYSSCQELAMAYNFD









MKRLKNELESNKDNLKSDILYFYFTQLGKCMYSLEDIDISD









LTNNYHYDIDHIYPQSIIKDDSLSNRVLVDKKKNAAKTDKF









LFEAKVLNPKAQQFYKKLLSLELISKEKYRRLTQKEISKDELE









GFVNRQLVSTNQSVMGLIKLLKEYYKVDEKNIIYSKGENVS









DFRHTFDLVKSRTANNFHHANDAYLNVVVGGILNKYYTS









RRFYQFSDIARIENEGESLNPSRIFTKRDILKANGKVIWDKK









EDIKRIEKDLYHRFDITETIRTYNPNKMYSKVTILPKGEGES









AVPFQTTTPRVDVEKYGGITSNKFSRYVIIEAHGKKGLDTIL









EAIPKTACGDNNKIEKDIDNYIASLDEYQKYTSYKVVNYNIK









ANVVIQEGSFKYIITGKSGNQYVLQNVQDRFFSKKAMITIK









NIDKYLNNKKLGIIMAKDNEKIIVSPARGKNNEEIFFEKTEL









VNLLKEIKTMYSKDIYSFSAIQNIVNNIDCSIDYSIDDFIIICN









NLLQILKTNERKNADLRLIHLSGNSGTLYLGKKLKSGMKFI









WQSITGYYEEILYEVK









[SEQ ID NO. 13]




MAD
57
DEED

Lachno-


MSEKYFVGLDMGTSSVGWAVTDEHYHLLRRKGKDLWG
GTTTG
GATAATG


2021

01000018.1

spiraceae


ARLFDEAETAAGRRTNRVSRRRLARQRARIGWLKELFRPY
AGAGC
TTTTACA






bacterium


LEEKDAGFLQRLEESRFFLEDKTVKQPYALFSDKEFTDKDY
CTTGT
AGGCGA







YQKYPTIFHLRKELLESKAPHDVRLVFLAVLNMYAHRGHFL
AAAAC
GTTCAAA







NPELQEGTLGDIHDLLSRLDAYIQDLFEDQGWSILENVEE
CGTAT
TAAGGAT







QQKVLAEKNISNTVRLEKILSAIGTSPKDKEKKPLIEIYKLICG
ATCTC
TTATCCG







LKGSLSLAFSGVEMNETDAQMKFSFSDSNLEENEPEIERIL
TCAAG
AAATCGC







GERYFEMYSILKEIHAWGLLSEIMSDDSGKTYPYISYAKVD
C
TTGCGTG







LYQKHHEQLRMLKKIIRTYAPDEYHRMFRSMEDNTYSAY
[SEQ ID
CATTGGC







VGSVNSKNKKQRRGAKSTDFFKEVKRIIEKIEKEHGELPEC
NO. 17]
ACCATCT







EEILDLIARDSFLPKQLTTANGVIPNQVYATELRQIVTNAA

ATCTTTT







AYLPFLNDKDDTGLTNAEKIVEMFKFHIPYYIGPLKNDGN

AAGACTT







GTAWVVRKQQGTVYPWNIDEKVDMAKTRDQFILNLVRK

TCTTTGA







CSYLNDETVLPASSLLYEKFKVLNELNNLTINGQKISVELKQ

AAGTCTT







DIFRDLFRATGKRVTTRKLMGYLRRKAVIDADADETCLEG

[SEQ ID







FDKTQGGFVSTLSSYHKFMEIFSTDVLTDRQREIAEGAIYF

NO. 18]







ATVYGEDKSFLKKVLRDKFSPAELSQAQIDRLSGIRFKDWS









HLSREFLLLEEADHSTGEIMTIIDRLWNTNENLMQIIHSDE









YTYKQAIEERTARLEKSLSEVSFEDIEDSYMSAPVRRMVW









QTIRILQEIEEVMGSEPARVFVEMTRSEGEKGDKGRKDSR









KKKLKELYKKCKDDDQGLLSDIEGRDERDFRIRKLYLYYMQ









KGLCMYSGHPIDFGKLFDDSYYDIDHIYPRHYVKDDSIENN









LVLVESKLNRDKKDTLLCPDIQERMHPVWEMLHRQGFM









NDEKFKRLMRKEPFSEEEFAHFIERQLVETGQGTKEIARIL









NDVLGNKDENNKVIYVKAGNVSSFRNDNKKNPEFVKCRV









INDHHHAKDAYLNIVVGNTYYTKFTLHPANFIRELRNKSH









PTLEDQYNMDKLFARRVERNGYTAWNPDTDFQTVKQVL









RKNSVLISRRSFIEHGQIADLQLVSGRKISEVNGKGYLPIKA









SDIRLSGPSGTMKYGGYNKASGAYFFLVEHELKGKLVRTIE









PVYVYMMASIHGKEDLEKYCQEELGYIHPRICLKKIPMYSH









IRINGFDYYLTGRSNDRLFICNAVQLTLSSEWSAYIKALSKA









VDEKWDAAYIEQQASRIQDSLKSEEVFISKERNDQLYKVLL









QKHLEGFFNNRINSIGTIMKEGYDSFRALPVNEQAETLME









ILKISQLVNIGANLVSIGGKSRSGVATVSKKISDSKSFQLISD









SVTGIFQRATDLLTI









[SEQ ID NO. 16]




MAD
57
CACY
uncultured
Cattle
MEKEYYLGLDMGTSSVGWAVTDKEYRLLRAKGKDMWG
GTTTG
GAGAATT


2022

WR01

Lachno-

rumen
IREFEEAQTAVERRTHRLSKRRRARQLVRIGLLKDYFHDEI
AGAGT
AACAAG




0000004.1

spiraceae


MKIDPNFYIRLENSKYYLEDKDVRLASSNGIFDDKNYTDKD
CTTGT
ACGAGT






bacterium


YYEQYKTIFHLRSELIHNSQKHDVRLVYLALLNMFKHRGHF
TAATT
GCAAATA







LFEGDAYVQGNIGDIYKEFIQLLKNEYYEDENVKLTDQIDY
CTTAA
AGGTTTA







FKLKEILSNSEFSRTAKAEKINSLVHIDKKNKLENTYIRLLCG
AGGTG
TCCGGAA







LEIELKILFPEIDEKIKICFAKGYDEKLVEITEILTDNQLQILEN
TAAAA
TCGTCAA







LKKIHDIAALDKIRKGKEYLSDARVAEYEKHREDLALLKKIY
C
TATGACC







REYMTKQDYDRMFREGEDGSYSAYVNSYNTSKKQRRNMK
[SEQ ID
TGCATTG







HRKIDEFYGTIRKDLKLLLKQGIQDDNIERILEEIDGNNDNK
NO. 20]
TGCAGA







FMPKQLSFANGVIPNSLHKAEMKAILRNAETYLPFLLETDE

ATCTTTA







SGLTVSERILQLFSFHIPYYIGPVSVNSEKNNGNGWVVRRE

AAATCAT







DGEVLPWNIEQKIDYGETSKRFIEKMVRRCTYISGEQVLPK

ATGATTT







NSFIYEKYCVLNEINNIKIDGERITVELKQNIYNDLYLHGKR

CATATGG







VTKKQLINYLNNRGMIEDENQVSGIDINLNNYLGSYGKFL

TTTTA







PIFEEKLKEDNYIKIAEDIIYLASIYGDSKKMLKSQIKSKYGDI

[SEQ ID







LDDKQIKRILGLKFKDWGRISRRFLELEGLDKETGEITTIIKA

NO. 21]







MWDYNLNFMEIIHSDAFDFKDKIEELHANSIKPLAEIEVED









LDDMYFSAPVKRMIWQTFKVIKEIEKVMGCPPKKVFIEM









TRINDKKSKGKRTNSRKEKFLSLYKNIHDELVDWKQLIISSD









ESGKLNSKKMYLYLTQQGICMYTGRRINLEELFDDNKYDI









DHIYPRHFVKDDNLENNLVLVEKQSNSRKSDTYPIDKSIRN









NSQVYKHWKSLREGNFISKEKYDRLTGKNEFTDEQKAGFI









ARQMVETSQGTKGVADIIKQALPQSRIIYSKASNVSEFRRK









YDILKSRTVNEFHHAHDAYLNIVVGNVYDTKFTSNPLNFIK









KQYNVDRKANNYNLDKMFVYDVKRGNEIAWIGWNPKK









SEDSSEMSKRGTIVTVKKMLSKNTPLMTRMSFVGHGGIA









EDNLSSHFVAKNKGYMPNGKESDVTKYGGYKKAKTAYFF









VVEHGQTNNRIRTIETLPIYRRREVEKYEDGLIKYCEQSLSLL









NPIIIYKKIKIQSLMKINGYYAYISGKSNEVYTFRNGVNMCL









SQEWINYVKKLENYIEKDRQDRMITYEKNIELYEIILRKYST









TILNKRLSKMDKKLINAKDRFCILNVKEQSQVLINVFVLSRI









GDNQTDLSKIGIGKQSGQITQNKKITGCKEFKLVNQSVTG









LYENEIDLLTV









[SEQ ID NO. 19]




MAD
56
DCGJ0

Lachno-

Feces
MEKNNYLLGLDIGTDSVGYAVTNDKYDILKFHGEPAWGV
GTTTG
AGACCCC


2023

1000048.1

clostridium

of six-
TIFDEASLSTEKRSFRVSRRRLDRRQQRVLLVQELFASEVA
AGAGT
TATGGAT





sp.
years
KVDKDFFKRIQESNLYRSDAENQAGLFIGEDYCDREYYGQ
AGTGT
TTACATT






old
YPTIHHLISDLMNGTSPHDVRLVYLACAWLVAHRGHFLSN
AAATC
GCGAGTT






ele-
IDKDNLSGLKDFSSVYEGLMQYFSDNGYERPWNANVDV
CATAG
CAAATAA






phant
KALGDALKKKQGVTAKTKELLALLLDSAKAEKLPREEFPFS
GGGTC
AAGTTTA







QDGIIKLLAGGTYKLSELFGNEEYKDFGSVKLSMDDEKLGE
TCAAA
CTCAAAT







IMSNIGEDYELIASLRIVSDWAVLVDVLGESATISEAKVGIY
C
CGTTGGC







NQHKADLEVLKKIIRKYTGKEGYKKVFRQVDSKENYVAYS
[SEQ ID
TTGACCA







QHESDGKAPKEKGIDIATFSKFILNIVRLLDVEPEDKEVYED
NO. 23]
ACCGCAC







MVARLELNSFLPKQVNTDNRVIPYQLYWFELHKILENASIY

AGCGTGT







LPMLTEKDSNGISVMEKLESVFMFRIPYFVGPLNKHSKYA

GCTTAAA







WLERKEGKIYPWNFENMVDLDASEANFIKRMTNTCTYLP

GATCTCT







GQNVLPKDSLRYHRFMVLNEINNLRINNERISVELKQKIYS

TCAGTGA







ELFLNVKKVTRKRLVDFLISNGELRKGEESSLTGIDVEIKANL

GGTC







APQIAFKKLMESGQLTEEDVESIIERASYAEDKARLAHWLE

[SEQ ID







AKYSKLSEIDRKYICGIKIKDFGRLSKMFLSELEGVDKTTGE

NO. 24]







MTTILGAMWNSQLNLMELINSELYSFREAICAYQTDYYST









HSSSLEERMNEMYLSNAVKRPVYRTLDIVKDVKKAFGEPK









KIFVEMTRGASEEQKGKRTKSRKEQILELYKQCKDEDVRIL









QQQLEEMGDLADNKLQGDKLFLYYMQKGKCMYTGTPIV









LEQLGSKAYDIDHIYPQAYVKDDSILNNRVLVLSEANGKKK









DIYPIEKETRDKMHGFWTYLNDKGMITEEKYKRLTRTTGF









TEEEKWSFINRQLTETSQATKAVATLLGELFPNAEIVYSKA









RLTSEFRQEFNLLKCRSYNDLHHAVDAYLNIVCGNVYNM









KFTKRWFNINKDYSIKTKTVFTHPVVCGGQVVWDGQEM









LNKVIRNAKKNTAHFTKYAYIRKGGFFDQMPVKAAEGLTP









LKKDMPTAVYGGYNKPSVAFLIPTRYKAGKKTEIIILSVEHL









FGERFLRDEAYAKEYAAERLKKILGKQVDEVSFPMGMRP









WKINTVLSLDGFLICISGIGSGGKCLRAQSIMQFSSDYRWT









IYLKRLERLVEKITVNAKYVYSEEFDKVSTIENIELYDLYIEKY









KATIFSKRVNSPEEIIESGRDKFVKLDVLSQARALLCIHQTF









GRIVGGCDLGLIGGKKNSAATGNFSSTISNWAKYYKDVRII









DQSTSGLWVRKSENLLELV









[SEQ ID NO. 22]




MAD
56
CADA
uncultured
Cattle
MNFDGEYFLGLDIGTDSVGYAVTDQRYNLVKFKGEPMW
GTTTG
GAGCCCT


2024

KQ01

Lachno-

rumen
GSHLFDAANQCAERRGFRTARRRLDRRQQRVKLVDEIFA
AGAGT
CTGGATT




0000027.1

spiraceae


PEVAKVDPNFYIRKMESALYPEDKSNKGDLYLYFNKQEYD
AGTGT
TACACTA






bacterium


EKHYYKDYPTIHHLICALMNDEKTKFDIRLINIAIDWLVAH
AAATC
CGAGTTC







RGHFLSEVGTDSVDKVLDFRKIYDEFMALFSDEDDAVSSK
CAGAG
AAATAAA







PWENINPDELGKVLKIHGKNAKRNELKKLLYGGKIPTDED
GGCTC
AATTATT







SFIDRKLLIDFIAGTSVQCNKLFRNSEYEDDLKITISNSDERE
CAAAA
TCAAATC







VVLPQLEDFHADIIAKLSSMYDWSVLSDILSGSTYISESKVK
C
GCCGCTA







VYEQHKKDLKELKEFVRKYAPEKYNDIFRLASKETYNYTAY
[SEQ ID
TGTCGGC







SYNLKSVKDEKDLPKGKASKEDFYSYLKKTLKLDKAENYNF
NO. 26]
CGCACA







VNDADTRFFDDMVERISSGTFLPKQVNSDNRVIPYQVYYI

GTGTGTG







ELKKILENAKKHYAFFEEKDEDGYSNVEKIMSVFTFRIPYYV

CATTAAG







GPLRNDDKSPYAWIRRKADGKIYPWDFEEKVDLDASENA

AAAAGTC







FIDRMTNSCTYIPGADVLPKWSLLYTKYMVLNEINNIKVN

CGAAAG







NIGISVEAKQGIYNELFCKKAKVSLKAIREYLISNGFMQKD

GGC







DEMSGIDITVKSSLKSRYDFRHLLEKNELTTDDVEAIISRSTY

[SEQ ID







AEDKARFKKWLKKEFPQLSDEDYKYVSKLKYKDFGRLSRSL

NO. 27]







LNGLEGASKETGEIGTIMHFLWETNDNLMQLLSDRYTFM









EEINKKRQDYYIEHKLTLNEQMEELGISNAVKRPVTRTLAV









VKDVVSAIGYAPQKIFVEMARQEDEKKKRSVTRKEQILELY









KNVEEDTKELERQLKKMGDTANNELQSDALFLYYLQLGK









CMYSGKPIDLTQIKTTKKYDIDHIWPQSMVKDDSLLNNRV









LVLSEINGDKKDVYPIDESIRSKMHSYWKMLLDKNLITKEK









YSRLTRPTPFTESEKLGFINRQLVETRQSMKAVTQLLNNM









YPDSEIVYVKAKLAADFKQDFKLAPKSRIINDLHHAKDTYL









NVVAGNVYNERFTKKWFNVNEKYSMKTKVLFGHDVKIG









DRLIWDSKKDLQTVKNTYEKNNIHLTRYAYCQKGGLFDQ









MPVKKGQGQIQLKKGMDIDRYGGYNKATASFFIIARYLR









GGKKEVSFVPVELMVSEKFLNDDNFAIEYITNVLTGMNTK









KIENVELPLGKRVIKIKTVLLLDGYKVWVNGKASGGTRVM









LTSAESLRMPKEYVEYLKKMENYSEKKKSNRNFMHDSEN









DGLSEEKNILLYDKLLEKLDENHFKKMPGNQCETMKSGRV









KFIELDFDVQISTLLNCIDLLKSGRTGGCDLKNIGGKSASGV









VYISANLSACKYNDVHIIDISPAGLHENISCNLMELFE









[SEQ ID NO. 25]




MAD
56
DOQG

Rumino-

human
MSFKENSKFYFGLDIGTDSVGWAVTDNLYKLYKYKNNLM
GTTTG
TTTTACT


2025

01000053.1

coccaceae

gut
WGVSLFEAASPAEDRRNHRTARRRLDRRQQRVALLRELF
AGAGT
ACCCTAT






bacterium


AKEILKTDPDFFLRLKESSLYPEDRTNKNVNTYFDDADFKD
AGTGT
AAATTTA







SDYFKMYPTVHHLIKELSESDKPHDVRLVYLACAFIVAHRG
AAATT
CACTACG







HFLNGADENNVQEVLDFNSSYCEFTDWFKSNDIEDNPFS
TATAG
AGTTCAA







ESTENEFSVILRKKIGITAKEKEIKNLLFGTTKTPDCYKDEEY
GGTAG
ATAAAAA







PIDIDVLIKFISGGKTNLAKLFRNPAYDELDIQTVEVGKADF
TAAAA
TTATTTC







ADTIDLLASSMEDTDVPLLSAVKAMYDWSLLIDVLKGQKT
C
AAATCGT







ISDAKVCEYEQHKSDLKALKHIVRKYLDKAQYDEIFRTAGE
[SEQ ID
ACTTTTT







KPNYVSYSYNVTDVKLKQLPSNFKKKYSEEFCKYINSKLEKI
NO. 29]
AGTACCT







KPEPDDEAVYNELIEKCNSKTLCPKQVTDENRVIPYQLYYH

TCACAAG







ELSMILDKASAYLDFLNETEDGISVKQKILTLMKFRIPYFVG

TGTTGTG







PSVKRNETDNVWIVRKAEGRIYPWNFENMVDYDKSEDG

AATATTA







FIRRMTCKCTYLAGEDVLPKYSLLYSRYTVLNEINNIKVKDV

ACTCACC







KISPELKQDIFNELFMKTSRVTVKKITELLKRKGAFSEENGD

TTCGGGT







SLSGVDINIKSSLKSYLDFRRLLENGSLSESDVERIIERITVTT

GAG







DKPRLISWLKTEYPALPAEDIRYISRLSYKDYGRLSAKMLTG

[SEQ ID







CYELDMDTGEIGGRSIIDLMWAENINLMQIMSDSYGYKS

NO. 30]







FIEEENKKYYAINPTGSIAQTLREMYVSPSASRAIIRTMDIV









KELRKIIKRDPDKIFVEMARGSKPEDKGKRTSSRREQIEKLF









ASAKEFVSDEEISHLRSQLGSLSDEQLRSEKYYLYFTQFGKC









VYSGEAIDFSRLGDNHCYDIDHIYPQSKVKDDSLHNKVLV









KSQLNGEKSDDYPIKEQIRNKMHPIWKNLFYRDPKNPTD









KIKYERLTRSTPFTEDELAGFIERQLVETRQSTKAVATLLKE









MFPDSKIVYVKAGQVSKFRHDFDMLKCREINDLHHAKDA









YLNVVVGNVHDVKFTSNPLNFVKNADKHYTIKIKETLKHK









VARNGETAWNPETDFDTVKRMMSKNSVRYVRYCYKRK









GELFKQQPKKAGNPDLAWLKKNLDPVKYGGYNSKSISCFS









LIKCTGVGVVIIPVELLCEKRYFSDDSFASEYAYSVLKNALPA









KNIAKISIDDISFPLKRRPIKINTLFEFDGYRVNIRSKDSYSVF









RISSAMAAIYSKDTSDYIKAISSYIDKSDKGSKFKPGEAFDVL









SNLKAYDEIAKKCISEPFCKISKLAEAGKKMEEGRNKFAELS









IIEQMKTLLLLVDVLKTGRVDKCNLKPVGGVDNFHTERMS









AILKNTKYSDIRIIDQSPTGLYENKSDNLLEL









[SEQ ID NO. 28]




MAD
65
CADB
uncultured
Cattle
MEQKDYYIGLDIGTNSVGWAVVDEGYQLCRFKKYDMW
GTTTG
GACTACC


2026

QN01

Firmicutes

rumen
GVRLFDSAETAAERRMNRVNRRRNRRKKQRIDLLQGLFA
AGAGT
ATATGAG




0000053.1

bacterium


EEIAKIDRTFFVRLNESRLHPEDKSTAFRHPLFNDPNYTDV
AGTGT
ATTACAC







DYYKEYPTIYHLRKELMDSAEPHDIRLVYLALHHILKNRGH
AATTT
TACACGG







FLIEGGFEDSKKFEPTFRQLLEVLTEELGLKMDGADAALAE
CATAT
TTCAAAT







SVLKDRGMKKTEKVKRLKNVFTLNTTDMDQESQKKQKA
GGTAG
AAAGAA







QIDAVCKFLAGSKGDFKKLVADEALNELKLDTFALGTSKAE
TCAAA
TGTTCGA







DIGLEIEKSAPQYCVVFESVKSVFDWKIMTQILGDESTFSS
C
AACCGCC







AKVKEYEKHHENLIILRELIRKYCDKETYRHFFNNVNGGYS
[SEQ ID
CTTTGGG







RYIGSLKKNGKKYYVAGCTQEEFYKELKGLLKSIDQRVDPE
NO. 32]
GCCCGCT







DRPVYQRVLAETEDETFLPLLRSKANSAIPRQIHQKELDDIL

TGTTGCG







QNASVYLPFLNDVDEDGLSAAEKIRSIFTFRIPYYVGPLSLR

GATTTAC







HKDKGAHVWIKRKEEGYIYPWNYEKKIDREKSNEEFIRRLI

AGACTTG







NQCTYLKDEKVLPKKSLLYSEFMVLNELNNLRIRGKRLSEE

ATATCAA







QVELKQRIYRDLFMTKTRVTKKTLLNYLRKEDSDLTEEDLS

GTCTG







GFDNDFKASLSSCLELKNKVFGDRIEEDRVRKIAEDLIRWL

[SEQ ID







TIYDDDKKMIKEVIRAEYPNEFTNEQLDVICRLKFSGWGN

NO. 33]







LSEAFLCGVEGADKDTGEVFTIIEALRNTNHNLMELLSGNY









TFTEKIREHNAALSSEIKAKDYESLVRDLYVSPACKRGIWQ









TIRITEEIKKIMGHEPKKIFVEMTREHRDSGRTTSRKDQLLA









LYQKCEEDARDWVKEIEDREERDFSSIKLFLYYLQQGKCM









YSGEAIDLDELMSKNSRWDRDHIYPQSKIKDDSLDNLVLV









KKELNAVKDNGEIAPDIQKRMKGFWLSLLRQGFLSKKKFD









RLTRTGPFTSEELAGFISRQLVETSQMSKAVAELLNQLYED









SRVVYVKAGLVSQFRQKDLGVLKSRSVNDYHHAKDAYLN









VVVGDMFDRKFTSDPARWFKKNKKVNYSINQVFRRDYE









ENGKLIWKGIDRGEDGKPLFRDGLIHGGTIDLVRAIAKRNT









NIRYTEYTYCETGQLYNLTLLPKTDTAITIPLKKELPAAKYGG









FKGAGTSYFSLIEFDDKKGHHHKQIVGVPIYVANMLEHNE









NAFIEYLETVCSFRNITVLCEKIKKNALISVNGYPMRIRGEN









EILNMLKNNLQLVLSQEGEETLRHIEKYFNKKPGFEPDKEH









DGIDRDAMAALYDEMTEKLCTVYKKRPTNQGELLKNNR









GLFLNLEKRSEMAKVLSETAKMFGTTAQTTADLSLIKGSKY









AGKIVINKNTLGAAKLILIHQSVTGLFETRVEL









[SEQ ID NO. 31]




MAD
65
CACW
uncultured
Cattle
MSKKFAGEYYLGLDIGTDSVGWAVTDNQYNVLKFNGKS
GTTTG
TTTACCA


2027

RN01

Succinic-

rumen
MWGIRLFDAAQTAAERRMFRTARRRVERRRWRLELLQE
AGAGT
TCCAGTG




0000001.1

lasticum


LFQNEIEKKDPDFFQRMKDSALYPEDSKTGKPFALFCDKD
AATGT
AGTTTAC





sp.

LNDKLYYKQYPTIYHLRKALLTENSKFDIRLVYLAIHHILKHR
AAATT
ATTACAA







GHFLFNGDFSNVTRFSFAFEQLQTCLCNELDMDFECNNV
CATAG
GTTCAAA







QKLSEILKDTHMSKNDKVKASVALFENSGDKKQLQAVIGL
GATGG
TAAAAAT







FCGAKKKLADVFLDETLNDTEMPSISIADKPYEELRPELESI
TAAAA
TTATTCA







LAEKCCVIDYIKAVYDWAILADMLDGGEYGNRTYISVARV
C
ACCCGTT







RQYEKHHDDLKKLKKLVRRYCKSEYKSFFSVAGTDNYCAYI
[SEQ ID
CTTCGGA







GDDIETDDRKSVKKCKQEDFYKRIKGLLKKAIENGCPKDEV
NO. 35]
ACCTCCA







VEIIKDIDAQVFLPLQVTKDNGVIPHQVHEMELKQILKNAE

CCGTGTG







KYYPFLCKKDEEGIVTSNKILQLFKFRIPYYVGPLNSRIGKNS

GAACATT







WIVRRAEGKIYPWNFEEKVDFDKSEEGFIRRMTNPCTYM

AAGGTCT







AGADVLPKYSLLYSEFMVLNELNNVRICGDKLSVEIKQTIIK

GCTTTGC







DLFQRTRRVTVRKLCDKLKAEGVISRNSNQKDIDIKGIDQD

AGGCC







LKSSMVSYVDFKNIFGKEIEKYSVQQMCERIIFLLTIHHDDK

[SEQ ID







RRLQKRIRAEFTEAQITDDQLQKVLRLNYQGWGRFSAEFL

NO. 36]







KELKGVDTETGEVFSIINALRETDDNLMQLLSNRYTFAEEL









EKYNSNKRKKIEALTYDNIMEGIVASPAIKRSAWQAISIVM









ELSKIMGREPKRIFVEMARGPEEKKHTISRKNQLLELYKSV









KDESRDWKTELETKTESDFRSIKLFLYYTQMGRCMYTGEPI









DLDQLANTTIYDRDHIYPQSLTKDDSLNNLVLVKKVENAN









KGNGLISADIQKKMRGFWAELKKKGLISDEKFSRLTRTTPL









SDDELAGFINRQLVETRQSSKIVADLFHQLYPTTQVVYVKA









KIVSDFRHETLDMVKVRSLNDLHHAKDAYLNIVTGNVYYE









KFSGNPLTWLRKNPDRNYSLNQMFNYDIVKKTKEGTSYV









WKKGKDGSIAVVRRTMERNDILYTRQATENKNGGLFDQ









NIVSSKNKPFIPVKKGLDVNKYGGYKGITPAYFALIEFTDKK









GSRQRLLEAVPLYLRADIDNDSNVLRDFYKNVLGLENPVVI









LNRIKKNSLLKINGFLIHLRGTTGFSASQLKVQNAVEFSLPH









HMEDYVKKLENYEKHIIAERGSTKNSQIKITEWDGISKEKN









LQLYDMFINKMENTIYKFRPANQVSNLKENREVFNSLAVE









DQCSVLNQVLMLFVCKPVTANLSLIKGSKNAGNMALSKII









SNMRSAYLIHQSVTGLFEQKIDLLKVSSQKD









[SEQ ID NO. 34]




MAD
66
DHKP

Bacillales

gut
MANKLFIGLDVGSDSVGWAATDENFHLYRLKGKTAWGA
GTTTG
GCATTGT


2028

01000031.1

bacterium

meta-
RIFSEASDAKGRRGFRVAGRRLARRKERIRLLNTLFDPLLKE
AGAGC
AAGACA






genome
KDPTFLLRLENSAIQNDDPNKPAQAVTDCLLFANKQEEKG
AGTGT
ACACTGC







FYKRYPTIWHLRKALMDNEDCAFSDIRFLYLAIHHIIKYRG
TGTCT
TACGTTC







NFLRDGEIKIGQFDYSVFDKLNETLSVLFDLQSEDEDSQEG
TATAT
AAATAA







HFVGLPKSQYEAFITTANDRNLPKQTKKTKLLSMFEKDEES
AGCTC
GCATATT







KSFLEMFCTLCAGGEFSTKKLNKKGEETFDDTKISFNASYD
GAAAA
GCTACAA







QNEPNYQEILGDAFDLVDIAKAVFDYCDLSDILNGNDNLS
C
GGTTCTC







NAFVELYDSHKSQLSALKAICKQIDNQSNLKGDASVYVKLF
[SEQ ID
CCTCGGA







NDPNDKSNYPAFTHNKTLVDKRCDIHTFDKYVIDTVLPYE
NO. 38]
GAATGA







PLLMGQDATNWQMLKSLAEQDRLLQTIALRSTSVIPMQL

CCATTAG







HQKELKIILKNAISRNVKGIAEIEEKILKLFQYKIPYYCGPLTT

GTCACTT







KSAYSNVVFKNNEYRPLKPWDYEEAIDWDETKKKFMEGL

AGATAG







TNKCTYLKDKNVLPKQSILYQDFDAWNKLNNLKVNGSKP

CCGGTTC







SLKELKDLFSFVSQRPKTTMKDIQRHFKSDTNSKDKDVVV

TTCTGGC







SGWNPEDYICCSSRASFGKNGVFDLNNPDSSDPKDLSKCE

TA







RMIFLKTIYADSPKDADVAILKEFPDLTNDQKSLLKTIKCKE

[SEQ ID







WSPLSKEFLELRYADKYGEIRESIINLLRSGEGNLMQILAKY

NO. 39]







DYQERIDAYNADSFQTKSKSQIVSDLIEEMPPKMRRPVIQ









AVRIVHEVVKVAKKEPDQISIEVTRENNNKEKKQQLTKKA









KSRSAQIQTFLKNLVKIDTFEEKRVDEVLEELKKYSDRSING









KHLYLYFLQNGKDAYTGKPINIDDVLSGNKYDTDHVIPQS









KMKDDSIDNLVLVERSINQHRSNEYPLPESIRKNPANVAF









WSKLKKAGMMSEKKFNNLTRANPLTEEELSAFVAAQINV









VNRSNIVIRDVLKVLYPNAKLIFSKAQYPSQIRKELNIPKLR









DLNDTHHAVDAYLNIVSGVSLTERYGNLSFIKAAQKNENQ









TDYSLNMERYISSLIQTKEGEKTSLGKLIDQTSRRHDFLLTY









RFSYQDSAFYNQTIYKKNAGLIPVHEKLPPERYGGYNSMS









TEVNCVVTIKGKKERRYLVGVPHLLLEKGNKVADINKEIAN









SVPHKENETIAVSLKDIIQLDSMVKKDGLVYLCTTQNKDLV









KLKPFGPIFLSRESEVYLSNLNKFVEKYPNIADGNENYSLKT









NRYGEKSIDFLQEKTGNVLKELVDLSNQKRFDYCPMICKL









RTIDYRKGVEGKTLTEQLILIRSFVGVFTRKSEALSNGSNFR









KARGLVLQDGLVLCSDSITGLYHTERKL [SEQ ID NO. 37]




MAD
66
DBKT

Bacillales

gut
MADKLFIGLDVGSESVGWAATDENFHLYRLKGKTAWGA
GTTTG
GCATTGT


2029

01000013.1

bacterium

meta-
RIFSEANDAKTRRGFRVAGRRLARRKERIRLLNTLFDPLLKK
AGAGC
AAGACA






genome
DPAFLLRLENSAIQNDDPNKPIQAIADCPLLVNKQEEKDYY
AGTGT
ACACTGC







KRYPTIWHLRKALMENDDHAFSDIRFLYLAIHHIIKYRGNF
TGTCT
TACGTTC







LREGDIKIGQFDYSIFDKLNETLAVLFDLQNEDGENEEGRFI
TATAT
AAATAA







GLPKSQYEAFITCANDRNLPKQPKKAKLLSMFEKTEESKAF
AGCTC
GCATATT







LEMFCTLCSGGEFSTKKLNAKGEETYQDAKISFNSSYDENE
GAAAA
GCTACAA







GAYQEILGDFFDLVDIAKAVFDYCDLSDILNGNDNLSSAFV
C
GGTTCTC







ELYDSHKSQLSALKSICKRIDNQNGFIGEKSIYVKLFNDPND
[SEQ ID
CATTGGA







KSNYPAFTNNKTLVDKRCDIHTFDKYVKETILPYESSLTGR
NO. 41]
GAATGA







DAVNWQMLKSLAEQDRLLQTIALRSTSVIPMQLHQKELKI

CCATTAG







ILKNAVSRNIKGVAEIEEKILKLFQYKIPYYCGPLTTKSDYSN

GTCGCTT







VVFKNNEYRPLKPWDYEEAIDWDGTKQKFMEGLTNKCT

AGATAG







YLKDKNVLPKQSVLYQDFDTWNKLNNLKVNGNKPSLEDL

CCAGTTC







NDLFSFVSQRSKTTMRDIQRYLKSKTNSKENDVVVSGWN

TTCTGGC







SEDYICCSSRASFNKNGIFNLNNSEVLKECERIIFLKTIYTDS

TA







PKDADAAVLKEFPDLTNNQKTLLKTIKCKEWSPLSKEFLEL

[SEQ ID







RYSDKYGEIRQSIIDLLRNGEGNLMQILAKYDYQEVIDACN

NO. 42]







AASFQTKSKSQIVSDLIEEMPPKMRRPVIQAVRIVQEVAK









VAKKEPDEISIEVTRENNDKEKKQQLTKKAKSRSTQIQNFL









KNLVKIDASEKKQANEVLEELKKYSDQSINGKHLYLYFLQN









GKDAYTGKPINIDDVLSGNKYDTDHIIPQSKMKDDSIDNL









VLVEREINQHRSNEYPLPESIRKNPANVAFWRKLKKAGM









MSEKKFNNLTRSNPLTEEELGAFVAAQINVVNRSNVVIRD









VLKILYPNAKLIFSKAQYPSQIRKELNIPKLRDLNDTHHAVD









AYLNIVSGVTLTDRYGNMRFIKASQDEEKHSLNMERYISSL









IQTKEGQRTELGELIDQTSRRHDFLLTYRFSYQDSAFYKQTI









YKKNAGLIPAHDNLPPERYGGYDSMSTEVNCVATIIGKKT









TRYLVGVPHLLIKKAKDGIDVNDELIKLVPHKENEVVKVDL









NTTLQLDCTVKKDGFMYLCTSNNIALVKLKPFSPIFLSRESE









IYLSNLMKYVEKYPNISDENSEYEFKINRENVDPIKFTEKQSI









EVVQDLIIKAKQDRFSYCSMISKLRDINAEEMIHSKSLTEQL









KIIKSLIGVFTRKSEILSDKNNFRKSRGAILQEDLFLCSDSITG









LYHTERKL









[SEQ ID NO. 40]




MAD
66
DBLD

Bacillales

gut
MEQNTKKLFIGLDVGTDSVGWAATDEYFNLYRLKGKTA
GTTTG
GCATTGT


2030

01000015.1

bacterium

meta-
WGARLFLDAANAKDRRQHRVSGRRLARRKERIRLLNALF
AGAGC
AAGACA






genome
DPLLKKVDPTFLLRLESSTLQNDDPNKDQRAVSDALLFGN
AGTGT
ACACTGC







KKHEKAYYAAFPTIWHLRKALIENDDKAFSDIRYLYLAIHHII
TGTCT
ACGTTCA







KYRGNFLRQGEIKIGEFDFSCFDKLNQFFDIYFSKEDEEEVE
TAAAT
AATAAGC







FIGLPNENYQRFIDCAADKNLGKGKKKGDLLKLMSFSEDE
AGCTC
AGATTGC







KPFCEMFCSLCAGLAFSTKKLNKKDETVFEDIKVEFNGKFD
GAAAA
TACAAG







DKQEEIKSVLGDAYDLVELAKFIFDYCDLKDILGASTNRLSE
C
GTTCCCG







AFAGIYDSHKEELKALKGICREIDRSLGNESKNSLYREVFND
[SEQ ID
TAAGGG







KGIPNNYAAFIHHETNSSRCGIADFNNYVLQKIEPLENLLS
NO. 44]
AATGACC







KQNYKNWIQLKQLASQGRLLQTIAIRSTSIIPMQLHLKDLK

ATCTGGT







LILANAEKRDIPGIKDIKEKILLLFQFKVPYYCGPLTDRSQYS

CACATGA







NVVLKAGTREKITPWNFADQVDLEETKKKFMEGLTNKCT

ATAGCCC







YLKDCNVLPRQSLMFQEYDAWNKLNNLSINGNKPSPEE

CCGGCA







MNALFDFASKRRKTTMSDIKKFEKRATMSKENDVTVSG

ACGGTG







WNENDFIDLSSFVSLSGFFDLGEIHSADYMACEEAILLKTIF

GCTG







TDAPQDADPIIAEKFPNLKPNQLAALKKMSCKGWATLSR

[SEQ ID







EFLTLKAVDADGEVMNETLLGLMKEGKGNLMQLLHSSLY

NO. 45]







NFQDVIDSHNRAVFGDKSPKQIANDLIEEMPPQMRRPVI









QALRIVREVSKVAKKQPDVISIEVTRESNDKKKKEEWSKKA









TDRKKQIDLFLKNLKKTEDVKQTESELDGQAINDIDSIRGK









HLYLYFLQNGKDAYTGLPIDINDVLNGTKYDTDHIIPQSLM









KDDSIDNLVLVNREKNQHKSNEFPLPRDIQTKANIERWRA









LKKAGGMSEKKFNNLTRTTPLTEEELSAFVAAQINVVNRS









NVVIRDVLKILYPNAKLIFSKAQYPSQIRRDLEIPKLRDLNDT









HHAVDAFLNIVSGVELTKQFGRMDVIKAAAKGDKDHSLN









MTRYLERLLKKVDENKNETMTELGNHVFVTSQRHDFLLT









YRFDYQDSAFYNATIYSPDKNLIPMHDGMDPERYGGYSS









LNIEYNCIATIKGKKKTTRYLLGVPHLLALKFKNDGIDITSDLI









KLVPHKGDEEVSIDWKNPIPLRITVKKDGVEYLLAPFNAQ









VMELKPVSPVFLPREAAEYLARLKKAVDQKKQFIYQNSAEI









FQSKDKNNALQFGPEQSKNVALKIYALADAKKYDYCAMIS









KLRDAALRAEMLDSLSSEALFKQYNDLISLLSQLTRRSKKIS









SKYFSKSRGALLQDGLKIVSKSITGLYETERNL









[SEQ ID NO. 43]




MAD
141
CACV
uncultured
Cattle
MNYILGLDIGIASVGWAAVALDANDEPCKILDLNARIFEA
ATTGT
TTGTAAT


2031

OG01

Seleno-

rumen
AEQPKTGASLAAPRREARGSRRRTRRRRHRMERLRHLFA
ACCAT
AACCTAT




0000001.1

monadaceae


REELISAENIAALFEAPADVYRLRAEGLSRRLDEGEWARVL
AGCGA
TTTACCT






bacterium


YHIAKRRGFKSNRKGAASDADEGKVLEAVKENEALLKNYK
GTTAA
CGCTATG







TVGEMMFRDEKFQTAKRNKGGSYTFCVSRGMLAEEIGEL
ATTAG
GCACAAT







FAAQREQGNPHASETFETAYSKIFADQRSFDDGPDANSR
GGAAT
TTGTTAT







SPYAGNQIEKMIGTCSLETDPPEKRAAKASYSFMRFSLLQK
TACAA
TACATGG







INHLRLKDAKGEERPLTDEERAAVEALAWKSPSLTYGAIRK
C
ACATTAT







ALPLPDELRFTDLYYRWDKKPEEIEKKKLPFAAPYHEIRKAL
[SEQ ID
ACTAAAC







DKREKGRIQSLTPDALDAVGYAFTVFKNDAKIEAALSAAGI
NO. 47]
ATTTCCT







DGEDAVALMAAGLTFRGFGHISVKACRKLIPHLEKGMTY

AAAAAA







DKACKEAGYDLQKTGGEKTKLLSGNLDEIREIPNPVVRRAI

GCAACG







AQTVKVVNAVIRRYGSPVAVNVELAREMGRTFQERRDM

AAAAAC







MKSMEDNNAENEKRKEELKGYGVVHPSGLDIVKLKLYKE

GTGCTG







QGGVCAYSLAAMPIEKVLKDHDYAEVDHILPYSRSFDDSY

GCAGCA







ANKVLVLSKENRDKGNRTPMEYMANMPGRRHDFITWV

A







KSAVRNPRKRDNLLLEKFGEDKEAAWKERHLTDTKYIGSFI

[SEQ ID







ANLLRDHLEFAPWLNGKKKQHVLAVNGAVTDYTRKRLGI

NO. 48]







RKIREDGDLHHAVDAAVIATVTQGNIQKLTDYSKQIERAF









VKNRDGRYVNPDTGEVLKKDEWIVQRSRHFPEPWPGFR









HELEARVSDHPKEMIESLRLPTYTPEEIDGLKPPFVSRMPT









RKVRGAAHLETVVSPRLKDEGMIVKKVSLDALKLTKDKDA









IENYYAPESDHLLYEALLHRLQAFGGDGEKAFAESFHKPKA









DGTPGPVVKKVKIAEKSTLSVPVHHGRGLAANGGMVRV









DVFFIPEGKDRGYYLVPVYTSDVVRGELPMRAVVQGKSY









AEWKLMREEDFIFSLYPNDLVYIEHEKGVKVKIQKKLREIST









LPREKTMTSGLFYYRTMGIAVASIHIYAPDGVYVQESLGV









KTLKEFKKWTIDILGGEPHPVQKEKRQDFASVKRDPHAAK









STSSG









[SEQ ID NO. 46]




MAD
141
CACV
uncultured
Cattle
MKYIIGLDMGITSVGFATMMLDDKDEPCRIIRMGSRIFEA
GTTGT
ATTGTAT


2032

WE01

Rumino-

rumen
AEHPKDGSSLAAPRRINRGMRRRLRRKSHRKERIKDLIIKN
AGTTC
CATACCA




0000020.1

coccus sp.


ELMTADEISAIYSTGKQLSDIYQIRAEALDRKLNTEEFVRLLI
CCTAA
AGAACA







HLSQRRGFKSNRKVDAKEKGSDAGKLLSAVNSNKELMIEK
TTATTC
ATTAGGT







NYRTIGEMLYKDEKFSEYKRNKADDYSNTFARSEYEDEIRQ
TTGGT
TACTATG







IFSAQQEHGNPYATDELKESYLDIYLSQRSFDEGPGGSSPY
ATGGT
ATAAGGT







GGNQIEKMIGNCTLEPEEKRAAKATFSFEYFNLLSKVNSIKI
ATAAT
AGTATAC







VSSSGKRALNNDERQSVIRLAFAKNAISYTSLRKELNMEYS
[SEQ ID
CGCAAA







ERFNISYSQSDKSIEEIEKKTKFTYLTAYHTFKKAYGSVFVE
NO. 50]
GCTCTAA







WSADKKNSLAYALTAYKNDTKIIEYLTQKGFDAAETDIALT

CACCTCA







LPSFSKWGNLSEKALNNIIPYLEQGMLYHDACTAAGYNFK

TCTTCGG







ADDTDKRMYLPAHEKEAPELDDITNPVVRRAISQTIKVIN

ATGAGG







ALIREMGESPCFVNIELARELSKNKAERSKIEKGQKENQVR

TGTTATC







NDRIMERLRNEFGLLSPTGQDLIKLKLWEEQDGICPYSLKP

T







IKIEKLFDVGYTDIDHIIPYSLSFDDTYNNKVLVMSSENRQK

[SEQ ID







GNRIPMQYLEGKRQDDFWLWVDNSNLSRRKKQNLTKET

NO. 51]







LSEDDLSGFKKRNLQDTQYLSRFMMNYLKKYLALAPNTT









GRKNTIQAVNGAVTSYLRKRWGIQKVRENGDTHHAVDA









VVISCVTAGMTKRVSEYAKYKETEFQNPQTGEFFDVDIRT









GEVINRFPLPYARFRNELLMRCSENPSRILHEMPLPTYAAD









EKVAPIFVSRMPKHKVKGSAHKETIRRAFEEDGKKYTVSK









VPLTDLKLKNGEIENYYNPESDGLLYNALKEQLIAFGGDAA









KAFEQPFYKPKSDGSEGPLVKKVKLINKATLTVPVLNNTAV









ADNGSMVRVDVFFVEGEGYYLVPIYVADTVKKELPNKAII









ANKPYEEWKEMREENFVFSLYPNDLIKISSRKDMKFNLVN









KESTLAPNCQSKEALVYYKGSDISTAAVTAINHDNTYKLRG









LGVKTLLKIEKYQVDVLGNVFKVGKEKRVRFK









[SEQ ID NO. 49]




MAD
141
DCJP0
unculti-
Feces
MKNTLYGIGLDIGVASVGWAVVGLNGTGEPVGLHRLGV
GTTGT
TTATACC


2033

1000021.1
vated
of
RIFDKAEQPKTGESLAAPRRMARGMRRRLRRKALRRADV
AGTTC
ATACCAA






Faecali-

three-
YALLERSGLSTREALAQMFEAGGLEDIYALRTRALDEPVGK
CCTAA
GAACTGT






bacterium

weeks
AEFSRILLHLAQRRGFKSNRRTASDGEDGRLLAAVNENRR
CAGTT
TATGGTT





sp.
old
RMAQGGWRTVGEMLYRHEAFALRKRNKADEYLSTVGR
CTTGG
GCTATGA






ele-
DMVAEEASLLFQRQRELGCAWATPELQAEYLSILLRQRSF
TATGG
TAAGGTC






phant
DEGPGGNSPYGGNQVEKMVGRCTFEPDEPRAAKAAYSF
TATAA
TTAGCAC







EYFSLLQKLNHIRLAENGETRPLTQPQRQQLLSLAHKTPDV
T
CGTAAA







SLARIRKELALPETVQFNGVRCRANETLEESEKKEKFACLP
[SEQ ID
GCTCTGA







AYHKMRKALDGVVKGRISSLSISQRDAAATALSLYKNEDT
NO. 53]
CGCCTCG







LRAKLTEAGFQAPEIDALAGLTGFSKFGHLSLKACRKLIPHL

CTTTCAG







EQGLTYDQACSAAGYDFKGHGAGERAFTLPAAAPEMEQI

CGGGGC







TSPVVRRAVAQTIKVVNGIIREMDASPAWVRIELARELSKT

GTCATCT







FGERQEMDRSMRENAAQNERLMQELRDTFHLLSPTGQ

TTTTTGC







DLVKYRLWKEQDGVCAYSLRRLDVERLFEPGYVDVDHIVP

CCAAAA







YSLSFDDRRSNKVLVLSSENRQKGNRLPLQYLQGKRREDFI

GACACG







VWTNSSVRDYRKRQNLLREKFSGDEAEGFRQRNLQDTQ

GATATTT







HMARFLYNYISDHLAFAQSEALGKKRVFAVSGAVTSHLRK

TT







RWGLSKVRADGDLHHALDAAVIACTTDGMIRRISGYYGH

[SEQ ID







IEGEYLQDADGAGSQHARTKERFPAPWPRFRDELIVRLSE

NO. 54]







QPGEHLLDINPAFYCEYGTEHICPVFVSRMPRRKVTGPGH









KETIKGAAAADEGLLTVRKALTELKLDKDGEIKDYYMPSSD









TLLYEALKAQLRRFGGDGKKAFAEPFYKPKADGTPGPLVR









KVKTIEKATLTVPVHGGAASNDTMVRVDVFLVPGDGYY









WVPVYVADTLKPELPNRAVVAFKPYSEWKEMREEDFIFSL









YPNDLVYVEHKSGLKFTLQNADSTLEKTWVPKASFAYFVG









GDISTAAISLRTHDNAYGLRGLGIKTLKVLKKYQVDVLGNIS









PVHRETRQRFR









[SEQ ID NO. 52]




MAD
141
CACX
uncultured
Cattle
MAYGIGLDIGIASVGFATVALNEQDEPCGILRMGSRIFDA
GTTGT
TTATACC


2034

AV01

Clostri-

rumen
AEHPKNGASLAAPRREARSARRRLRRHRHRLERIRNLLVE
AGTTC
ATACCAA




0000001.1

diales


SCLISQDGLGSLFEGRLEDIYALRTRALDERLTDAELCRVLIH
CCTAA
GAACTGT






bacterium


LAQRRGFRSNRKADAADKEAGKLLKAVSENDRRMEENG
CGGTT
TGGGTTA







YRTVGEMLYKDPLFAEHRRNKGEAYLSTVTRTAVEQEARL
CTTGG
CTACAAT







VLSTQREKGNAAITEDFVEKYLDILLSQRPFDVGPGGNSPY
TATGG
AAGGTA







GGNMIEKMIGRCTFEPDELRAPKASYSFEYFQLLQKVNHI
TATAA
GTAAACC







RLLRDGRSEPLSEEQRRAIIDLALASADVTFAKIRKALSLPDS
T
GAAAAG







VRFNDVYYRESAEEAEKKKKLGCMDAYHEMRKALDKVAK
[SEQ ID
CTCTGAC







GRICAIPVEQRNAIAYVLTVHKTDERILTELQNINLERSDID
NO. 56]
GTCTTGT







QLMQMKGFSKFGHLSIKACDRIIPYLEQGMTYSDACTAA

TTGCGCA







GYAFRGHEGGEHSLYLPAQTPEMDEITSPVVRRAVSQTIK

GGACGT







VVNALIREQGESPTFVNIELAREMSKDFAERNDIRRENEK

CATCTTT







NAKANEAVMNELRRTFGLVNPSGQDLVKYKLFLEQGGVC

ATATCAG







PYTQRPMEPGRLFEAGYADVDHIVPYSISFDDRYCNKVLT

ACGGAT







FASVNRKEKGNRLPLQFLKGERRESFIVYVKANVRDYRKQ

G







RLLLKETVTEEDRKGFRDRNLQDTKHMAAFLHSYINDHLQ

[SEQ ID







FAPFQTDRKRHVTAVNGAVTAYLRKRWGIRKVRAEGDLH

NO. 57]







HASDALVIACTTPGMIQRLSRYAELREAEYMQTEDGAVRF









DPATGEVLEKFPYPWPCFRQEWTARVSDDPQAMLQDM









KLTDYRGLPLEQVKPVFVSRMPKHKVTGAAHKDTVKSAK









ALDRGVVLVKRALTDLKLKDGEIENYYDPASDRLLYEALKE









RLIAFGGDAQKAFAEPFHKPKRDGTPGPLVKKVKLMEKSS









LTVPVHDGKGVADNDSMVRIDVFFVAGEGYYFVPIYVAD









TVKPELPNRAVVANKPYAEWKEMKDEDFLFSLYPSDLMR









VTQKKGIKLSLINKESTLKKEEMAQSILLYYVKGSISTGSITA









ENHDRTYAINSLGIKTLEKLEKYQVDVLGNVSPVGKEKRLT









FC









[SEQ ID NO. 55]




MAD
141
CADA
uncultured
Cattle
MLPYAIGLDIGIASVGWAVVGLDTNERPFCILGMGSRIFD
GTTGT
TTATACC


2035

TZ010

Chloro-

rumen
KAEQPKTGASLALPRREARSLRRRLRRHRHRNERIRNLLLR
AGTCC
ATTCCAG




000012.1

flexi


EKIISESELQDLFSGTLSDIYQLRVEALDRKLDDKEFSRVLIHI
CCTGA
AAACTAT






bacterium


AQRRGFKSNRKNAAASQEDGKLLSAVTENQQRMNDKG
TGGTT
TATGGTC







YRTVSEMLLRDDKFKDHKRNKGGEYLTTVTRTMVEDEVH
TCTGG
ACTACAA







KIFSAQRTHGNLKADNQLESEYLEILLSQRSFDEGPGGDSP
AATGG
TAAGGTA







YGGSQIEKMIGKCTFFPEEKRAAKATYTFEYFNLLEKINHIR
TATAA
TTAGACC







LVSKDNLPEPLSDFQRRSLIELAYKVENLTYDRIRKELHISPE
T
GTAGAG







LKFNTIRYESDDLPENEKKQKLNCLKAYHEIRKALDKLGKG
[SEQ ID
CACTAAC







TINTLSKEQLNTIGTVLSMYKTSEIIKNKMEQIPAEIVDKLD
NO. 59]
ACCCCAT







EEGINFSKFGHLSIKACELIIPGLEKGLNYNDACEEAGLNFK

TTGGGGT







AHNNEEKSFLLHPTEDDYADITSPVVKRAASQTIKVINAIIR

GTTATCT







KQGCSPTYINIEVARELSKDFYERDKINKRNEANRAENERS

CTTTAAA







LEQIRKEYGKSNASGLDLVKFKLYQKQDGVCAYSQKQISFE

CTGTCCA







RLFEPNYVEVDHIIPYSKCFDDRESNKVLVFAKENREKGNR

AAATTTA







LPLEYLDGKKRESFIVWVNSKVKDYRKKQNLLKESLSEEEE

GTATTGC







KQFKERNLQDTKTVSKFLMNYINDNLIFSSSNKRKKHVTA

AATTATT







VSGGVTSYMRKRWGISKVREDGDQHHAVDALVIVCTTD

GA







GMIQQVSKYVEYKECQYIQTDAGSLAVDPYTGEVLRSFPY

[SEQ ID







PWARFHEDAVTWTEKIFVSRMPMRKVTGPAHKETIKSPK

NO. 60]







ALGEGLLIVRKPLTELKLKNGEIENYYKPEADLLLYNGLKERL









MEFGGDAKKAFAEPFPKPGNPQKIVKKVRLTEKSTLNVPV









LKGEGRADNDSMVRVDVFLKDGKYYLVPIYVADTLKPELP









NKACIAHKPYDEWATMDDGDFLFSLYPNDLIYIKHKKGIKL









TKINKNSTLADSIEGKEFFLFYKTMGISSAVLTCTNHDNTYY









IESLGVKTLESLEKCVVGVLGEIHKVRKEKRTGFSGN









[SEQ ID NO. 58]




MAD
141
CADA

Rumino-

Cattle
MLPYAIGLDIGISSVGWASVALDEEDKPCGIIGMGSRIFDA
GTTAT
TTATACC


2036

WQ01

coccaceae

rumen
AEQPKTGDSLAAPRRAARSARRRLRRRRHRNERIRALML
AGTTC
ATACCAA




0000026.1

bacterium


REGLLSEAELAALFDGRLEDICALRVRALDEAVTNDELARIL
CCTGT
GAACGA







LHLSQRRGFRSNRKTAATQEDGELLAAVSANRALMQERG
TCGTT
AGCAGG







YRTVAEMLLRDERYRDHRRNKGGAYIATVGRDMVEDEV
CTTGG
TTACTAT







RQIFAAQRALGSTAASETLETAYLEILLSQRSFDAGPGEPSP
TATGG
GATAAG







YAGGQIERMIGRCTFEPDEPRAARATYSFEYFSLLEAVNHI
TATAA
GTAGTAT







RLTEAGESVPLTKEQREKLIALAHRTADLSYAKIRKELGVPE
T
ACCGCA







SQRFNMVTYGKTDSADEAEKKTKLKQLRAYHQMRAAFE
[SEQ ID
GAGCTCC







KAAKGSFVLLTKEQRNAVGQTLSIYKTSDNIRPRLREAGLT
NO. 62]
AACGCCT







EAEIDVAEGLSFSKFGHLSVKACDKIIPFLEQGMKYSEACV

CGCTTTT







AAGYAFRGHEGQDKQRLLPPLDNDAKDTITSPVVLRAVS

GCGGGG







QTIKVVNAIIRERGGSPTFINIELAREMAKDFSERSQIKREQ

CGTTGTC







DSNRARNERMMERIKTEYGKSSPTGLDLVKLKLYEEQAG

TCT







VCAYSLKQMSLEHLFDPNYAEIDHIIPYSISFDDGYKNKVLV

[SEQ ID







LAKENRDKGNRLPLEYLNGKRREDFIVWVNSSVRDWRKK

NO. 63]







QNLLKEHVTPEDEAKFKERNLQDTKTASRFLLNYIADNLAF









APFQTERKKRVTAVNGSVTAYLRKRWGIAKVRANGDLHH









AVDALVIACTTDGLIQKVSRYACYQENRYSEAGGVIVDSA









TGEVVAQFPEPWPRFRHELEARLSDDPARAVLGLGLAHY









MTGEIRPRPLFVSRMPRRKVTGAAHKETVKSPRALDEGQ









LVTKTPLSALKLGKDGEIPGYYKPESDRLLYEALKARLRQFG









GDGKKAFAEPFHKPKHDGTPGPVVTKVKLCEPATLSVPV









HGGLGAANNDSMVRIDVFHVEGDGYYFVPIYIADTLKLEL









PNKACVKIKKISEWKHMKPQDFMFSLYPNDLFRIVSKKGI









TLNLVSKESTLPTSVNVSDTLLYFVSAGIASACLTCRNHDN









TYQIESLGIKTLEKLEKYTVDVLGNVHRVEKEPRMSFSQKG









D









[SEQ ID NO. 61]




MAD
141
DGSQ

Clostri-

low
MLPYGIGLDIGITSVGWATVALDENDRPYGIIGMGSRIFD
GTTAT
TTATACC


2037

01000028.1

diales

meth-
AAEQPKTGESLAAPRRAARSARRRLRRHRHRNERIRALILR
AGTTC
ATACCAA






bacterium

ane
ENLLSEGQLLHLYDGQLSDVYSLRVKALDERVSNEEFARILI
CCTGA
GAACTAT






produ-
HISQRRGFKSNRKGASSKEDSELLAAISANQVRMQQQGY
TAGTT
GAGGTT






cing
RTVAEMYLKDPIYQEHRRNKGGNYIATVSRAMVEDEVH
CTTGG
GCTATAA






sheep
QIFTGQRACGNPAATKELEEAYVEILLSQRSFDDGPGDGS
TATGG
TAAGGTA







PYAGSQIERMIGKCQLEKEAGEPRAAKATYSFEYFSLLAAI
TATAA
GTAAACC







NNISIISNGQLSPLTKEQREMLIALAHKTSELNYARIRKELG
T
GCAGAG







LSEAQRFNTVSYGKMEIAEAEKKTKFEHLKAYHKMRREFE
[SEQ ID
CTCTAAC







RIAKGHFASITIEQRNAIGDVLSKYKTDAKIRPALREAGLTE
NO. 65]
GCCTCAC







LDIDAAEALNFSKFGHISIKACKKIIPWLEQGMKYSEACNA

ATTTGTG







AGYNFKGHDGQEKSHLLPPLDEESRNVITSPVALRAISQTI

GGGCGT







KVVNAIIRERGCSPTFINIELAREMSKDFYERIEIKKEQDGN

TATCTCT







RAKNERMMERIRTEYGKASPTGQDLVKFKLYEEQGGVCA

[SEQ ID







YSLKQMSLAHLFEPDYAEVDHIVPYSISFDDGYKNKVLVLA

NO. 66]







KENRDKGNRLPLQYLQGKRREDFIAWVNSCVRDYKKRQR









LLKESISEDDLRAFKERNLQDTKTASRFLLNYISDHLEFTQF









ATERKKHVTAVNGSVTAYLRKRWGITKIRENGDLHHAVD









ALVIACTTDGMIQQVSRFAQHRENQYSLAEDSRFIIDPET









GEVIKEFPYPWPRFRQELEARLSSNPGLAVRDRGFLLYMA









ESIPVHPLFVSRMPRRKVTGAAHKETIKSGKAQKDGLLIVK









KPLTDLKLDKEGEIANYYNPMSDRLLYEALKKRLTAFNGD









GKKAFADPFYKPKSDGTQGPLVNKVKLCEPSTLNVSVIGG









KGVAENDSMVRIDVFRVEGDGYYFVPVYVADTVKPELPN









KACVANKPYTDWKEMRESDFLFSLYPNDLLKVTHKKALIL









TKAQKDSDLPDCKETKSEMLYFVSASISTASLACRTHDNSY









RINSLGIKTLEALEKYTVDVLGEYHPVRRETRQTFTGRESSG









HSGIS









[SEQ ID NO. 64]




MAD
141
CACW

Rumino-

Cattle
MRPYGIGLDIGISSVGWAAIALDHQDSPCGILDMGARIFD
GTTGT
TTATACC


2038

HR01

coccaceae

rumen
AAENPKDGASLAAPRREKRSQRRRLRRHRHRNERIRRML
AGTTC
ATACCAA




0000008.1

bacterium


LKEGLLTEAELTGLFDGALEDIYALRTRALDEALTKQEFARV
CCTGA
GAACGA







LLHLSQRRGFRSNRRATAAQEDGKLLDAVSENAKRMADC
TCGTT
TCAGGTT







GYRTVGEMLCRDATFAKHKRNKGGEYLTTVSRAMIEDEV
CTTGG
GCTACAA







KLVFASQRRLGSAFASEALEQGYLDILLSQRSFDEGPGGNS
TATGG
TAAGGTA







PYGGAQIERMIGKCTFYPEEPRAARACYSFEYFSLLQKVN
TATAA
GTAAACC







HIRLQKDGESTPLTSEQRLQLIELAHKTENLDYARIRRALQI
T
GAAGAG







PDAYRFNTVSYRIESDPAAAEKKEKFQYLRAYHTMRKAID
[SEQ ID
CTCTAAC







GASKGRFALLSQEQRDQIGTVLTLYKSQERISEKLTEAGIEP
NO. 68]
GCCCCGT







CDIAALESVSGFSKTGHISLRACKELIPYLEQGMNYNEACA

TTCTTTA







AAGIEFHGHSGTERTVVLHPTPDDLADITSPVVRRAVAQT

CGGGGC







VKVINAVIRRYGSPVFVNIELARELAKDFTERKKLEKDNKT

GTTATCT







NRAENERLMRRIREEYGKMNPTGLDLVKLRLYEEQAGVC

CT







PYSQKQMSLQRLFEPNYAEVDHIIPYSISFDDSRRNKVLVL

[SEQ ID







AEENRNKGNRLPLQYLTGERRDNFIVWVNSSVRDYRKKQ

NO. 69]







KLLKPTVTDEDKQQFKERNLQDTKTMSRFLMNYINDHLQ









FGVSAKERKKRVTAVNGIVTSYLRKRWGITKIRGDGDLHH









AVDALVIACATDGMIRQITRYAQYRECRYMQTDTGSAAI









DEATGEVLRIFPYPWEHFRKELEARLSSDPARAVNALRLPF









YLDSGEPLPKPLFVSRMPRRKVSGAAHKDTVKSPKAMAE









GKVIVRRALTDLKLKNGEIENYFDPGSDRLLYDALKARLAA









FGGDGAKAFREPFYKPRHDGTPGPLVKKVKLCEPTTLNVA









VHGGKGVADNDSMVRIDVFRVEGDGYYFVPIYIADTLKP









VLPNKACVAFKPYSEWRTMDDRDFIFSLYPNDLIRVTHKS









ALKLSRVSKESTLPESIESKTALLYYVSAGISGAAVSCRNHD









NSYEIKSMGIKTLEKLEKYTVDVLGEYHKVEKERRMPFTGK









RS









[SEQ ID NO. 67]




MAD
141
CACZL

Rumino-

Cattle
MRPYAIGLDIGITSVGWATVALDADESPCGIIGLGSRIFDA
GTTAT
TTATACC


2039

L0100

coccaceae

rumen
AEQPKTGESLAAPRRAARGSRRRLRRHRHRNERIRSLMLE
AGTTC
ATACCAA




00017.1

bacterium


ERLISQDELETLFDGRLEDIYALRVKALDEIVSRTDFARILLHI
CCTGA
GAACTAT







SQRRGFKSNRKNPTTKEDGVLLAAVNENKQRMSEHGYR
TAGTT
TTAGGTT







TVGEMFLLDETFKDHKRNKGGNYITTVARDMVADEVRAI
CTTGG
ACTATGA







FSAQRELGASFASEEFEERYLEILLSQRSFDEGPGGNSPYG
TATGG
TAAGGTT







GSQIERMVGRCTFFPDEPRAAKATYSFEYFTLLQKVNHIRI
TATAA
TAGTACA







VENGVASKLTDEQRRIIIELAHTTKDVSYAKIRKVLKLSDKQ
T
CCTTAGA







LFNIRYSDNSPAEDSEKKEKLGIMKAYHQMRSAIDRVSKG
[SEQ ID
GCTCTGA







RFAMMPRAQRNAIGTALSLYKTSDKIRKYLTDAGLDEIDIN
NO. 71]
CGCCTCG







SADSIGSFSKFGHISVKACDMLIPFLEQGMNYNEACAAAG

CTTTTGC







LNFKGHDAGEKSKLLHPKEEDYEDITSPVVRRAIAQTIKVIN

GAGGCG







AIIRREGCSPTFINIELAREMAKDFRERNRIKKENDDNRAK

TTATCTC







NERLLERIRTEYGKNNPTGLDLVKLRLYEEQSGVCMYSLK

TTTATAT







QMSLEKLFEPNYAEVDHIVPYSISFDDSRKNKVLVLTEENR

TGCCAAA







NKGNRLPLQYLKGRRREDFIVWVNNNVKDYRKRRLLLKE

AATGCAA







ELTAEDESGFKERNLQDTKTMSRFLLNYIADNLEFAESTRG

ATATATC







RKKKVTAVNGAVTAYMRKRWGITKIREDGDCHHAVDAV

GTACAAT







VIACTTDAMIRQVSRYAQFRECEYMQTESGSVAVDTGTG

GGTGGC







EVLRTFPYPWPDFRKELEARLANDPAKVINDLHLPFYMSA

[SEQ ID







GRPLPEPVFVSRMPRRKVTGAAHKDTIKSARELDNGYLIV

NO. 72]







KRPLTDLKLKNGEIENYYNPQSDKCLYDALKNALIEHGGD









AKKAFAGEFRKPKRDGTPGPIVKKVKLLEPTTMCVPVHGG









KGAADNDSMVRVDVFLSGGKYYLVPIYVADTLKPELPNK









AVTRGKKYSEWLEMADEDFIFSLYPNDLICATSKNGITLSV









CRKDSTLPPTVESKSFMLYYRGTDISTGSISCITHDNAYKLR









GLGVKTLEKLEKYTVDVLGEYHKVGKEVRQPFNIKRRKAC









PSEML









[SEQ ID NO. 70]




MAD
141
DHKF

Clostri-

Feces
MHRYAIGLDIGITSVGWAAIALDAEENPCGMLDFGSRIFT
GTTGT
TTATACC


2040

01000115.1

diales


GAEHPKTGASLAAPRREARGARRRLRRHRHRNERIRRLM
AGTTC
ATACCAA






bacterium


VSGGLISQEQLESLFAGQLEDIYALRTRALDEQVAREELARI
CCTGA
GAACTGC





UBA4701

MLHLSQRRGFRSNRKGGADAEDGKLLEAVGDNKRRMD
TGGTT
TCAGGTT







EKGYRTAGEMFFKDEAFAAHKRNKGGNYIATVTRAMTE
CTTGG
ACTATGA







DEVHRIFAAQRGFGAEYANEKLEAAYLDILLSQRSFDEGP
TATGG
TAAGGTA







GGDSPYGGSQIERMIGTCAFEPDQPRAAKAAYSFEYFSLL
TATAA
GTAAACC







EKLNHIRLVSGGKSEPLTDAQRKKLIELAHKQDTLSYAKIRK
T
GAAGAG







ELELNEAVRFNSVRYTDDATFEEQEKKEKIVCMKAYHAM
[SEQ ID
CTCTAAT







RKAVDKNAKGRFAYLTIPQRNEIGRVLSTYKTSAKIEPALA
NO. 74]
GCCCCGT







AAGIEPCDIAALEGLSFSKFGHLSIKACDKLIPFLEKAMNYN

CTCGCAC







DACAAAGYDFRGHSRDGRQMYLPPLGGDCTEITSPVVRR

GGGGCA







AVSQTIKVINAIIRRYGTSPVYVNIELAREMSKDFAERNKIK

TTATCTC







KQNDDNRSKNEKIKEQVAEYKHGAATGLDIVKMKLFNEQ

TAACAGC







GGICAYSQRQMSLERLFDPNYAEVDHIVPYSISFDDRYKN

GAAAAG







KVLVLTEENRNKGNRLPLQYLTGERRDRFIVWVNNSVRD

GCAAA







FQKRKLLLKEALTPEEENDWKERNLQDTKFVSSFLLNYIND

[SEQ ID







NLLFAPSVRRKKRVTAVNGAVTDYMRKRWGISKVREDG

NO. 75]







DRHHAVDAVVIACTNDALIQKVSRYESWHERHYMPTEN









GSILVDPATGEIKQTFPYPWAMFRKELEARLSNDPSRAVA









DLKLPFYMDADAPPVKPLFVSRMPTRKVTGAAHKDTVKS









ARALADGLAIVRRPLTALKLDKDGEIAGYYNKDSDRLLYDA









LKARLTEYGGNAAKAFAEPFYKPKSDGTPGPVVNKVKLTE









PTTLSVPVQDGTGIADNDSMVRIDVFRVVGDGYYFVPVY









VADTLKQELPDRAVVAFKAHSEWKVMSDGDFVFSLYPN









DLVKVTRKKDVILKRSFDNSTLPETIASNECLLYYAGADIST









GAISCVTNDNAYSIRGLGIKTLVSMEKYTVDILGEYHPVRK









EERQRFNTKR









[SEQ ID NO. 73]









Example 3: Vector Cloning, MADZYME Library Construction and PCR

The MADzyme coding sequences were cloned into a pUC57 vector with T7-promoter sequence attached to the 5′-end of the coding sequence and a T7-terminator sequence attached to the 3′-end of the coding sequence.


First, Q5 Hot Start 2× master mix reagent (NEB, Ipswich, Mass.) was used to amplify the MADzyme sequences cloned in the pUC57 vector. The forward primer 5′-TTGGGTAACGCCAGGGTTTT [SEQ ID No. 172] and reverse primer 5′-TGTGTGGAATTGTGAGCGGA [SEQ ID No. 173] amplified the sequences flanking the MADzyme in the pUC57 vector including the T7-promoter and T7-terminator components at the 5′- and 3′-end of the MADzymes, respectively. 1 μM primers were used in a 10 μL PCR reaction using 3.3 μL boiled cell samples as templates in 96 well PCR plates. The PCR conditions shown in Table 2 were used:

















STEP
TEMPERATURE
TIME









DENATURATION
98° C.
30 SEC



30 CYCLES
98° C.
10 SEC




66° C.
30 SEC




72° C.
 3 MIN



FINAL EXTENSION
72° C.
 2 MIN



HOLD
12° C.










Example 4: gRNA Construction

Several functional gRNAs associated with each MADzyme was designed by truncating the 5′ region, the 3′ region and the repeat/anti-repeat duplex (see Table 3).














TABLE 3





gRNA







name
sgRNAv1
sgRNAv2
sgRNAv3
sgRNAv4
sgRNAv5







sgM
GTTTTAGAGCTATGC
GTTTTAGAGCTATGC
GTTTTAGAGCTATGC
GTTTTAGAGCT
NONE


2015
TGTTTTGAATGCTTC
TGTTTTGAATGCTTC
TGTTAACAACATAGC
ATGCAAACAT




CAAAACGAAATGTT
GTAGCATTCAAAAC
AAGTTAAAATAAGG
AGCAAGTTAA




GGTAGCATTCAAAA
AACATAGCAAGTTA
CTTTGTCCGTTCTCA
AATAAGGCTTT




CAACATAGCAAGTT
AAATAAGGCTTTGTC
ACTTTTAGTGACGCT
GTCCGTTCTCA




AAAATAAGGCTTTG
CGTTCTCAACTTTTA
GTTTCGGCG
ACTTTTAGTGA




TCCGTTCTCAACTTT
GTGACGCTGTTTCG
[SEQ ID NO. 78]
CGCTGTTTCGG




TAGTGACGCTGTTTC
GCG

CG




GGCG
[SEQ ID NO. 77]

[SEQ ID NO.




[SEQ ID NO. 76]


79]



sgM
GTTTTAGAGTCATGT
GTTTTAGAGTCATGT
GTTTTAGAGTCATGT
NONE
NONE


2016
TGTTTAGAATGGTA
TGTAAAAACAACATA
TGTAAAAACAACATA





CCAAAACATCTTTTG
GCAAGTTAAAATAA
GCAAGTTAAAATAA





GGACTATTCTAAAC
GGTTTTAACCGTAAT
GCGTAATCAACTGTA





AACATAGCAAGTTA
CAACTGTAAAGTGG
AAGTGGCGCTGTTTC





AAATAAGGTTTTAA
CGCTGTTTCGGCGC
GGCGC





CCGTAATCAACTGTA
[SEQ ID NO. 81]
[SEQ ID NO. 82]





AAGTGGCGCTGTTT







CGGCGC







[SEQ ID NO. 80]






sgM
GTTTTAGAGCTGTG
GTTTTAGAGCTGTGC
GTTTTAGAGCTGTGC
GTTTTAGAGCT
NONE


2017
CTGTTTCGAATGGTT
TGTTTCGAAAAATCG
TGTAAAAACAACAC
GTGCAAACAC




CCAAAACGAAATGT
AAACAACACAGCGA
AGCGAGTTAAAATA
AGCGAGTTAA




TGGAACTATTCGAA
GTTAAAATAAGGCTT
AGGCTTTGTCCGTAC
AATAAGGCTTT




ACAACACAGCGAGT
TGTCCGTACACAACT
ACAACTTGTAAAAG
GTCCGTACACA




TAAAATAAGGCTTT
TGTAAAAGGGGCAC
GGGCACCCGATTCG
ACTTGTAAAA




GTCCGTACACAACTT
CCGATTCGGGTGC
GGTGC
GGGGCACCCG




GTAAAAGGGGCACC
[SEQ ID NO. 84]
[SEQ ID NO. 85]
ATTCGGGTGC




CGATTCGGGTGCA


[SEQ ID NO.




[SEQ ID NO. 83]


86]



sgM
GTTTTAGAGCTGTGT
GTTTTAGAGCTGTGT
GTTTTAGAGCTGTGT
NONE
NONE


2019
TGTTTCGAATGGTTC
TGTAAAAACAATACA
TGTAAAAACAATACA





CAAAACGGTTTGAA
GCAAAGTTAAAATA
GCAAGTTAAAATAA





ACCATTCGAAACAA
AGGCTAGTCCGTAT
GGCTAGTCCGTATAC





TACAGCAAAGTTAA
ACAACGTGAAAACA
AACGTGAAAACACG





AATAAGGCTAGTCC
CGTGGCACCGATTC
TGGCACCGATTCGG





GTATACAACGTGAA
GGTGC
TGC





AACACGTGGCACCG
[SEQ ID NO. 88]
[SEQ ID NO. 89





ATTCGGTGC







[SEQ ID NO. 87]






sgM
GTTTGCTAGTTATGT
GTTTGCTAGTTATGT
GTTTGCTAGTTATGT
NONE
NONE


2020
TATTTATAGTATTAA
TATAAAAATAACATA
TATAAAAATAACATA





GCAAACTGTAAATA
ACGAGTGCAAATAA
ACGAGTGCAAATAA





ACATAACGAGTGCA
GCGTTTCGCGAAAA
GCGTTTCGCGAAAA





AATAAGCGTTTCGC
TTTACAGTGGCCCTG
TTTACAGTGGCCCTG





GAAAATTTACAGTG
CTGTGGGGCCTTTTT
CTGTGGGGCC





GCCCTGCTGTGGGG
TATTTATCAAA
[SEQ ID NO. 92]





CCTTTTTTATTTATCA
[SEQ ID NO. 91]






AA







[SEQ ID NO. 90]






sgM
GTTTGAGAGCCTTG
NONE
NONE
NONE
NONE


2021
TAAAACCGTATATCT







CTCAAGCGAAAGAT







AATGTTTTACAAGG







CGAGTTCAAATAAG







GATTTATCCGAAATC







GCTTGCGTGCATTG







GCACCATCTATCTTT







TAAGACTTTCTTTGA







AAGTCTT







[SEQ ID NO. 93]






sgM
GTTTGAGAGTCTTGT
GTTTGAGAGTCTTGT
GTTTGAGAGTCTTGT
GTTTGAGAGT
NONE


2022
TAATTCTTAAAGGTG
AAAAACAAGACGAG
AAAAACAAGACGAG
CTTGTTAATTC




TAAAACGAGAATTA
TGCAAATAAGGTTTA
TGCAAATAAGGTTTA
AAAAGAATTA




ACAAGACGAGTGCA
TCCGGAATCGTCAAT
TCCGGAATCGTCAAT
ACAAGACGAG




AATAAGGTTTATCC
ATGACCTGCATTGTG
ATGACCTGCATTGTG
TGCAAATAAG




GGAATCGTCAATAT
CAGAATCTTTAAAAT
CAG
GTTTATCCGGA




GACCTGCATTGTGC
CATATGATTTCATAT
[SEQ ID NO. 96]
ATCGTCAATAT




AGAATCTTTAAAATC
GGTTTTA

GACCTGCATTG




ATATGATTTCATATG
[SEQ ID NO. 95]

TGCAGAATCTT




GTTTTA


TAAAATCATAT




[SEQ ID NO. 94]


GATTTCATATG







GTTTTA







[SEQ ID NO.







97]



sgM
GTTTGAGAGTAGTG
NONE
NONE
NONE
NONE


2023
TAAATCCATAGGGG







TCTCAAACGAAAAG







ACCCCTATGGATTTA







CATTGCGAGTTCAA







ATAAAAGTTTACTCA







AATCGTTGGCTTGA







CCAACCGCACAGCG







TGTGCTTAAAGATCT







CTTCAGTGAGGTC







[SEQ ID NO. 98]






sgM
GTTTGAGAGTAGTG
NONE
NONE
NONE
NONE


2024
TAAATCCAGAGGGC







TCCAAAACGAGCCC







TCTGGATTTACACTA







CGAGTTCAAATAAA







AATTATTTCAAATCG







CCGCTATGTCGGCC







GCACAGTGTGTGCA







TTAAGAAAAGTCCG







AAAGGGC







[SEQ ID NO. 99]






sgM
GTTTGAGAGTAGTG
GTTTGAGAGTAGTG
GTTTGAGAGTAGTG
GTTTGAGAGT
NONE


2025
TAAATTTATAGGGT
TAAAAATACACTACG
TAAAAATACACTACG
AGTGTAAATTT




AGTAAAACAAATTTT
AGTTCAAATAAAAAT
AGTTCAAATAAAAAT
ATAGGAAAAC




ACTACCCTATAAATT
TATTTCAAATCGTAC
TATTTCAAATCGTAC
CTATAAATTTA




TACACTACGAGTTCA
TTTTTAGTACCTTCA
TTTTTAGTACCTTCA
CACTACGAGTT




AATAAAAATTATTTC
CAAGTGTTGTGAAT
CAAGTGTTGTGAA
CAAATAAAAA




AAATCGTACTTTTTA
ATTAACTCACCTTCG
[SEQ ID NO. 102]
TTATTTCAAAT




GTACCTTCACAAGT
GGTGAG

CGTACTTTTTA




GTTGTGAATATTAAC
[SEQ ID NO. 101]

GTACCTTCACA




TCACCTTCGGGTGA


AGTGTTGTGA




G


ATATTAACTCA




[SEQ ID NO. 100]


CCTTCGGGTG







AG







[SEQ ID NO.







103]



sgM
GTTTGAGAGTAGTG
NONE
NONE
NONE
NONE


2026
TAATTTCATATGGTA







GTCAAACGACTACC







ATATGAGATTACACT







ACACGGTTCAAATA







AAGAATGTTCGAAA







CCGCCCTTTGGGGC







CCGCTTGTTGCGGA







TTTACAGACTTGATA







TCAAGTCTG







[SEQ ID NO. 104]






sgM
GTTTGAGAGTAATG
GTTTGAGAGTAATG
GTTTGAGAGTAATG
GTTTGAGAGT
NONE


2027
TAAATTCATAGGAT
TAAAAATACATTACA
TAAAAATACATTACA
AATGTAAATTC




GGTAAAACGAAATT
AGTTCAAATAAAAAT
AGTTCAAATAAAAAT
ATAAAAGTGA




TACCATCCAGTGAG
TTATTCAACCCGTTC
TTATTCAACCCGTTC
GTTTACATTAC




TTTACATTACAAGTT
TTCGGAACCTCCACC
TTCGGAACCTCCACC
AAGTTCAAATA




CAAATAAAAATTTAT
GTGTGGAACATTAA
GTGTGGA
AAAATTTATTC




TCAACCCGTTCTTCG
GGTCTGCTTTGCAG
[SEQ ID NO. 107]
AACCCGTTCTT




GAACCTCCACCGTG
GCC

CGGAACCTCC




TGGAAC
[SEQ ID NO. 106]

ACCGTGTGGA




[SEQ ID NO. 105]


ACATTAAG







[SEQ ID NO.







108]



sgM
GTTTGAGAGCAGTG
NONE
NONE
NONE
NONE


2028
TTGTCTTATATAGCT







CGAAAACGCATTGT







AAGACAACACTGCT







ACGTTCAAATAAGC







ATATTGCTACAAGG







TTCTCCCTCGGAGAA







TGACCATTAGGTCA







CTTAGATAGCCGGT







TCTTCTGGCTA







[SEQ ID NO. 109]






sgM
GTTTGAGAGCAGTG
GTTTGAGAGCAGTG
GTTTGAGAGCAGTG
GTTTGAGAGC
NONE


2029
TTGTCTTATATAGCT
TAAAAACACTGCTAC
TAAAAACACTGCTAC
AGTGTTGTCAA




CGAAAACGCATTGT
GTTCAAATAAGCATA
GTTCAAATAAGCATA
AAGACAACAC




AAGACAACACTGCT
TTGCTACAAGGTTCT
TTGCTACAAGGTTCT
TGCTACGTTCA




ACGTTCAAATAAGC
CCATTGGAGAATGA
CCATTGGAGAATGA
AATAAGCATAT




ATATTGCTACAAGG
CCATTAGGTCGCTTA
CCATTAGGTC
TGCTACAAGG




TTCTCCATTGGAGAA
GATAGCCAGTTCTTC
[SEQ ID NO. 112]
TTCTCCATTGG




TGACCATTAGGTCG
TGGCTA

AGAATGACCA




CTTAGATAGCCAGTT
[SEQ ID NO. 111]

TTAGGTCGCTT




CTTCTGGCTA


AGATAGCCAG




[SEQ ID NO. 110]


TTCTTCTGGCT







A







[SEQ ID NO.







113]



sgM
GTTTGAGAGCAGTG
NONE
NONE
NONE
NONE


2030
TTGTCTTAAATAGCT







CGAAAACGCATTGT







AAGACAACACTGCA







CGTTCAAATAAGCA







GATTGCTACAAGGT







TCCCGTAAGGGAAT







GACCATCTGGTCAC







ATGAATAGCCCCCG







GCAACGGTGGCTG







[SEQ ID NO. 114]






sgM
ATTGTACCATAGCG
NONE
NONE
NONE
NONE


2031
AGTTAAATTAGGGA







ATTACAACGAAATT







GTAATAACCTATTTT







ACCTCGCTATGGCA







CAATTTGTTATTACA







TGGACATTATACTAA







ACATTTCCTAAAAAA







GCAACGAAAAACGT







GCT







[SEQ ID NO. 115]






sgM
GTTGTAGTTCCCTAA
GTTGTAGTTCCCTAA
GTTGTAGTTCCCTAA
GTTGTAGTTCC
NONE


2032
TTATTCTTGGTATGG
TTATTCTTGGTAAAA
TTATTCTTGGTAAAA
CTAATTATTCT




TATAATGAAAATTGT
ACCAAGAACAATTA
ACCAAGAACAATTA
TGGTATGGTA




ATCATACCAAGAAC
GGTTACTATGATAA
GGTTACTATGATAA
AAAATATCATA




AATTAGGTTACTATG
GGTAGTATACCGCA
GGTAGTATACCGCA
CCAAGAACAA




ATAAGGTAGTATAC
AAGCTCTAACACCTC
AAGCTCTAACACCTC
TAGGTTACTA




CGCAAAGCTCTAAC
ATCTTCGGATGAGG
ATCTTCGGATGAG
TGATAAGGTA




ACCTCATCTTCGGAT
TGTTA
[SEQ ID NO. 118]
GTATACCGCA




GAGGTGTTATCT
[SEQ ID NO. 117]

AAGCTCTAACA




[SEQ ID NO. 116]


CCTCATCTTCG







GATGAGGTGT







TATCT







[SEQ ID NO.







119]



sgM
GTTGTAGTTCCCTAA
GTTGTAGTTCCCTAA
GTTGTAGTTCCCTAA
GTTGTAGTTCC
NONE


2033
CAGTTCTTGGTATG
CAGTTCTAAAAAGA
CAGTTCTAAAAAGA
CTAACAGTAA




GTATAATAAAAATT
ACTGTTATGGTTGCT
ACTGTTATGGTTGCT
AAACTGTTATG




ATACCATACCAAGA
ATGATAAGGTCTTA
ATGATAAGGTCTTA
GTTGCTATGAT




ACTGTTATGGTTGCT
GCACCGTAAAGCTCT
GCACCGTAAAGCTCT
AAGGTCTTAG




ATGATAAGGTCTTA
GACGCCTCGCTTTCA
GACGCCTCGCTTTCA
CACCGTAAAG




GCACCGTAAAGCTC
GCGGGGCGTCA
GCGGGG
CTCTGACGCCT




TGACGCCTCGCTTTC
[SEQ ID NO. 121]
[SEQ ID NO. 122]
CGCTTTCAGCG




AGCGGGGCGTCATC


GGGCGTCA




TTTTTTGCCCAAAAG


[SEQ ID NO.




ACACGGATATTTTT


123]




[SEQ ID NO. 120]






sgM
GTTGTAGTTCCCTAA
GTTGTAGTTCCCTAA
GTTGTAGTTCCCTAA
GTTGTAGTTCC
NONE


2034
CGGTTCTTGGTATG
CGGTACTGTTGGGTT
CGGTACTGTTGGGTT
CTAACGGTTCT




GTATAATGAATTATA
ACTACAATAAGGTA
ACTACAATAAGGTA
TGAAAACAAG




CCATACCAAGAACT
GTAAACCGAAAAGC
GTAAACCGAAAAGC
AACTGTTGGG




GTTGGGTTACTACA
TCTGACGTCTTGTTT
TCTGACGTCTTGTTT
TTACTACAATA




ATAAGGTAGTAAAC
GCGCAGGACGTCAT
GCGCAGGACGTCAT
AGGTAGTAAA




CGAAAAGCTCTGAC
CTTTATATCAGACGG
CTTT
CCGAAAAGCT




GTCTTGTTTGCGCAG
ATG
[SEQ ID NO. 126]
CTGACGTCTTG




GACGTCATCTTTATA
[SEQ ID NO. 125]

TTTGCGCAGG




TCAGACGGATG


ACGTCATCTTT




[SEQ ID NO. 124]


ATATCAGACG







GATG







[SEQ ID NO.







127]



sgM
GTTGTAGTCCCCTGA
NONE
NONE
NONE
NONE


2035
TGGTTTCTGGAATG







GTATAATGAAATTAT







ACCATTCCAGAAACT







ATTATGGTCACTACA







ATAAGGTATTAGAC







CGTAGAGCACTAAC







ACCCCATTTGGGGT







GTTATCTCTTTAAAC







TGTCCAAAATTTAGT







ATTGCAATTATTGA







[SEQ ID NO. 128]






sgM
GTTATAGTTCCCTGT
NONE
NONE
NONE
NONE


2036
TCGTTCTTGGTATGG







TATAATGAAATTATA







CCATACCAAGAACG







AAGCAGGTTACTAT







GATAAGGTAGTATA







CCGCAGAGCTCCAA







CGCCTCGCTTTTGCG







GGGCGTTGTCTCT







[SEQ ID NO. 128]






sgM
GTTATAGTTCCCTGA
NONE
NONE
NONE
NONE


2037
TAGTTCTTGGTATGG







TATAATGAAATTATA







CCATACCAAGAACT







ATGAGGTTGCTATA







ATAAGGTAGTAAAC







CGCAGAGCTCTAAC







GCCTCACATTTGTGG







GGCGTTATCTCT







[SEQ ID NO. 129]






sgM
GTTGTAGTTCCCTGA
NONE
NONE
NONE
NONE


2038
TCGTTCTTGGTATGG







TATAATGAAATTATA







CCATACCAAGAACG







ATCAGGTTGCTACA







ATAAGGTAGTAAAC







CGAAGAGCTCTAAC







GCCCCGTTTCTTTAC







GGGGCGTTATCTCT







[SEQ ID NO. 130]






sgM
GTTATAGTTCCCTGA
GTTATAGTTCCCTGA
GTTATAGTTCCCTGA
GTTATAGTTCC
GTTATAGTTC


2039
TAGTTCTTGGTATGG
TAGTTCTTGGTATGG
TAGTTCTTAACCAAG
CTGATAGTTCT
CCTGATAGTT



TATAATGAATTATAC
TATAATGAATTATAC
AACTATTTAGGTTAC
TGCAAGAACT
CTGCAAGAA



CATACCAAGAACTA
CATACCAAGAACTAT
TATGATAAGGTTTAG
ATTTAGGTTAC
CTATTTAGGT



TTTAGGTTACTATGA
TTAGGTTACTATGAT
TACACCTTAGAGCTC
TATGATAAGG
TACTATGATA



TAAGGTTTAGTACA
AAGGTTTAGTACACC
TGACGCCTCGCTTTT
TTTAGTACACC
AGGTTTAGTA



CCTTAGAGCTCTGAC
TTAGAGCTCTGACGC
GCGAGGCGTTATCT
TTAGAGCTCTG
CACCTTAGAG



GCCTCGCTTTTGCGA
CTCGCTTTTGCGAGG
CT
ACGCCTCGCTT
CTCTGACGCC



GGCGTTATCTCTTTA
CGTTATCTCT

TTGCGAGGCG
AAAAGGCGT



TATTGCCAAAAATG
[SEQ ID NO. 132]
[SEQ ID NO. 133]
TTATCTCT
TATCTCT



CAAATATATCGTACA


[SEQ ID NO.
[SEQ ID NO.



ATGGTGGC


134]
135]



[SEQ ID NO. 131]






sgM
GTTGTAGTTCCCTGA
NONE
GTTGTAGTTCCCTGA
GTTGTAGTTCC
NONE


2040
TGGTTCTTGGTATG

TGGTTCTTGAAAAA
CTGATGGTTCT




GTATAATAAATTATA

GAACTGCTCAGGTT
TGAAAAAGAA




CCATACCAAGAACT

ACTATGATAAGGTA
CTGCTCAGGTT




GCTCAGGTTACTAT

GTAAACCGAAGAGC
ACTATGATAA




GATAAGGTAGTAAA

TCTAATGCCCCGTCT
GGTAGTAAAC




CCGAAGAGCTCTAA

CGCACGGGGCATTA
CGAAGAGCTC




TGCCCCGTCTCGCAC

TCTCT
TAATGCCAAA




GGGGCATTATCTCT

[SEQ ID NO. 137]
GGGCATTATCT




[SEQ ID NO. 136]


CT







[SEQ ID NO.







138]









To find the optimal gRNA length, different lengths of spacer, repeat:anti-repeat duplex and 3′ end of the tracrRNA were included. These gRNAs were then synthesized as a single stranded DNA downstream of the T7 promoter (see Table 4). These sgRNAs were amplified using two primers (5′-AAACCCCTCCGTTTAGAGAG [SEQ ID NO. 174] and 5′-AAGCTAATACGACTCACTATAGGCCAGTC [SEQ ID NO. 175]) and 1 uL of 10 uM diluted single stranded DNA as a template in 25 uL PCR reactions for each sgRNA according to the conditions of Table 5.










TABLE 4





Name
Sequence







sgM201
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAGCGCCGAAACAGCGCCACTTTACAGTTGATTACGGT


6v1
TAAAACCTTATTTTAACTTGCTATGTTGTTTAGAATAGTCCCAAAAGATGTTTTGGTACCATTCTAAACAACA



TGACTCTAAAACCCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 139]


sgM201
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAGCGCCGAAACAGCGCCACTTTACAGTTGATTACGGT


6v2
TAAAACCTTATTTTAACTTGCTATGTTGTTTTTACAACATGACTCTAAAACCCAGTAACATTACTGACTGGCC



TATAGTGAGTCGTATTA [SEQ ID NO. 140]


sgM201
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAGCGCCGAAACAGCGCCACTTTACAGTTGATTACGCT


6v3
TATTTTAACTTGCTATGTTGTTTTTACAACATGACTCTAAAACCCAGTAACATTACTGACTGGCCTATAGTGA



GTCGTATTA [SEQ ID NO. 141]


sgM201
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAGCACCGAATCGGTGCCACGTGTTTTCACGTTGTATA


9v1
CGGACTAGCCTTATTTTAACTTTGCTGTATTGTTTCGAATGGTTTCAAACCGTTTTGGAACCATTCGAAACAA



CACAGCTCTAAAACCCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 142]


sgM201
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAGCACCGAATCGGTGCCACGTGTTTTCACGTTGTATA


9v2
CGGACTAGCCTTATTTTAACTTTGCTGTATTGTTTTTACAACACAGCTCTAAAACCCAGTAACATTACTGACT



GGCCTATAGTGAGTCGTATTA [SEQ ID NO. 143]


sgM201
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAGCACCGAATCGGTGCCACGTGTTTTCACGTTGTATA


9v3
CGGACTAGCCTTATTTTAACTTGCTGTATTGTTTTTACAACACAGCTCTAAAACCCAGTAACATTACTGACTG



GCCTATAGTGAGTCGTATTA [SEQ ID NO. 144]


sgM202
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTATTTGATAAATAAAAAAGGCCCCACAGCAGGGCCACT


0v1
GTAAATTTTCGCGAAACGCTTATTTGCACTCGTTATGTTATTTACAGTTTGCTTAATACTATAAATAACATAA



CTAGCAAACCCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 145]


sgM202
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTATTTGATAAATAAAAAAGGCCCCACAGCAGGGCCACT


0v2
GTAAATTTTCGCGAAACGCTTATTTGCACTCGTTATGTTATTTTTATAACATAACTAGCAAACCCAGTAACAT



TACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 146]


sgM202
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAGGCCCCACAGCAGGGCCACTGTAAATTTTCGCGAAA


0v3
CGCTTATTTGCACTCGTTATGTTATTTTTATAACATAACTAGCAAACCCAGTAACATTACTGACTGGCCTATA



GTGAGTCGTATTA [SEQ ID NO. 147]


sgM202
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTATAAAACCATATGAAATCATATGATTTTAAAGATTCTG


2v1
CACAATGCAGGTCATATTGACGATTCCGGATAAACCTTATTTGCACTCGTCTTGTTAATTCTTTTGAATTAAC



AAGACTCTCAAACCCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 148]


sgM202
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTATAAAACCATATGAAATCATATGATTTTAAAGATTCTG


2v2
CACAATGCAGGTCATATTGACGATTCCGGATAAACCTTATTTGCACTCGTCTTGTTTTTACAAGACTCTCAAA



CCCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 149]


sgM202
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTACTGCACAATGCAGGTCATATTGACGATTCCGGATAA


2v3
ACCTTATTTGCACTCGTCTTGTTTTTACAAGACTCTCAAACCCAGTAACATTACTGACTGGCCTATAGTGAGT



CGTATTA [SEQ ID NO. 150]


sgM202
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGCTCACCCGAAGGTGAGTTAATATTCACAACACTTGTGAA


5v1
GGTACTAAAAAGTACGATTTGAAATAATTTTTATTTGAACTCGTAGTGTAAATTTATAGGTTTTCCTATAAAT



TTACACTACTCTCAAACCCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 151]


sgM202
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTACTCACCCGAAGGTGAGTTAATATTCACAACACTTGT


5v2
GAAGGTACTAAAAAGTACGATTTGAAATAATTTTTATTTGAACTCGTAGTGTATTTTTACACTACTCTCAAAC



CCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 152]


sgM202
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTATTCACAACACTTGTGAAGGTACTAAAAAGTACGATT


5v3
TGAAATAATTTTTATTTGAACTCGTAGTGTATTTTTACACTACTCTCAAACCCAGTAACATTACTGACTGGCC



TATAGTGAGTCGTATTA [SEQ ID NO. 153]


sgM202
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAGGCCTGCAAAGCAGACCTTAATGTTCCACACGGTGG


7v1
AGGTTCCGAAGAACGGGTTGAATAAATTTTTATTTGAACTTGTAATGTAAACTCACTTTTATGAATTTACATT



ACTCTCAAACCCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 154]


sgM202
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAGGCCTGCAAAGCAGACCTTAATGTTCCACACGGTGG


7v2
AGGTTCCGAAGAACGGGTTGAATAAATTTTTATTTGAACTTGTAATGTATTTTTACATTACTCTCAAACCCA



GTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 155]


sgM202
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTATCCACACGGTGGAGGTTCCGAAGAACGGGTTGAAT


7v3
AAATTTTTATTTGAACTTGTAATGTATTTTTACATTACTCTCAAACCCAGTAACATTACTGACTGGCCTATAG



TGAGTCGTATTA [SEQ ID NO. 156]


sgM202
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTATAGCCAGAAGAACTGGCTATCTAAGCGACCTAATGG


9v1
TCATTCTCCAATGGAGAACCTTGTAGCAATATGCTTATTTGAACGTAGCAGTGTTGTCTTTTGACAACACTG



CTCTCAAACCCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 157]


sgM202
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTATAGCCAGAAGAACTGGCTATCTAAGCGACCTAATGG


9v2
TCATTCTCCAATGGAGAACCTTGTAGCAATATGCTTATTTGAACGTAGCAGTGTTTTTACACTGCTCTCAAAC



CCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 158]


sgM202
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAGACCTAATGGTCATTCTCCAATGGAGAACCTTGTAG


9v3
CAATATGCTTATTTGAACGTAGCAGTGTTTTTACACTGCTCTCAAACCCAGTAACATTACTGACTGGCCTATA



GTGAGTCGTATTA [SEQ ID NO. 159]


sgM203
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAAGATAACACCTCATCCGAAGATGAGGTGTTAGAGCT


2v1
TTGCGGTATACTACCTTATCATAGTAACCTAATTGTTCTTGGTATGATATTTTTACCATACCAAGAATAATTA



GGGAACTACAACCCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 160]


sgM203
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTATAACACCTCATCCGAAGATGAGGTGTTAGAGCTTTG


2v2
CGGTATACTACCTTATCATAGTAACCTAATTGTTCTTGGTTTTTACCAAGAATAATTAGGGAACTACAACCC



AGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 161]


sgM203
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTACTCATCCGAAGATGAGGTGTTAGAGCTTTGCGGTAT


2v3
ACTACCTTATCATAGTAACCTAATTGTTCTTGGTTTTTACCAAGAATAATTAGGGAACTACAACCCAGTAAC



ATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 162]


sgM203
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTATGACGCCCCGCTGAAAGCGAGGCGTCAGAGCTTTAC


3v1
GGTGCTAAGACCTTATCATAGCAACCATAACAGTTTTTACTGTTAGGGAACTACAACCCAGTAACATTACTG



ACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 163]


sgM203
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTATGACGCCCCGCTGAAAGCGAGGCGTCAGAGCTTTAC


3v2
GGTGCTAAGACCTTATCATAGCAACCATAACAGTTCTTTTTAGAACTGTTAGGGAACTACAACCCAGTAACA



TTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 164]


sgM203
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTACCCCGCTGAAAGCGAGGCGTCAGAGCTTTACGGTG


3v3
CTAAGACCTTATCATAGCAACCATAACAGTTCTTTTTAGAACTGTTAGGGAACTACAACCCAGTAACATTAC



TGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 165]


sgM203
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTACATCCGTCTGATATAAAGATGACGTCCTGCGCAAAC


4v1
AAGACGTCAGAGCTTTTCGGTTTACTACCTTATTGTAGTAACCCAACAGTTCTTGTTTTCAAGAACCGTTAG



GGAACTACAACCCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 166]


sgM203
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTACATCCGTCTGATATAAAGATGACGTCCTGCGCAAAC


4v2
AAGACGTCAGAGCTTTTCGGTTTACTACCTTATTGTAGTAACCCAACAGTACCGTTAGGGAACTACAACCCA



GTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 167]


sgM203
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAAAAGATGACGTCCTGCGCAAACAAGACGTCAGAGC


4v3
TTTTCGGTTTACTACCTTATTGTAGTAACCCAACAGTACCGTTAGGGAACTACAACCCAGTAACATTACTGA



CTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 168]


sgM203
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAAGAGATAACGCCTCGCAAAAGCGAGGCGTCAGAGC


9v1
TCTAAGGTGTACTAAACCTTATCATAGTAACCTAAATAGTTCTTGCAAGAACTATCAGGGAACTATAACCCA



GTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 169]


sgM203
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAAGAGATAACGCCTTTTGGCGTCAGAGCTCTAAGGTG


9v2
TACTAAACCTTATCATAGTAACCTAAATAGTTCTTGCAAGAACTATCAGGGAACTATAACCCAGTAACATTA



CTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 170]


sgM203
AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAAGAGATAACGCCTCGCAAAAGCGAGGCGTCAGAGC


9v3
TCTAAGGTGTACTAAACCTTATCATAGTAACCTAAATAGTTCTTGGTTAAGAACTATCAGGGAACTATAACC



CAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 171]




















TABLE 5







STEP
TEMPERATURE
TIME









DENATURATION
98° C.
30 SEC



12 CYCLES
98° C.
10 SEC




66° C.
30 SEC




72° C.
 2 MIN



FINAL EXTENSION
72° C.
 2 MIN



HOLD
12° C.










The target library was designed based on an assumption that the eight randomized NNNNNNNN [SEQ ID NO. 176] PAMs of these nucleases reside on the 3′ end of the target sequence (5′-CCAGTCAGTAATGTTACTGG [SEQ ID NO. 177]).


Example 5: In Vitro Transcription and Translation for Production of MAD Nucleases and gRNAs

The MADZYMEs were tested for activity by in vitro transcription and translation (txtl). Both the gRNA plasmid and nuclease plasmid were included in each txtl reaction. A PURExpress® In Vitro Protein Synthesis Kit (NEB, Ipswich, Mass.) was used to produce MADzymes from the PCR-amplified MADZYME library and also to produce the gRNA libraries. In each well in a 96-well plate, the reagents listed in Table 6 were mixed to start the production of MADzymes and gRNAs:













TABLE 6








REAGENTS
VOLUME (μl)




















1
SolA (NEB kit)
10



2
SolB (NEB kit)
7.5



3
PCR amplified gRNA
0.4



4
Murine RNase inhibitor (NEB)
0.5



5
Water
3.0



6
PCR amplified T7 MADZYMEs
3.6










A master mix with all reagents was mixed on ice with the exception of the PCR-amplified T7-MADZYMEs to cover enough 96-well plates for the assay. After 21 μL of the master mix was distributed in each well in 96 well plates, 4 μL of the mixture of PCR amplified MADZYMEs and gRNA under the control of T7 promoter was added. The 96-well plates were sealed and incubated for 4 hrs at 37° C. in a thermal cycler. The plates were kept at room temperature until the target pool was added to perform the target depletion reaction.


After 4 hours incubation to allow production of the MADzymes and gRNAs, 4 μL of the target library pool (10 ng/μL) was added to the 10 μL aliquots of in vitro transcription/translation reaction mixture and allowed to deplete for 30 min, 3 hrs or overnight at 37° C. and 48° C. The target depletion reaction mixtures were diluted into PCR-grade water that contains RNAse A incubated for 5 min at room temperature. Proteinase K was then added and the mixtures were incubated for 5 min at 55° C. RNAseA/Proteinase K treated samples were purified with DNA purification kits and the purified DNA samples were then amplified and sequenced. The PCR conditions are shown in Table 7:













TABLE 7







STEP
TEMPERATURE
TIME









DENATURATION
98° C.
30 SEC



4 CYCLES
98° C.
10 SEC




66° C.
30 SEC




72° C.
20 SEC



12 CYCLES
98° C.
10 SEC




72° C.
20 SEC



FINAL EXTENSION
72° C.
 2 MINUTES



HOLD
12° C.










Example 6: Measurement of Nicked Plasmid with Nickase RNP Complexes

Proteins were produced in vitro under a PURExpress® In Vitro Protein Synthesis Kit (NEB, Ipswich, Mass.). Guide RNAs that target the target plasmid were also produced under a T7 promoter in the same mixture. The MADzyme Nickase or Nuclease and guide complexes (RNP complex) formed as they were produced in the in vitro transcription and translation reagent. Supercoiled plasmid target was diluted into the digestion buffer, then the RNP complex was added to the same digestion buffer to initiate the plasmid digestion. After incubation at 37° C. to allow digestion of the plasmid, the resulting mixtures were treated with RNAase and Proteinase K, then the target plasmid was purified with a PCR cleanup kit, and run on TAE-agarose gel to observe the formation of nicked or double stand cut plasmid. The results are shown in FIG. 7. Table 8 lists the identified MADzyme nickases, including the variations from the nuclease sequence in Table 1 and the amino acid sequence.











TABLE 8





MADzyme
SEQ



Nickase
ID



Name
NO
Amino Acid Sequence







MAD2016-
177
MKKDYVIGLDIGTNSVGWAVMTEDYQLVKKKMPIYGNTEKKKIKKNFWGVRLFEEGHTAEDRR


H851A

LKRTARRIISRRRNRLRYLQAFFEEAMTDLDENFFARLQESFLVPEDKKWHRHPIFAKLEDEVAYH




ETYPTIYHLRKKLADSSEQADLRLIYLALAHIVKYRGHFLIEGKLSTENISVKEQFQQFMIlYNQTFVN




GESRLVSAPLPESVLIEEELTEKASRTKKSEKVLQQFPQEKANGLFGQFLKLMVGNKADFKKVFGL




EEEAKITYASESYEEDLEGILAKVGDEYSDVFLAAKNVYDAVELSTILADSDKKSHAKLSSSMIVRFT




EHQEDLKKFKRFIRENCPDEYDNLFKNEQKDGYAGYIAHAGKVSQLKFYQYVKKIIQDIAGAEYFL




EKIAQENFLRKQRTFDNGVIPHQIHLAELQAIIHRQAAYYPFLKENQEKIEQLVTFRIPYYVGPLSKG




DASTFAWLKRQSEEPIRPWNLQETVDLDQSATAFIERMTNFDTYLPSEKVLPKHSLLYEKFMVFN




ELTKISYTDDRGIKANFSGKEKEKIFDYLFKTRRKVKKKDIIQFYRNEYNTEIVTLSGLEEDQFNASFS




TYQDLLKCGLTRAELDHPDNAEKLEDIIKILTIFEDRQRIRTQLSTFKGQFSAEVLKKLERKHYTGW




GRLSKKLINGIYDKESGKTILGYLIKDDGVSKHYNRNFMQLINDSQLSFKNAIQKAQSSEHEETLSE




TVNELAGSPAIKKGIYQSLKIVDELVAIMGYAPKRIVVEMARENQTTSTGKRRSIQRLKIVEKAMA




EIGSNLLKEQPTTNEQLRDTRLFLYYMQNGKDMYTGDELSLHRLSHYDIDAIIPQSFMKDDSLDN




LVLVGSTENRGKSDDVPSKEVVKDMKAYWEKLYAAGLISQRKFQRLTKGEQGGLTLEDKAHFIQ




RQLVETRQITKNVAGILDQRYNANSKEKKVQIITLKASLTSQFRSIFGLYKVREVNDYHHGQDAYL




NCVVATTLLKVYPNLAPEFVYGEYPKFQTFKENKATAKAIIYTNLLRFFTEDEPRFTKDGEILWSNS




YLKTIKKELNYHQMNIVKKVEVQKGGFSKESIKPKGPSNKLIPVKNGLDPQKYGGFDSPIVAYTVLF




THEKGKKPLIKQEILGITIMEKTRFEQNPILFLEEKGFLRPRVLMKLPKYTLYEFPEGRRRLLASAKEA




QKGNQMVLPEHLLTLLYHAKQCLLPNQSESLTYVEQHQPEFQEILERVVDFAEVHTLAKSKVQQI




VKLFEANQTADVKEIAASFIQLMQFNAMGAPSTFKFFQKDIERARYTSIKEIFDATIIYQSTTGLYET




RRKVVD


MAD2016-
178
MKKDYVIGLDIGTNSVGWAVMTEDYQLVKKKMPIYGNTEKKKIKKNFWGVRLFEEGHTAEDRR


N874A

LKRTARRIISRRRNRLRYLQAFFEEAMTDLDENFFARLQESFLVPEDKKWHRHPIFAKLEDEVAYH




ETYPTIYHLRKKLADSSEQADLRLIYLALAHIVKYRGHFLIEGKLSTENISVKEQFQQFMIlYNQTFVN




GESRLVSAPLPESVLIEEELTEKASRTKKSEKVLQQFPQEKANGLFGQFLKLMVGNKADFKKVFGL




EEEAKITYASESYEEDLEGILAKVGDEYSDVFLAAKNVYDAVELSTILADSDKKSHAKLSSSMIVRFT




EHQEDLKKFKRFIRENCPDEYDNLFKNEQKDGYAGYIAHAGKVSQLKFYQYVKKIIQDIAGAEYFL




EKIAQENFLRKQRTFDNGVIPHQIHLAELQAIIHRQAAYYPFLKENQEKIEQLVTFRIPYYVGPLSKG




DASTFAWLKRQSEEPIRPWNLQETVDLDQSATAFIERMTNFDTYLPSEKVLPKHSLLYEKFMVFN




ELTKISYTDDRGIKANFSGKEKEKIFDYLFKTRRKVKKKDIIQFYRNEYNTEIVTLSGLEEDQFNASFS




TYQDLLKCGLTRAELDHPDNAEKLEDIIKILTIFEDRQRIRTQLSTFKGQFSAEVLKKLERKHYTGW




GRLSKKLINGIYDKESGKTILGYLIKDDGVSKHYNRNFMQLINDSQLSFKNAIQKAQSSEHEETLSE




TVNELAGSPAIKKGIYQSLKIVDELVAIMGYAPKRIVVEMARENQTTSTGKRRSIQRLKIVEKAMA




EIGSNLLKEQPTTNEQLRDTRLFLYYMQNGKDMYTGDELSLHRLSHYDIDHIIPQSFMKDDSLDN




LVLVGSTEARGKSDDVPSKEVVKDMKAYWEKLYAAGLISQRKFQRLTKGEQGGLTLEDKAHFIQ




RQLVETRQITKNVAGILDQRYNANSKEKKVQIITLKASLTSQFRSIFGLYKVREVNDYHHGQDAYL




NCVVATTLLKVYPNLAPEFVYGEYPKFQTFKENKATAKAIIYTNLLRFFTEDEPRFTKDGEILWSNS




YLKTIKKELNYHQMNIVKKVEVQKGGFSKESIKPKGPSNKLIPVKNGLDPQKYGGFDSPIVAYTVLF




THEKGKKPLIKQEILGITIMEKTRFEQNPILFLEEKGFLRPRVLMKLPKYTLYEFPEGRRRLLASAKEA




QKGNQMVLPEHLLTLLYHAKQCLLPNQSESLTYVEQHQPEFQEILERVVDFAEVHTLAKSKVQQI




VKLFEANQTADVKEIAASFIQLMQFNAMGAPSTFKFFQKDIERARYTSIKEIFDATIIYQSTTGLYET




RRKVVD


MAD2032-
179
MKYIIGLDMGITSVGFATMMLDDKDEPCRIIRMGSRIFEAAEHPKDGSSLAAPRRINRGMRRRL


H590A

RRKSHRKERIKDLIIKNELMTADEISAIYSTGKQLSDIYQIRAEALDRKLNTEEFVRLLIHLSQRRGFK




SNRKVDAKEKGSDAGKLLSAVNSNKELMIEKNYRTIGEMLYKDEKFSEYKRNKADDYSNTFARSE




YEDEIRQIFSAQQEHGNPYATDELKESYLDIYLSQRSFDEGPGGSSPYGGNQIEKMIGNCTLEPEE




KRAAKATFSFEYFNLLSKVNSIKIVSSSGKRALNNDERQSVIRLAFAKNAISYTSLRKELNMEYSERF




NISYSQSDKSIEEIEKKTKFTYLTAYHTFKKAYGSVFVEWSADKKNSLAYALTAYKNDTKIIEYLTQK




GFDAAETDIALTLPSFSKWGNLSEKALNNIIPYLEQGMLYHDACTAAGYNFKADDTDKRMYLPA




HEKEAPELDDITNPVVRRAISQTIKVINALIREMGESPCFVNIELARELSKNKAERSKIEKGQKENQ




VRNDRIMERLRNEFGLLSPTGQDLIKLKLWEEQDGICPYSLKPIKIEKLFDVGYTDIDAIIPYSLSFDD




TYNNKVLVMSSENRQKGNRIPMQYLEGKRQDDFWLWVDNSNLSRRKKQNLTKETLSEDDLSG




FKKRNLQDTQYLSRFMMNYLKKYLALAPNTTGRKNTIQAVNGAVTSYLRKRWGIQKVRENGDT




HHAVDAVVISCVTAGMTKRVSEYAKYKETEFQNPQTGEFFDVDIRTGEVINRFPLPYARFRNELL




MRCSENPSRILHEMPLPTYAADEKVAPIFVSRMPKHKVKGSAHKETIRRAFEEDGKKYTVSKVPLT




DLKLKNGEIENYYNPESDGLLYNALKEQUAFGGDAAKAFEQPFYKPKSDGSEGPLVKKVKLINKA




TLTVPVLNNTAVADNGSMVRVDVFFVEGEGYYLVPIYVADTVKKELPNKAIIANKPYEEWKEMR




EENFVFSLYPNDLIKISSRKDMKFNLVNKESTLAPNCQSKEALVYYKGSDISTAAVTAINHDNTYKL




RGLGVKTLLKIEKYQVDVLGNVFKVGKEKRVRFK


MAD2039-
180
MRPYAIGLDIGITSVGWATVALDADESPCGIIGLGSRIFDAAEQPKTGESLAAPRRAARGSRRRLR


H587A

RHRHRNERIRSLMLEERLISQDELETLFDGRLEDIYALRVKALDEIVSRTDFARILLHISQRRGFKSN




RKNPTTKEDGVLLAAVNENKQRMSEHGYRTVGEMFLLDETFKDHKRNKGGNYITTVARDMVA




DEVRAIFSAQRELGASFASEEFEERYLEILLSQRSFDEGPGGNSPYGGSQIERMVGRCTFFPDEPR




AAKATYSFEYFTLLQKVNHIRIVENGVASKLTDEQRRIIIELAHTTKDVSYAKIRKVLKLSDKQLFNIR




YSDNSPAEDSEKKEKLGIMKAYHQMRSAIDRVSKGRFAMMPRAQRNAIGTALSLYKTSDKIRKYL




TDAGLDEIDINSADSIGSFSKFGHISVKACDMLIPFLEQGMNYNEACAAAGLNFKGHDAGEKSKL




LHPKEEDYEDITSPVVRRAIAQTIKVINAIIRREGCSPTFINIELAREMAKDFRERNRIKKENDDNRA




KNERLLERIRTEYGKNNPTGLDLVKLRLYEEQSGVCMYSLKQMSLEKLFEPNYAEVDAIVPYSISFD




DSRKNKVLVLTEENRNKGNRLPLQYLKGRRREDFIVWVNNNVKDYRKRRLLLKEELTAEDESGFK




ERNLQDTKTMSRFLLNYIADNLEFAESTRGRKKKVTAVNGAVTAYMRKRWGITKIREDGDCHHA




VDAVVIACTTDAMIRQVSRYAQFRECEYMQTESGSVAVDTGTGEVLRTFPYPWPDFRKELEARL




ANDPAKVINDLHLPFYMSAGRPLPEPVFVSRMPRRKVTGAAHKDTIKSARELDNGYLIVKRPLTD




LKLKNGEIENYYNPQSDKCLYDALKNALIEHGGDAKKAFAGEFRKPKRDGTPGPIVKKVKLLEPTT




MCVPVHGGKGAADNDSMVRVDVFLSGGKYYLVPIYVADTLKPELPNKAVTRGKKYSEWLEMA




DEDFIFSLYPNDLICATSKNGITLSVCRKDSTLPPTVESKSFMLYYRGTDISTGSISCITHDNAYKLRG




LGVKTLEKLEKYTVDVLGEYHKVGKEVRQPFNIKRRKACPSEML


MAD2039-
181
MRPYAIGLDIGITSVGWATVALDADESPCGIIGLGSRIFDAAEQPKTGESLAAPRRAARGSRRRLR


N610A

RHRHRNERIRSLMLEERLISQDELETLFDGRLEDIYALRVKALDEIVSRTDFARILLHISQRRGFKSN




RKNPTTKEDGVLLAAVNENKQRMSEHGYRTVGEMFLLDETFKDHKRNKGGNYITTVARDMVA




DEVRAIFSAQRELGASFASEEFEERYLEILLSQRSFDEGPGGNSPYGGSQIERMVGRCTFFPDEPR




AAKATYSFEYFTLLQKVNHIRIVENGVASKLTDEQRRIIIELAHTTKDVSYAKIRKVLKLSDKQLFNIR




YSDNSPAEDSEKKEKLGIMKAYHQMRSAIDRVSKGRFAMMPRAQRNAIGTALSLYKTSDKIRKYL




TDAGLDEIDINSADSIGSFSKFGHISVKACDMLIPFLEQGMNYNEACAAAGLNFKGHDAGEKSKL




LHPKEEDYEDITSPVVRRAIAQTIKVINAIIRREGCSPTFINIELAREMAKDFRERNRIKKENDDNRA




KNERLLERIRTEYGKNNPTGLDLVKLRLYEEQSGVCMYSLKQMSLEKLFEPNYAEVDHIVPYSISFD




DSRKNKVLVLTEENRNKGNRLPLQYLKGRRREDFIVWVNNNVKDYRKRRLLLKEELTAEDESGFK




ERNLQDTKTMSRFLLNYIADNLEFAESTRGRKKKVTAVNGAVTAYMRKRWGITKIREDGDCHHA




VDAVVIACTTDAMIRQVSRYAQFRECEYMQTESGSVAVDTGTGEVLRTFPYPWPDFRKELEARL




ANDPAKVINDLHLPFYMSAGRPLPEPVFVSRMPRRKVTGAAHKDTIKSARELDNGYLIVKRPLTD




LKLKNGEIENYYNPQSDKCLYDALKNALIEHGGDAKKAFAGEFRKPKRDGTPGPIVKKVKLLEPTT




MCVPVHGGKGAADNDSMVRVDVFLSGGKYYLVPIYVADTLKPELPAKAVTRGKKYSEWLEMA




DEDFIFSLYPNDLICATSKNGITLSVCRKDSTLPPTVESKSFMLYYRGTDISTGSISCITHDNAYKLRG




LGVKTLEKLEKYTVDVLGEYHKVGKEVRQPFNIKRRKACPSEML









While this invention is satisfied by embodiments in many different forms, as described in detail in connection with preferred embodiments of the invention, it is understood that the present disclosure is to be considered as exemplary of the principles of the invention and is not intended to limit the invention to the specific embodiments illustrated and described herein. Numerous variations may be made by persons skilled in the art without departure from the spirit of the invention. The scope of the invention will be measured by the appended claims and their equivalents. The abstract and the title are not to be construed as limiting the scope of the present invention, as their purpose is to enable the appropriate authorities, as well as the general public, to quickly determine the general nature of the invention. In the claims that follow, unless the term “means” is used, none of the features or elements recited therein should be construed as means-plus-function limitations pursuant to 35 U.S.C. § 112, ¶6.

Claims
  • 1. A nickase selected from the following nickases: MAD2016-H851A [SEQ ID NO: 177]; MAD2016-N874A [SEQ ID NO: 178]; MAD2032-H590A [SEQ ID NO: 179]; MAD-2039-H587A [SEQ ID NO: 180]; MAD2039-N610A [SEQ ID NO: 181].
  • 2. The nickase of claim 1 having the amino acid sequence of SEQ ID NO: 177.
  • 3. The nickase of claim 1 having the amino acid sequence of SEQ ID NO: 178.
  • 4. The nickase of claim 1 having the amino acid sequence of SEQ ID NO: 179.
  • 5. The nickase of claim 1 having the amino acid sequence of SEQ ID NO: 180.
  • 6. The nickase of claim 1 having the amino acid sequence of SEQ ID NO: 181.
RELATED CASES

This application claims priority to U.S. Ser. No. 63/133,502, filed 4 Jan. 2021, entitled “MAD NUCLEASES”, which is incorporated herein in its entirety.

US Referenced Citations (90)
Number Name Date Kind
6391582 Ying et al. May 2002 B2
6837995 Vassarotti et al. Jan 2005 B1
7166443 Walker et al. Jan 2007 B2
8332160 Platt et al. Dec 2012 B1
8697359 Zhang et al. Apr 2014 B1
8926977 Miller et al. Jan 2015 B2
9260505 Weir et al. Feb 2016 B2
9361427 Hillson Jun 2016 B2
9499855 Hyde et al. Nov 2016 B2
9776138 Innings et al. Oct 2017 B2
9790490 Zhang et al. Oct 2017 B2
9896696 Begemann et al. Feb 2018 B2
9982279 Gill et al. May 2018 B1
9988624 Serber et al. Jun 2018 B2
10011849 Gill et al. Jul 2018 B1
10017760 Gill et al. Jul 2018 B2
10227576 Cameron et al. Mar 2019 B1
10266851 Chen Apr 2019 B2
10704033 Kim et al. Jul 2020 B1
10724021 Kim et al. Jul 2020 B1
10745678 Kim et al. Aug 2020 B1
10767169 Kim et al. Sep 2020 B1
10837021 Tian et al. Nov 2020 B1
10883095 Mijts Jan 2021 B1
10927385 Kannan et al. Feb 2021 B2
11053485 Mijts Jul 2021 B2
11085030 Mijts Aug 2021 B2
11174471 Mijts Nov 2021 B2
11193115 Mijts Dec 2021 B2
20020139741 Kopf Oct 2002 A1
20040110253 Kappler et al. Jun 2004 A1
20060014137 Ghosh et al. Jan 2006 A1
20070020761 Yu et al. Jan 2007 A1
20100076057 Sontheimer et al. Mar 2010 A1
20110294217 Mcconnell-Smith et al. Dec 2011 A1
20130236970 Anneren et al. Sep 2013 A1
20140068797 Doudna et al. Mar 2014 A1
20140199767 Barrangou et al. Jul 2014 A1
20140242033 Gruber et al. Aug 2014 A1
20140273226 Wu et al. Sep 2014 A1
20150024464 Lippow et al. Jan 2015 A1
20150071898 Liu et al. Mar 2015 A1
20150098954 Hyde et al. Apr 2015 A1
20150159174 Frendewey et al. Jun 2015 A1
20150176013 Musunuru et al. Jun 2015 A1
20150191719 Hudson et al. Jul 2015 A1
20150225732 Williams et al. Aug 2015 A1
20150344549 Muir et al. Dec 2015 A1
20160024529 Carstens et al. Jan 2016 A1
20160053272 Wurzel et al. Feb 2016 A1
20160053304 Wurzel et al. Feb 2016 A1
20160076093 Shendure et al. Mar 2016 A1
20160102322 Ravinder et al. Apr 2016 A1
20160130608 Doudna et al. May 2016 A1
20160168592 Church et al. Jun 2016 A1
20160264981 Yang et al. Sep 2016 A1
20160281053 Sorek et al. Sep 2016 A1
20160289673 Huang et al. Oct 2016 A1
20160298134 Chen et al. Oct 2016 A1
20160354487 Zhang et al. Dec 2016 A1
20170002339 Barrngou et al. Jan 2017 A1
20170022499 Lu et al. Jan 2017 A1
20170044525 Kaper et al. Feb 2017 A1
20170051310 Doudna et al. Feb 2017 A1
20170073705 Chen et al. Mar 2017 A1
20170191123 Kim et al. Jul 2017 A1
20170211078 Kamineni et al. Jul 2017 A1
20170240922 Gill et al. Aug 2017 A1
20170369870 Gill et al. Dec 2017 A1
20180028567 Li et al. Feb 2018 A1
20180052176 Holt et al. Feb 2018 A1
20180073013 Lorenz et al. Mar 2018 A1
20180112235 Li et al. Apr 2018 A1
20180187149 Ma et al. Jul 2018 A1
20180200342 Bikard et al. Jul 2018 A1
20180230460 Gill et al. Aug 2018 A1
20180230461 Gill et al. Aug 2018 A1
20180284125 Gordon et al. Oct 2018 A1
20190017072 Ditommaso et al. Jan 2019 A1
20190085324 Regev et al. Mar 2019 A1
20190136230 Sather et al. May 2019 A1
20190169605 Masquelier et al. Jun 2019 A1
20190194650 Gill et al. Jun 2019 A1
20190225928 Masquelier et al. Jul 2019 A1
20190270987 Masquelier et al. Sep 2019 A1
20200071660 Spindler et al. Mar 2020 A1
20200095533 Garst et al. Mar 2020 A1
20200216794 Belgrader et al. Jul 2020 A1
20200263197 Cheng et al. Aug 2020 A1
20200270632 Roy et al. Aug 2020 A1
Foreign Referenced Citations (42)
Number Date Country
2395087 Dec 2011 EP
3199632 Aug 2017 EP
WO2002010183 Feb 2002 WO
WO 2003087341 Oct 2003 WO
WO 2010079430 Jul 2010 WO
WO 2011072246 Jun 2011 WO
WO 2011143124 Nov 2011 WO
WO 2013142578 Sep 2013 WO
WO 2013176772 Nov 2013 WO
WO 2014018423 Jan 2014 WO
WO2014143381 Sep 2014 WO
WO 2014144495 Sep 2014 WO
WO 2016110453 Jul 2016 WO
WO2016110453 Jul 2016 WO
WO 2017053902 Mar 2017 WO
WO2017075265 May 2017 WO
WO 2017078631 May 2017 WO
WO 2017083722 May 2017 WO
WO 2017106414 Jun 2017 WO
WO2017106414 Jun 2017 WO
WO 2017161371 Sep 2017 WO
WO 2017174329 Oct 2017 WO
WO 2017186718 Nov 2017 WO
WO2017212400 Dec 2017 WO
WO2017216392 Dec 2017 WO
WO 2017216392 Dec 2017 WO
WO 2017223330 Dec 2017 WO
WO2017223330 Dec 2017 WO
WO 2018031950 Feb 2018 WO
WO 2018071672 Apr 2018 WO
WO 2018083339 May 2018 WO
WO2018152325 Aug 2018 WO
WO2018172556 Sep 2018 WO
WO 2018172556 Sep 2018 WO
WO 2018191715 Oct 2018 WO
WO2019006436 Jan 2019 WO
WO2019055878 Mar 2019 WO
WO2019200004 Oct 2019 WO
WO2019209926 Oct 2019 WO
WO2020005383 Jan 2020 WO
WO2020021045 Jan 2020 WO
WO2020074906 Apr 2020 WO
Non-Patent Literature Citations (105)
Entry
Studer. Residue mutations and their impact on protein structure and function: detecting beneficial and pathogenic changes. Biochem. J. (2013) 449, 581-594.
International Search Report and Written Opinion for International Application No. PCT/US21/48566, dated Dec. 10, 2021, p. 1-10.
UniProtKB/TrEMBL, “A0A1G4WF58_9FIRM”, Nov. 22, 2017, rerieved from Internet: https://www.uniprot.org/uniprot/A0A_1G4WF58.txt, pp. 1-3.
International Search Report and Written Opinion for International Application No. PCT/US20/19379, dated Jul. 22, 2020, p. 1-10.
International Search Report and Written Opinion for International Application No. PCT/US20/36064, dated Sep. 18, 2020, p. 1-16.
International Search Report and Written Opinion for International Application No. PCT/US20/40389, dated Oct. 13, 2020, p. 1-12.
Arnak, et al., “Yeast Artificial Chromosomes”, John Wiley & Sons, Ltd., doi:10.1002/9780470015902.a0000379.pub3, pp. 1-10 (2012).
Woo, et al., “Dual roles of yeast Rad51 N-terminal domain in repairing DNA double-strand breaks”, Nucleic Acids Research, doi:10.1093/nar/gkaa.587, vol. 48, No. 15, pp. 8474-8489 (2020).
International Search Report and Written Opinion for International Application No. PCT/US2021/012868, dated Mar. 26, 2021, p. 1-15.
Anzalone et al., “Search-and-replace genome editing without doubles-strand breaks or donor DNA,” Nature, Oct. 21, 2019, vol. 576, No. 7785, pp. 149-157.
Alvarez, et al., “In vivo diversification of target genomic sites using processive T7 RNA polymerase-base deaminase fusions blocked by RNA-guided dCas9”, Dept.of Microbial Biotechnology and Systems Biology Program, Madrid, Spain, Jan. 1, 2019, p. 1-33.
International Search Report and Written Opinion for International Application No. PCT/US20/65168, dated Mar. 17, 2021, p. 1-15.
International Search Report and Written Opinion for International Application No. PCT/US2020/038345, dated Nov. 23, 2020, p. 1-13.
International Search Report and Written Opinion for International Application No. PCT/US21/12867, dated May 12, 2021, p. 1-17.
International Search Report and Written Opinion for International Application No. PCT/US2020/064727, dated Apr. 28, 2021, p. 1-13.
International Search Report and Written Opinion for International Application No. PCT/US21/29008, dated Aug. 24, 2021, p. 1-19.
International Search Report and Written Opinion for International Application No. PCT/US21/29011, dated Aug. 24, 2021, p. 1-20.
Bauer, et al., “Cell-microcarrier Adhesion to Gas-Liquid Interfaces and Foam”, Biotechnol. Prog. 2000, 16, 125-132, Oct. 19, 1999.
Datlinger, et al., “Pooled CRISPR screening with single-cell transcriptome readout”, Nature Methods, Jan. 10, 2017; p. 1-10, doi:10.1038/nmeth.4177.
Dixit, et al., “Perturb-Seq: Dissecting Molecular Circuits with Scalable Single-Cell RNA Profiling of Pooled Genetic Screens”, Cell 167, p. 1853-1866, Dec. 15, 2016.
GE Healthcare Life Sciences, “Microcarrier Cell Culture Principles and Methods”, 18-1140-62 AC, p. 1-23, Nov. 2013.
Jacobi, et al., “Simplified CRISPR tools for efficient genome editing and streamlined protocols for their delivery into mammalian cells and mouse zygotes”, Methods 121-122, p. 16-28, Mar. 23, 2017.
Jaitin, et al., “Dissecting Immune Circuits by Linking CRISPR-Pooled Screens with Single-Cell RNA-Seq”, Cell 167, p. 1883-1896, Dec. 15, 2016.
Kim, et al., “Formation of Thermoresponsive Poly(N-isopropylacrylamide)/Dextran Particles by Atom Transfer Radical Polymerization”, Macromol. Rapid Commun., 24, p. 517-521, 2003.
Kimple, et al., “Overview of Affinity Tags for Protein Purification”, Curr Protoc Protein Sci.; 73: Unit-9-9. Doi:10.1002/0471140864.ps0909s73, p. 1-26, Aug. 6, 2015.
Nienow, et al., “A potentially scalable method for the harvesting of hMSCs from microcarriers”, Biochemical Engineering Journal 85, p. 79-88, Feb. 4, 2014.
Replogle, et al., “Direct capture of CRISPR guides enables scalable, multiplexed, and multi-omic Perturb-Seg”, bioRxiv; doi:http://dx.doi.org/10.1101/503367, p. 1-26, Dec. 21, 2018.
Sivalingam, et al., “Superior Red Blood Cell Generation from Human Pluripotent Stem Cells Through a Novel Microcarrier-Based Embryoid Body Platform”, Tissue Engineering: Part C, vol. 22, No. 8, p. 765-780, Jun. 9, 2016.
International Search Report and Written Opinion for International Application No. PCT/US21/35807, dated Nov. 24, 2021, p. 1-21.
International Search Report and Written Opinion for International Application No. PCT/US21/50338, dated Dec. 10, 2021, p. 1-17.
International Search Report and Written Opinion for International Application No. PCT/US21/43097, dated Nov. 19, 2021, p. 1-12.
International Search Report and Written Opinion for International Application No. PCT/US21/39872, dated Oct. 27, 2021, p. 1-14.
Filsinger, et al., “Characterizing the portability of RecT-mediated oligonucleotide recombination”, bioRxiv, Apr. 15, 2020, doi:org/10.1101/2020.04.14.041095, p. 1-25.
Nelson, et al., “Engineered pegRNAs improve prime editing efficiency”, Nature Biotechnology, Jul. 25, 2021, doi.org/10.1038/s41587-021-01039-7, p. 1-14.
Yu, et al., “Improved delivery of Cas9 protein/gRNA complexes using lipofectamine CRISPRMAX”, Biotechnol Ltt, Feb. 18, 2016, doi 10.1007/sl0529-016-2064-9, p. 919-929.
Bengali, et al., “Gene Delivery Through Cell Culture Substrate Adsorbed DNA Complexes”, Biotechnol Bioeng., May 5, 2005, doi:10.1002/bit.20393, p. 1-23.
Segura, et al., “Substrate-mediated DNA delivery: role of the cationic polymer structure and extent of modification”, Journal of Controlled Release, Aug. 9, 2003, doi:10.1016/j.jconrel.2003.08.003, p. 69-84.
Takahashi, et al., “Integration of CpG-free DNA induces de novo methylation of CpG islands in pluripotent stem cells,” Science, May 5, 2017, vol. 356, No. 6337, pp. 1-7.
Chen, et al., “Human Pluripotent Stem Cell Culture: Considerations for Maintenance, Expansion, and Therapeutics”, Cell Stem Cell, Jan. 2, 2014, doi.org/10.1016/j.stem.2013.12.005, p. 13-26.
Fayazpour, F., “Exploring New Applications for Photophysically Encoded Mircrocarriers”, Ghent University Faculty of Pharmaceutical Sciences, Thesis Submission, Sep. 2008, 169 pages.
Chueng, et al., “Unlinking the methylome pattern from nucleotide sequence, revealed by large-scale in vivo genome engineering and methylome editing in medaka fish,” PLoS Genetics, Dec. 21, 2017, vol. 13, No. 12, pp. 1-25.
Elvin, et al., “Modified bacteriophage lambda promoter vectors for overproduction of proteins in Escherichia coli”, Gene, 87, Sep. 15, 1989, p. 123-126.
Segall-Shapiro, et al., “Engineered promoters enable constant gene expression at any copy number in bacteria”, Nature Biotechology, vol. 36, No. 4, Mar. 19, 2018, p. 352-363.
Xing, et al., “A CRISPR/Cas9 toolkit for multiplex genome editing in plants”, BMC Plant Biology, 2014, p. 1-12.
Sun, et al., “A Single Multiplex crRNA Array for FnCpf1-Mediated Human Genome Editing,” Molecular Therapy, Aug. 1, 2018, vol. 26, No. 8, pp. 2070-2076.
Kurata, et al., “Highly multiplexed genome engineering using CRISPR/Cas9 gRNA arrays,” PLoS ONE, Sep. 17, 2018, vol. 13, No. 9, pp. 1-17.
Hubmann, et al., “Natural and Modified Promoters for Tailored Metabolic Engineering of the Yeast Saccharomyces cerevisiae”, Methods in Molecular Biology, vol. 1152, doi10.1007/978-1-4939-0563-8_2, p. 17-42.
Unciti-Broceta, et al., “Combining Nebulization—Mediated Transfection and Polymer Microarrays for the Rapid Determination of Optimal Transfection Substrates”, Journal of Combinatorial Chemistry, vol. 10, No. 2, Feb. 5, 2008, p. 179-184.
Fayazpour, et al., “Evaluation of Digitally Encoded Layer-by-layer Coated Microparticles as Cell Carriers”, Advanced Functional Materials, Sep. 1, 2008, p. 2716-2723.
UniProtKB/TrEMBL, “A0A1G4WF58_9FIRM”, Nov. 22, 2017, rerieved from Internet: https://www.uniprot.org/uniprot/A0A_164WF58.txt, pp. 1-3.
Natsume, et al., “Conditional Degrons for Controlling Protein Expression at the Protein Level”, Annual Review of Genetics, vol. 51, 2017, doi.org/10.1146/annurev-genet-120116-024656, p. 83-104.
Chen, et al., “Enhancing the copy number of episomal plasmids in Saccharomyces cerevisiae for improved protein production”, FEMS Yeast Research, Apr. 25, 2012, doi:10.1111/j.1567-1364.2012.00809.x; p. 598-607.
Price, et al., “Expanding and understanding the CRISPR toolbox for Bacillus subtilis with MAD7 and dMAD7”, Biotechnology and Bioengineering, Feb. 19, 2020, doi:10.1002/bit.27312 p. 1805-1816.
International Search Report and Written Opinion for International Application No. PCT/US21/43534, dated Nov. 10, 2021, p. 1-16.
International Search Report and Written Opinion for International Application No. PCT/US20/26095, dated Jul. 17, 2020, p. 1-10.
Bao, et al., “Genome-scale engineering of Saccharomyces cerevisiae with single-nucleotide precision”, Nature Biotechnology, doi:10.1038/nbt.4132, pp. 1-6 (May 7, 2018).
Dicarlo, et al., “Genome engineering in Saccharomyces cervisiae using CRISPR-Case systems”, Nucleic Acids Research, 41(7):4336-43 (2013).
Garst, et al., “Genome-wide mapping of mutations at single-nucleotide resolution for protein, metabolic and genome engineering”, Nature Biotechnology, 35(1):48-59 (2017).
Hsu, et al., “DNA targeting specificity of RNA-guided Cas9 nucleases”, Nature Biotechnology, 31(9):827-32 (2013).
Jiang, et al., “RNA-guided editing of bacterial genomes using CRISPR-Cas systems”, Nature Biotechnology, 31(3):233-41 (2013).
Jinek, et al., “A Programmable Dual-RNA-Guided DNA Endonuclease in Adaptive Bacterial Immunity”, Science, 337:816-20 (2012).
Verwaal, et al., “CRISPR/Cpfl enables fast and simple genome editing of Saccharamyces cerevisiae”, Yeast, 35:201-11 (2018).
Lian, et al., “Combinatorial metabolic engineering using an orthogonal tri-functional CRISPR system”, Nature Communications, DOI:1038/s41467-017-01695-x/www.nature.com/naturecommunications, pp. 1-9 (2017).
Roy, et cl., “Multiplexed precision genome editing with trackable genomic barcodes in yeast”, Nature Biotechnolgy, doi:10.1038/nbt.4137, pp. 1-16 (2018).
Dong, “Establishment of a highly efficient virus-inducible CRISPR/Cas9 system in insect cells,” Antiviral Res., 130:50-7(2016).
Epinat et al., “A novel engineered meganuclease induces homologous recombination in eukaryotic cells, e.g., yeast and mammalian cells”, Nucleic Acids Research, 31(11): 2952-2962.
Farasat et al., “A Biophysical Model of CRISPR/Cas9 Activity for Rational Design of Genome Editing and Gene Regulation,” PLoS Comput Biol., 29:12(1):e1004724 (2016).
Liu et al., “A chemical-inducible CRISPR-Cas9 system for rapid control of genome editing”, Nature Chemical Biology, 12:980-987(2016).
Eklund, et al., “Altered target site specificity variants of the I-Ppol His-Cys bis homing endonuclease” Nucleic Acids Research, 35(17):5839-50 (2007).
Boles, et al., “Digital-to-biological converter for on-demand production of biologies”, Nature Biotechnology, doi:10.1038/nbt.3859 (May 29, 2017).
Pines, et al., “Codon Compression Algorithms for Saturation Mutagenesis”, ACS Synthetic Biology, 4:604-14 (2015).
Bessa et al., “Improved gap repair cloning in yeast: treatment of the gapped vector with Taq DNA polymerase avoids vector self-ligation,” Yeast, 29(10):419-23 (2012).
Boch, “TALEs of genome targeting,” Nature Biotechnology vol. 29, pp. 135-136 (2011).
Campbell et al., “Targeting protein function: the expanding toolkit for conditional disruption,” Biochem J., 473(17):2573-2589 (2016).
Casini et al., “Bricks and blueprints: methods and standards for DNA assembly,” Nat Rev Mol Cell Biol., (9):568-76 (2015).
Chica et al., “Semi-rational approaches to engineering enzyme activity: combining the benefits of directed evolution and rational design,” Current Opinion in Biotechnology, 16(4): 378-384 (2005).
Du Rai et al., “Zinc finger nucleases: custom-designed molecular scissors for genome engineering of plant and mammalian cells”, Nucleic Acids Res., 33(18):5978-90 (2005).
Kadonaga et al., “Regulation of RNA polymerase II transcription by sequence-specific DNA binding factors”, Cell, 116(2):247-57 (2004).
Lee et al., “Targeted chromosomal deletions in human cells using zinc finger nucleases”, Genome Res., 20(1): 81-9 (2009).
Miller et al., “A TALE nuclease architecture for efficient genome editing”, Nature Biotechnology, 29 (2): 143-8 (2011).
Mittelman et al., “Zinc-finger directed double-strand breaks within CAG repeat tracts promote repeat instability in human cells”, PNAS USA, 106 (24): 9607-12 (2009).
Shivange, “Advances in generating functional diversity for directed protein evolution”, Current Opinion in Chemical Biology, 13 (1): 19-25 (2009).
Udo, “An Alternative Method to Facilitate cDNA Cloning for Expression Studies in Mammalian Cells by Introducing Positive Blue White Selection in Vaccinia Topoisomerase I-Mediated Recombination,” PLoS One, 10(9):e0139349 (2015).
Urnov et al., “Genome editing with engineered zinc finger nucleases”, Nature Reviews Genetics, 11:636-646 (2010).
International Search Report and Written Opinion for International Application No. PCT/US2018/053608, dated Dec. 13, 2018, p. 1-9.
International Search Report and Written Opinion for International Application No. PCT/US2018/053670, dated Jan. 3, 2019, p. 1-13.
International Search Report and Written Opinion for International Application No. PCT/US2018/053671, dated Sep. 26, 2018, p. 1-12.
International Search Report and Written Opinion for International Application No. PCT/US2018/040519, dated Sep. 26, 2018, p. 1-8.
International Search Report and Written Opinion for International Application No. PCT/US2019/026836, dated Jul. 2, 2019, p. 1-10.
International Search Report and Written Opinion for International Application No. PCT/US2019/023342, dated Jun. 6, 2019, p. 1-34.
International Search Report and Written Opinion for International Application No. PCT/US2019/030085, dated Jul. 23, 2019, p. 1-14.
International Search Report and Written Opinion for International Application No. PCT/US20/24341, dated Jun. 19, 2020, p. 1-9.
NonFinal Office Action for U.S. Appl. No. 16/399,988, dated Jul. 31, 2019, p. 1-20.
First Office Action Interview Pilot Program Pre-Interview Communication for U.S. Appl. No. 16/024,831, dated Feb. 12, 2019, p. 1-37.
NonFinal Office Action for U.S. Appl. No. 16/024,816 dated Sep. 4, 2018, p. 1-10.
Final Office Action for U.S. Appl. No. 16/024,816 dated Nov. 26, 2018, p. 1-12.
First Office Action Interview Pilot Program Pre-Interview Communication Preinterview for U.S. Appl. No. 16/454,865 dated Aug. 16, 2019, p. 1-36.
Yoshioka, et al., “Development of a mono-promoter-driven CRISPR/Cas9 system in mammalian cells”, Scientific Reports, Jul. 3, 2015, p. 1-8.
Remaut, et al., “Plasmid vectors for high-efficiency expression controlled by the PL promoter of coliphage lambda,” Laboratory of Molecular Biology, Apr. 15, 1981, p. 81-93.
International Search Report and Written Opinion for International Application No. PCT/US2019/028821, dated Aug. 2, 2019, p. 1-14.
International Search Report and Written Opinion for International Application No. PCT/US2019/028883, dated Aug. 16, 2019, p. 1-12.
International Search Report and Written Opinion for International Application No. PCT/US2019/46526, dated Dec. 18, 2019, p. 1-17.
International Search Report and Written Opinion for International Application No. PCT/US2018/34779, dated Nov. 26, 2018, p. 1-39.
International Search Report and Written Opinion for International Application No. PCT/US19/57250, dated Feb. 25, 2020, p. 1-16.
International Search Report and Written Opinion for International Application No. PCT/US19/47135, dated Jun. 11, 2020, p. 1-15.
Provisional Applications (1)
Number Date Country
63133502 Jan 2021 US