ISOLATED POLYNUCLEOTIDES AND POLYPEPTIDES AND METHODS OF USING SAME FOR EXPRESSING AN EXPRESSION PRODUCT OF INTEREST

FIELD AND BACKGROUND OF THE INVENTION

The present invention, in some embodiments thereof, relates to isolated polynucleotides and polypeptides and methods of using same for expressing an expression product of interest.

Recombinant DNA technologies, including inducible gene expression and genome editing technologies have provided opportunities for control of gene expression and precise, targeted modifications to genome sequences in many types of organisms, including plants and animals. Rational gene expression in general, and genome editing in particular, have an enormous potential across basic research, drug discovery and cell based medicine by inserting or removing a specific genetic trait. This includes the correction of mutations that cause disease, the addition of therapeutic genes to specific sites in the genome, the removal of deleterious genes or genome sequences and alteration of plant genomes in order to generate improved crops. Existing methods for genome editing include, for example, the use of zinc finger nucleases, TALENs, and CRISPR-Cas9 systems.

SUMMARY OF THE INVENTION

According to an aspect of some embodiments of the present invention there is provided a method of expressing an expression product of interest, the method comprising:

(i) introducing into a cell a polynucleotide comprising an AimR responsive element operatively linked to a nucleic acid sequence encoding the expression product of interest, wherein the AimR comprises a DNA binding domain for binding the AimR responsive element, the AimR comprising a helix-turn-helix (HTH) motif and a tetratricopeptide repeat (TPR) domain; and

(ii) contacting the cell with an AimP peptide comprising an amino acid sequence of XXXXGG/A, wherein the AimP peptide is capable of binding the AimR polypeptide and dissociating the AimR polypeptide from the AimR responsive element,

thereby expressing the expression product of interest.

According to an aspect of some embodiments of the present invention there is provided a method of expressing an expression product of interest, the method comprising introducing into a cell a polynucleotide comprising an AimR responsive element operatively linked to a heterologous nucleic acid sequence encoding the expression product of interest, wherein the AimR comprises a DNA binding domain for binding the AimR responsive element, the AimR comprising a helix-turn-helix (HTH) motif and a tetratricopeptide repeat (TPR) domain, thereby expressing the expression product of interest.

According to an aspect of some embodiments of the present invention there is provided a method of expressing an expression product of interest, the method comprising introducing into a cell a polynucleotide comprising an AimR responsive element operatively linked to a nucleic acid sequence encoding the expression product of interest; and a nucleic acid construct comprising an AimR polynucleotide and a cis-acting regulatory element heterologous to the AimR for directing expression of the AimR polynucleotide,

wherein the AimR comprises a DNA binding domain for binding the AimR responsive element, the AimR comprising a helix-turn-helix (HTH) motif and a tetratricopeptide repeat (TPR) domain, thereby expressing the expression product of interest.

According to some embodiments of the invention, the method comprising introducing into the cell a polynucleotide encoding the AimR.

According to some embodiments of the invention, the method comprising contacting the cell with an AimP peptide or a nucleic acid sequence encoding same, the AimP peptide comprising an amino acid sequence of XXXXGG/A, wherein the AimP peptide is capable of binding the AimR polypeptide and dissociating the AimR polypeptide from the AimR responsive element.

According to some embodiments of the invention, the method comprising contacting the cell with an agent capable of downregulating expression and/or activity of the AimR responsive element.

According to some embodiments of the invention, the expression product of interest is endogenous to the cell.

According to some embodiments of the invention, the expression product of interest is exogenous to the cell.

According to an aspect of some embodiments of the present invention there is provided an article of manufacture identified for expressing an expression product of interest comprising a packaging material packaging at least two of:

(i) a polynucleotide comprising an AimR responsive element, wherein the AimR comprises a DNA binding domain for binding the AimR responsive element, the AimR comprising a helix-turn-helix (HTH) motif and a tetratricopeptide repeat (TPR) domain;

(ii) a polynucleotide encoding the AimR;

(iii) an AimP peptide comprising an amino acid sequence of XXXXGG/A or a nucleic acid sequence encoding same, wherein the AimP peptide is capable of binding the AimR polypeptide and dissociating the AimR polypeptide from the AimR responsive element; and/or

(iv) an agent capable of downregulating expression and/or activity of the AimR responsive element.

According to some embodiments of the invention, the polynucleotide encoding the AimR is comprised in a nucleic acid construct comprising a cis-acting regulatory element heterologous to the AimR for directing expression of the AimR polynucleotide.

According to some embodiments of the invention, the article of manufacture comprising a multiple cloning site (MCS).

According to some embodiments of the invention, the article of manufacture comprising a polynucleotide encoding the expression product of interest.

According to some embodiments of the invention, the expression product of interest is a DNA editing agent.

According to some embodiments of the invention, the AimR responsive element comprises a nucleic acid sequence for binding said AimR and an AimX polynucleotide, wherein the AimX polynucleotide or an AimX polypeptide encoded by the AimX polynucleotide is capable of inhibiting lysogeny of a temperate phage expressing the AimR in a host bacteria.

According to some embodiments of the invention, the expression product of interest is an AimX dependent DNA editing agent.

According to some embodiments of the invention, the method comprising introducing into the cell a nucleic acid sequence to be integrated into a genome of the cell by the DNA editing agent.

According to some embodiments of the invention, the DNA editing agent is an integrase.

According to some embodiments of the invention, the DNA editing agent is selected from the group consisting of zinc finger nuclease, an effector protein of Class 2 CRISPR/Cas (e.g. Cas9, Cpf1, C2c1, C2c3) and TALEN.

According to an aspect of some embodiments of the present invention there is provided an isolated AimP peptide comprising an amino acid sequence of XXXXGG/A, wherein the peptide is capable of binding an AimR polypeptide comprising a DNA binding domain for binding an AimR responsive element, the AimR comprising a helix-turn-helix (HTH) motif and a tetratricopeptide repeat (TPR) domain; and dissociating the AimR polypeptide from an AimR responsive element.

According to some embodiments of the invention, there is provided an isolated polynucleotide encoding the peptide of the invention.

According to an aspect of some embodiments of the present invention there is provided an isolated AimR polypeptide comprising a DNA binding domain for binding an AimR responsive element, the AimR comprising a helix-turn-helix (HTH) motif and a tetratricopeptide repeat (TPR) domain.

According to some embodiments of the invention, there is provided an isolated polynucleotide encoding the polypeptide of the invention.

According to an aspect of some embodiments of the present invention there is provided an isolated polynucleotide comprising an AimR responsive element, wherein the AimR comprises a DNA binding domain for binding the AimR responsive element, the AimR comprising a helix-turn-helix (HTH) motif and a tetratricopeptide repeat (TPR) domain.

According to some embodiments of the invention, AimR responsive element comprises a nucleic acid sequence for binding said AimR and an AimX polynucleotide, wherein the AimX polynucleotide or an AimX polypeptide encoded by the AimX polynucleotide is capable of inhibiting lysogeny of a temperate phage expressing the AimR in a host bacteria.

According to some embodiments of the invention, there is provided an isolated polypeptide encoded by the polynucleotide of the present invention.

According to some embodiments of the invention, there is provided an isolated polynucleotide comprising the polynucleotides of the present invention.

According to some embodiments of the invention, there is provided an isolated polynucleotide encoding an arbitrium system comprising the polynucleotides of the present invention, wherein the arbitrium system is capable of regulating lysogeny of a phage expressing the arbitrium system in a host bacteria.

According to some embodiments of the invention, there is provided a nucleic acid construct comprising the polynucleotide of the present invention and a multiple cloning site (MCS).

According to some embodiments of the invention, there is provided a nucleic acid construct comprising the polynucleotide of the present invention or the nucleic acid construct of the present invention and a cis-acting regulatory element heterologous to the AimP, the AimR, the AimR responsive element and/or the AimX for directing expression of the polynucleotide.

According to some embodiments of the invention, the nucleic acid construct of the present invention being a nucleic acid construct system comprising at least two nucleic acid constructs each expressing at least one of the polynucleotides.

According to some embodiments of the invention, the construct encodes a polycistronic mRNA comprising the polynucleotide.

According to an aspect of some embodiments of the present invention there is provided an isolated nucleic acid agent capable of downregulating expression and/or activity of AimR responsive element, wherein the AimR comprises a DNA binding domain for binding the AimR responsive element, the AimR comprising a helix-turn-helix (HTH) motif and a tetratricopeptide repeat (TPR) domain.

According to some embodiments of the invention, there is provided an article of manufacture identified for expressing an expression product of interest comprising a packaging material packaging at least one of:

(a)(i) the isolated polynucleotide of the present invention,

(a)(ii) a nucleic acid construct comprising the polynucleotide of the present invention and a multiple cloning site (MCS), and or

(a)(iii) a nucleic acid construct comprising the polynucleotide of the present invention and a cis-acting regulatory element heterologous to the AimR or the AimX for directing expression of the polynucleotide;

and at least one of:

(b)(i) the isolated peptide of the present invention,

(b)(ii) the isolated polynucleotide of the present invention,

(b)(iii) a nucleic acid construct comprising the polynucleotide of the present invention and a MCS,

(b)(iv) a nucleic acid construct comprising the polynucleotide of the present invention and a cis-acting regulatory element heterologous to the AimP for directing expression of the polynucleotide, and/or

(b)(v) the isolated agent of the present invention.

According to some embodiments of the invention, the isolated polynucleotide, the nucleic acid construct or the article of manufacture of the present invention comprising a nucleic acid sequence encoding an expression product of interest.

According to some embodiments of the invention, the expression product of interest is a DNA editing agent.

According to some embodiments of the invention, the AimP peptide is capable of leading to lysogeny of a temperate phage expressing the AimR in a host bacteria.

According to some embodiments of the invention, the AimP comprises an amino acid sequence selected from the group consisting of SEQ ID NOs: 269-283 and 285-286.

According to some embodiments of the invention, the AimR comprises a nucleic acid sequence having at least 80% identity to a sequence selected from the group consisting of SEQ ID NO: 1-113 and/or an amino acid sequence having at least 80% identity to a sequence selected from the group consisting of SEQ ID NOs: 114-226.

According to some embodiments of the invention, the AimR comprises a nucleic acid sequence selected from the group consisting of 1-113 and/or an amino acid sequence selected from the group consisting of SEQ ID NOs: 114-226.

According to some embodiments of the invention, the AimR responsive element comprises a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 287-334.

According to some embodiments of the invention, the AimR responsive element comprises SEQ ID NO: 378.

According to some embodiments of the invention, the AimX comprises a nucleic acid sequence as set forth in SEQ ID NO: 336.

According to some embodiments of the invention, the AimR and the AimP and/or the AimR responsive element are positioned sequentially 5′ to 3′ on a nucleic acid molecule of a temperate phage expressing same.

According to some embodiments of the invention, the AimP and the AimR responsive element are positioned sequentially 5′ to 3′ on a nucleic acid molecule of a temperate phage expressing same.

According to some embodiments of the invention, the AimR and the AimP and/or the AimR responsive element are positioned sequentially 5′ to 3′ in a genome of a temperate phage.

According to some embodiments of the invention, the AimP and the AimR responsive element are positioned sequentially 5′ to 3′ in a genome of a temperate phage.

Unless otherwise defined, all technical and/or scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the invention pertains. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of embodiments of the invention, exemplary methods and/or materials are described below. In case of conflict, the patent specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and are not intended to be necessarily limiting.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

Some embodiments of the invention are herein described, by way of example only, with reference to the accompanying drawings. With specific reference now to the drawings in detail, it is stressed that the particulars shown are by way of example and for purposes of illustrative discussion of embodiments of the invention. In this regard, the description taken with the drawings makes apparent to those skilled in the art how embodiments of the invention may be practiced.

In the drawings:

FIGS. 1A-E demonstrate the effect of conditioned media on infection dynamics of phage phi3T. FIG. 1A is a schematic representation of the preparation protocol of control and conditioned media. FIG. 1B is a graph showing the growth curves of B. subtilis 168 infected by phi3T at multiplicity of infection (MOI)=0.1, in control and conditioned media. FIG. 1C is a graph showing the growth curves of B. subtilis strain DS4979⁵(oppD::kan) infected by phi3T at MOI=0.1, in control and conditioned media. For FIGS. 1B-C, data represents average of 3 biological replicates, each with 3 technical replicates, and error bars represent standard error. FIG. 1D is a semi-quantitative PCR photograph demonstrating phage lysogeny during an infection time course of B. subtilis 168 with phi3T in control and conditioned media. FIG. 1E is a graph showing the growth curves of B. subtilis 168 infected by phi3T at MOI=0.1, in control and conditioned media, with and without pre-treatment with proteinase K. Data represents average of 3 technical replicates, and error bars represent standard error.

FIGS. 2A-G demonstrate the arbitrium peptide and its receptor. FIG. 2A is a schematic representation of the arbitrium locus in the phi3T phage genome. FIG. 2B demonstrate the signal for cleavage by extracellular proteases in the AimP pre-pro-peptide. Shown are the amino acid sequences of B. subtilis quorum sensing Phr genes divided to their domains¹. The recognition signal for B. subtilis extracellular proteases in each sequence is marked in green. The signal peptide in the phi3T AimP protein was predicted by the SignalP 4.1 web server⁸. SEQ ID NOs on the right represent the sequences of the signal peptide, the pro-peptide and the mature peptide, respectively. FIG. 2C demonstrate mass spectrometry histograms verifying the presence of the SAIRGA (SEQ ID NO: 269) peptide in conditioned medium. The left panel shows reference synthesized SAIRGA peptide at 100 nM concentration in LB. B2, Y3, Y4 and Y5 are the peptide fragment ions. The middle panel shows control medium and the right panel shows conditioned medium. In all panels, arrowhead depicts the expected retention time of the SAIRGA peptide. FIG. 2D is a graph demonstrating the growth curves of B. subtilis 168 infected by phi3T at MOI=0.1, in LB media supplemented with synthesized SAIRGA (SEQ ID NO: 269) peptide. Numbers represent peptide concentrations. Shown is the average of 3 biological replicates, each with 3 technical replicates. FIG. 2E show graphs demonstrating that 5 amino acids versions of the 6 amino acids arbitrium peptide do not guide lysogeny. Shown are growth curves of B. subtilis 168 infected by phi3T at MOI=0.1, in LB media supplemented with synthesized SAIRGA (SEQ ID NO: 269, left panel), AIRGA (SEQ ID NO: 335, middle panel) or SAIRG (SEQ ID NO: 355) peptide. Shown is the average of 3 biological replicates, each with 3 technical replicates. Error bars represent standard error. FIG. 2F demonstrate the sequencing-based quantitative determination of the fraction of lysogenized bacteria. Top—schematics of the analysis: total DNA was extracted from cells during infection and was subjected to Illumina whole genome sequencing. Percent lysogeny was calculated as the fraction of reads spanning the phage/bacteria integration junction out of total reads covering this junction. Integration junction is marked red; integrated phage DNA is green. Bottom—percent lysogenized bacteria at four time points during infection of B. subtilis 168 with phi3T at MOI=2. Shown is the average of three biological replicates with error bars denoting standard error. FIG. 2G is a graph demonstrating microscale thermophoresis (MST) analysis of the binding between purified AimR (C-terminal 6×His-tag, at concentration of 200 nM) and synthesized SAIRGA (SEQ ID NO: 269) or GMPRGA (SEQ ID NO: 271) peptides (at a concentration range of 9-4000 nM). Average and standard error of three replicates is shown.

FIGS. 3A-F demonstrate a conserved peptide communication code guiding lysogeny in Bacillus phages. FIG. 3A is a schematic representation of selected instances of AimR homologs in sequenced genomes (marked in red). Locus tags are indicated for AimR homologs, along with the percent identity to the phi3T AimR. Mature arbitrium peptide (the last 6 amino acids of AimP ORF) is indicated below the AimP homolog (marked in orange). FIG. 3B is a pie chart demonstrating the distribution of arbitrium peptides among 112 homologs of AimP (see Table 3). The peptides shown are SAIRGA (SEQ ID NO: 269), GFTVGA (SEQ ID NO: 273), SASRGA (SEQ ID NO: 275), GFGRGA (SEQ ID NO: 272), GVVRGA (SEQ ID NO: 276), GMPRGA (SEQ ID NO: 271), AMGNGG (SEQ ID NO: 278), DPGRGG (SEQ ID NO: 274), GFGHGA (SEQ ID NO: 282), GFPRGA (SEQ ID NO: 281), GIVRGA (SEQ ID NO: 286), SIIRGA (SEQ ID NO: 270), TIGRGG (SEQ ID NO: 280), NPGRGA (SEQ ID NO: 285), SIGHGA (SEQ ID NO: 283), SPSRGA (SEQ ID NO: 277) and TIGRG (SEQ ID NO: 279). FIG. 3C shows the amino acid profile of arbitrium peptide types. FIGS. 3D-E are graphs demonstrating growth curves of B. subtilis BEST7003 infected by spBeta at MOI=0.1, in LB media supplemented with synthesized GMPRGA (SEQ ID NO: 271, FIG. 3D) and SAIRGA (SEQ ID NO: 269, FIG. 3E) peptides. FIG. 3F is a graph demonstrating the growth curves of B. subtilis 168 infected by phi3T at MOI=0.1, in LB media supplemented with synthesized GMPRGA (SEQ ID NO: 271) peptide. Data in FIGS. 3D-F represent average of 3 biological replicates, each with 3 technical replicates. Error bars represent standard error.

FIGS. 4A-I demonstrate DNA binding and transcription regulation in the arbitrium system. FIG. 4A is a graph demonstrating ChIP-seq of His-tagged AimR 15 minutes post-infection with or without 1 CpM of SAIRGA (SEQ ID NO: 269) peptide. For each nucleotide in the phage genome, the ratio between the amount of sequenced pulled-down DNA during infection without the peptide to the amount of DNA pulled-down when the peptide was present in the medium is shown. The ratio was normalized to the amount of sequenced reads in each library. FIG. 4B shows a zoomed-in region in the phage genome of the graph presented in FIG. 4A. FIG. 4C is a graph demonstrating gel-filtration results of purified AimR with or without the presence of either SAIRGA (SEQ ID NO: 269) or GMPRGA (SEQ ID NO: 271) peptide. Inset presents a calibration curve for the gel filtration using proteins of known sizes. FIG. 4D is a bar graph demonstrating expression of the AimX gene during infection. RNA-seq read counts were normalized to the number of reads hitting the phage genome in each RNA-seq library. Normalization was performed separately for the two time points. Data presented for individual biological replicates (3 for each experiment with wild type bacteria and 2 for AimR knockdown strains). FIGS. 4E-G show RNA-seq coverage of the arbitrium locus at 5 min (FIG. 4E), 10 min (FIG. 4F) and 20 min (FIG. 4G) post infection. RNA-seq coverage was normalized to the number of reads hitting the phage genome in each RNA-seq library. FIG. 4H is a graph demonstrating growth curves of wild type and dCas9-silenced bacterial strains during phi3T-infection. Strains were infected at t=0 at MOI=0.1. dCas9 was induced by xylose 0.2%. The aimX gene (purple line) was cloned also under a xylose promoter. Shown is the average of 3 biological replicates, each with 3 technical replicates. Error bars represent standard error. FIG. 4I demonstrates phage gene expression 20 minutes post infection. Each dot represents a single phage gene. Axes represent average RNA-seq read count per gene from 3 replicates, after normalization to control for RNA-seq library size. X axis represent expression when phage infection was in the presence of 1 μM of SAIRGA (SEQ ID NO: 269) peptide; and Y axis represent expression in the absence of peptide.

FIGS. 5A-C are schematic representations of the proposed mechanistic model for communication-based lysis-lysogeny decisions. FIG. 5A demonstrates the dynamics of arbitrium accumulation during infection of a bacterial culture by phage. FIG. 5B shows that at the first encounter of a phage with a bacterial population, there is no arbitrium in the medium. The early genes aimR and aimP are expressed immediately upon infection. AimR binds, as a dimer, the phage DNA upstream of aimX, and activates AimX expression. AimX is an inhibitor of lysogeny, directing the phage to a lytic cycle. At the same time AimP is expressed, secreted and processed extracellularly to produce the mature peptide. FIG. 5C shows that at later stages of the infection dynamics, the arbitrium peptide accumulates in the medium and is internalized into the bacteria by the OPP transporter. At this stage when the phage infects the bacterium, the expressed AimR receptor binds the arbitrium molecules and cannot activate the expression of AimX, leading to lysogeny preference.

FIG. 6 is a schematic representation of Construct #1: bacteriophage Phi3T AimR-AimP-AimX locus (red rectangle) was genetically fused to the fluorescent reporter gene (gfp) and inserted within the EcoRI/BamHI-cleavage sites of the pDR111 plasmid yielding pDR111-Construct #1.

FIG. 7 is a schematic representation of Construct #1 with an antibiotic-resistance gene inserted into the Bacillus subtilis BEST7003 chromosome within the amyE gene.

FIG. 8 is a schematic representation of Construct #2 with an antibiotic-resistance gene inserted into the Bacillus subtilis BEST7003 chromosome within the amyE gene.

FIGS. 9A-B are graphs demonstrating growth curves and GFP fluorescence levels of wild-type Bacillus subtilis BEST7003 (WT) and of Bacillus subtilis BEST7003 containing Construct #1 (FIG. 9A) or Construct #2 (FIG. 9B) following culture in LB growth medium supplemented with the indicated concentrations of SAIRGA (SEQ ID NO: 269) peptide. Shown is the mean value of technical replicates (n=3). Fluorescence is shown in arbitrary units (a.u).

DESCRIPTION OF SPECIFIC EMBODIMENTS OF THE INVENTION

The present invention, in some embodiments thereof, relates to isolated polynucleotides and polypeptides and methods of using same for expressing an expression product of interest.

Before explaining at least one embodiment of the invention in detail, it is to be understood that the invention is not necessarily limited in its application to the details set forth in the following description or exemplified by the Examples. The invention is capable of other embodiments or of being practiced or carried out in various ways.

Recombinant DNA technologies, including inducible gene expression and genome editing technologies developed in recent years allow control of gene expression and precise and targeted modifications to genome sequences in many types of organisms, including plants and animals.

Whilst conceiving embodiments of the invention and reducing to practice, the present inventors have devised a novel expression tool which is based on a newly uncovered communication system utilized by temperate viruses to coordinate lytic-lysogenic decisions.

As is illustrated hereinunder and in the Examples section, which follows, the present inventors show that during infection of its Bacillus host cell, the virus (spBeta family phage) produces a 6 amino acids long peptide that is released to the medium (Examples 1-2, FIGS. 1A-E and 2A-F). This peptide leads to lysogeny of the phage in its infected host in a concentration-dependent manner (Example 2, FIG. 2D). Following, the inventors were able to identify a shared motif in the peptides encoded by different species of the phage family; a glycine residue at the 5^thposition, glycine or alanine at the 6^thposition, and optionally a positively charged residue at the 4^thposition (Example 3, FIGS. 2G and 3A-C). The inventors further demonstrate that this novel communication system, which the present inventors denoted as the “arbitrium” system, is encoded by 3 phage genes denoted herein as: AimP, producing the peptide; AimR, the intracellular peptide receptor; and AimX, a negative regulator of lysogeny (Example 2-4, FIGS. 2A-G, 3A-C and 4A-I and Table 3). AimR, as a homodimer, is a transcriptional activator of AimX; when bound by the peptide AimR becomes a monomer and the transcription of AimX is repressed, leading to lysogeny. Without being bound by theory it is suggested that the arbitrium system enables an offspring phage particle to communicate with its predecessors, i.e., to estimate the amount of recent prior infections and hence decide whether to employ the lytic or lysogenic cycle (Example 4, FIGS. 5A-C).

Taken together, the present teachings suggest that the arbitrium system and functional fragments thereof can be used for e.g. controlling expression of an expression product of interest in general and controlling the integration of DNA sequences into a target genome in particular. As is further described hereinbelow and in the Examples section that follows, AimR can bind an AimR responsive element thereby leading to activation of transcription of an expression product of interest operatively linked to the AimR responsive element (e.g. AimR binding site); while binding of AimP to AimR represses the transcription of the expression product of interest.

The present inventors further show that an AimR-AimP system can function to control the expression of a GFP reporter gene operatively linked to AimR binding site in Bacillus subtilis BEST7003 in a phage-independent context (Example 5, FIGS. 6-8 and 9A-B). Thus, according to a first aspect of the present invention, there is provided an isolated AimP peptide comprising an amino acid sequence of XXXXGG/A, wherein said peptide is capable of binding an AimR polypeptide comprising a DNA binding domain for binding an AimR responsive element, said AimR comprising a helix-turn-helix (HTH) motif and a tetratricopeptide repeat (TPR) domain; and dissociating said AimR polypeptide from said AimR responsive element.

According to another aspect of the present invention, there is provided an isolated AimR polypeptide comprising a DNA binding domain for binding an AimR responsive element, said AimR comprising a helix-turn-helix (HTH) motif and a tetratricopeptide repeat (TPR) domain.

As used herein the term “isolated” refers to at least partially separated from the natural environment, physiological environment e.g., a microorganism or e.g., phage.

As used herein, the term “AimR” refers to the polynucleotide and expression product e.g. polypeptide of the AimR gene. The product of the AimR gene contains a DNA binding domain for binding an AimR responsive element. The product of the AimR gene contains a helix-turn-helix (HTH) motif and a tetratricopeptide repeat (TPR) domain.

As used herein, the phrase “DNA binding domain (DBD)” refers to a motif that can recognize and bind to a specific nucleic acid sequence (e.g. AimR responsive element). The DBD can recognize and bind to a double-stranded nucleic acid sequence (e.g. DNA) in a sequence specific manner. Typically, a DBD is a structural motif in a protein domain. According to specific embodiments the DBD comprises a helix-turn-helix (HTH) motif.

As used herein the term “helix-turn-helix (HTH) motif” refers to a well-known DNA binding motif which complements the shape of the DNA major groove, having the mfap number IPR000047 (see e.g. Ann Rev. of Biochem. (1984) 53:293 and Brunelle and Schleif J. Mol. Biol. (1989) 209:607). The domain can be illustrated by the sequence XXXPhoAlaXXPhoGlyPhoXXXXPhoXXPhoXX (SEQ ID NO: 335), wherein X is any amino acid and Pho is a hydrophobic amino acid.

As used herein, the term “tetratricopeptide repeat (TPR) domain” refers to a degenerate 34 amino acid consensus sequence found in tandem arrays of 3-16 motifs that is believed to mediate protein-protein binding Family: TPR_1 (PF00515). TPR domain forms two anti-parallel α-helices separated by a turn, to form a structure with an amphipathic groove [see e.g. Hirano et al. (1990), Cell, 60:319-328].

According to specific embodiments, the AimR polynucleotide has at least 80% identity to a nucleic acid selected from the group consisting of SEQ ID NO: 1-113.

According to specific embodiments, the AimR polynucleotide has at least 85%, at least 90%, at least 95% identity to a nucleic acid selected from the group consisting of SEQ ID NO: 1-113.

According to specific embodiments, the AimR polypeptide has at least 80% identity to an amino acid sequence selected from the group consisting of SEQ ID NOs: 114-226.

According to specific embodiments, the AimR polypeptide has at least at least 85%, at least 90%, at least 95% identity to an amino acid sequence selected from the group consisting of SEQ ID NOs: 114-226.

According to specific embodiments, the AimR polypeptide has at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90% or 100% identity to SEQ ID NO: 114.

Sequence identity can be determined using any protein alignment algorithm or any nucleic acid sequence alignment algorithm based on the polynucleotide sequence encoding the polypeptide such as Blast, ClustalW, MUSCLE, and HHpred.

According to specific embodiments, AimR amino acid sequence is selected from the group consisting of SEQ ID NOs: 114-226.

According to specific embodiments, AimR nucleic acid sequence is selected from the group consisting of SEQ ID NOs: 1-113.

According to a specific embodiment, AimR comprises SEQ ID NO: 114.

The present inventors have uncovered that AimR binds the phage DNA as a dimer in the absence of the AimP peptide and functions as a transcriptional activator of e.g. AimX. Upon binding to the AimP peptide, the AimR changes its oligomeric state from the active dimer to the inactive monomer leading to a decrease in the transcription of e.g. AimX.

Hence, an AimR polypeptide according to some embodiments is capable of binding an AimP peptide comprising an amino acid sequence of XXXXGG/A (as further described hereinbelow); and in the absence of AimP binding DNA (i.e. AimR responsive element) and activating gene expression (i.e. a transcription factor).

Methods of determining binding of a transcription factor (e.g. AimR) to DNA (e.g. AimR Responsive element) are well known in the art and include, but not limited to, Chromatin Immunoprecipitation (ChIP) Assay, DNA Electrophoretic Mobility Shift Assay (EMSA), DNA Pull-down Assay and Microplate Capture and Detection Assay.

According to specific embodiments, the AimR binds DNA as a dimer. Methods of evaluating dimerization are well known in the art and are further described in the Examples section which follows and include migration on a gel filtration column.

According to specific embodiments, the term “AimR” refers to a full length AimR. According to other specific embodiments, the term “AimR” refers to a fragment of AimR which maintains the activity as described herein.

The term “AimR”, also refers to functional AimR homologues which exhibit the desired activity as described herein. Such homologues can be, for example, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identical or homologous to the polypeptide SEQ ID NOs: 114-226, or 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identical to the polynucleotide encoding same (as further described hereinabove and below). The homolog may also refer to an ortholog, a deletion, insertion, or substitution variant, including an amino acid substitution.

Sequence identity or homology can be determined using any protein or nucleic acid sequence alignment algorithm such as Blast, ClustalW, MUSCLE, and HHpred.

As used herein, the term “operatively linked” refers to a nucleic acid sequence having a functional relationship with another nucleic acid sequence. According to specific embodiments, a nucleic acid sequence is “operatively linked” to a regulatory sequence when the regulatory sequence (e.g. AimR binding site) controls and regulates the transcription and/or translation of that nucleic acid sequence. According to one embodiment, the regulatory element act in cis on the nucleic acid sequence it regulates. According to another embodiment, the regulatory element act in trans on the nucleic acid sequence it regulates. According to specific embodiments, the regulatory sequence is placed upstream to the nucleic acid sequence it regulates. According to specific embodiments, the term “operatively linked” includes having an appropriate start signal (e.g., ATG) upstream of the nucleic acid sequence to be expressed and maintaining the correct reading frame to allow expression of the nucleic acid sequence under the control of the regulatory sequence and expression of the desired product encoded by the nucleic acid sequence.

According to specific embodiments, the term “AimR responsive element” refers to a full length AimR responsive element. According to other specific embodiments, the term “AimR responsive element” refers to a fragment of AimR responsive element which maintains the activity as described herein. According to specific embodiments, the AimR responsive element is 4-100, 4-50, 4-30 or 4-10 nucleic acids long.

According to specific embodiment, the AimR responsive element comprises a nucleic acid sequence for binding AimR (i.e. AimR binding site).

According to specific embodiments, the AimR responsive element comprises a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 287-334.

According to specific embodiments, the AimR responsive element is comprised in a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 287-334.

According to specific embodiments, the AimR responsive element comprises SEQ ID NO: 378.

According to specific embodiments, the AimR responsive element is comprised in SEQ ID NO: 378.

According to specific embodiments, the AimR responsive element comprises a fragment of SEQ ID NOs: 287-334 or 378 which maintains SEQ ID NOs: 287-334 or 378 activity as a nucleic acid sequence for binding AimR (i.e. AimR binding site), as determined by e.g. Chromatin Immunoprecipitation (ChIP) Assay, DNA Electrophoretic Mobility Shift Assay (EMSA), Microplate Capture and Detection Assay, or DNA Pull-down Assay.

According to specific embodiments, the AimR responsive element comprises a nucleic acids sequence for binding AimR (i.e. AimR binding site) and AimX. Hence, specific embodiments of the present invention disclose that binding of AimR to its binding site controls expression of AimX, which in turn controls expression of an expression product of interest.

The term “AimR responsive element”, also refers to functional AimR responsive element homologues which exhibit the desired activity (i.e., binding an AimR). Such homologues can be, for example, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identical or homologous to the polynucleotide SEQ ID NOs: 287-334 and 378. The homolog may also refer to an ortholog, a deletion, insertion, or substitution variant, including an amino acid substitution.

AimRs and their cognate sequences comprising AimR responsive elements which can be used in accordance with some embodiments are listed in Table 3 hereinbelow.

As used herein, the term “AimP” refers to a peptide having an amino acid sequence of XXXXGG/A, wherein X is any amino acid or a nucleic acid encoding same. According to specific embodiments, the AimP has an amino acid sequence of XXXX₁GG/A, wherein X is any amino acid and wherein X₁is a positively charged amino acid. According to a specific embodiment, the AimP is the product of the AimP gene.

The peptide can be long e.g., more than 50 amino acids (e.g., 51-80 amino acids, 51-100 amino acids, 100-200 amino acids) or short e.g., 6-50 amino acids long. According to specific embodiments, the peptide is 6 amino acids long.

According to specific embodiments, AimP comprises an amino acid sequence selected from the group consisting of SEQ ID NOs: 269-283 and 285-286.

According to specific embodiments, the AimP peptide comprises a SAIRGA (SEQ ID NO: 269) or GMPRGA (SEQ ID NO: 271) amino acid sequence.

According to specific embodiments, the AimP peptide comprises a SAIRGA (SEQ ID NO: 269) amino acid sequence.

A functional AimP peptide can bind an AimR polypeptide and dissociate the AimR polypeptide from DNA and specifically from an AimR responsive element.

As used herein, the terms “dissociating” and “dissociate” refer to at least 30% reduction in complexes comprising the AimR and AimR responsive element, as evidenced by an assay known in the art e.g., Chromatin Immunoprecipitation (ChIP) Assay, DNA Electrophoretic Mobility Shift Assay (EMSA), DNA Pull-down Assay and Microplate Capture and Detection Assay.

According to specific embodiments, a functional AimP can lead to lysogeny of a temperate phage expressing AimR in a host bacteria. Methods of analyzing phage lysogeny are described hereinbelow.

The term “AimP”, also refers to functional AimP homologues which exhibit the desired activity as described herein. Such homologues can be, for example, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identical or homologous to the peptide SEQ ID NOs: 269-283 and 285-286. The homolog may also refer to an ortholog, a deletion, insertion, or substitution variant, including an amino acid substitution.

AimPs and their cognate AimRs which can be used in accordance with some embodiments are listed in Table 3 hereinbelow.

Binding assays for qualifying peptide binding to AimR are well known in the art and include e.g. western blot, BiaCore, high-performance liquid chromatography (HPLC) or flow cytometry.

Binding assays for qualifying the ability of the peptide to dissociate AimR from DNA are well known in the art and include and include, but not limited to, Chromatin Immunoprecipitation (ChIP) Assay, DNA Electrophoretic Mobility Shift Assay (EMSA), DNA Pull-down Assay and Microplate Capture and Detection Assay.

According to specific embodiments binding of the AimP peptide to AimR polypeptide inhibits dimerization of the AimR polypeptide.

As used herein, the terms “inhibit” and “inhibiting” refer to a decrease in activity (e.g. dimerization, binding, lysogeny). According to specific embodiments the decrease is at least 1.5 fold, at least 2 fold, at least 3 fold, at least 5 fold, at least 10 fold, or at least 20 fold as compared to same in the absence of the AimP peptide or AimX.

According to other specific embodiments the decrease is by at least 5%, by at least a 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95% or 99% or 100% as compared to same in the absence of the AimP peptide or AimX.

A temperate phage is one capable of entering the lysogenic pathway, in which the phage becomes a dormant, passive part of the cell's genome through prior to completion of its lytic cycle.

According to specific embodiments the phage is capable of infecting a Bacillus bacteria.

According to specific embodiments the phage is a prophage.

Table 3 below indicates phages and prophages that can be used according to specific embodiments of the invention. According to a specific embodiment, the phage is a spBeta phage. According to a specific embodiment, the phage is Phi3T.

For the same culture conditions the extent of lysogeny is generally expressed in comparison to the lysogeny in bacteria of the same species but not contacted with the indicated agent (e.g. AimP, AimX) or contacted with a vehicle control, also referred to as control.

Methods of analyzing phage lysogeny are well known in the art and include, but not limited to, DNA sequencing and PCR analysis. As a temperate phage can employ the lysogenic pathway or the lytic pathway, the lysogenic activity of a phage can be assessed indirectly by determining reduction in the lytic activity of a phage by methods well known in the art including, but not limited to, optical density, plaque assay, and living dye indicators.

As used herein, the phrases “leading to lysogeny” and “lead to lysogeny” refers to an increase in lysogeny.

According to specific embodiments the increase is at least 1.5 fold, at least 2 fold, at least 3 fold, at least 5 fold, at least 10 fold, or at least 20 fold as compared to same in the absence of the AimP peptide or AimX downregulating agent.

According to other specific embodiments the increase is by at least 5%, by at least a 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95% or 99% or 100% as compared to same in the absence of the AimP peptide or AimX downregulating agent.

According to specific embodiments, the bacteria can be infected with a temperate phage selected from the phages depicted in Table 3 below. According to specific embodiments, the bacteria can be infected with a spBeta phage. According to specific embodiments the bacteria is a Bacillus bacteria. According to specific embodiments the bacteria is selected from the group consisting of the bacteria listed in Table 3 below.

The present invention also contemplates isolated polynucleotides encoding the components of the arbitrium system and functional fragments thereof as described herein.

Thus, according to an aspect of the present invention, there is provided an isolated polynucleotide encoding an AimP peptide comprising an amino acid sequence of XXXXGG/A, wherein said peptide is capable of binding an AimR polypeptide comprising a DNA binding domain for binding an AimR responsive element, said AimR comprising a helix-turn-helix (HTH) motif and a tetratricopeptide repeat (TPR) domain; and dissociating said AimR polypeptide from an AimR responsive element.

According to another aspect of the present invention, there is provided an isolated polynucleotide encoding an AimR polypeptide comprising a DNA binding domain for binding an AimR responsive element, said AimR comprising a helix-turn-helix (HTH) motif and a tetratricopeptide repeat (TPR) domain.

According to specific embodiments, the polynucleotide encoding the AimR polypeptide comprises a sequence having at least 80%, at least 85%, at least 90%, at least 95% or 100% identity to a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-113.

According to specific embodiments, the polynucleotide encoding the AimR polypeptide comprises a sequence selected from the group consisting of SEQ ID NOs: 1-113.

According to another aspect of the present invention, there is provided an isolated polynucleotide comprising an AimR responsive element, wherein said AimR comprises a DNA binding domain for binding said AimR responsive element, said AimR comprises a helix-turn-helix (HTH) motif and a tetratricopeptide repeat (TPR) domain.

According to specific embodiments, the AimR responsive element comprises a nucleic acid sequence for binding AimR and an AimX polynucleotide, wherein said AimX polynucleotide or an AimX polypeptide encoded by said AimX polynucleotide is capable of inhibiting lysogeny of a temperate phage expressing said AimR in a host bacteria.

According to another aspect of the present invention, there is provided an isolated AimX polynucleotide.

According to another aspect of the present invention, there is provided an isolated polypeptide encoded by the AimX polynucleotide.

As used herein, the term “AimX” refers to the polynucleotide and expression product e.g. polypeptide of the AimX gene. According to specific embodiments, the “AimX” refers to the polynucleotide of the AimX gene. According to specific embodiments, an AimR binding site is operatively linked to the AimX polynucleotide. A functional AimX is capable of inhibiting lysogeny of a temperate phage expressing the respective AimR in a host bacteria. According to specific embodiments, AimX and AimR binding site comprise an AimR responsive element.

According to specific embodiments, AimX comprises a nucleic acid sequence as set forth in SEQ ID NO: 336.

According to specific embodiments, the term “AimX” refers to a full length AimX. According to other specific embodiments, the term “AimX” refers to a fragment of AimX which maintains the activity as described herein.

The term “AimX”, also refers to functional AimX homologues which exhibit the desired activity as described herein. Such homologues can be, for example, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identical or homologous to the polynucleotide SEQ ID NO:336. The homolog may also refer to an ortholog, a deletion, insertion, or substitution variant, including an amino acid substitution.

AimRs and their cognate AimXs which can be used in accordance with some embodiments are listed in Table 3 hereinbelow.

According to another aspect of the present invention, there is provided an isolated polynucleotide comprising a polynucleotide encoding an AimP peptide, a polynucleotide encoding an AimR polypeptide, a polynucleotide comprising an AimR responsive element and/or an AimX polynucleotide and any combination of same.

According to another aspect of the present invention, there is provided an isolated polynucleotide encoding an arbitrium system comprising a nucleic acid sequence encoding an AimP peptide, a nucleic acid sequence encoding an AimR polypeptide and a nucleic acid sequence for binding AimR operatively linked to an AimX nucleic acid sequence, wherein said arbitrium system is capable of regulating lysogeny of a phage expressing said arbitrium system in a host bacteria.

As used herein “arbitrium system” or a “functional arbitrium”, refers to a multi-gene system which comprises AimP, AimR, and AimX as described herein and which activity controls phage lysogeny in its host genome.

The combinations of the arbitrium system components which can be used in accordance with some embodiments are listed in Table 3 hereinbelow.

As used herein, the term “components of the arbitrium system” refers to AimP, AimR, AimR responsive element, AimX and any combination of same or functional fragments thereof such as described hereinabove.

According to specific embodiments, the AimR is encoded by a gene positioned within 1-10 genes of a gene encoding AimP and/or AimR responsive element (e.g. SEQ ID NOs: 287-334, 378) in a genome of a temperate phage.

According to specific embodiments, the AimP is encoded by a gene positioned within 1-10 genes of a gene encoding an AimR responsive element (e.g. SEQ ID NOs: 287-334, 378) in a genome of a temperate phage. According to specific embodiments, the AimR and the AimP and/or the AimR responsive element (e.g. SEQ ID NOs: 287-334, 378) are positioned sequentially 5′ to 3′ on a nucleic acid molecule of a temperate phage expressing same. According to specific embodiments, the AimR and the AimP and/or the AimR responsive element (e.g. SEQ ID NOs: 287-334, 378) are positioned sequentially 5′ to 3′ in a genome of a temperate phage.

According to specific embodiments, the AimR and the AimP and/or the AimR responsive element (e.g. SEQ ID NOs: 287-334, 378) are positioned contiguously 5′ to 3′ on a nucleic acid molecule of a temperate phage expressing same.

According to specific embodiments, the AimR and the AimP and/or the AimR responsive element (e.g. SEQ ID NOs: 287-334, 378) are positioned contiguously 5′ to 3′ in a genome of a temperate phage.

According to specific embodiments, the AimP and the AimR responsive element (e.g. SEQ ID NOs: 287-334, 378) are positioned sequentially 5′ to 3′ on a nucleic acid molecule of a temperate phage expressing same.

According to specific embodiments, the AimP and the AimR responsive element (e.g. SEQ ID NOs: 287-334, 378) are positioned sequentially 5′ to 3′ in a genome of a temperate phage.

According to specific embodiments, the AimP and the AimR responsive element (e.g. SEQ ID NOs: 287-334, 378) are positioned contiguously 5′ to 3′ on a nucleic acid molecule of a temperate phage expressing same.

According to specific embodiments, the AimP and the AimR responsive element (e.g. SEQ ID NOs: 287-334, 378) are positioned contiguously 5′ to 3′ in a genome of a temperate phage.

According to specific embodiments, the AimR is encoded by a gene positioned within 1-10 genes of a gene encoding AimX in a genome of a temperate phage.

According to specific embodiments, the AimP is encoded by a gene positioned within 1-10 genes of a gene encoding AimX in a genome of a temperate phage.

According to specific embodiments, the AimR and the AimP and/or the AimX are positioned sequentially 5′ to 3′ on a nucleic acid molecule of a temperate phage expressing same. According to specific embodiments, the AimR and the AimP and/or the AimX are positioned sequentially 5′ to 3′ in a genome of a temperate phage.

According to specific embodiments, the AimR and the AimP and/or the AimX are positioned contiguously 5′ to 3′ on a nucleic acid molecule of a temperate phage expressing same.

According to specific embodiments, the AimR and the AimP and/or the AimX are positioned contiguously 5′ to 3′ in a genome of a temperate phage.

According to specific embodiments, the AimP and the AimX are positioned sequentially 5′ to 3′ on a nucleic acid molecule of a temperate phage expressing same. According to specific embodiments, the AimP and the AimX are positioned sequentially 5′ to 3′ in a genome of a temperate phage.

According to specific embodiments, the AimP and the AimX are positioned contiguously 5′ to 3′ on a nucleic acid molecule of a temperate phage expressing same. According to specific embodiments, the AimP and the AimX are positioned contiguously 5′ to 3′ in a genome of a temperate phage.

Thus, according to specific embodiments, the polynucleotide of the present invention comprises:

(1) a polynucleotide encoding an AimP peptide;

(2) a polynucleotide comprising a nucleic acid sequence encoding an AimR polypeptide;

(3) a polynucleotide comprising an AimR responsive element nucleic acid sequence; (4) an AimX polynucleotide

(5) a polynucleotide comprising (2) and (3);

(6) a polynucleotide comprising (2) and (4);

(7) a polynucleotide comprising (1), (2) and (3);

(8) a polynucleotide comprising (1), (2) and (4);

(9) a polynucleotide comprising (1) and (2);

(10) a polynucleotide comprising (1) and (3); or

(11) a polynucleotide comprising (1) and (4)

According to specific embodiments, the AimR polynucleotide is upstream to the AimP polynucleotide.

According to specific embodiments, the AimR polynucleotide is upstream to the AimR responsive element polynucleotide.

According to specific embodiments, the AimR polynucleotide is upstream to the AimX polynucleotide.

According to specific embodiments, the AimR responsive element is upstream to the AimX polynucleotide.

According to specific embodiments, the AimP polynucleotide is upstream to the AimR responsive element polynucleotide.

According to specific embodiments, the AimP polynucleotide is upstream to the Aimx polynucleotide.

As used herein, the terms “peptide” and “polypeptide”, which are interchangeably used, encompass native peptides (either degradation products, synthetically synthesized peptides or recombinant peptides) and peptidomimetics (typically, synthetically synthesized peptides), as well as peptoids and semipeptoids which are peptide analogs, which may have, for example, modifications rendering the peptides more stable while in a body or more capable of penetrating into cells. Such modifications include, but are not limited to N terminus modification, C terminus modification, peptide bond modification, backbone modifications, and residue modification. Methods for preparing peptidomimetic compounds are well known in the art and are specified, for example, in Quantitative Drug Design, C. A. Ramsden Gd., Chapter 17.2, F. Choplin Pergamon Press (1992), which is incorporated by reference as if fully set forth herein. Further details in this respect are provided hereinunder.

Peptide bonds (—CO—NH—) within the peptide may be substituted, for example, by N-methylated amide bonds (—N(CH3)-CO—), ester bonds (—C(═O)—O—), ketomethylene bonds (—CO—CH2-), sulfinylmethylene bonds (—S(═O)—CH2-), α-aza bonds (—NH—N(R)—CO—), wherein R is any alkyl (e.g., methyl), amine bonds (—CH2-NH—), sulfide bonds (—CH2-S—), ethylene bonds (—CH2-CH2-), hydroxyethylene bonds (—CH(OH)—CH2-), thioamide bonds (—CS—NH—), olefinic double bonds (—CH═CH—), fluorinated olefinic double bonds (—CF═CH—), retro amide bonds (—NH—CO—), peptide derivatives (—N(R)—CH2-CO—), wherein R is the “normal” side chain, naturally present on the carbon atom.

These modifications can occur at any of the bonds along the peptide chain and even at several (2-3) bonds at the same time.

Natural aromatic amino acids, Trp, Tyr and Phe, may be substituted by non-natural aromatic amino acids such as 1,2,3,4-tetrahydroisoquinoline-3-carboxylic acid (Tic), naphthylalanine, ring-methylated derivatives of Phe, halogenated derivatives of Phe or O-methyl-Tyr.

The peptides of some embodiments of the invention may also include one or more modified amino acids or one or more non-amino acid monomers (e.g. fatty acids, complex carbohydrates etc).

The term “amino acid” or “amino acids” is understood to include the 20 naturally occurring amino acids; those amino acids often modified post-translationally in vivo, including, for example, hydroxyproline, phosphoserine and phosphothreonine; and other unusual amino acids including, but not limited to, 2-aminoadipic acid, hydroxylysine, isodesmosine, nor-valine, nor-leucine and ornithine. Furthermore, the term “amino acid” includes both D- and L-amino acids.

Tables 1 and 2 below list naturally occurring amino acids (Table 1) and non-conventional or modified amino acids (e.g., synthetic, Table 2), which can be used with some embodiments of the invention.

TABLE 1

Three-Letter
One-letter

Amino Acid
Abbreviation
Symbol

Alanine
Ala
A

Arginine
Arg
R

Asparagine
Asn
N

Aspartic acid
Asp
D

Cysteine
Cys
C

Glutamine
Gln
Q

Glutamic Acid
Glu
E

Glycine
Gly
G

Histidine
His
H

Isoleucine
Ile
I

Leucine
Leu
L

Lysine
Lys
K

Methionine
Met
M

Phenylalanine
Phe
F

Proline
Pro
P

Serine
Ser
S

Threonine
Thr
T

Tryptophan
Trp
W

Tyrosine
Tyr
Y

Valine
Val
V

Any amino acid as above
Xaa
X

TABLE 2

Non-conventional amino

Non-conventional amino acid
Code
acid
Code

ornithine
Orn
hydroxyproline
Hyp

α-aminobutyric acid
Abu
aminonorbornyl-carboxylate
Norb

D-alanine
Dala
aminocyclopropane-
Cpro

carboxylate

D-arginine
Darg
N-(3-guanidinopropyl)glycine
Narg

D-asparagine
Dasn
N-(carbamylmethy)glycine
Nasn

D-aspartic acid
Dasp
N-(carboxymethyl)glycine
Nasp

D-cysteine
Dcys
N-(thiomethyl)glycine
Ncys

D-glutamine
Dgln
N-(2-carbamylethyl)glycine
Ngln

D-glutamic acid
Dglu
N-(2-carboxyethyl)glycine
Nglu

D-histidine
Dhis
N-(imidazolylethyl)glycine
Nhis

D-isoleucine
Dile
N-(1-methylpropyl)glycine
Nile

D-leucine
Dleu
N-(2-methylpropyl)glycine
Nleu

D-lysine
Dlys
N-(4-aminobutyl)glycine
Nlys

D-methionine
Dmet
N-(2-methylthioethyl)glycine
Nmet

D-ornithine
Dorn
N-(3-aminopropyl)glycine
Norn

D-phenylalanine
Dphe
N-benzylglycine
Nphe

D-proline
Dpro
N-(hydroxymethyl)glycine
Nser

D-serine
Dser
N-(1-hydroxyethyl)glycine
Nthr

D-threonine
Dthr
N-(3-indolylethyl) glycine
Nhtrp

D-tryptophan
Dtrp
N-(p-hydroxyphenyl)glycine
Ntyr

D-tyrosine
Dtyr
N-(1-methylethyl)glycine
Nval

D-valine
Dval
N-methylglycine
Nmgly

D-N-methylalanine
Dnmala
L-N-methylalanine
Nmala

D-N-methylarginine
Dnmarg
L-N-methylarginine
Nmarg

D-N-methylasparagine
Dnmasn
L-N-methylasparagine
Nmasn

D-N-methylasparatate
Dnmasp
L-N-methylaspartic acid
Nmasp

D-N-methylcysteine
Dnmcys
L-N-methylcysteine
Nmcys

D-N-methylglutamine
Dnmgln
L-N-methylglutamine
Nmgln

D-N-methylglutamate
Dnmglu
L-N-methylglutamic acid
Nmglu

D-N-methylhistidine
Dnmhis
L-N-methylhistidine
Nmhis

D-N-methylisoleucine
Dnmile
L-N-methylisolleucine
Nmile

D-N-methylleucine
Dnmleu
L-N-methylleucine
Nmleu

D-N-methyllysine
Dnmlys
L-N-methyllysine
Nmlys

D-N-methylmethionine
Dnmmet
L-N-methylmethionine
Nmmet

D-N-methylornithine
Dnmorn
L-N-methylornithine
Nmorn

D-N-methylphenylalanine
Dnmphe
L-N-methylphenylalanine
Nmphe

D-N-methylproline
Dnmpro
L-N-methylproline
Nmpro

D-N-methylserine
Dnmser
L-N-methylserine
Nmser

D-N-methylthreonine
Dnmthr
L-N-methylthreonine
Nmthr

D-N-methyltryptophan
Dnmtrp
L-N-methyltryptophan
Nmtrp

D-N-methyltyrosine
Dnmtyr
L-N-methyltyrosine
Nmtyr

D-N-methylvaline
Dnmval
L-N-methylvaline
Nmval

L-norleucine
Nle
L-N-methylnorleucine
Nmnle

L-norvaline
Nva
L-N-methylnorvaline
Nmnva

L-ethylglycine
Etg
L-N-methyl-ethylglycine
Nmetg

L-t-butylglycine
Tbug
L-N-methyl-t-butylglycine
Nmtbug

L-homophenylalanine
Hphe
L-N-methyl-
Nmhphe

homophenylalanine

α-naphthylalanine
Anap
N-methyl-α-naphthylalanine
Nmanap

penicillamine
Pen
N-methylpenicillamine
Nmpen

γ-aminobutyric acid
Gabu
N-methyl-γ-aminobutyrate
Nmgabu

cyclohexylalanine
Chexa
N-methyl-cyclohexylalanine
Nmchexa

cyclopentylalanine
Cpen
N-methyl-cyclopentylalanine
Nmcpen

α-amino-α-methylbutyrate
Aabu
N-methyl-α-amino-α-
Nmaabu

methylbutyrate

α-aminoisobutyric acid
Aib
N-methyl-α-aminoisobutyrate
Nmaib

D-α-methylarginine
Dmarg
L-α-methylarginine
Marg

D-α-methylasparagine
Dmasn
L-α-methylasparagine
Masn

D-α-methylaspartate
Dmasp
L-α-methylaspartate
Masp

D-α-methylcysteine
Dmcys
L-α-methylcysteine
Mcys

D-α-methylglutamine
Dmgln
L-α-methylglutamine
Mgln

D-α-methyl glutamic acid
Dmglu
L-α-methylglutamate
Mglu

D-α-methylhistidine
Dmhis
L-α-methylhistidine
Mhis

D-α-methylisoleucine
Dmile
L-α-methylisoleucine
Mile

D-α-methylleucine
Dmleu
L-α-methylleucine
Mleu

D-α-methyllysine
Dmlys
L-α-methyllysine
Mlys

D-α-methylmethionine
Dmmet
L-α-methylmethionine
Mmet

D-α-methylornithine
Dmorn
L-α-methylornithine
Morn

D-α-methylphenylalanine
Dmphe
L-α-methylphenylalanine
Mphe

D-α-methylproline
Dmpro
L-α-methylproline
Mpro

D-α-methylserine
Dmser
L-α-methylserine
Mser

D-α-methylthreonine
Dmthr
L-α-methylthreonine
Mthr

D-α-methyltryptophan
Dmtrp
L-α-methyltryptophan
Mtrp

D-α-methyltyrosine
Dmtyr
L-α-methyltyrosine
Mtyr

D-α-methylvaline
Dmval
L-α-methylvaline
Mval

N-cyclobutylglycine
Ncbut
L-α-methylnorvaline
Mnva

N-cycloheptylglycine
Nchep
L-α-methylethylglycine
Metg

N-cyclohexylglycine
Nchex
L-α-methyl-t-butylglycine
Mtbug

N-cyclodecylglycine
Ncdec
L-α-methyl-
Mhphe

homophenylalanine

N-cyclododecylglycine
Ncdod
α-methyl-α-naphthylalanine
Manap

N-cyclooctylglycine
Ncoct
α-methylpenicillamine
Mpen

N-cyclopropylglycine
Ncpro
α-methyl-γ-aminobutyrate
Mgabu

N-cycloundecylglycine
Ncund
α-methyl-cyclohexylalanine
Mchexa

N-(2-aminoethyl)glycine
Naeg
α-methyl-cyclopentylalanine
Mcpen

N-(2,2-diphenylethyl)glycine
Nbhm
N-(N-(2,2-diphenylethyl)
Nnbhm

carbamylmethyl-glycine

N-(3,3-diphenylpropyl)glycine
Nbhe
N-(N-(3,3-diphenylpropyl)
Nnbhe

carbamylmethyl-glycine

1-carboxy-1-(2,2-diphenyl
Nmbc
1,23,4-
Tic

ethylamino)cyclopropane

tetrahydroisoquinoline-3-

carboxylic acid

phosphoserine
pSer
phosphothreonine
pThr

phosphotyrosine
pTyr
O-methyl-tyrosine

2-aminoadipic acid

hydroxylysine

The peptides of some embodiments of the invention are preferably utilized in a linear form, although it will be appreciated that in cases where cyclicization does not severely interfere with peptide characteristics, cyclic forms of the peptide can also be utilized.

The peptides of some embodiments of the invention preferably include one or more non-natural or natural polar amino acids, including but not limited to serine and threonine which are capable of increasing peptide solubility due to their hydroxyl-containing side chain.

The amino acids of the peptides of the present invention may be substituted either conservatively or non-conservatively.

The term “conservative substitution” as used herein, refers to the replacement of an amino acid present in the native sequence in the peptide with a naturally or non-naturally occurring amino or a peptidomimetics having similar steric properties. Where the side-chain of the native amino acid to be replaced is either polar or hydrophobic, the conservative substitution should be with a naturally occurring amino acid, a non-naturally occurring amino acid or with a peptidomimetic moiety which is also polar or hydrophobic (in addition to having the same steric properties as the side-chain of the replaced amino acid).

As naturally occurring amino acids are typically grouped according to their properties, conservative substitutions by naturally occurring amino acids can be easily determined bearing in mind the fact that in accordance with the invention replacement of charged amino acids by sterically similar non-charged amino acids are considered as conservative substitutions.

For producing conservative substitutions by non-naturally occurring amino acids it is also possible to use amino acid analogs (synthetic amino acids) well known in the art. A peptidomimetic of the naturally occurring amino acid is well documented in the literature known to the skilled practitioner.

When affecting conservative substitutions the substituting amino acid should have the same or a similar functional group in the side chain as the original amino acid.

The phrase “non-conservative substitutions” as used herein refers to replacement of the amino acid as present in the parent sequence by another naturally or non-naturally occurring amino acid, having different electrochemical and/or steric properties. Thus, the side chain of the substituting amino acid can be significantly larger (or smaller) than the side chain of the native amino acid being substituted and/or can have functional groups with significantly different electronic properties than the amino acid being substituted. Examples of non-conservative substitutions of this type include the substitution of phenylalanine or cycohexylmethyl glycine for alanine, isoleucine for glycine, or —NH—CH [(—CH₂)₅—COOH]—CO— for aspartic acid. Those non-conservative substitutions which fall under the scope of the present invention are those which still constitute a peptide having anti-bacterial properties.

The N and C termini of the peptides of the present invention may be protected by function groups. Suitable functional groups are described in Green and Wuts, “Protecting Groups in Organic Synthesis”, John Wiley and Sons, Chapters 5 and 7, 1991, the teachings of which are incorporated herein by reference. Preferred protecting groups are those that facilitate transport of the compound attached thereto into a cell, for example, by reducing the hydrophilicity and increasing the lipophilicity of the compounds.

The peptides of the present invention may be attached (either covalently or non-covalently) to a penetrating agent.

As used herein the phrase “penetrating agent” refers to a heterologous agent which enhances translocation of any of the attached peptide across a cell membrane.

According to one embodiment, the penetrating agent is a peptide and is attached to the peptide (either directly or non-directly) via a peptide bond.

Typically, peptide penetrating agents have an amino acid composition containing either a high relative abundance of positively charged amino acids such as lysine or arginine, or have sequences that contain an alternating pattern of polar/charged amino acids and non-polar, hydrophobic amino acids. Non-limiting examples of CPPs that can penetrate cells in a non-toxic and efficient manner and may be suitable for use in accordance with some embodiments of the invention include TAT (transcription activator from HIV-1), pAntp (also named penetratin, Drosophila antennapedia homeodomain transcription factor) and VP22 (from Herpes Simplex virus). Protocols for producing CPPs-cargos conjugates and for infecting cells with such conjugates can be found, for example L Theodore et al. [The Journal of Neuroscience, (1995) 15(11): 7158-7167], Fawell S, et al. [Proc Natl Acad Sci USA, (1994) 91:664-668], and Jing Bian et al. [Circulation Research. (2007) 100: 1626-1633].

The peptides of the present invention may also comprise non-amino acid moieties, such as for example, hydrophobic moieties (various linear, branched, cyclic, polycyclic or hetrocyclic hydrocarbons and hydrocarbon derivatives) attached to the peptides; non-peptide penetrating agents; various protecting groups, especially where the compound is linear, which are attached to the compound's terminals to decrease degradation. Chemical (non-amino acid) groups present in the compound may be included in order to improve various physiological properties such; decreased degradation or clearance; decreased repulsion by various cellular pumps, improve immunogenic activities, improve various modes of administration (such as attachment of various sequences which allow penetration through various barriers, through the gut, etc.); increased specificity, increased affinity, decreased toxicity and the like.

Attaching the amino acid sequence component of the peptides of the invention to other non-amino acid agents may be by covalent linking, by non-covalent complexion, for example, by complexion to a hydrophobic polymer, which can be degraded or cleaved producing a compound capable of sustained release; by entrapping the amino acid part of the peptide in liposomes or micelles to produce the final peptide of the invention. The association may be by the entrapment of the amino acid sequence within the other component (liposome, micelle) or the impregnation of the amino acid sequence within a polymer to produce the final peptide of the invention.

The peptides of some embodiments of the invention may be synthesized and purified by any techniques known to those skilled in the art of peptide synthesis, such as, but not limited to, solid phase techniques and recombinant techniques such as further described hereinbelow.

As used herein, the term “polynucleotide” refers to a single or double stranded nucleic acid sequence which is isolated and provided in the form of an RNA sequence, a complementary polynucleotide sequence (cDNA), a genomic polynucleotide sequence and/or a composite polynucleotide sequences (e.g., a combination of the above).

As the novel “arbitrium” system is involved in controlling integration of DNA into a genome of a host the present teachings suggest that components of the “arbitrium” system described herein can be used for controlling integration of expression product of interest into a target genome.

Thus, according to specific embodiments, the polynucleotides of the present invention comprise a nucleic acid sequence encoding an expression product of interest.

As used herein the term “expression product of interest” refers to a RNA and/or protein of interest.

According to specific embodiments, expression and/or activity of the expression product of interest is dependent on AimX; i.e. expression of AimX controls expression and/or activity of the expression product of interest.

According to specific embodiments, the expression product of interest is a therapeutic expression product such as an antibody, a growth factor, a cytokine etc.

According to specific embodiments, the expression product of interest is a DNA editing agent.

According to specific embodiments, the DNA editing agent expression and/or activity is dependent on AimX; i.e. expression of AimX controls expression and/or activity of the DNA editing agent.

As used herein, the term “DNA editing agent” refers to an agent capable of introducing sequence alterations in the genome of a cell. These targeted sequence alterations may involve loss-of function or gain of function alterations. Non-limiting examples of such alterations include a missense mutation, i.e., a mutation which changes an amino acid residue in the protein with another amino acid residue and thereby modulates the activity of the protein; a nonsense mutation, i.e., a mutation which introduces a stop codon in a protein, e.g., an early stop codon which results in a shorter protein devoid of the activity; a frame-shift mutation, i.e., a mutation, usually, deletion or insertion of nucleic acid(s) which changes the reading frame of the protein, and may result in an early termination by introducing a stop codon into a reading frame (e.g., a truncated protein, devoid of the activity), or in a longer amino acid sequence (e.g., a readthrough protein) which affects the secondary or tertiary structure of the protein and results in a protein with modulated activity; a readthrough mutation due to a frame-shift mutation or a modified stop codon mutation (i.e., when the stop codon is mutated into an amino acid codon), with a modulated activity; a deletion mutation, i.e., a mutation which deletes coding nucleic acids in a gene sequence and which may result in a frame-shift mutation or an in-frame mutation (within the coding sequence, deletion of one or more amino acid codons); an insertion mutation, i.e., a mutation which inserts coding or non-coding nucleic acids into a gene sequence, and which may result in a frame-shift mutation or an in-frame insertion of one or more amino acid codons; an inversion, i.e., a mutation which results in an inverted coding or non-coding sequence; a splice mutation i.e., a mutation which results in abnormal splicing or poor splicing; and a duplication mutation, i.e., a mutation which results in a duplicated coding or non-coding sequence, which can be in-frame or can cause a frame-shift.

According to specific embodiments DNA sequence alteration of a gene may comprise at least one allele of the gene.

According to other specific embodiments alteration of a gene sequence comprises both alleles of the gene. In such instances gene may be in a homozygous form or in a heterozygous form.

According to specific embodiments the DNA editing agent allows for the integration of exogenous nucleic acid sequences into the genome.

Thus, according to specific embodiments, the polynucleotides of some embodiments of the present invention comprise a nucleic acid sequence to be integrated into a genome of a cell by the DNA editing agent.

Methods of DNA genome editing are well known in the art [see for example Francisco Martin et al. “New Vectors for Stable and Safe Gene Modification” in Gene Therapy-Developments and Future Perspectives (2011) DOI: 10.5772/10622; Menke D. Genesis (2013) 51: —618; Capecchi, Science (1989) 244:1288-1292; Santiago et al. Proc Natl Acad Sci USA (2008) 105:5809-5814; International Patent Application Nos. WO 2014085593, WO 2009071334 and WO 2011146121; U.S. Pat. Nos. 8,771,945, 8,586,526, 6,774,279 and UP Patent Application Publication Nos. 20030232410, 20050026157, US20060014264; the contents of which are incorporated by reference in their entireties] and include, but not limited to integrases and engineered nucleases. Agents for DNA genome editing can be designed publically available sources or obtained commercially from Transposagen, Addgene and Sangamo Biosciences.

According to specific embodiments, the DNA editing agent is an integrase.

As used herein, the term “integrase” refers to a recombinase that is capable of integrating a nucleic acid sequence into another nucleic acid sequence (e.g., a genome of a cell). According to specific embodiments, the integrase is a site-specific recombinase i.e. leads to integration of sequences between two nucleic acids, each nucleic acid comprising at least one recognition site for the recombinase. Such integrases are known in the art [see e.g. in Francisco Martin et al. Gene Therapy—Developments and Future Perspectives (2011) DOI: 10.5772/10622, Recchia A et al. Curr Gene Ther. (2011) 11(5): 399-405, Lim K I, BMB Rep. (2015) 48(1):6-12 and US Patent Application publication No. US20070190601; the contents of which are incorporated by reference in their entireties] and include for example a retroviral integrase (HIV integrase) and phage integrase (e.g. phi C31 integrase, lambda integrase).

According to specific embodiments the DNA editing agent is an engineered endonucleases.

Genome editing using engineered endonucleases such as meganucleases, Zinc finger nucleases (ZFNs), transcription-activator like effector nucleases (TALENs) and CRISPR/Cas system, is well known in the art and described for example in International Patent Application Publication No. WO2015/033343, the contents of which are incorporated herein by reference in their entirety.

According to other specific embodiments, the DNA editing agent is selected from the group consisting of zinc finger nuclease, an effector protein of Class 2 CRISPR/Cas (e.g. Cas9, Cpf1, C2c1, C2c3) and TALEN.

According to specific embodiments, the polynucleotides of the present invention are part of a nucleic acid construct (also referred to herein as an “expression vector”).

As used herein, the terms “nucleic acid construct” and “expression vector” refer to a nucleic acid vector designed to introduce specific expression products of interest (i.e. genes) in host cell. The expression can be transient or consistent, episomal or integrated into the chromosome of the host cell. According to specific embodiments, the expression is on a transmissible genetic element such as a plasmid.

Hence, the nucleic acid construct of some embodiments of the invention includes additional sequences which render this vector suitable for replication and integration in prokaryotes, eukaryotes, or preferably both (e.g., shuttle vectors). In addition, a typical cloning vector may contain regulatory elements e.g. promoters, enhancers, transcription and translation initiation sequence, transcription and translation terminator, polyadenylation signal transcription termination signals etc and at least one multiple cloning site (MCS) for cloning of expression products of interest. By way of example, such constructs will typically include a 5′ LTR, a tRNA binding site, a packaging signal, an origin of second-strand DNA synthesis, and a 3′ LTR or a portion thereof.

Thus, according to an aspect of the present invention, there is provided a nucleic acid construct comprising a polynucleotide encoding an AimP peptide, a polynucleotide encoding an AimR polypeptide, a polynucleotide comprising an AimR responsive element and/or an AimX polynucleotide and any combination of same; and a MCS.

According to another aspect of the present invention, there is provided a nucleic acid construct comprising a polynucleotide encoding an AimP peptide, a polynucleotide encoding an AimR polypeptide, a polynucleotide comprising an AimR responsive element and/or an AimX polynucleotide and any combination of same; and a cis-acting regulatory element heterologous to AimP, AimR, AimR responsive element and/or AimX for directing expression of the polynucleotide.

Thus, according to specific embodiments, the nucleic acid construct may comprise any of the polynucleotides (1)-(11) described hereinabove.

Teachings of the invention further contemplate that the polynucleotides are part of a nucleic acid construct system whereby the components of the arbitrium system are expressed from different constructs.

Thus, according to specific embodiments, the present invention provides for a nucleic acid construct system comprising at least two nucleic acid constructs each expressing at least one of the polynucleotides of the present invention.

Thus according to specific embodiments, the nucleic acid construct system comprises an individual nucleic acid construct for each polynucleotide of the present invention (i.e. AimP, AimR, AimR responsive element and AimX).

According to other specific embodiments a single construct comprises a number of polynucleotide of the present invention, as described hereinabove.

According to specific embodiments, the nucleic acid construct system comprises at least one construct allowing integration of the polynucleotide into the chromosome of the cell and at least one construct allowing episomal expression of the polynucleotide.

As used herein, the term “multiple cloning site (MCS)” refers to a nucleic acid sequence comprising at least one restriction site, and more typically a number of restriction sites, for the purpose of cloning nucleic acid sequences into an expression vector. The MCS is recognized and digested by a specific restriction enzyme, such that a target expression product of interest can be inserted into the digested MCS site. Any MCS known in the art can be used in the nucleic acid constructs of the present invention. It can be obtained from various commercially available vectors known in the art having MCS (e.g., pUC18, pUC19, etc.).

Cis acting regulatory sequences include those that direct constitutive expression of a nucleotide sequence as well as those that direct inducible expression of the nucleotide sequence only under certain conditions.

The cis regulatory sequences of the present invention are heterologous to the polynucleotides of the present invention (i.e. AimP, AimR, AimR responsive element and AimX).

As used herein, the term “heterologous” means derived from a different genetic location. For example, a polynucleotide may be placed by genetic engineering techniques into a vector derived from a different source, and is a heterologous polynucleotide; a promoter removed from its native coding sequence and operatively linked to a coding sequence other than the native sequence is a heterologous promoter.

According to specific embodiments, the nucleic acid construct includes a promoter sequence for directing transcription of the polynucleotide sequence in the cell in a constitutive or inducible manner.

Eukaryotic promoters typically contain two types of recognition sequences, the TATA box and upstream promoter elements. The TATA box, located 25-30 base pairs upstream of the transcription initiation site, is thought to be involved in directing RNA polymerase to begin RNA synthesis. The other upstream promoter elements determine the rate at which transcription is initiated.

Preferably, the promoter utilized by the nucleic acid construct of some embodiments of the invention is active in the specific cell population transformed. Examples of cell type-specific and/or tissue-specific promoters include promoters such as albumin that is liver specific [Pinkert et al., (1987) Genes Dev. 1:268-277], lymphoid specific promoters [Calame et al., (1988) Adv. Immunol. 43:235-275]; in particular promoters of T-cell receptors [Winoto et al., (1989) EMBO J. 8:729-733] and immunoglobulins; [Banerji et al. (1983) Cell 33729-740], neuron-specific promoters such as the neurofilament promoter [Byrne et al. (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477], pancreas-specific promoters [Edlunch et al. (1985) Science 230:912-916] or mammary gland-specific promoters such as the milk whey promoter (U.S. Pat. No. 4,873,316 and European Application Publication No. 264, 166).

Enhancer elements can stimulate transcription up to 1,000 fold from linked homologous or heterologous promoters. Enhancers are active when placed downstream or upstream from the transcription initiation site. Many enhancer elements derived from viruses have a broad host range and are active in a variety of tissues. For example, the SV40 early gene enhancer is suitable for many cell types. Other enhancer/promoter combinations that are suitable for some embodiments of the invention include those derived from polyoma virus, human or murine cytomegalovirus (CMV), the long term repeat from various retroviruses such as murine leukemia virus, murine or Rous sarcoma virus and HIV. See, Enhancers and Eukaryotic Expression, Cold Spring Harbor Press, Cold Spring Harbor, N.Y. 1983, which is incorporated herein by reference.

According to specific embodiments, the promoter is a viral (e.g. phage) promoter.

According to specific embodiments the promoter is a bacterial promoter.

In the construction of the expression vector, the promoter is preferably positioned approximately the same distance from the heterologous transcription start site as it is from the transcription start site in its natural setting. As is known in the art, however, some variation in this distance can be accommodated without loss of promoter function.

Polyadenylation sequences can also be added to the expression vector in order to increase the efficiency of mRNA translation. Two distinct sequence elements are required for accurate and efficient polyadenylation: GU or U rich sequences located downstream from the polyadenylation site and a highly conserved sequence of six nucleotides, AAUAAA, located 11-30 nucleotides upstream. Termination and polyadenylation signals that are suitable for some embodiments of the invention include those derived from SV40.

In addition to the elements already described, the expression vector of some embodiments of the invention may typically contain other specialized elements intended to increase the level of expression of cloned nucleic acids or to facilitate the identification of cells that carry the recombinant DNA. For example, a number of animal viruses contain DNA sequences that promote the extra chromosomal replication of the viral genome in permissive cell types. Plasmids bearing these viral replicons are replicated episomally as long as the appropriate factors are provided by genes either carried on the plasmid or with the genome of the host cell. The vector may or may not include a eukaryotic replicon. If a eukaryotic replicon is present, then the vector is amplifiable in eukaryotic cells using the appropriate selectable marker. If the vector does not comprise a eukaryotic replicon, no episomal amplification is possible. Instead, the recombinant DNA integrates into the genome of the engineered cell, where the promoter directs expression of the desired nucleic acid.

The expression vector of some embodiments of the invention can further include additional polynucleotide sequences that allow, for example, the translation of several proteins from a single mRNA and sequences for genomic integration of the promoter-chimeric polypeptide, as described in details herein above and below.

According to specific embodiments, the construct encodes a polycistronic mRNA comprising the polynucleotides of the present invention.

Various construct schemes can be utilized to express few genes from a single nucleic acid construct. According to specific embodiments, the construct encodes a polycistronic mRNA comprising the polynucleotides of the present invention; that is the polynucleotides can be co-transcribed as a polycistronic message from a single promoter sequence of the nucleic acid construct. To enable co-translation of all the genes from a single polycistronic message, the different polynucleotide segments can be transcriptionally fused via a linker sequence including an internal ribosome entry site (IRES) sequence which enables the translation of the polynucleotide segment downstream of the IRES sequence. In this case, a transcribed polycistronic RNA molecule including the coding sequences of different combinations of the polynucleotides of the present invention will be translated from both the capped 5′ end and the internal IRES sequence of the polycistronic RNA molecule.

Alternatively, each two polynucleotide segments can be translationally fused via a protease recognition site cleavable by a protease expressed by the cell to be transformed with the nucleic acid construct. In this case, a chimeric polypeptide translated will be cleaved by the cell expressed protease.

Still alternatively, the nucleic acid construct of some embodiments of the invention can include at least two promoter sequences each being for separately expressing a distinct polynucleotide. These at least two promoters which can be identical or distinct can be constitutive, tissue specific or regulatable (e.g. inducible) promoters functional in one or more cell types.

When secretion of the polypeptides is desired the polynucleotides of the invention can be expressed as fusion polypeptides comprising the nucleic acid sequence encoding e.g. the components of the “arbitrium” system (e.g. AimP) ligated in frame to a nucleic acid sequence encoding a signal peptide that provides for secretion. According to specific embodiments, the signal sequence is an N-terminal signal sequence. According to specific embodiments, the signal peptide is cleaved upon peptide secretion.

DNA encoding suitable signal sequences can be derived from genes for secreted bacterial proteins, such as the E. coli outer membrane protein gene (ompA) (Masui et al. (1983) FEBS Lett. 151(1):159-164; Ghrayeb et al. (1984) EMBO J. 3:2437-2442) and the E. coli alkaline phosphatase signal sequence (phoA) (Oka et al. (1985) Proc. Natl. Acad. Sci. 82:7212). Other prokaryotic signals include, for example, the signal sequence from penicillinase, Ipp, or heat stable enterotoxin II leaders and signal sequences of the Phr family of quorum sensing systems [described e.g. in Pottathil, M. & Lazazzera, B. A. Front. Biosci. 8, d32-45 (2003)].

According to specific embodiments, the signal sequence is as set forth in SEQ ID NO: 342.

Selectable marker genes that ensure maintenance of the vector in the cell can also be included in the expression vector. Preferred selectable markers include those which confer resistance to drugs such as ampicillin, chloramphenicol, erythromycin, kanamycin (neomycin), and tetracycline (Davies et al. (1978) Annu. Rev. Microbiol. 32:469). Selectable markers can also allow a cell to grow on minimal medium or in the presence of toxic metabolite and can include biosynthetic genes, such as those in the histidine, tryptophan, and leucine biosynthetic pathways.

Other than containing the necessary elements for the transcription and translation of the inserted coding sequence, the expression construct of some embodiments of the invention can also include sequences engineered to enhance stability, production, purification, yield or toxicity of the expressed polypeptide. Thus, for example, the peptide of some embodiments of the invention (e.g. AimP) may be a pro-peptide containing a sequence processed by extracellular proteases to produce the mature peptide. According to specific embodiments the signal for extracellular processing is as set forth in SEQ ID NO: 348.

Or, for example, the expression of a fusion protein or a cleavable fusion protein comprising the protein of some embodiments of the invention (e.g. AimP) and a heterologous protein can be engineered. Such a fusion protein can be designed so that the fusion protein can be readily isolated by affinity chromatography; e.g., by immobilization on a column specific for the heterologous protein. Where a cleavage site is engineered between the protein of some embodiments of the invention and the heterologous protein, the protein of some embodiments of the invention can be released from the chromatographic column by treatment with an appropriate enzyme or agent that disrupts the cleavage site [e.g., see Booth et al. (1988) Immunol. Lett. 19:65-70; and Gardella et al., (1990) J. Biol. Chem. 265:15854-15859].

When needed, recovery of the recombinant polypeptide is effected following an appropriate time in culture. The phrase “recovering the recombinant polypeptide” refers to collecting the whole fermentation medium containing the polypeptide and need not imply additional steps of separation or purification. Notwithstanding the above, polypeptides of some embodiments of the invention can be purified using a variety of standard protein purification techniques, such as, but not limited to, affinity chromatography, ion exchange chromatography, filtration, electrophoresis, hydrophobic interaction chromatography, gel filtration chromatography, reverse phase chromatography, concanavalin A chromatography, chromatofocusing and differential solubilization.

Where appropriate, the polynucleotides may be optimized for increased expression in the transformed organism. For example, the polynucleotides can be synthesized using preferred codons for improved expression.

It will be appreciated that the individual elements comprised in the expression vector can be arranged in a variety of configurations. For example, enhancer elements, promoters and the like, and even the polynucleotide sequence(s) encoding components of the “arbitrium” system can be arranged in a “head-to-tail” configuration, may be present as an inverted complement, or in a complementary configuration, as an anti-parallel strand. While such variety of configuration is more likely to occur with non-coding elements of the expression vector, alternative configurations of the coding sequence within the expression vector are also envisioned.

Examples for mammalian expression vectors include, but are not limited to, pcDNA3, pcDNA3.1(+/−), pGL3, pZeoSV2(+/−), pSecTag2, pDisplay, pEF/myc/cyto, pCMV/myc/cyto, pCR3.1, pSinRep5, DH26S, DHBB, pNMT1, pNMT41, pNMT81, which are available from Invitrogen, pCI which is available from Promega, pMbac, pPbac, pBK-RSV and pBK-CMV which are available from Strategene, pTRES which is available from Clontech, and their derivatives.

Examples of bacterial constructs include the pET series of E. coli expression vectors [Studier et al. (1990) Methods in Enzymol. 185:60-89).

In yeast, a number of vectors containing constitutive or inducible promoters can be used, as disclosed in U.S. Pat. No. 5,932,447. Alternatively, vectors can be used which promote integration of foreign DNA sequences into the yeast chromosome.

In cases where plant expression vectors are used, the expression of the coding sequence can be driven by a number of promoters. For example, viral promoters such as the 35S RNA and 19S RNA promoters of CaMV [Brisson et al. (1984) Nature 310:511-514], or the coat protein promoter to TMV [Takamatsu et al. (1987) EMBO J. 6:307-311] can be used. Alternatively, plant promoters such as the small subunit of RUBISCO [Coruzzi et al. (1984) EMBO J. 3:1671-1680 and Brogli et al., (1984) Science 224:838-843] or heat shock promoters, e.g., soybean hsp17.5-E or hsp17.3-B [Gurley et al. (1986) Mol. Cell. Biol. 6:559-565] can be used. These constructs can be introduced into plant cells using Ti plasmid, Ri plasmid, plant viral vectors, direct DNA transformation, microinjection, electroporation and other techniques well known to the skilled artisan. See, for example, Weissbach & Weissbach, 1988, Methods for Plant Molecular Biology, Academic Press, NY, Section VIII, pp 421-463.

Other expression systems such as insects and mammalian host cell systems which are well known in the art can also be used by some embodiments of the invention.

Expression vectors containing regulatory elements from eukaryotic viruses such as retroviruses can be also used. SV40 vectors include pSVT7 and pMT2. Vectors derived from bovine papilloma virus include pBV-1MTHA, and vectors derived from Epstein Bar virus include pHEBO, and p2O5. Other exemplary vectors include pMSG, pAV009/A⁺, pMTO10/A⁺, pMAMneo-5, baculovirus pDSVE, and any other vector allowing expression of proteins under the direction of the SV-40 early promoter, SV-40 later promoter, metallothionein promoter, murine mammary tumor virus promoter, Rous sarcoma virus promoter, polyhedrin promoter, or other promoters shown effective for expression in eukaryotic cells.

As described above, viruses are very specialized infectious agents that have evolved, in many cases, to elude host defense mechanisms. Typically, viruses infect and propagate in specific cell types. The targeting specificity of viral vectors utilizes its natural specificity to specifically target predetermined cell types and thereby introduce a recombinant gene into the infected cell. Thus, the type of vector used by some embodiments of the invention will depend on the cell type transformed. The ability to select suitable vectors according to the cell type transformed is well within the capabilities of the ordinary skilled artisan and as such no general description of selection consideration is provided herein. Recombinant viral vectors are useful for in vivo expression of the polynucleotides and polypeptides of some embodiments of the present invention since they offer advantages such as lateral infection and targeting specificity. Various methods can be used to introduce the polynucleotides, expression vectors and polypeptides of some embodiments of the invention into cells. The polynucleotides and nucleic acid construct described herein can be introduced into cells by any method known in the art, as further described in details hereinbelow. Alternatively or additionally, the polypeptides described herein can be contacted with the cell per se.

It will be appreciated that the cell may be comprised inside a particular organism, for example inside a mammalian body or inside a plant.

Thus, according to a specific embodiment introducing and/or contacting is effected in-Vivo.

According to another specific embodiment introducing and/or contacting is effected ex-vivo or in-vitro.

Various methods can be used to introduce the polynucleotides and expression vectors of some embodiments of the invention into cells. Such methods are generally described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Springs Harbor Laboratory, New York (1989, 1992), in Ausubel et al., Current Protocols in Molecular Biology, John Wiley and Sons, Baltimore, Md. (1989), Chang et al., Somatic Gene Therapy, CRC Press, Ann Arbor, Mich. (1995), Vega et al., Gene Targeting, CRC Press, Ann Arbor Mich. (1995), Vectors: A Survey of Molecular Cloning Vectors and Their Uses, Butterworths, Boston Mass. (1988) and Gilboa et at. [Biotechniques 4 (6): 504-512, 1986] and include, for example, stable or transient transfection, lipofection, electroporation and infection with recombinant viral vectors. In addition, see U.S. Pat. Nos. 5,464,764 and 5,487,992 for positive-negative selection methods.

Introduction of nucleic acids by viral infection offers several advantages over other methods such as lipofection and electroporation, since higher transfection efficiency can be obtained due to the infectious nature of viruses.

Currently preferred in vivo nucleic acid transfer techniques include transfection with viral or non-viral constructs, such as adenovirus, lentivirus, Herpes simplex I virus, or adeno-associated virus (AAV) and lipid-based systems. Useful lipids for lipid-mediated transfer of the gene are, for example, DOTMA, DOPE, and DC-Chol [Tonkinson et al., Cancer Investigation, 14(1): 54-65 (1996)]. The most preferred constructs for use in gene therapy are viruses, most preferably adenoviruses, AAV, lentiviruses, or retroviruses. A viral construct such as a retroviral construct includes at least one transcriptional promoter/enhancer or locus-defining element(s), or other elements that control gene expression by other means such as alternate splicing, nuclear RNA export, or post-translational modification of messenger. Such vector constructs also include a packaging signal, long terminal repeats (LTRs) or portions thereof, and positive and negative strand primer binding sites appropriate to the virus used, unless it is already present in the viral construct. Other vectors can be used that are non-viral, such as cationic lipids, polylysine, and dendrimers.

Regardless of the method of introduction, the present teachings provide for an isolated cell which comprises the polynucleotides, the nucleic acid constructs and/or the polypeptides, as described herein.

As used herein, the term “cell” refers to a prokaryotic or a eukaryotic cell. Non-limiting examples of cells that can be used in some embodiments of the present invention include, but are not limited to, microorganisms, such as bacteria transformed with a recombinant bacteriophage DNA, plasmid DNA or cosmid DNA expression vector containing the coding sequence; yeast transformed with recombinant yeast expression vectors containing the coding sequence; plant cell systems infected with recombinant virus expression vectors (e.g., cauliflower mosaic virus, CaMV; tobacco mosaic virus, TMV) or transformed with recombinant plasmid expression vectors, such as Ti plasmid, containing the coding sequence.

According to specific embodiments the cell is a bacteria.

According to specific embodiments the cell is a mammalian cell.

According to specific embodiments, the mammalian cell is selected from the group consisting of a Chinese Hamster Ovary (CHO), HEK293, PER.C6, HT1080, NS0, Sp2/0, BHK, Namalwa, COS, HeLa and Vero cell.

According to another specific embodiment, the cell is a primary cell.

According to a specific embodiment, the cell is a cell line.

The cell may be derived from a suitable tissue including but not limited to blood, muscle, nerve, brain, heart, lung, liver, pancreas, spleen, thymus, esophagus, stomach, intestine, kidney, testis, ovary, hair, skin, bone, breast, uterus, bladder, spinal cord, or various kinds of body fluids. According to specific embodiment the cell does not express the polynucleotides and/or the polypeptides of the present invention endogenously.

According to specific embodiments, the cell expresses AimR endogenously.

The term “endogenous” as used herein, refers to the expression of the native gene in its natural location and expression level in the genome of an organism.

As shown in the Examples section which follows the arbitrium system is a genetic system that uses peptides to control the integration of specific DNA into a specific target site. Thus, the present teachings suggest the use of the polynucleotides, the nucleic acid construct and/or the polypeptides described hereinabove for inducible gene expression in general and to control genome editing in particular.

Thus, according to an aspect of the present invention, there is provided a method of expressing an expression product of interest, the method comprising:

(i) introducing into a cell a polynucleotide comprising an AimR responsive element operatively linked to a nucleic acid sequence encoding the expression product of interest, wherein said AimR comprises a DNA binding domain for binding said AimR responsive element, said AimR comprising a helix-turn-helix (HTH) motif and a tetratricopeptide repeat (TPR) domain; and

(ii) contacting said cell with an AimP peptide comprising an amino acid sequence of XXXXGG/A, wherein said AimP peptide is capable of binding said AimR polypeptide and dissociating said AimR polypeptide from an AimR responsive element,

thereby expressing the expression product of interest.

According to another aspect of the present invention, there is provided a method of expressing an expression product of interest, the method comprising introducing into a cell a polynucleotide comprising an AimR responsive element operatively linked to a heterologous nucleic acid sequence encoding the expression product of interest, wherein said AimR comprises a DNA binding domain for binding said AimR responsive element, said AimR comprising a helix-turn-helix (HTH) motif and a tetratricopeptide repeat (TPR) domain, thereby expressing the expression product of interest.

According to specific embodiments, the method comprising introducing into the cell a polynucleotide encoding AimR.

According to another aspect of the present invention, there is provided a method of expressing an expression product of interest, the method comprising introducing into a cell a polynucleotide comprising an AimR responsive element operatively linked to a nucleic acid sequence encoding the expression product of interest; and a nucleic acid construct comprising an AimR polynucleotide and a cis-acting regulatory element heterologous to said AimR for directing expression of said AimR polynucleotide,

wherein said AimR comprises a DNA binding domain for binding said AimR responsive element, said AimR comprising a helix-turn-helix (HTH) motif and a tetratricopeptide repeat (TPR) domain, thereby expressing the expression product of interest.

According to specific embodiments, the method comprising contacting the cell with an AimP peptide or a nucleic acid sequence encoding same.

According to specific embodiments, the method comprising contacting the cell with an agent capable of downregulating expression and/or activity of said AimR responsive element.

As used herein the phrase “dowregulating expression” refers to dowregulating the expression at the genomic (e.g. homologous recombination and site specific endonucleases) and/or the transcript level using a variety of molecules which interfere with transcription and/or translation (e.g., RNA silencing agents e.g. RNA interference (RNAi), transcriptional gene silencing (TGS), post-transcriptional gene silencing (PTGS), quelling, co-suppression, and translational repression) or on the protein level (e.g., aptamers, small molecules and inhibitory peptides, antagonists, enzymes that cleave the polypeptide, antibodies and the like).

According to specific embodiments, the downregulating agent is a nucleic acid agent.

According to specific embodiments, the downregulating agent is an effector protein of Class 2 CRISPR/Cas (e.g. Cas9, Cpf1, C2c1, C2c3).

According to specific embodiments, the downregulating agent is a crRNA or a sgRNA of a CRISPR/Cas system.

According to specific embodiments, the downregulating agent is an RNAi.

According to specific embodiments, down regulating expression refers to the absence of mRNA and/or protein, as detected by RT-PCR or Western blot, respectively.

According to other specific embodiments down regulating expression refers to a decrease in the level of mRNA and/or protein, as detected by RT-PCR or Western blot, respectively as compared to same in the absence of the downregulating agent. The reduction may be by at least a 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95% or at least 99% reduction.

As the CRISPR/Cas system used by the present inventors for downregulating expression and/or activity of AimX disclosed in the Examples section which follows is novel; according to an aspect of the present invention there is provided an isolated nucleic acid agent capable of downregulating expression and/or activity of an AimR responsive element.

According to specific embodiments, the downregulating agent is capable of downregulating expression and/or activity of an AimX.

As used herein, “expressing” or “expression” refers to gene expression at the nucleic acid and/or protein level. Expression can be determined using methods known in the art e.g. but not limited to selectable marker gene, Northern blot analysis, PCR analysis, DNA sequencing, RNA sequencing, Western blot analysis, and Immunohistochemistry.

When expression of the expression product of interest results in modulated activity, qualifying efficacy of DNA integration can also be determined by determining activity.

In addition, one ordinarily skilled in the art can readily design a knock-in/knock-out construct including positive and/or negative selection markers for efficiently selecting transformed cells that underwent DNA integration event with the polynucleotide or construct. Positive selection provides a means to enrich the population of clones that have taken up foreign DNA. Non-limiting examples of such positive markers include Human influenza hemagglutinin (HA) tag, glutamine synthetase, dihydrofolate reductase (DHFR), His-tag, FLAG peptide, and markers that confer antibiotic resistance, such as neomycin, hygromycin, puromycin, and blasticidin S resistance cassettes. Negative selection markers are necessary to select against random integrations and/or elimination of a marker sequence (e.g. positive marker). Non-limiting examples of such negative markers include the herpes simplex-thymidine kinase (HSV-TK) which converts ganciclovir (GCV) into a cytotoxic nucleoside analog, hypoxanthine phosphoribosyltransferase (HPRT) and adenine phosphoribosytransferase (ARPT).

According to specific embodiments, the expression product of interest is endogenous to the cell.

According to other specific embodiments, the expression product of interest is exogenous to the cell.

As mentioned above the expression product of interest may be a DNA editing agent. Thus, according to specific embodiments the method comprising introducing into the cell a nucleic acid sequence to be integrated into a genome of the cell by the DNA editing agent. Methods for qualifying efficacy and detecting integration of a nucleic acid sequence into the genome of the cell are well known in the art and include, but not limited to, DNA sequencing, electrophoresis, an enzyme-based mismatch detection assay and a hybridization assay such as PCR, RT-PCR, RNase protection, in-situ hybridization, primer extension, Southern blot, Northern Blot and dot blot analysis.

According to another aspect there is provided an article of manufacture identified for expressing an expression product of interest comprising a packaging material packaging at least two of:

(i) a polynucleotide comprising an AimR responsive element;

(ii) a polynucleotide encoding said AimR;

(iii) an AimP peptide; and/or

(iv) an agent capable of downregulating expression and/or activity of said AimR responsive element.

The article of manufacture may comprise two, three or all; i.e. (i)+(ii), (i)+(iii), (i)+(iv), (ii)+(iii), (ii)+(iv), (i)+(ii)+(iii), (ii)+(iii)+(iv), (i)+(ii)+(iv), (i)+(iii)+(iv) or (i)+(ii)+(iii)+(iv).

According to another aspect there is provided an article of manufacture identified for expressing an expression product of interest comprising a packaging material packaging:

- (a) at least one of the (2)-(6) polynucleotides of the present invention or functional fragments thereof or a nucleic acid construct comprising same as described hereinabove; and
- (b) at least one of an AimP peptide, AimP polynucleotide, functional fragments thereof, a nucleic acid construct comprising same or an AimR responsive element downregulating agent as described hereinabove.

According to specific embodiments, the polynucleotide encoding the AimR is comprised in a nucleic acid construct comprising a cis-acting regulatory element heterologous to the AimR for directing expression of the AimR polynucleotide.

According to specific embodiment, the article of manufacture comprises a multiple cloning site (MCS).

According to specific embodiment, the article of manufacture comprises a polynucleotide encoding the expression product of interest.

According to specific embodiments of these aspects of the present invention the (i), (ii) (iii) and/or (iv) or the (a) and (b) are packaged in separate containers.

According to yet other specific embodiments of these aspects of the present invention the (i), (ii) (iii) and/or (iv) or the (a) and (b) are in c-formulation.

It is expected that during the life of a patent maturing from this application many relevant DNA editing agents will be developed and the scope of the term DNA editing agent is intended to include all such new technologies a priori.

As used herein the term “about” refers to ±10%.

The terms “comprises”, “comprising”, “includes”, “including”, “having” and their conjugates mean “including but not limited to”.

The term “consisting of” means “including and limited to”.

The term “consisting essentially of” means that the composition, method or structure may include additional ingredients, steps and/or parts, but only if the additional ingredients, steps and/or parts do not materially alter the basic and novel characteristics of the claimed composition, method or structure.

As used herein, the singular form “a”, “an” and “the” include plural references unless the context clearly dictates otherwise. For example, the term “a compound” or “at least one compound” may include a plurality of compounds, including mixtures thereof.

Throughout this application, various embodiments of this invention may be presented in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the invention. Accordingly, the description of a range should be considered to have specifically disclosed all the possible subranges as well as individual numerical values within that range. For example, description of a range such as from 1 to 6 should be considered to have specifically disclosed subranges such as from 1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6 etc., as well as individual numbers within that range, for example, 1, 2, 3, 4, 5, and 6. This applies regardless of the breadth of the range.

Whenever a numerical range is indicated herein, it is meant to include any cited numeral (fractional or integral) within the indicated range. The phrases “ranging/ranges between” a first indicate number and a second indicate number and “ranging/ranges from” a first indicate number “to” a second indicate number are used herein interchangeably and are meant to include the first and second indicated numbers and all the fractional and integral numerals therebetween. As used herein the term “method” refers to manners, means, techniques and procedures for accomplishing a given task including, but not limited to, those manners, means, techniques and procedures either known to, or readily developed from known manners, means, techniques and procedures by practitioners of the chemical, pharmacological, biological, biochemical and medical arts.

When reference is made to particular sequence listings, such reference is to be understood to also encompass sequences that substantially correspond to its complementary sequence as including minor sequence variations, resulting from, e.g., sequencing errors, cloning errors, or other alterations resulting in base substitution, base deletion or base addition, provided that the frequency of such variations is less than 1 in 50 nucleotides, alternatively, less than 1 in 100 nucleotides, alternatively, less than 1 in 200 nucleotides, alternatively, less than 1 in 500 nucleotides, alternatively, less than 1 in 1000 nucleotides, alternatively, less than 1 in 5,000 nucleotides, alternatively, less than 1 in 10,000 nucleotides.

TABLE 3

Homologs of AimR-AimP in sequenced genomes

E-

DNA

value

bind-

of

Mature

ing

Locus Tag of Aim R

Homo-
Associated
peptide

re-

NA
AA

Amino
logy
AimP
sequence

gion

SEQ
SEQ

Acid
to

SEQ

SEQ

SEQ

ID
ID
IMG_oid
Genome
Scaffold
Start
End
Str-
Length
phi3T

ID
Se-
ID
Genomic
Phage
ID

NO
NO
AimR
Name
Accession
Coord
Coord
and
(aa)
AimR
Sequence
NO
quence
NO
context
type
NO

114

Phage
KY030782
68883
70019
+
378
0
MKKVFFGLVILTA
227
SAIRGA
269
Phage
SpBeta-
287

Phi3T

LAISFVAGQQSVST

like

ASASDEVTVASAIR

GA

Ga0069201_
2
115
2635513586

Bacillus

Ga0069201_
208226
209371
-
381
0
LKKTILGVAIIAAL
228
SIIRGA
270
Pro
SpBeta-
288

120285

amyloli-
120

ALSFVAGQKSVSTA

-
like

quefaciens

APNDEISVASIIRGA

plantarum

GR4-5

C379DRAFT_
3
116
2552910987

Bacillus

C379DRAFT_
266
1426
-
386
1.00E-
MKKLIMALVILGA
229
GMPRGA
271
un-
SpBeta-
289

03616

subtilis

ANIP01000036_

86
LGTSFISADSSIRQA

certain
like

S1-4
1.36

SGDYEVAGMPRGA

SPBc2p081
4
117
638282206

Bacillus

NC_001884
76295
77455
+
386
5.00E-
MKKLIMAILVILGA
230
GMPRGA
271
Phage
SpBeta-
290

phage

88
LGTSYISADSSIQQA

SpBeta
like

SPbeta

SGDYEVAGMPRGA

BsubsJ_010
5
118
643902925

Bacillus

NZ_
230462
231622
-
386
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

100011343

subtilis

ABQM01000008

88
LGTSYISADSSIQQA

phage
like

subtilis

SGDYEVAGMPRGA

JH642

BacJ22_001
6
119
2505840589

Bacillus

BacJ22_
6956
8116
+
386
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

1.00000140

subtilis

scaffold_

87
LGTSYISADSSIQQA

phage
like

J22
10

SGDYEVAGMPRGA

BacJ26_000
7
120
2505863705

Bacillus

BacJ22_
49692
50852
-
386
3.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

6.00000860

subtilis

scaffold_

86
LGTSYISADSSIQQA

phage
like

J22
5

SGDYEVAGMPRGA

B657_20860
8
121
2518770117

Bacillus

CP003783
2188487
2189647
-
386
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

subtilis

88
LGTSYISADSSIQQA

phage
like

QB928

SGDYEVAGMPRGA

BSU6051_
9
122
2540550687

Bacillus

CP003329
2208993
2210153
-
386
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

20860

subtilis

88
LGTSYISADSSIQQA

phage
like

subtilis

SGDYEVAGMPRGA

6051-HGW

A154DRAFT_
10
123
2551879023

Bacillus

A154DRAFT_
91519
92481
+
320
6.00E-
MKKVFIGLTIVAS
231
GFGRGA
272
Pro-
un-
290

00465

amyloli-
AMQ101000006_

57
LAVGFVAGQQTTIH

phage
certain

quefaciens

1.6

SASGEGTFHVAGFG

amyloli-

RGA

quefaciens

DC-12

Ga0054580_
11
124
2612412751

Bacillus

Ga0054580_
10029
1004108
+
386
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

1011017

subtilis

101

87
LGTSYISADSSIQQA

phage
like

GXA-28

SGDYEVAGMPRGA

Ga0098749_
12
125
2642060655

Bacillus

Ga0098749_
963601
964761
+
386
5.00E-
MKKLIMALVVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

1011083

murimartini

101

88
LGTSYISADSSIQQA

phage
like

LMG21005

SGDYEVAGMPRGA

Ga0098284_
13
126
2648115039

Bacillus

Ga0098284_
2227091
2228251
-
386
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

112311

subtilis

11

88
LGTSYISADSSIQQA

phage
like

BS49

SGDYEVAGMPRGA

Ga0111823_
14
127
2661428278

Bacillus

Ga0111823_
2209094
2210254
-
386
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

112291

subtilis

11

88
LGTSYISADSSIQQA

phage
like

subtilis

SGDYEVAGMPRGA

168

BSUA_
15
128
2585935209

Bacillus

CP007800
2181762
2182922
-
386
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

02243

subtilis

88
LGTSYISADSSIQQA

phage
like

subtilis

SGDYEVAGMPRGA

JH642

subAG174

Ga0098717_
16
129
2649478018

Bacillus

Ga0098717_
213507
214667
-
386
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

105270

subtilis

105

88
LGTSYISADSSIQQA

phage
like

MS1577

SGDYEVAGMPRGA

Bsubs1_010
17
130
643894000

Bacillus

NZ_
1189758
1190897
-
379
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

100011496

subtilis

ABQK01000005

88
LGTSYISADSSIQQA

phage
like

subtilis

SGDYEVAGMPRGA

168

BSUB_
18
131
2585930860

Bacillus

CP008698
2181762
2182922
-
386
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

02243

subtilis

88
LGTSYISADSSIQQA

phage
like

subtilis

SGDYEVAGMPRGA

AG1839

Ga0060198_
19
132
2624474520

Bacillus

Ga0060198_
72834
73994
+
386
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

10884

subtilis

108

87
LGTSYISADSSIQQA

phage
like

inaquosorum

SGDYEVAGMPRGA

BGSC3A28

Ga0077871_
20
133
2633570397

Bacillus

Ga0077871_
2188573
2189733
-
386
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

112265

subtilis

11

88
LGTSYISADSSIQQA

phage
like

subtilis

SGDYEVAGMPRGA

3NA

Ga0112192_
21
134
2662146470

Bacillus

Ga0112192_
18734
19894
-
386
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

12634

subtilis

126

88
LGTSYISADSSIQQA

phage
like

B4146

SGDYEVAGMPRGA

Ga0112188_
22
135
2663822497

Bacillus

Ga0112188_
8284
9444
-
386
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

17017

subtilis

170

88
LGTSYISADSSIQQA

phage
like

B4071

SGDYEVAGMPRGA

Ga0111750_
23
136
2668370401

Bacillus

Gs0111750_
2208841
2210001
-
386
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

112291

subtilis

11

88
LGTSYISADSSIQQA

phage
like

subtilis

SGDYEVAGMPRGA

168

Ga0112185_
24
137
2665375350

Bacillus

Ga0112185_
18751
19911
-
386
2.00E-
MKKLIMALVILGA
229
GMPRGA
271
Pro-
SpBeta-
290

101037

subtilis

1010

87
LGTSFISADSSIRQA

phage
like

B4068

SGDYEVAGMPRGA

Ga0112983_
25
138
2670038339

Bacillus

Ga0112983_
53996
55156
+
386
1.00E-
MKKLIMALVILGA
229
GMPRGA
271
Pro-
SpBeta-
290

104062

subtilis

1040

86
LGTSFISADSSIRQA

phage
like

subtilis

SGDYEVAGMPRGA

B4067

BsubsN3_
26
139
643898395

Bacillus

NZ_
1189759
1190919
-
386
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

0101000114

subtilis

ABQL01000005

88
LGTSYISADSSIQQA

phage
like

17

NCIB 3610

SGDYEVAGMPRGA

Ga0072477_
27
140
2637211300

Bacillus

Ga0072477_
2209087
2210247
-
386
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

112290

subtilis

11

88
LGTSYISADSSIQQA

phage
like

KCTC 1028

SGDYEVAGMPRGA

AUSI98
28
141
2547773423

Bacillus

AUSI98DRAFT_
27978
29138
+
386
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
291

DRAFT_

subtilis

AFSF0100

88
LGTSYISADSSIQQA

phage
like

00510

subtilis

SGDYEVAGMPRGA

AUSI98

Ga0112186_
29
142
2662791384

Bacillus

Ga0112186_
72619
73779
-
386
1.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
292

158126

subtilis

158

85
LGTSYISADSSIQQA

phage
like

B4069

SGDYEVAGMPRGA

BsubsS_
30
143
643907317

Bacillus

NZ_
1024361
1025500
-
379
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
293

0100011472

subtilis

ABQN01000008

88
LGTSYISADSSIQQA

phage
like

subtilis

SGDYEVAGMPRGA

SMY

Ga0077513_
31
144
2640459274

Bacillus

Ga0077513_
120393
121595
-
400
9.00E-
MKKLLIGIFVSATL
232
GFTVGA
273
Pro-
SpBeta-
294

1097185

glycini-

1097

90
LAVGYVASQVNNS

phage
like

fermentans

GYSIAGFTVGA

TH008

Ga0100863_
32
145
2651529367

Bacillus

Ga0100863_
37279
38310
+
343
2.00E-
MKKITMSVIVLA
233
DPGRGG
274
Pro-
phage
295

10136

amyloli-
101

53
AIVTVVLGSVQHQE

phage
group D

quefaciens

AKSHTVNQLADPG

RGG

BMSB_
33
146
2525735558

Bacillus

BMSB_NODE_
193495
194652
+
385
8.00E-
MKKLVMALVLLA
234
SASRGA
275
Pro-
SpBeta-
296

03973

pumilus

203_3_len_

117
AVAGVFSGTQQSIA

phage
like

CCMA-560
235571_cov_

LDDEKVSTSSASRG

(Bio-
807_850708.

A

suractant)
114

V529_20920
34
147
2578426885

Bacillus

CP006890
2191528
2192682
-
384
2.00E-
MKKFNCAIVILLA
235
SASRGA
275
Pro-
SpBeta-
297

amyloli-

154
LTVGFVSGQQSVQ

phage
like

quefaciens

TANGDITVASASRG

SQR9

A

Ga0077150_
35
148
2630228571

Bacillus

Ga0077150_
64995
66149
+
384
1.00E-
MKKFNCAIVILLA
236
SASRGA
275
Pro-
SpBeta-
297

10184

amyloli-
101

154
LAVGFVSGQQSVQ

phage
like

quefaciens

TANGDITVASASRG

JJC33M

A

Ga0100863_
36
149
2651532210

Bacillus

Ga0100863_
17241
18395
+
384
1.00E-
MKKFNCAIVILLA
235
SASRGA
275
Pro-
SpBeta-
297

12531

amyloli-
125

154
LTVGFVSGQQSVQ

phage
like

quefaciens

TANGDITVASASRG

RHNK22

A

K667DRAFT_
37
150
554710456

Bacillus

K667DRAFT_
43047
44201
-
384
2.00E-
MKKFNCAIVILLA
235
SASRGA
275
Pro-
SpBeta-
298

03728

subtilis

AQGM010000023_

155
LTVGFVSGQQSVQ

phage
like

SPZ1
1.23

TANGDITVASASRG

A

L145DRAFT_
38
151
2554737155

Paeni-
L145DRAFT_
2478
6632
+
384
2.00E-
MKKFNCAIVILLA
235
SASRGA
275
Pro-
SpBeta-
298

03550

bacillus

ARYD01000023_

155
LTVGFVSGQQSVQ

phage
like

polymyxa

1.23

TANGDITVASASRG

ATCC

A

12321

BATRDET
39
152
2547896382

Bacillus

BATRD
96814
97707
-
297
1.00E-
MKKIFMGITIAAV
237
GVVRGA
276
Pro-
SpBeta-
299

2DRAFT_

atrophaeus

ET2DRAFT_

61
LMFSYASVKLASN

phage
like

00678

Detrick-2
AEFQ01000007_

EQTLGDYEVAGVV

1.7

RGA

BAC51E
40
153
2548050036

Bacillus

BAC51EDRAFT_
108728
109621
-
297
1.00E-
MKKIFMGITIAAV
237
GVVRGA
276
Pro-
SpBeta-
299

DRAFT_

atrophaeus

AEFX01000024_

61
LMFSYASVKLASN

phage
like

03601

BACI051-E
1.24

EQTLGDYEVAGVV

RGA

BATR8221
41
154
2547925085

Bacillus

BATR82
105859
109482
-
297
1.00E-
MKKIFMGITIAAV
237
GVVRGA
276
Pro-
SpBeta-
299

DRAFT_

atrophaeus

21DRAFT_

61
LMFSYASVKLASN

phage
like

04209

globigii

ARFV01000030_

EQTLGDYEVAGVV

ATCC
1.30

RGA

49822-1

BATR8222
42
155
2547920785

Bacillus

BATR82
107377
108270
-
297
1.00E-
MKKIFMGITIAAV
237
GVVRGA
276
Pro-
SpBeta-
299

DRAFT_

atrophaeus

22DRAFT_

61
LMFSYASVKLASN

phage
like

04125

globigii

AEFW01000028_

EQTLGDYEVAGVV

ATCC
1.28

RGA

49822-2

Ga0057413_
43
156
2598460075

Bacillus

Ga0057413_
2140405
2141298
-
297
1.00E-
MKKIFMGITIAAV
237
GVVRGA
276
Pro-
SpBeta-
299

02088

atrophaeus

gi673978761.1

61
LMFSYASVKLASN

phage
like

globigii

EQTLGDYEVAGVV

BSS

RGA

BATR132
44
157
2547912402

Bacillus

BATR132
47035
47928
+
297
1.00E-
MKKIFMGITIAAV
237
GVVRGA
276
Pro-
SpBeta-
299

DRAFT_

atrophaeus

DRAFT_

61
LMFSYASVKLASN

phage
like

04097

1013-2
AEFT01000028_

EQTLGDYEVAGVV

1.28

RGA

BAC51ND
45
158
2548024839

Bacillus

BAC51N
55449
56342
+
297
1.00E-
MKKIFMGITIAAV
237
GVVRGA
276
Pro-
SpBeta-
299

RAFT_

atrophaeus

DRAFT_

61
LMFSYASVKLASN

phage
like

02086

BACI051-N
AEFY01000036_

EQTLGDYEVAGVV

1.36

RGA

BATR
46
159
2547985264

Bacillus

BATR
47819
48712
+
297
1.00E-
MKKIFMGITIAAV
237
GVVRGA
276
Pro-
SpBeta-
299

DRAFT_

atrophaeus

DRAFT_

61
LMFSYASVKLASN

phage
like

04181

globigii

AEFO01000042_

EQTLGDYEVAGVV

Dugway
1.42

RGA

RBAU_
47
160
2541780767

Bacillus

HG328253
2209877
2211025
-
382
1.00E-
MKNILGIVILLAM
238
SPSRGA
277
Pro-
SpBeta-
300

2086

amyloli-

155
AVGFVAGQQSIETA

phage
like

quefaciens

SVDHVDQPVKVAS

plantarum

PSRGA

UCMB5033

BAMTA208_
48
161
651181381

Bacillus

CP002627
2938440
2939402
-
320
2.00E-
MKKVFIGLTIVAS
239
GFGRGA
272
Pro-
phi105-
301

15460

amyloli-

57
LAVGFVAGQQTTIH

phage
like

quefaciens

TASGEETFHVAGFG

TA208

RGA

Ga0069498_
49
162
2628601775

Bacillus

Ga0069498_11
2941301
2942263
-
320
2.00E-
MKKVFIGLTIVAS
239
GFGRGA
272
Pro-
phi105-
301

113134

subtilis

57
LAVGFVAGQQTTIH

phage
like

ATCC

TASGEETFHVAGFG

13952

RGA

BAXH7_
50
163
2511913376

Bacillus

CP002927
2940486
2941448
-
320
2.00E-
MKKVFIGLTIVAS
239
GFGRGA
272
Pro-
phi105-
301

03158

amyloli-

57
LAVGFVAGQQTTIH

phage
like

quefaciens

TASGEETFHVAGFG

XH7

RGA

BAMF_
51
164
649669044

Bacillus

NC_014551
2971658
2972620
-
320
2.00E-
MKKVFIGLTIVAS
239
GFGRGA
272
Pro-
phi105-
301

2913

amyloli-

57
LAVGFVAGQQTTIH

phage
like

quefaciens

TASGEETFHVAGFG

Campbell F,

RGA

DSM 7

Ga0100815_
52
165
2646412539

Bacillus

Ga0100815_114
292702
293664
+
320
2.00E-
MKKVFIGLTIVAS
240
GFGRGA
272
Pro-
phi105-
302

114334

amyloli-

56
LAVGFVAGQQTTIH

phage
like

quefaciens

SASGEETFHVAGFG

XK-4-1

RGA

H008DRAFT_
53
166
2586341037

Bacillus

H008DRAFT_
256225
257187
+
320
6.00E-
MKKVFIGLTIVAS
231
GFGRGA
272
Pro-
Mu-like
303

01224

methylo-
AOFO01000003_

57
LAVGFVAGQQTTIH

phage
(virfam)

trophicus

1.3

SASGEGTFHVAGFG

SK19.001

RGA

EGDHOCAQ14_
54
167
2545558323

Bacillus

EGDHPCAQ14_
293215
294177
+
320
4.00E-
MKKVFIGLTIVAS
240
GFGRGA
272
un-
un-
304

02563

amyloli-
contig00004.4

56
LAVGFVAGQQTTIH

certain
certain

quefaciens

SASGEETFHVAGFG

EGD-AQ14

RGA

O205_
55
168
2578932748

Bacillus

AVQH010000344
293215
294177
+
320
4.00E-
MKKVFIGLTIVAS
240
GFGRGA
272
un-
un-
304

13015

amyloli-

56
LAVGFVAGQQTTIH

certain
certain

quefaciens

SASGEETFHVAGFG

EGD-AQ14

RGA

Ga0077150_
56
169
2630231491

Bacillus

GA0077150_
118370
119332
+
320
7.00E-
MKKVFIGLTIVAS
241
GFGRGA
272
Pro-
phi105-
305

109123

amyloli-
109

58
LAVGFVAGQQTTIH

phage
like

quefaciens

NAASGEETFHVAG

JJC33M

FGRGA

ba1_12134
57
170
2535064176
Bacillus
AMSH01000035
31377
32540
-
387
7.00E-
MKKTALFLIVAVT
242
AMGNG
278
Pro-
Mu-like
306

sp.

78
IFSVGFASGQTSEQ

G

phage
(virfam)

HYC-10

AIEFIKTAAMGNGG

FGRGA

Ga0111348_
58
171
2656454588

Bacillus

Ga0111348_12
3159949
3160911
-
320
3.00E-
MSMKIKLGLAAD
243
TIGRG
279
Pro-
phi105-
307

123252

amyloli-

46
AVALFVAGYATNQ

phage
like

quefaciens

AVKDVAAGQDTVF

plantarum

KVATIGRG

NAU-B3

null

replaces

81671

Ga0081671_
59
172
2638063346

Bacillus

Ga0081671_
3159949
3160911
-
320
3.00E-
MSMKIKLGLAAD
243
TIGRG
279
Pro-
phi105-
307

113251

amyloli-
11

46
AVALFVAGYATNQ

phage
like

quefaciens

AVKDVAAGQDTVF

plantarum

KVATIGRG

NAU-B3

C379DRAFT_
60
173
2552909043

Bacillus

C379DRAFT_
93495
94667
+
390
2.00E-
MKKVFIGLTIVAA
244
SASRGA
275
Pro-
Mu-like
308

01654

subtilis

ANIP01000010_

121
LAVAFVAGQHSQV

phage
(virfam)

S1-4
1.10

DTASGSVSVASASR

GA

BSSC8_
61
175
2521025224

Bacillus

AGFW01000004
67653
68813
+
386
4.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
309

21910

subtilis

86
LGTSYISADSSIQQA

phage
like

subtilis

SGDYEVAGMPRGA

SC-8

Ga0077901_
62
175
2645892320

Bacillus

Ga0077901_11
2230615
2231754
-
379
0
MKKIIFGTAILAAL
245
SAIRGA
269
Pro-
SpBeta-
310

112256

methylo-

AISFIAGQHSVNTA

phage
like

trophicus

SVSDEISVASAIRGA

JJ-D34

Ga0071349_
63
176
2616415682

Bacillus

Ga0071349_
2230136
2231275
-
379
0
MKKIIFGTAILAAL
245
SAIRGA
269
Pro-
SpBeta-
310

112240

methylo-

11

AISFIAGQHSVNTA

phage
like

trophicus

SVSDEISVASAIRGA

JJ-D34

BACAU_
64
177
2511700821

Bacillus

NC_016784
3053043
3054005
-
320
2.00E-
MKKVFIGLTIVAS
240
GFGRGA
272
Pro-
phi105-
311

2847

amyloli-

56
LAVGFVAGQQTTIH

phage
like

quefaciens

SASGEETFHVAGFG

CAU-B946

RGA

Ga0111348_
65
178
2656453156

Bacillus

Ga0111348_
1723410
1724549
+
379
0
MKKIIFGTAILASL
246
SAIRGA
269
Pro-
SpBeta-
312

121799

amyloli-
12

AISFIAGQSVNTA

phage
like

quefaciens

SASDEISVASAIRGA

plantarum

NAU-B3

null

replaces

81671

O205_
66
179
2578929474

Bacillus

AVQH01000001
220604
221743
-
379
0
MKKIIFGTAILASL
246
SAIRGA
269
Pro-
SpBeta-
312

01290

amyloli-

AISFIAGQSVNTA

phage
like

quefaciens

SASDEISVASAIRGA

EGD-AQ14

Ga0081671_
67
180
2638061915

Bacillus

Ga0081671_
1723410
1724549
+
379
0
MKKIIFGTAILASL
246
SAIRGA
269
Pro-
SpBeta-
312

111799

amyloli-
11

AISFIAGQSVNTA

phage
like

quefaciens

SASDEISVASAIRGA

plantarum

NAU-B3

LL3_01627
68
181
651184090

Bacillus

CP002634
1507610
1508575
+
321
4.00E-
MKNKLKIGLAVA
247
TIGRGG
280
Pro-
phage
312

amyloli-

58
VLSVSVIGFVANKA

phage
group D

quefaciens

MNAAADAKEPQFK

LL3

VATIGRGG

EGDHPCA
69
182
2545556050

Bacillus

EGDHP
220604
221743
-
379
0
MKKIIFGTAILASL
246
SAIRGA
269
Pro-
SpBeta-
312

Q14_00251

amyloli-
CAQ14_contig

AISFIAGQSVNTA

phage
like

quefaciens

0000.1.

SASDEISVASAIRGA

EGD-AQ14

M769_
70
183
2571030593

Bacillus

AZYP01000118
269
1435
-
388
9.00E-
MKKLFVGIVVSVS
248
GFTVGA
23
un-
un-
313

0124555

lichen-

93
LLAVGIAAAQINSG

certain
certain

iformis

FSVAGFTVGA

S 16

Ga0102445_
71
184
2645666845

Bacillus

Ga0102445_
161090
162247
-
385
2.00E-
MKKLVMALVLVA
249
GASRGA
275
Pro-
SpBeta-
314

1200198

sp. Leaf49
120

109
AVAGVFSGTQQSIA

phage
like

MDAEKSVTSSASR

GA

BSONL12_
72
185
2546447199

Bacillus

AOFM01000019
77077
78243
-
388
1.00E-
MKKWLFSIAVVA
250
GFPRGA
281
Pro-
SpBeta-
315

23325

sonorensis

94
ALLITGVAVAESTH

phage
like

L12

QAEGGYYIAGRPRG

A

Ga0077185_
73
186
2655559738

Bacillus

Ga0077185_
1026770
1027939
+
389
6.00E-
MKKLFVGIVVSVS
251
GFTVGA
273
Pro-
SpBeta-
316

1131171

lichen-
113

90
LLAVGIAAAQVNS

phage
like

iformis

GFSVAGFTVGA

GB2

C650DRAFT_
74
187
2553299155

Bacillus

C650DRAFT_
13723
14892
-
389
6.00E-
MKKLFVGIVVSVS
251
GFTVGA
273
Pro-
SpBeta-
316

02718

lichen-
AMWQ01000046_

90
LLAVGIAAAQVNS

phage
like

iformis

1.46

GFSVAGFTVGA

CGMCC

3963

M661DRAFT_
75
188
2555268673

Bacillus

M661DRAFT_
78786
79955
+
389
2.00E-
MKKLFVGIVVSVS
251
GFTVGA
273
Pro-
SpBeta-
316

01208

sp.
ATNR01000004_

91
LLAVGIAAAQVNS

phage
like

SB47
1.4

GFSVAGFTVGA

Ga0098704_
76
189
2647076482

Bacillus

Ga0098704_
40519
41574
-
351
1.00E-
MKKLFVGIVVSVS
251
GFTVGA
273
Pro-
SpBeta-
316

11068

lichen-
110

77
LLAVGIAAAQVNS

phage
like

iformis

GFSVAGFTVGA

S127

HMPREF1012_
77
190
650030994

Bacillus

NZ_
25867
27036
-
389
3.00E-
MKKLFVGIVVSVS
251
GFTVGA
273
un-
SpBeta-
316

01017

sp.
ACWC01000004

90
LLAVGIAAAQVNS

certain
like

BT1B_CT2

GFSVAGFTVGA

N399_
78
191
2576359374

Bacillus

AVEZ01000043
78729
79886
+
385
5.00E-
MKKLFVGIVVSVT
252
GFTVGA
273
Pro-
SpBeta-
317

24140

lichen-

92
LLAVGIAAAKINSG

phage
like

iformis

FSVAGFTVGA

CG-B52

W9SDRAFT_
79
192
2550811423

Bacillus

W9SDRAFT_
15375
16532
+
385
2.00E-
MKKLFVGIVVSVS
253
GFTVGA
273
Pro-
SpBeta-
318

00383

lichen-
AJL

91
LLAVGIAASQINSG

phage
like

iformis

V01000017_

FSVAGFTVGA

10-1-A
1.17

Ga0057513_
80
193
2598368491

Bacillus

Ga0057513_
2066016
2066909
+
297
1.00E-
MKKIFMGITIAAV
237
GVVRGA
276
Pro-
SpBeta-
319

02496

subtilis

gi674581281.3

61
LMFSYASVKLASN

phage
like

var niger

EQTLGDYEVAGVV

PCI246

RGA

BATR1942_
81
194
649707910

Bacillus

NC_014639
1594016
1594909
-
297
1.00E-
MKKIFMGITIAAV
237
GVVRGA
276
Pro-
SpBeta-
319

07720

atrophaeus

61
LMFSYASVKLASN

phage
like

1942

EQTLGDYEVAGVV

RGA

C379DRAFT_
82
195
2552911160

Bacillus

C379DRAFT_
145417
146379
-
320
2.00E-
MAKKMKLGLATA
254
TIGRG
279
Pro-
phi105-
320

03789

subtilis

ANIP01000036_

51
AVALFLAGYATNL

phage
like

S1-4
1.36

VVSDVAAGKGDVF

KVATIGRG

BAMTA208_
83
196
651179561

Bacillus

CP002627
1257136
1258296
+
386
5.00E-
MKKLFMGITIAAV
255
GVVRGA
276
Pro-
SpBeta-
321

06415

amyloli-

95
LMFSYASVKLVSN

phage
like

quefaciens

EQASGDYEVAGVV

TA208

RGA

Ga0069498_
84
197
2628600848

Bacillus

Ga0069498_
2117404
2118564
-
386
5.00E-
MKKLFMGITIAAV
255
GVVRGA
276
Pro-
SpBeta-
321

112187

subtilis

11

95
LMFSYASVKLVSN

phage
like

ATCC

EQASGDYEVAGVV

13952

RGA

BSI_
85
198
2536647131

Bacillus

AMXN01000009
83188
83904
+
238
3.00E-
MKKVTIGLTIVAA
256
GFGRGA
272
un-
un-
322

39170

subtilis

37
LAIGFVAGQQSGLH

certain
certain

inaquosorum

SASGNETFHVAGFG

KCTC

RGA

13429

Ga0055124_
86
199
2612266796

Bacillus

Ga0055124_
106622
107584
-
320
6.00E-
MKKVTIGLVTIVAA
257
GFGRGA
272
un-
un-
323

109116

subtilis

109

50
LTIGFVAGQQSGLH

certain
certain

E1

SASGNKTFHVAGF

GRGA

BAME_
87
200
2540180739

Bacillus

AJWW01000017
45331
46491
-
386
7.00E-
MKKVFALTIVAA
258
GFGHGA
282
Pro-
Mu-like
324

14140

sp. M 2-6

79
AIFFGGVVTGTQIN

phage
(virfam)

(*Bacillus

SASDFNTAGFGHG

aerophilus

A

KACC

16563 Gil)

Ga0102445_
88
201
2645667694

Bacillus

Ga0102445_
280606
281766
-
386
1.00E-
MKKVFALTIVVA
259
GFGRGA
272
Pro-
phi105-
325

121323

sp. Leaf49
121

87
AIFFGGVATGTQID

phage
like

TASDYSTAGFGRG

A

EGDHPCAQ14_
89
202
2545559244

Bacillus

EGDHPCAQ14_
2398
3576
-
392
8.00E-
MKKVLIGLTIVAA
260
SIGHGA
283
un-
un-
326

03492

amyloli-
contig00011.

67
LTVGFVGGQYSVN

certain
certain

quefaciens

11

NASGDVQVASIGH

EGD-AQ14

GA

O205_
90
203
2578930197

Bacillus

AVQH01000003
2398
3576
-
392
8.00E-
MKKVLIGLTIVAA
260
SIGHGA
283
un-
un-
326

97690

amyloli-

67
LTVGFVGGQYSVN

certain
certain

quefaciens

NASGDVQVASIGH

EGD-AQ14

GA

Ga0112190_
91
24
2668626179

Bacillus

Ga0112190_
69813
70967
-
384
8.00E-
MKKVLYSLIIVIAL
261
SPSRGA
277
Pro-
SpBeta-
327

1037117

subtilis

1037

1.72
AVGFVGGQKSMET

phage
like

B4073

ASVDQPIKVASPSR

GA

Bcell_
92
205
649826024

Bacillus

NC_014829
2492123
2493163
+
346
5.00E-
MKKFIKGLIIAVTL
262
?
284
not

328

2296

cellulo-

39
VAASTSIPTSSVAY

pro-

silyticus

DVDFRVRSIEVVDV

phage

N-4,

QPLYDVDFRVRSV

DSM 2522

EQFDVQPLYDVDF

RVR

BSU20860
93
206
646318642

Bacillus

NC_000964
2208994
2210154
-
386
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

subtilis

88
LGTSYISADSSIQQA

phage
like

subtilis

SGDYEVAGMPRGA

168

LL3_02315
94
207
651184775

Bacillus

CP002634
2201471
2202610
-
379
0
MKKIIFGTAILAAL
263
SAIRGA
269
Pro-
SpBeta-
329

amyloli-

AISFIAGQHSVNTA

phage
like

quefaciens

SASDEISVASAIRGA

LL3

BacJ24_
95
208
2505852091

Bacillus

BacJ24_
83769
84929
+
386
3.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

0008.

subtilis

scaffold_7

86
LGTSYISADSSIQQA

phage
like

00001040

J24

SGDYEVAGMPRGA

BAXH7_
96
209
2511911529

Bacillus

CP002927
1258713
1259873
+
386
5.00E-
MKKLFMGITIAAV
255
GVVRGA
276
Pro-
SpBeta-
321

01322

amyloli-

95
LMFSYASVKLVSN

phage
like

quefaciens

EQASGDYEVAGVV

XH7

RGA

BANAU_
97
210
2514130679

Bacillus

NC_017061
1159971
1160933
+
320
2.00E-
MKKVFICLTIVAS
264
GRGRGA
272
Pro-
Mu-like
330

1103

amyloli-

57
LAVGFIAGQQTTIH

phage
(virfam)

quefaciens

SASGEETFHVAGFG

planatarum

RGA

YAU

B9601-Y2

BANAU_
98
211
2514131006

Bacillus

NC_017061
1528667
1529629
+
320
5.00E-
MKKVLIGLAIVAA
265
NRGRGA
285
Pro-
phage
331

1431

amyloli-

56
LAVGFVGGQHFKT

phage
group D

quefaciens

ASGDIQMANPGRG

planatarum

A

YAU

B9601-Y2

BANAU_
99
212
2514131350

Bacillus

NC_017061
1939108
1940262
+
384
6.00E-
MKKFNCAIVILLA
235
SASRGA
275
Pro-
SpBeta-
297

1775

amyloli-

155
LTVGFVSGQQSVQ

phage
like

quefaciens

TANGDITVASASRG

planatarum

A

YAU

B9601-Y2

BS732_
100
213
2531250071

Bacillus

AOTY01000004
37378
38517
+
379
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

3741

subtilis

88
LGTSYISADSSIQQA

phage
like

MB732

SGDYEVAGMPRGA

MUS_1241
101
214
2540720195

Bacillus

CP003332
1158120
1159082
+
320
2.00E-
MKKVFICLTIVAS
264
GRGRGA
272
Pro-
Mu-like
330

amyloli-

57
LAVGFIAGQQTTIH

phage
(virfam)

quefaciens

SASGEETFHVAGFG

Y2

RGA

MUS_1619
102
215
2540720547

Bacillus

CP003332
1524792
1525754
+
320
5.00E-
MKKVLIGLAIVAA
265
NRGRGA
285
Pro-
phage
331

amyloli-

56
LAVGFVGGQHFKT

phage
group D

quefaciens

ASGDIQMANPGRG

Y2

A

MUS_1994
103
216
2540720888

Bacillus

CP003332
1935234
1936334
+
384
6.00E-
MKKFNCAIVILLA
235
SASRGA
275
Pro-
SpBeta-
297

amyloli-

155
LTVGFVSGQQSVQ

phage
like

quefaciens

TANGDITVASASRG

Y2

A

GYSDRAFT_
104
217
2547506781

Bacillus

GYSDRAFT_
106
1116
+
336
6.00E-
MKKLIMALVILGA
266
GMPRGA
271
Pro-
SpBeta-
332

03015

vallis-
AFSH0100007

63
LGTSYISADSSNQQ

phage
like

mortis DV1-

ASGDYEVAGMPRG

F-131_.71

A

BATRMS
105
218
2547798787

Bacillus

BATRMDRAFT_
108435
109328
-
297
1.00E-
MKKIFMGITIAAV
237
GVVRGA
276
Pro-
SpBeta-
299

DRAFT_04055

atrophaeus

AEFM01000025_

61
LMFSYASVKLASN

phage
like

ATCC 9372
1.25

EQTLGDYEVAGVV

RGA

BATRDET3
106
219
2547887380

Bacillus

BATRDET3
47822
48715
+
297
1.00E-
MKKIFMGITIAAV
237
GVVRGA
276
Pro-
SpBeta-
299

DRAFT_04168

atrophaeus

DRAFT_

61
LMFSYASVKLASN

phage
like

Detrick-3
AEFR01000030_

EQTLGDYEVAGVV

1.30

RGA

BATRDET1
107
220
2547902834

Bacillus

BATRDET1
123791
124684
-
297
1.00E-
MKKIFMGITIAAV
237
GVVRGA
276
Pro-
SpBeta-
299

DRAFT_

atrophaeus

DRAFT_

61
LMFSYASVKLASN

phage
like

02926

Detrick-1
AEFP01000015_

EQTLGDYEVAGVV

1.15

RGA

BATR722
108
221
2547907714

Bacillus

BATR272
123820
124713
-
297
1.00E-
MKKIFMGITIAAV
237
GVVRGA
276
Pro-
SpBeta-
299

DRAFT_

atrophaeus

DRAFT_

61
LMFSYASVKLASN

phage
like

03603

ATCC
AEFU01000031_

EQTLGDYEVAGVV

9372-2
1.31

RGA

BATR131
109
222
2547916178

Bacillus

BATR131
173190
174083
+
297
1.00E-
MKKIFMGITIAAV
237
GVVRGA
276
Pro-
SpBeta-
299

DRAFT_

atrophaeus

DRAFT_

61
LMFSYASVKLASN

phage
like

03725

1013-1
AEFS01000022_

EQTLGDYEVAGVV

1.22

RGA

EF83_
110
223
2579676294

Bacillus

JMEF01000083
1896
3056
-
386
5.00E-
MKGGDKMKKFIM
267
GIVRGA
286
Pro-
un-
333

22295

subtilis

92
AITIAAVLSISFVGA

phage
certain

KATMIRA

KASSNEQASGDYQ

1933

VAGIVRGA

EF83_
111
224
2579676381

Bacillus

JMEF01000107
1005
1721
-
238
8.00E-
MKKVFIGLAIVAA
268
SASRGA
275
un-
un-
334

22800

subtilis

78
LAVAFVAGQHSQT

certain
certain

KATMIRA

DNASGNVSVASAS

1933

RGA

Ga0077944_
112
225
2635815553

Bacillus

Ga0077944_
2140380
2141273
-
297
1.00E-
MKKIFMGITIAAV
237
GVVRGA
276
Pro-
SpBeta-
299

112105

atrophaeus

11

61
LMFSYASVKLASN

phage
like

NRS 1221A

EQTLGDYEVAGVV

RGA

Ga0098211_
113
226
2652254345

Bacillus

Ga0098211_
2227093
2228253
-
386
5.00E-
MKKLIMALVILGA
230
GMPRGA
271
Pro-
SpBeta-
290

112309

sp. BS34A
11

88
LGTSYISADSSIQQA

phage
like

SGDYEVAGMPRGA

It is appreciated that certain features of the invention, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the invention, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable subcombination or as suitable in any other described embodiment of the invention. Certain features described in the context of various embodiments are not to be considered essential features of those embodiments, unless the embodiment is inoperative without those elements.

Various embodiments and aspects of the present invention as delineated hereinabove and as claimed in the claims section below find experimental support in the following examples.

EXAMPLES

Reference is now made to the following examples, which together with the above descriptions illustrate some embodiments of the invention in a non limiting fashion.

Generally, the nomenclature used herein and the laboratory procedures utilized in the present invention include molecular, biochemical, microbiological and recombinant DNA techniques. Such techniques are thoroughly explained in the literature. See, for example, “Molecular Cloning: A laboratory Manual” Sambrook et al., (1989); “Current Protocols in Molecular Biology” Volumes I-III Ausubel, R. M., ed. (1994); Ausubel et al., “Current Protocols in Molecular Biology”, John Wiley and Sons, Baltimore, Md. (1989); Perbal, “A Practical Guide to Molecular Cloning”, John Wiley & Sons, New York (1988); Watson et al., “Recombinant DNA”, Scientific American Books, New York; Birren et al. (eds) “Genome Analysis: A Laboratory Manual Series”, Vols. 1-4, Cold Spring Harbor Laboratory Press, New York (1998); methodologies as set forth in U.S. Pat. Nos. 4,666,828; 4,683,202; 4,801,531; 5,192,659 and 5,272,057; “Cell Biology: A Laboratory Handbook”, Volumes I-III Cellis, J. E., ed. (1994); “Culture of Animal Cells— A Manual of Basic Technique” by Freshney, Wiley-Liss, N. Y. (1994), Third Edition; “Current Protocols in Immunology” Volumes I-III Coligan J. E., ed. (1994); Stites et al. (eds), “Basic and Clinical Immunology” (8th Edition), Appleton & Lange, Norwalk, Conn. (1994); Mishell and Shiigi (eds), “Selected Methods in Cellular Immunology”, W. H. Freeman and Co., New York (1980); available immunoassays are extensively described in the patent and scientific literature, see, for example, U.S. Pat. Nos. 3,791,932; 3,839,153; 3,850,752; 3,850,578; 3,853,987; 3,867,517; 3,879,262; 3,901,654; 3,935,074; 3,984,533; 3,996,345; 4,034,074; 4,098,876; 4,879,219; 5,011,771 and 5,281,521; “Oligonucleotide Synthesis” Gait, M. J., ed. (1984); “Nucleic Acid Hybridization” Hames, B. D., and Higgins S. J., eds. (1985); “Transcription and Translation” Hames, B. D., and Higgins S. J., eds. (1984); “Animal Cell Culture” Freshney, R. I., ed. (1986); “Immobilized Cells and Enzymes” IRL Press, (1986); “A Practical Guide to Molecular Cloning” Perbal, B., (1984) and “Methods in Enzymology” Vol. 1-317, Academic Press; “PCR Protocols: A Guide To Methods And Applications”, Academic Press, San Diego, Calif. (1990); Marshak et al., “Strategies for Protein Purification and Characterization—A Laboratory Course Manual” CSHL Press (1996); all of which are incorporated by reference as if fully set forth herein. Other general references are provided throughout this document. The procedures therein are believed to be well known in the art and are provided for the convenience of the reader. All the information contained therein is incorporated herein by reference.

Materials and Methods

Oligos and Reagents—

All oligos were purchased from Sigma (St. Louis, Mo.) or Integrated DNA Technologies (IDT, San Jose, Calif.). Synthetic peptides were purchased from Peptide 2.0 Inc. (Chantilly, Va.), at 98% purity, desalted.

Preparation of Conditioned Media—

A schematic representation of the procedure is shown in FIG. 1A. Specifically, overnight cultures of B. subtilis strain 168 were diluted 1:50 in LB media supplemented with 0.1 mM MnCl₂and 5 mM MgCl₂, and incubated at 37° C. with shaking until reaching optical density (O.D) 600 nm=0.5. Phages (phi29, phi105, rho14 or phi3T) were added to the bacterial culture at MOI=1 and incubated for 3 hours. The media were centrifuged at 4000 rpm for 10 minutes at 4° C. and the supernatant was filtered with 0.2 μm filer (GE Healthcare Life Sciences, Whatman, CAT #10462200). Phages and large molecules were further filtered out from the media by using Amicon Ultra centrifugal filters at a cutoff of 3,000 NMWL (3 kDa) (Milipore, CAT # UFC900324). A plaque assay was performed in order to verify that no phages were left in the medium.

Proteinase K Treatment—

7.5 mg (per reaction) of Proteinase K-Agarose from Tritirachium album (Sigma, CAT # P9290) were washed twice with 750 μl of sterile water and then resuspended with 750 μl of LB supplemented with 0.1 mM MnCl₂and 5 mM MgCl₂. Following, the tubes were centrifuged again and the supernatant was discarded. 1.5 ml of phi3T-derived conditioned medium or control medium was added to a tube containing the washed proteinase K. The media were incubated for 2 hours at 37° C. with the proteinase K. The media were centrifuged and the supernatants were collected for the infection assay.

Growth Dynamics of Phage-Infected Cultures—

Overnight cultures of bacteria were diluted 1:100 in LB media and incubated at 37° C. with shaking until reaching O.D 600 nm=0.1. The bacterial culture was centrifuged at 4000 rpm for 10 minutes at room temperature. The supernatant was discarded and the pellet was resuspended in LB medium supplemented with 0.1 mM MnCl₂and 5 mM MgCl₂at 10% of the initial volume. The concentrated bacterial culture was added to conditioned medium or medium supplemented with synthesized arbitrium peptide in a ratio of 1:9 (bacteria to medium) and incubated for 1 hour at room temperature. Following, the culture was infected with phages at MOI=0.1. Optical density measurements at a wavelength of 600 nm were taken using a TECAN Infinite 200 μlate reader in a 96-wells plate. For infection experiments that did not include conditioned medium or addition of a synthesized peptide, the diluted overnight culture was grown to early-logarithmic phase and then infected as described above.

Semi Quantitative PCR Assay for Lysogeny—

An overnight culture of bacteria was diluted 1:100 until reaching O.D 600 nm=0.1. Medium was replaced (with conditioned medium or control medium) as described above, and the culture was incubated for 1 hour at room temperature. Bacteria were infected by phi3T at MOI=5. Cell pellets were collected at 0, 15, 30, 40 and 60 minutes post infection in the presence of conditioned or control medium. DNA was extracted using DNeasy blood and tissue kit (CAT #69504). Multiplex PCR assays to detect phage phi3T DNA, B. subtilis DNA, and the junction between integrated phage and bacterial genome were performed as previously described at Goldfarb et al²⁹.

Mass Spec—

Conditioned media was filtered using 3 kDa MW cutoff filters (Millipore) and the low molecular weight fraction was desalted using the Oasis HLB uElution plates (Waters Corp.). Samples were dried and stored at −80° C. until analysis. ULC/MS grade solvents were used for all chromatographic steps. Each sample was loaded using split-less nano-Ultra Performance Liquid Chromatography (10 kpsi nanoAcquity; Waters, Milford, Mass., USA). The mobile phase was: A) H₂O+0.1% formic acid and B) acetonitrile+0.1% formic acid. Samples' desalting was performed online using a reversed-phase C18 trapping column (180 μm internal diameter, 20 mm length, 5 μm particle size; Waters). Following, the peptides were separated using a T3 HSS nano-column (75 μm internal diameter, 250 mm length, 1.8 μm particle size; Waters) at 0.35 μL/min. The peptides were eluted from the column into the mass spectrometer using the following gradient: 4% to 35% B in 65 min, 35% to 90% B in 5 min, maintained at 90% for 5 min and then back to initial conditions. The nanoUPLC was coupled online through a nanoESI emitter (10 μm tip; New Objective; Woburn, Mass., USA) to a quadrupole orbitrap mass spectrometer (Q Exactive Plus, Thermo Scientific) using a Flexlon nanospray apparatus (Proxeon). Data was acquired in parallel reaction monitoring (PRM) mode, targeting precursor masses 574.33 and 287.67, the singly and doubly charged forms of peptide SAIRGA (SEQ ID NO: 269). MS2 resolution was set to 35,000 and the maximum injection time set to 200 msec, automatic gain control was set to 2e5. Raw data was imported to Skyline software³⁰version 3.5. Product ion intensities were extracted and the total area under the curve was calculated.

AimR Purification—

AimR (SEQ ID NO: 1) was cloned into the expression vector pET28a (Novagen) using Transfer-PCR (TPCR)³¹, using the following primers:

SEQ ID NO: 356
TP28_aimR_F
TTTGTTTAACTTTAAGAAGGAGATATACCATGATT

AAGAATGA ATGCGAAAAGG

SEQ ID NO: 357
TP28_aimR_
CTTTGTTAGCAGCCGGATCTTAGTGGTG

cHis_R
GTGGTGGTGGTGAATGAGAGATAAGGTTTAATAAG

TCAAG

Following, AimR was expressed in E. coli BL21(DE3) cells with a C-terminal 6× His-tag. Expression was performed at 15° C. for about ˜18 hours using 200 μM IPTG (Isopropyl β-D-1-thiogalactopyranoside) as an inducer. The cells pellet was resuspended in lysis buffer [50 mM Tris pH 8, 0.3M NaCl, 20 mM Imidazole, 2 mM DTT, 0.2 mg/ml Lysozyme, 1 μg/ml DNAse, protease inhibitor cocktail (Calbiochem)], disrupted by a cell disrupter at 4° C. and clarified at 15,000 g for 30 minutes. The clarified lysate was loaded onto a HisTrap_FF_5 ml column (GE Healthcare) and washed with buffer containing 50 mM Tris pH 8, 0.3 M NaCl, 20 mM imidazole and 2 mM DTT. AimR was eluted from the column in one step with the same buffer containing 0.5 M imidazole. Fractions containing AimR were pooled and injected to a size exclusion column (HiLoad_16/60_Superdex_200_prepgrade,GE_Healthcare) equilibrated with 20 mM Tris pH 8, 0.3 M NaCl, 2 mM TCEP. Fractions containing pure AimR were pooled and flash frozen in aliquots using liquid nitrogen.

Pure AimR was injected to an analytical gel filtration column (Superdex_200_Increase_10/30 GL, GE Healthcare) equilibrated with buffer containing 20 mM Tris pH 8, 0.3 M NaCl, 2 mM TCEP. The migration position of pure AimR was compared to that of AimR-peptide mixtures at the following molar ratios: AimR and SAIRGA (SEQ ID NO: 269) peptide (1:2), AimR and GMPRGA (SEQ ID NO: 271) peptide (1:1). The column was calibrated (inset of FIG. 4C) by monitoring the migration positions of the following known proteins/polymers: blue dextran (2000 kDa), Thyroglobulin (669 kDa), Apoferritin (443 kDa), beta-amylase (200 kDa), alcohol dehydrogenase (150 kDa), Albumin (66 kDA) and Carbonic anhydrase (29 kDa).

Microscale Thermophoresis (MST)—

Two-step purified 6×His-tagged AimR stored in Tris/NaCl buffer (50 mM Tris pH 8.0, 150 mM NaCl, 2 mM TCEP) at −80° C. was thawed on ice and centrifuged at 21,000 g for 10 minutes at 4° C. prior to analysis. Peptides [SAIRGA (SEQ ID NO: 269) and GMPRGA (SEQ ID NO: 271)] were solubilized in 50 mM Tris-HCl pH 8.0, 150 mM NaCl to a final concentration of 100 μM. AimR was diluted to 200 nM and was incubated with 16 different peptide concentrations varying between 9-4000 nM, which were prepared in Tris/NaCl buffer containing 0.1% [v/v] Pluronic acid (NanoTemper). Roughly 3 μl were loaded into NT.LabelFree Zero-Background Premium Coated Capillaries (NanoTemper) and inserted into a Monolith NT.LabelFree device (NanoTemper). MST experiments were performed at 60% MST power (infra-red laser) and 20% LED power at 23° C. using the Monolith NT.LabelFree instrument (Nanotemper). Ratios between normalized initial fluorescence and post-temperature-jump and thermophoresis were calculated and averaged from 3 independent runs (runs were incubated for 20 minutes at room temperature before the measurement). A plot of fluorescence ratios versus peptide concentration was used to assess the binding capacity of the phage protein and its cognate peptide ligand.

ChIP-Seq—

For the ChIP-seq experiments, B. subtilis cell cultures were grown in 100 ml to O.D of 0.1 in LB at 37° C. Following, 50 ml of culture was centrifuged at 4000 rpm for 10 minutes at room temperature. The pellets were then suspended with 25 ml LB either containing or lacking the peptide (SAIRGA, SEQ ID NO: 269) at a final concentration of 1 μM. The cultures were placed for an additional incubation hour at room temperature with shaking. The cultures were then infected with phage (MOI=0.5) having a final plaque forming units (PFU) of 5×10⁷PFU/ml phages. 15 minutes post phage infection the cultures were centrifuged at 3000 g for 5 minutes at 4° C. The supernatant was discarded and the pellets were resuspended with 1 ml of ice-cold 1×PBS (10 mM Phosphate, 137 mM NaCl, 2.7 mM KCl, pH of 7.4).

For the formaldehyde fixation of protein to DNA, the 1 ml PBS resuspended pellets were mixed with 62.5 μl formaldehyde (Thermoscientific 16% formaldehyde solution (w/v) methanol-free ampule) yielding a final formaldehyde concentration of 1% (w/v) within the solution. The formaldehyde-containing cell suspension was incubated at room temperature for 10 minutes with mild agitation. Following, 75 μl 2M Glycine (f.c. 150 mM) was added to quench residual formaldehyde. Glycine-containing samples were kept on ice for an additional 10 minutes followed by centrifugation at 5500 g for 1 minute at 4° C. The supernatants were discarded and the pellets were washed with a 1 ml of ice-cold 1×PBS. Centrifugation and wash were repeated 3 times for each sample.

The cell pellets were suspended with 600 μl lysis buffer containing 50 mM Tris-HCl pH 7.5, 150 mM NaCl and a protease inhibitor mix (cOmplete ULTRA Tablets Roche). The lysis-buffer-containing cells were applied on a lysing matrix B (MP Biomedicals) 0.1 mm silica beads. The mixture and beads were placed in a FastPrep-24 (MP Biomedicals) apparatus and shaken aggressively for 20 seconds 6 m/sec at 4° C. The beads were then separated from lysed cells by centrifugation at 10,000 g for 1 minutes at 4° C. according to manufacturer's instructions. Following, 300 μl of the supernatant, containing the lysed cell mixture, were transferred into a 1.5 ml Bioruptor® Plus TPX microtubes (diagenode) and kept on ice for 10 minute (according to manufacturer's instructions). The samples were sonicated at 4° C. with full power for 15 minutes (30 seconds off/on cycles) using the BioRuptor plus (Diagenode) apparatus. Sonication sheared the DNA to an average size of ˜500 bp. Following sonication, samples were centrifuge at 20,000 g for 10 minutes at 4° C.

For the IP experiments, supernatant containing the lysis buffer and cellular content was mixed with Triton X-100 and deoxycholate yielding a final IP-buffer composition containing 50 mM Tris-HCl, pH 7.5, 150 mM NaCl, 1% [vol/vol] Triton X-100, 0.1% [wt/vol] sodium deoxycholate supplemented with proteases inhibitors (cOmplete ULTRA Tablets Roche). Anti-6×His Tag® ChIP-grade antibody (abcam (ab9108)) was then added to the sonicated samples and gently mixed over night at 4° C. In parallel, Protein G Dynabeads (100.04 D; Invitrogen) were washed three times with IP buffer.

DNA-protein-antibody complexes (300 μl) were captured with a 100 μl Dynabeads protein G by mixing them for 1 hour at room temperature with rotation. 0.72 μl of 0.5M EDTA (f.c. 1 mM) was added to that mixture to prevent DNase activity at room temperature. Beads were applied to a magnetic stand (Qiagen) and washed three times with IP buffer (200 μl) at room temperature. Two elution steps were applied with 100 μl and 50 μl of elution buffer (50 mM Tris-HCl, pH 7.5, 10 mM EDTA, 1% [wt/vol] SDS) for 15 minutes at 65° C. on a rocking platform. The eluate (100 μl) was incubated with 5 μl of proteinase K (20 mg ml⁻¹) for 1 hour at 50° C. followed by 6 hours at 65° C. Immunoprecipitated DNA was recovered using a QIAquick PCR Purification kit (Qiagen). Immunoprecipitated DNA was then converted into NGS libraries using an existing protocol³²and was sequenced on a NextSeq500 Illumina machine generating 75nt-long reads.

Sequencing-Based Assay for Lysogeny—

An overnight culture of bacteria was diluted 1/100 until reaching O.D 600 nm=0.1. Medium was replaced as described above [LB supplemented with 0.1 mM MnCl₂and 5 mM MgCl₂with or without SAIRGA (SEQ ID NO: 269)], and incubated for 1 hour at room temperature. Bacteria were infected by phi3T at multiplicity of infection (MOI)=2. Cell pellets were collected at 5, 10, 20 and 60 minutes post infection in the presence or absence of SAIRGA (SEQ ID NO: 269) peptide at 1 μM final concentration in the medium. DNA was extracted using QIAGEN DNeasy blood and tissue kit (CAT #69504) and subjected to Illumina-based whole genome sequencing on NextSeq500. The relative abundance of lysogens in each sample was estimated using the number of reads mapped to the uninterrupted integration site versus reads mapped to the integration junction spanning the prophage DNA on one end and the bacterial DNA on the other end.

CRISPRI Experiments—

Construction of strains silencing phage genes was done by inserting a dCas9 construct controlled by a xylose promoter²², into the lacA region in B. subtilis 168 genome, and sgRNA with spacers targeting the gene of choice under constitutive promoter to thrC region [Spacer targeting aimR: ACCATTACTTTTCATAAC (SEQ ID NO: 358), spacer targeting aimX: TTTCCGCTTCATCTCAAGA (SEQ ID NO: 359)]. Infection assays in CRISPRi strains were performed in LB supplemented with 0.1 mM MnCl₂, 5 mM MgCl₂and 0.2% xylose.

For complementation assays of aimR on the background of aimR-silenced CRISPRi strain, aimR was amplified from the phi3T genome using the following primers:

SEQ ID NO:
fwd
AAGAATTCCTCATTGTGTTTAGGTAAAATAA

360
primer
GAATTC

SEQ ID NO:
rev
AACTGCAGTTAGTGGTGGTGGTGGTGGTGAA

361
primer
TTAGAGATAAGGTTAATAATTCAAG*

*includes a 6xHis tag

These primers amplify aimR together with 158 bases from it upstream region, with a 6×His at its C-terminus. The amplified fragment was cloned into the pBS1C plasmid (received from BGSC). Following, the native protospacer adjacent motif in the complemented aimR gene was changed by a synonymous point mutation (C->A at codon #20 of the aimR gene) using a primer set containing the point mutation and Gibson assembly. The modified gene was then integrated into the amyE locus in the B. subtilis genome. The aimX complementation was constructed on the background of aimR-silenced CRISPRi strain. For this, aimX was amplified from the phi3T genome using the following primers:

SEQ ID NO:
fwd
AAACTAGTTTTAAGGGAAAGTTCCAGAAATT

362
primer
C

SEQ ID NO:
rev
AACTGCAGTCCGTTGCCAATAGATTATGC

363
primer

These primers amplify aimX together with 60 bases from its upstream region and 107 bases from its downstream region (containing the gene terminator). The amplified aimX was cloned into a pBS1C plasmid modified to contain a xylose promoter, and was then integrated into the amyE locus in the B. subtilis genome.

RNA-Seq—

To determine the difference in gene expression with and without the peptide, bacteria were incubated for 1 hour in LB medium supplemented with 0.1 mM MnCl₂and 5 mM MgCl₂in the presence or absence of 1 μM synthesized SAIRGA (SEQ ID NO: 269) peptide. Following, the bacteria were infected with phi3T (MOI=0.1); and cell pellets were collected at 0, 5, 10 and 20 minutes post infection.

RNA extraction and RNA-seq was performed as described in Dar et al.³³. Briefly, pellets were lysed using the Fastprep homogenizer (MP Biomedicals, Santa Ana, Calif.) and RNA was extracted with the FastRNA PRO blue kit (MP Biomedicals, 116025050) according to manufacturer's instructions. RNA samples were treated with TURBO deoxyribonuclease (DNase) (Life technologies, AM2238) and fragmented with fragmentation buffer (Ambion) in 72° C. for 1:45 minutes. The reactions were cleaned by adding×2.5 SPRI beads. The beads were washed twice with 80% EtOH, and air dried for 5 minutes. The RNA was eluted using H₂O. rRNA was depleted by using the Ribo-Zero rRNA Removal Kit (epicenter, MRZB12424). Strand-specific RNA-seq was performed using the NEBNext Ultra Directional RNA Library Prep Kit (NEB, E7420) with the following adjustments: all cleanup stages were performed using×1.8 SPRI beads, and only one cleanup step was performed after the end repair step.

To determine the effect of CRISPRi silencing of the aimR gene, bacteria at early logarithmic stage were infected with phi3T (MOI=0.1) and cell pellets were collected 20 minutes post infection. RNA-seq libraries were prepped as described above.

RNA-seq libraries were sequenced using the Illumina NextSeq500 platform. Sequenced reads were demultiplexed and adapters were trimmed using fastx_clipper with default parameters. Reads were mapped to the reference genomes (gene annotation and sequences were downloaded from Genbank: NC_000964 for Bacillus subtilis str. 168, AP012496 for Bacillus subtilis BEST7003) using NovoAlign (Novocraft) V3.02.02 with default parameters. All downstream analyses and normalized genome-wide RNA-seq coverage maps were generated as described in Dar et al³³.

Differential Expression Analysis—

Reads per gene were calculated for each biological replicate at 20 minutes post infection with and without the synthetic peptide and normalized relative to the total mapped reads hitting the phage genome in each replicate. Log 10 transformation of the average of 3 replicates per gene in each condition was used to plot FIG. 4I and calculate the fold change of gene AimX.

Identification of AimR Homologs and the Arbitrium Peptide Code—

Homologs for the phi3T AimR receptor were searched for using the BLAST option in the Integrated Microbial Genomes (IMG) web server (img(dot)jgi(dot)doe(dot)gov/cgi-bin/mer/main(dot)cgi). The phi3T AimR (SEQ ID NO: 114) was provided as a query sequence and was searched against all isolated genomes with an e-value threshold of 1e-35. The gene neighborhood for each AimR homolog was visually inspected via the IMG “gene neighborhood” representation, and genes found located next to proteins annotated as phage proteins were considered as found in a prophage. The immediate downstream gene for each AimR homolog was considered the respective AimP gene if it contained a signal peptide as predicted by the IMG web server. If no immediate downstream gene was annotated, the intergenic region immediately downstream to the AimR homolog was translated using the Expasy Translate Tool (web(dot)expasy(dot)org/translate/), and short translated ORFs were inspected for the AimP signature. Results of this analysis are presented in Table 3 above.

Plasmid Construction—

The vectors were constructed using the following primers and oligonucleotides:

SEQ

ID

NO:
primer
sequence 5′-3′

364
A
AATCGCCATTCGCCAGGGCTGCAGGAATTCCCCTCATT

GTGTTTAGGTAAAATAAGAA

365
B
GTGTTTAAAATGTCTATTTTATTTAGTTTCAATATGCT

CATG

366
C
GAAACTAAATAAAATAGACATTTTTACACTGATTAACT

AATAAGGAGGACAAACATGTC

367
D
GGTAATGGTAGCGACCGGCGCTCAGGATCCTAAATACG

CTTCACAGTTTCTTCTTCATT

368
E
TATTCTCACCTCCTTTCAAATTTGTCAAACC

369
F
GTTTGACAAATTTGAAAGGAGGTGAGAATATTAAATAA

TTGAATAGGTAATACATAATACTATCATAGACG

370
G
TCTAATAACCCCCATGTTCTTATTTTTTGATTTTTG

371
H
TCAAAAAATAAGAACATGGGGGTTATTAGAGCATATTG

AAACTAAATAAAATAGACATTTTAAACAC

Construct #1 comprises 2 components: the Bacteriophage Phi3T virus aimR-aimP-aimX locus and a Fluorescent reporter gene (Superfolder Green Fluorescent Protein (sfGFP(sp)), denoted herein as GFP. Both components were inserted into a target shuttle plasmid (pDR111) that enabled propagation of the plasmid in E. coli followed by transformation into the Bacillus subtilis BEST7003 genome. pDR111 contains two sequences, each matching either the 5′-end or the 3′-end of the target gene (amyE) which through homologous recombination allows insertion of Construct #1. In addition, pDR111 also includes a Spectinomycin antibiotic resistance (spec) gene that allows for selection of the desired insertion, namely Construct #1.

Specifically, the aimR-aimP-aimX locus (SEQ ID NO: 372) was directly amplified from bacteriophage Phi3T genome, using primers A+B, which contains the aimR, aimP and aimX coding genes, including their intergenic spaces (Erez Z, et al., Nature. 2017; 541(7638):488-493).

The reporter GFP gene (SEQ ID NO: 373) was amplified, using primers C+D, from plasmid pDR111-sfGFP(sp) which contains a Bacillus subtilis-optimized superfolder-GFP (Overkamp W, et al., Appl Environ Microbiol. 2013 October; 79(20):6481-90).

The reporter gene GFP was genetically fused to the Phi3T aimR-aimP-aimX locus by inserting 3 STOP codons followed by a ribosome-binding-site (denoted herein as rbs) immediately downstream of the 37 bp long 3′UTR of aimX (see FIG. 6). A transcription-terminator element (rrnB) was placed downstream to the GFP gene. The genetic fusion of the aimR-aimP-aimX locus together with the GFP reporter gene yielded Construct #1 (FIG. 6, SEQ ID NO: 374).

The construct was inserted into the shuttle plasmid pDR111 between the restriction sites EcoRI and BamHI using the NEB builder HIFI DNA assembly reaction kit (NEB, MA, USA), resulting in the plasmid pDR111-Construct #1 (FIG. 6, SEQ ID NO: 375).

The pDR111-Construct #1 plasmid was propagated in Escherichia coli and then used as a shuttle vector to insert the Construct #1 operon into the Bacillus subtilis genome within the amyE gene together with a spec gene on the opposite strand upstream of the RPX operon (FIG. 7). Whole genome sequencing, using a High-throughput Illumina NextSeq sequencer, was applied to verify the sequence of the Construct #1 operon and its precise position within the bacterial genome.

In order to create a minimized expression system the pDR111-Construct #1 plasmid was amplified using primers E+F followed by DNA assembly using the NEB builder HIFI DNA assembly reaction kit (NEB, MA, USA), resulting in the plasmid pDR111-Construct #1 with a deleted aimP gene. This construct was further amplified using primers G+H followed by DNA assembly leading to deletion of the aimX gene as well. The end product, pDR111-Construct #2 (SEQ ID NO: 377), contained the double gene deletion Δ aimP/Δ aimX variant of Construct #1 [i.e. Construct #2 (SEQ ID NO: 376)]. pDR111-Construct #2 was used as a shuttle plasmid in a similar manner to pDR111-Construct #1. This led to insertion of Construct #2 into the bacterial genome as described above for Construct #1 (FIG. 8).

Growth Dynamics of Plasmid-Containing Cultures—

Starter growths of wild type (WT) Bacillus subtilis BEST7003 strain and of Bacillus subtilis BEST7003 strains containing Construct #1 or Construct #2 were cultured in 3 ml Luria-Bertani (LB) broth until the growth curve indicated stationary phase. Following, the bacterial cells were diluted in a ratio of 1/100 in LB broth or LB broth supplemented with a SAIRGA (SEQ ID NO: 269) peptide at a concentration range of 62.5 nM-1000 nM. Optical density (O.D. at 600 nm) and GFP fluorescence levels (488 nm excitation/518 nm emission) were measured over time in a 96 wells plate using a TECAN Infinite 200 μlate reader.

Example 1
A Short Peptide is Released to the Medium Following phi3T Phage Infection Affecting the Lysis/Lysogeny Decision

Cultures of Bacillus subtilis str. 168 were infected by 1 of the four different phages: phi29 (Podoviridae, obligatory lytic), phi105 (Siphoviridae, temperate, lambda-like), rho14 (Siphoviridae, temperate, lambda-like) or phi3T (Siphoviridae, temperate, spBeta-like) in control and in phage-derived conditioned media (see FIG. 1A). Surprisingly, for one of the tested phages, phi3T, the infection dynamics in the conditioned medium, as inferred from the bacterial growth curve, was dramatically different than the dynamics in the control medium. As shown in FIG. 1B, whereas a substantial fraction of the phi3T infected bacterial culture had lysed in the control medium two hours post infection, the culture grown in the conditioned medium appeared to be largely protected from lysis. This effect was not detected for any of the other three phages tested, for which no difference in infection dynamics between the control and conditioned media was observed. Moreover, the conditioned medium prepared from phi3T infection did not affect the infection dynamics of other phages, and vice versa, conditioned media prepared from other phages did not affect the infection dynamics of phi3T.

Taken together, these results imply that a small molecule is released to the medium during infection of B. subtilis by phi3T and this molecule can affect infection dynamics of downstream infections of this phage.

It is known that quorum sensing (QS) in Bacilli and other Firmicutes is typically based on short peptides that are secreted to the medium and sensed by intra-cellular or membrane bound receptors^1-3. Thus, to test whether the active substance in the medium is proteinaceous the conditioned medium was treated with proteinase K. As shown in FIG. 1E, infection dynamics in the proteinase-treated conditioned medium were similar to the dynamics in the control medium, suggesting that the active component in the medium is indeed proteinaceous. Since communication peptides in Bacillus quorum sensing systems are frequently imported into the cell by the oligopeptide permease transporter (OPP), phage infection dynamics was tested in bacteria in which an essential subunit of the OPP transporter, oppD, was deleted (FIG. 1C). The phage-derived conditioned medium lost its effect when the bacteria lacking the functional OPP were infected by phi3T, suggesting that the active substance in the conditioned medium is a 3aa-aa long peptide, which is the size range of peptides that can be imported by the OPP transporter of Gram positive bacteria⁴.

A close examination of the phage infection dynamics in the oppD mutant showed increased culture lysis in both control and conditioned media as compared to infection of wild type bacteria (FIGS. 1B-C). Phage phi3T is a temperate phage that may choose to infect either through the lytic or the lysogenic cycles^6,7. Whereas the lytic cycle leads to lysis of the bacterial cell, in the lysogenic cycle the phage genome integrates into the bacterial genome, and the lysogenized bacterium becomes protected from further infection by the same phage⁷. In accordance, the growth curve of bacteria infected by phi3T in the control medium presents partial, but not full, lysis of the culture, followed by culture recovery due to growth of lysogenized bacteria. The observation that the infection dynamics curve of the oppD mutant presents a complete lysis of the culture suggests that the active peptide released to the medium may promote lysogeny of the phage. Hence, to test whether the reduced bacterial lysis observed during infection in the phage-derived conditioned medium is due to increased lysogeny phage phi3T integration into the B. subtilis genome during infection was examined using a semi-quantitative PCR assay. Indeed, as shown in FIG. 1D, increased lysogeny was observed when the bacterial culture was infected in the conditioned medium.

Taken together, these results suggest that during phi3T infection a short peptide is released to the medium and, as this peptide accumulates, it acts as a communication agent affecting the lysis/lysogeny decision of later generations of the phage progeny. This newly discovered putative communication molecule is denoted herein as arbitrium (the Latin word for “decision”).

Example 2
The Peptide Affecting the Lysis/Lysogeny Decision is Encoded by the aimP Gene

Phi3T was isolated 4 decades ago and was characterized as belonging to the spBeta family of phages, although to date its genome was not sequenced⁶. To search for the possible genetic system encoding the arbitrium peptide, the genome of phi3T was sequenced and analyzed. This genome assembled into a single 128 kbps contig containing 185 predicted genes. To search for proteins likely to be secreted into the medium all of the open reading frames (ORFs) in phi3T were screened for the presence of an N-terminal signal peptide using the signalP software⁸. Three ORFs were predicted to have a signal peptide, suggesting that they are secreted or membrane-localized. While two of these genes seemed irrelevant (one was an integral membrane protein and the other was a large nuclease), the third gene exhibited features reminiscent of Bacillus quorum sensing peptides (FIGS. 2A-B). Peptides belonging to the Phr family of quorum sensing systems in B. subtilis are typically processed from a pre-pro-peptide that contains an N-terminal signal sequence, which is recognized by the Sec system and cleaved upon secretion¹. Once outside the cell, the pro-peptide is further processed by B. subtilis extracellular proteases to produce the mature short (5-6 amino acids) peptide that is typically found on the C-terminal end of the pro-peptide¹. The selected candidate gene encoded a short ORF (43 amino acids, SEQ ID NO: 227), and displayed both an N-terminal signal sequence and the consensus cleavage site for peptide maturation at its C-terminus (FIGS. 2A-B). If this phi3T-encoded protein is secreted and matured extracellularly, then the predicted mature communication peptide after pro-peptide cleavage would be Ser-Ala-Ile-Arg-Gly-Ala (SAIRGA, SEQ ID NO: 269). Indeed, mass spectrometry analysis confirmed the presence of the SAIRGA peptide in the conditioned medium but not in the control medium (FIG. 2C).

To test whether the predicted mature peptide is indeed the arbitrium molecule that influences the phage lysogeny decision, bacteria were infected with phi3T in LB medium supplemented with increasing amounts of synthesized SAIRGA (SEQ ID NO: 269) peptide. A clear concentration-dependent effect on the phage infection dynamics was observed, such that reduced culture lysis was apparent when the medium contained higher concentrations of the synthesized peptide (FIG. 2D). These effects were specific to that peptide, and were neither observed for shorter, 5 amino acids versions of the peptide (SAIRG or AIRGA, SEQ ID NOs: 355 and 354, respectively), nor for PhrC (SEQ ID NO: 350), a known quorum sensing peptide of B. subtilis (FIGS. 2D-E). The maximal effect on the culture growth curve was observed at SAIRGA (SEQ ID NO: 269) peptide concentration of 500 nM, above which the effect seemed saturated (FIG. 2D).

To verify that the observed effect of the SAIRGA peptide on the dynamics of the infected culture was the result of increased lysogeny, total DNA of bacteria collected from a time course experiment during infection by phi3T with and without the peptide was directly sequenced. By comparing the fraction of sequencing reads passing through the intact phage integration site in the bacterial genome to reads demonstrating phage integration at that site, the fraction of lysogenized bacteria at each time point was directly quantified. Remarkably, a consistently elevated lysogeny in the presence of the SAIRGA (SEQ ID NO: 269) peptide was observed, such that 48% (±7.9%) of the bacteria were lysogenized at 60 minutes post infection, as compared to 18% (±3.3%) of bacteria grown without the peptide (FIG. 2F). These results suggest that the phage-encoded gene identified is secreted and processed into the mature arbitrium communication peptide that further affects the phage lysis/lysogeny decision. This gene was denoted herein as aimP.

The aimP gene is located immediately downstream of a gene (SEQ ID NO: 1) encoding a 378 amino acids long open reading frame (SEQ ID NO: 114), suggesting that these two genes may be co-transcribed from the phage genome as a polycistron. This upstream gene encodes a predicted tetratricopeptide repeat (TPR) domain, typical of intracellular peptide receptors of the RRNPP family in QS systems of Gram positive bacteria^9-11(FIG. 2A). It was therefore hypothesized that this upstream gene, which was denoted aimR, is the receptor of the AimP-derived arbitrium peptide. To test this hypothesis a C-terminal His-tagged AimR was purified, and microscale thermophoresis (MST) was used to measure the binding between the purified receptor and the synthesized arbitrium peptide. This analysis showed high-affinity binding, at an effective peptide concentration of EC₅₀=138 nM (118-162 nM at confidence interval of 95%), between the phi3T AimR receptor and the cognate SAIRGA (SEQ ID NO: 269) peptide (FIG. 2G), confirming that AimR most probably functions as the intracellular receptor of the arbitrium SAIRGA peptide.

Example 3
A Conserved Peptide Communication Code Guiding Lysogeny in Bacillus Phages

To appreciate the abundance of this system in nature, a homology search was conducted to find homologs of the aimR gene in available sequenced genomes. Using this search 112 instances of AimR homologs were detected, virtually all of them in Bacillus phages or in prophages found integrated within Bacilli genomes, suggesting that this gene primarily fulfills a phage-related function (FIG. 3A and Table 3 above). In all cases, aimR homologs were found upstream of aimP candidate genes, i.e., short polypeptides encoding an N-terminal signal peptide, followed by a pro-peptide conforming with the processing maturation signal of the Bacillus extracellular proteases (Table 3 above; FIG. 2B). Although the sequences of the predicted mature peptides were diverse, all of them maintained strict rules for their sequence composition, with an obligatory glycine residue at the 5^thposition, glycine or alanine at the 6^thposition, and a preference for positively charged residue at the 4^thposition (FIGS. 3B-C). To test the hypothesis that the phage-encoded communication peptides guide phage lysogeny in a sequence-specific manner, the infection dynamics of the spBeta phage, in which a homolog of the AimR-AimP system was identified, was evaluated. The predicted mature AimP-derived arbitrium peptide of spBeta was GMPRGA (SEQ ID NO: 271), a sequence that differs by the 3 N-terminal amino acids from the SAIRGA (SEQ ID NO: 269) peptide of phi3T. As shown in FIGS. 3D-F, whereas the GMPRGA (SEQ ID NO: 271) peptide promoted lysogeny of spBeta, it did not affect the lysogeny profile of phi3T; and similarly, the phi3T-derived SAIRGA (SEQ ID NO: 269) peptide had no effect on the infection dynamics of spBeta. In accordance, the spBeta-derived GMPRGA (SEQ ID NO: 271) peptide did not show specific binding to the phi3T AimR receptor (FIG. 2G).

Taken together, these results demonstrate a sequence-specific peptide code that guides phage lysogeny in a phage species-specific manner.

Example 4
The AimP Peptide Alters the Oligomeric State of its AimR and Arters the Expression of AimX

In communication systems of Gram positive bacteria, the binding of the communication peptide to its receptor usually leads to reprogramming of the transcriptional response. This can occur either directly, when the receptor is a transcription regulator such as in the cases of the PrgX^12,13in Enterococci, the PlcR^14-16of the Bacillus cereus group, and in other systems^17,18; or indirectly, as in the case of Rap/Phr systems of Bacilli, in which the receptor is a phosphatase that regulates downstream transcriptional regulators by dephosphorylation^19,20, or steric interference²¹. The presence of a predicted helix-turn-helix (HTH) motif in the N-terminus of AimR suggested that the receptor of the arbitrium system directly binds DNA. To test whether AimR binds the phage DNA in vivo, a His-tagged aimR gene was engineered into a B. subtilis 168 strain in which a dCas9 (CRISPRi) technology²²was used to silence the expression of the phage AimR gene, but not the cloned His-tagged AimR (see Materials and Methods hereinabove). Following, a ChIP-seq assay was performed 15 minutes after phage infection with and without the presence of the arbitrium peptide. Sequencing of the DNA bound to AimR clearly showed that AimR binds a single site in the phage genome, directly downstream of the aimP gene (FIGS. 4A-B). Moreover, this binding only occurred when the arbitrium peptide was absent in the medium, suggesting that binding of the arbitrium peptide to its AimR receptor leads to dissociation of the receptor from its binding site on the phage DNA.

During the process of AimR purification it was noticed that the protein migrates as homodimer in a gel filtration column. Upon addition of the phi3T-derived SAIRGA (SEQ ID NO: 269) peptide, however, the protein strictly migrated as a monomer (FIG. 4C). These results suggest that the arbitrium peptide transfers the signal via alteration of the oligomeric state of its AimR receptor from a DNA binding dimer to a peptide-bound, dissociated monomer. Addition of the spBeta GMPRGA (SEQ ID NO: 271) peptide did not lead to a change in the AimR oligomeric state, pointing, again, to the high specificity between the peptide and its receptor in the arbitrium system (FIG. 4C).

To examine whether binding of the arbitrium peptide to its AimR receptor leads to a transcriptional response in the phage genome RNA-seq was applied to RNA extracted from bacteria during a time course of infection with and without the peptide. As shown in FIGS. 4D-I, the most dramatic change in the expression was observed for a single gene, which was denoted herein as aimX, that was immediately downstream to the AimR DNA binding site. This gene showed substantial expression in the absence of the arbitrium peptide starting 10 minutes following infection, but its expression was reduced more than 20 fold when the medium was supplemented by 1 μM of the SAIRGA (SEQ ID NO: 269) peptide (FIGS. 4D-G).

These results suggest that AimR, when bound to the phage DNA as a dimer in the absence of the arbitrium peptide, is a transcriptional activator of AimX. Indeed, when AimR was silenced using dCas9, the expression of AimX was dramatically reduced (FIG. 4D). Moreover, silencing of AimR resulted in increased lysogeny, suggesting that binding of AimR to the phage DNA inhibits lysogeny (or promotes lysis), possibly by activating the expression of AimX (FIG. 4H).

Since the AimR knockdown did not lead a dramatic transcriptional effect for any gene except AimX at 20 minutes post infection (FIG. 4I), it was hypothesized that the main function of AimR is to control the expression of the aimX gene, and that the AimX gene product works downstream to the AimR-AimP communication system to execute the lysis-lysogeny decision. Consistent with this hypothesis, knockdown of AimX using dCas9 resulted in increased lysogeny (FIG. 4H). Moreover, complementing with an ectopic AimX on the background of AimR knockdown (in which AimX expression is naturally silenced, FIG. 4D) resulted in culture lysis, demonstrating that AimX functions downstream to AimR, and works as the inhibitor of the lysogenic or the promoter of the lytic cycle (FIG. 4H).

Taken together, the present inventors have shown that a large family of phages uses communication peptides in order to decide whether to enter a lytic cycle or lysogenize the infected bacterium. In a sense, the communication mechanism described allows an offspring phage particle to communicate with its ancestors, i.e., measure the amount of the predecessor phages that completed successful infections in prior cycles. The biological logic behind this strategy is clear: when a single phage encounters a bacterial colony, there is ample prey for the progeny phages that are produced from the first cycles of infection, and hence a lytic cycle is preferred. In later stages of the infection dynamics the number of bacterial cells is reduced to a point that progeny phages are at risk of no longer having a new host to infect. Then, it is logical for the phage to switch into lysogeny to preserve chances for viable reproduction.

The arbitrium system provides an elegant mechanism for a phage particle to estimate the amount of recent prior infections and hence decide whether to employ the lytic or lysogenic cycle. Without being bound by theory, the results point to the following model (FIGS. 5A-C): upon initial infection of a bacterial culture, phi3T expresses the early operon AimR-AimP. AimR, as a dimer, activates the expression of AimX, which, in turn, blocks the pathway to lysogeny and promotes the lytic cycle. At the same time, AimP is secreted into the medium and processed into the mature arbitrium peptide. Following several cycles of infection, arbitrium peptides will accumulate in the medium. At this stage, when a phage particle infects a yet-uninfected bacterium, the concentration of the arbitrium peptide, which is internalized into the bacteria by the OPP transporter, will be high enough to bind the AimR receptor. Upon binding, the AimR receptor changes its oligomeric state from the active dimer to the inactive monomer, silencing the AimX lysogeny-inhibitor and leading to lysogeny.

Example 5
The Arbitrium System as a System for Expressing an Expression Product of Interest

To demonstrate that the AimR-AimP system can function as a heterologous expression system in a phage-independent context, it was utilized to control the expression of a GFP reporter gene in Bacillus subtilis BEST7003.

To this end, the arbitrium locus (AimR-AimP-AimR binding site-AimX) was engineered such that the AimX gene was fused to a GFP reporter gene (Construct #1, FIG. 6). This construct was integrated into the genome of Bacillus subtilis BEST7003 at the AmyE locus (FIG. 7). As shown in FIG. 9A, integration of Construct #1 resulted in significant expression of GFP in liquid cultures of the bacteria, reaching a plateau following about 6 hours of culture probably due to expression of the AimP peptide encoded by the construct. Moreover, when the bacteria was grown in liquid culture in the presence of various concentrations of the arbitrium peptide (SAIRGA, SEQ ID NO: 269), a concentration-dependent expression of the GFP fluorescence was observed, such that maximal fluorescence was detected in the absence of the peptide; and fluorescence gradually repressed in a manner proportional to the concentration of the peptide in the growth media.

Following, a minimized expression system containing only the AimR gene, AimR binding site and a GFP downstream to the AimR binding site was engineered (Construct #2, FIG. 8) and integrated into the genome of Bacillus subtilis BEST7003 at the AmyE locus (i.e. this system did not contain the AimP and AimX genes). Similarly to a bacteria containing Construct #1, integration of the minimized Construct #2 resulted in significant expression of GFP in liquid cultures of the bacteria; and when the bacteria was grown in the presence of various concentrations of the arbitrium peptide (SAIRGA, SEQ ID NO: 269), a differential expression of GFP dependent on the concentration of the peptide in the medium was observed (FIG. 9B). Importantly, SAIRGA (SEQ ID NO: 269) peptide concentration or GFP expression did not affect the growth rate of the bacteria (FIGS. 9A-B).

Although the invention has been described in conjunction with specific embodiments thereof, it is evident that many alternatives, modifications and variations will be apparent to those skilled in the art. Accordingly, it is intended to embrace all such alternatives, modifications and variations that fall within the spirit and broad scope of the appended claims.

All publications, patents and patent applications mentioned in this specification are herein incorporated in their entirety by reference into the specification, to the same extent as if each individual publication, patent or patent application was specifically and individually indicated to be incorporated herein by reference. In addition, citation or identification of any reference in this application shall not be construed as an admission that such reference is available as prior art to the present invention. To the extent that section headings are used, they should not be construed as necessarily limiting.

REFERENCES
Additional References are Cited in Text

1. Pottathil, M. & Lazazzera, B. A. The extracellular Phr peptide-Rap phosphatase signaling circuit of Bacillus subtilis. Front. Biosci. 8, d32-45 (2003).

2. Perego, M. Forty years in the making: understanding the molecular mechanism of peptide regulation in bacterial development. PLoS Biol. 11, e1001516 (2013).

3. Waters, C. M. & Bassler, B. L. Quorum sensing: cell-to-cell communication in bacteria. Annu. Rev. Cell Dev. Biol. 21, 319-46 (2005).

4. Lanfermeijer, F. C., Detmers, F. J., Konings, W. N. & Poolman, B. On the binding mechanism of the peptide receptor of the oligopeptide transport system of Lactococcus lactis. EMBO J. 19, 3649-56 (2000).

5. Patrick, J. E. & Kearns, D. B. Laboratory Strains of Bacillus subtilis Do Not Exhibit Swarming Motility. J. Bacteriol. 191, 7129-7133 (2009).

6. Tucker, R. G. Acquisition of Thymidylate Synthetase Activity by a Thymine-requiring Mutant of Bacillus subtilis following Infection by the Temperate Phage 3. J. Gen. Virol. 4, 489-504 (1969).

7. Rutberg, L. in The Molecular Biology of Bacilli Vol. 1 Bacillus subtilis (ed Dubnau, D. A.) Ch. Temperate Bacteriophages of Bacillus Subtilis, 247-268 (Academic press, 1982).

8. Emanuelsson, O., Brunak, S., von Heijne, G. & Nielsen, H. Locating proteins in the cell using TargetP, SignalP and related tools. Nat. Protoc. 2, 953-971 (2007).

9. Rocha-Estrada, J., Aceves-Diez, A. E., Guarneros, G. & de la Torre, M. The RNPP family of quorum-sensing proteins in Gram-positive bacteria. Appl. Microbiol. Biotechnol. 87, 913-23 (2010).

10. Do, H. & Kumaraswami, M. Structural Mechanisms of Peptide Recognition and Allosteric Modulation of Gene Regulation by the RRNPP Family of Quorum-Sensing Regulators. J. Mol. Biol. 428, 2793-2804 (2016).

11. Perez-Pascual, D., Monnet, V. & Gardan, R. Bacterial Cell-Cell Communication in the Host via RRNPP Peptide-Binding Regulators. Front. Microbiol. 7, 706 (2016).

12. Dunny, G. M. & Bemtsson, R. P.-A. Enterococcal Sex Pheromones: Evolutionary Pathways to Complex, Two-Signal Systems. J. Bacteriol. 198, 1556-1562 (2016).

13. Shi, K. et al. Structure of peptide sex pheromone receptor PrgX and PrgX/pheromone complexes and regulation of conjugation in Enterococcus faecalis. Proc. Natl. Acad. Sci. U.S.A 102, 18596-601 (2005).

14. Lereclus, D., Agaisse, H., Gominet, M., Salamitou, S. & Sanchis, V. Identification of a Bacillus thuringiensis gene that positively regulates transcription of the phosphatidylinositol-specific phospholipase C gene at the onset of the stationary phase. J. Bacteriol. 178, 2749-56 (1996).

15. Slamti, L. & Lereclus, D. A cell-cell signaling peptide activates the PlcR virulence regulon in bacteria of the Bacillus cereus group. EMBO J. 21, 4550-9 (2002).

16. Declerck, N. et al. Structure of PlcR: Insights into virulence regulation and evolution of quorum sensing in Gram-positive bacteria. Proc. Natl. Acad. Sci. U.S.A 104, 18490-5 (2007).

17. Dubois, T. et al. Activity of the Bacillus thuringiensis NprR-NprX cell-cell communication system is co-ordinated to the physiological stage through a complex transcriptional regulation. Mol. Microbiol. 88, 48-63 (2013).

18. Fleuchot, B. et al. Rgg proteins associated with internalized small hydrophobic peptides: a new quorum-sensing mechanism in streptococci. Mol. Microbiol. 80, 1102-1119 (2011).

19. Parashar, V., Mirouze, N., Dubnau, D. A. & Neiditch, M. B. Structural Basis of Response Regulator Dephosphorylation by Rap Phosphatases. PLoS Biol. 9, e1000589 (2011).

20. Ishikawa, S., Core, L. & Perego, M. Biochemical characterization of aspartyl phosphate phosphatase interaction with a phosphorylated response regulator and its inhibition by a pentapeptide. J. Biol. Chem. 277, 20483-20489 (2002).

21. Baker, M. D. & Neiditch, M. B. Structural Basis of Response Regulator Inhibition by a Bacterial Anti-Activator Protein. PLoS Biol. 9, el001226 (2011).

22. Peters, J. M. et al. A Comprehensive, CRISPR-based Functional Analysis of Essential Genes in Bacteria. Cell 165, 1493-1506 (2016).

23. Johnson, C. M. & Grossman, A. D. Integrative and Conjugative Elements (ICEs): What They Do and How They Work. Annu. Rev. Genet. 49, 577-601 (2015).

24. Auchtung, J. M., Lee, C. A., Monson, R. E., Lehman, A. P. & Grossman, A. D. Regulation of a Bacillus subtilis mobile genetic element by intercellular signaling and the global DNA damage response. Proc. Natl. Acad. Sci. U.S.A 102, 12554-9 (2005).

25. Hargreaves, K. R. et al. What Does the Talking?: Quorum Sensing Signalling Genes Discovered in a Bacteriophage Genome. PLoS One 9, e85131 (2014).

26. Jiang, M., Grau, R. & Perego, M. Differential Processing of Propeptide Inhibitors of Rap Phosphatases in Bacillus subtilis. J. Bacteriol. 182, 303-310 (2000).

27. Dodd, I. B., Shearwin, K. E. & Egan, J. B. Revisited gene regulation in bacteriophage lambda. Curr. Opin. Genet. Dev. 15, 145-52 (2005).

28. Oppenheim, A. B., Kobiler, O., Stavans, J., Court, D. L. & Adhya, S. Switches in bacteriophage lambda development. Annu. Rev. Genet. 39, 409-29 (2005).

29. Goldfarb, T. et al. BREX is a novel phage resistance system widespread in microbial genomes. EMBO J. 34, 169-83 (2015).

30. MacLean, B. et al. Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics 26, 966-8 (2010).

31. Erijman, A., Dantes, A., Bernheim, R., Shifman, J. M. & Peleg, Y. Transfer-PCR (TPCR): a highway for DNA cloning and protein engineering. J. Struct. Biol. 175, 171-177 (2011).

32. Garber, M. et al. A High-Throughput Chromatin Immunoprecipitation Approach Reveals Principles of Dynamic Gene Regulation in Mammals. Mol. Cell 47, 810-822 (2012).

33. Dar, D. et al. Term-seq reveals abundant ribo-regulation of antibiotics resistance in bacteria. Science, 352, 187 (2016).

ISOLATED POLYNUCLEOTIDES AND POLYPEPTIDES AND METHODS OF USING SAME FOR EXPRESSING AN EXPRESSION PRODUCT OF INTEREST

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

PCT Information

Provisional Applications (1)