Isoprenoids comprise >55,000 natural products for which methods to access and diversify their structures are in high demand. Ultimately, the isoprene motif plays a critical role in modulating the biological activity of isoprenoids, determines their utility as tools to study and treat human diseases, and provides the basis to develop new fuels and chemicals. Notably, although several valuable isoprenoids have been accessed via heterologous expression, the ability to diversify isoprenoids is extremely limited largely due to critical limitations imposed by native isoprenoid biosynthesis.
Firstly, only the mevalonate (MEV) and 1-deoxy-D-xylulose-5-phosphate (DXP) pathways (
Secondly, terpene metabolism is highly regulated and is a burden to the carbon supply on the cell. For example, the MEV pathway uses three molecules of phosphate donor (ATP) and two reducing equivalents (NADPH) for each DMAPP/IPP, while the DXP pathway requires two phosphate donors (ATP and CTP) and two reducing equivalents (NADPH) (
Thirdly, given that native terpenes are typically essential for maintenance of the cell, genetic modification of native hemiterpene pathways would likely be lethal.
Accordingly, new methods of making isoprenoids are needed.
Together, the limitations associated preparing isoprenoid derivatives utilizing existing biochemical pathways can be overcome by supplying a membrane-permeable carbon building block dedicated for a designer pathway that would function independent of native isoprenoid metabolism. A potential strategy for hemiterpene biosynthesis can start with an alcohol (e.g., isopentenol (ISO) and/or dimethylallyl alcohol (DMAA)), which can be converted to a pyrophosphate (diphosphate) via stepwise enzymatically catalyzed phosphorylation (see, for example,
Accordingly, provided herein are methods for synthesizing an isoprenoid subunit. In some embodiments, these methods can comprise (i) contacting a primary alcohol defined by Formula I below
wherein R1 is selected from the group consisting of C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; R1′ is selected from the group consisting of hydrogen, C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; and each RX, when present, is independently selected from OH, NO2, CN, halo, C1-6 alkyl, C2-6 alkenyl, C2-6 alkynyl, C1-4 haloalkyl, C1-6 alkoxy, C1-6 haloalkoxy, cyano-C1-3 alkyl, HO—C1-3 alkyl, amino, C1-6 alkylamino, di(C1-6 alkyl)amino, thio, C1-6 alkylthio, C1-6 alkylsulfinyl, C1-6 alkylsulfonyl, carbamyl, C1-6 alkylcarbamyl, di(C1-6 alkyl)carbamyl, carboxy, C1-6 alkylcarbonyl, C1-6 alkoxycarbonyl, C1-6 alkylcarbonylamino, C1-6 alkylsulfonylamino, aminosulfonyl, C1-6 alkylaminosulfonyl, di(C1-6 alkyl)aminosulfonyl, aminosulfonylamino, C1-6 alkylaminosulfonylamino, di(C1-6 alkyl)aminosulfonylamino, aminocarbonylamino, C1-6 alkylaminocarbonylamino, and di(C1-6 alkyl)aminocarbonylamino; with a phosphatase that exhibits bidirectional activity in the presence of ATP to form a phosphate defined by Formula II below
wherein R1, R1′, and RX are as defined above with respect to Formula I an presents a phosphate group; and (ii) contacting the phosphate defined by Formula II with a kinase in the presence of ATP to generate the isoprenoid subunit defined by Formula III below
wherein R1, R1′, and RX are as defined above with respect to Formula I and PP represents a pyrophosphate group.
In some embodiments, the phosphatase can comprise a non-specific acid phosphatase. In certain embodiments, the phosphatase can comprise PhoN.
In some embodiments, the kinase can comprise a kinase that uses a phosphate acceptor. For example, the kinase can be chosen from a polyphosphate kinase, a phosphomevalonate kinase, a phosphomethylpryimidine kinase, a farnesyl-diphosphate kinase, or a combination thereof. In certain embodiments, the kinase can comprise isopentenyl phosphate kinase (IPK). In some cases, the phosphatase, the kinase, or a combination thereof can comprise a mutant enzyme engineered to increase substrate promiscuity, improve enzyme activity, increase enzyme specificity with respect to a particular substrate, or a combination thereof.
In some embodiments, the primary alcohol defined by Formula I is not one of the following
In some embodiment steps (i) and (ii) can be performed in a cell-free system. In some of these embodiments, the method can further comprise recovering the isoprenoid subunit from the cell-free system. In other embodiments, steps (i) and (ii) can be performed in a cell comprising genes encoding for the phosphatase that exhibits bidirectional activity and the kinase. The cell can be engineered to express (or overexpress) the genes encoding for the phosphatase and the kinase.
Methods can further comprise introducing the isoprenoid subunit into a natural or artificial isoprenoid biosynthetic pathway to synthesize an isoprenoid. This can be performed within a cell or in a cell-free system.
Also provided are methods for synthesizing an isoprenoid subunit that comprise (i) providing a cell comprising genes encoding for (1) a phosphatase that exhibits bidirectional activity, and (2) a kinase; and (ii) incubating the cell in a fermentation broth with ATP and a primary alcohol defined by Formula I below
wherein R1 is selected from the group consisting of C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; R1′ is selected from the group consisting of hydrogen, C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; and each RX, when present, is independently selected from OH, NO2, CN, halo, C1-6 alkyl, C2-6 alkenyl, C2-6 alkynyl, C1-4 haloalkyl, C1-6 alkoxy, C1-6 haloalkoxy, cyano-C1-3 alkyl, HO—C1-3 alkyl, amino, C1-6 alkylamino, di(C1-6 alkyl)amino, thio, C1-6 alkylthio, C1-6 alkylsulfinyl, C1-6 alkylsulfonyl, carbamyl, C1-6 alkylcarbamyl, di(C1-6 alkyl)carbamyl, carboxy, C1-6 alkylcarbonyl, C1-6 alkoxycarbonyl, C1-6 alkylcarbonylamino, C1-6 alkylsulfonylamino, aminosulfonyl, C1-6 alkylaminosulfonyl, di(C1-6 alkyl)aminosulfonyl, aminosulfonylamino, C1-6 alkylaminosulfonylamino, di(C1-6 alkyl)aminosulfonylamino, aminocarbonylamino, C1-6 alkylaminocarbonylamino, and di(C1-6 alkyl)aminocarbonylamino; thereby generating the isoprenoid subunit.
In some embodiments, the phosphatase can comprise a non-specific acid phosphatase. In certain embodiments, the phosphatase can comprise PhoN.
In some embodiments, the kinase can comprise a kinase that uses a phosphate acceptor. For example, the kinase can be chosen from a polyphosphate kinase, a phosphomevalonate kinase, a phosphomethylpryimidine kinase, a farnesyl-diphosphate kinase, or a combination thereof. In certain embodiments, the kinase can comprise isopentenyl phosphate kinase (IPK). In some cases, the phosphatase, the kinase, or a combination thereof can comprise a mutant enzyme engineered to increase substrate promiscuity, improve enzyme activity, increase enzyme specificity with respect to a particular substrate, or a combination thereof.
In some embodiments, the primary alcohol defined by Formula I is not one of the following
Methods can further comprise introducing the isoprenoid subunit into a natural or artificial isoprenoid biosynthetic pathway to synthesize an isoprenoid, and isolating the resulting isoprenoid.
Also provided are methods for synthesizing an isoprenoid subunit that comprise (i) contacting a primary alcohol defined by Formula I below
wherein R1 is selected from the group consisting of C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; R1′ is selected from the group consisting of hydrogen, C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; and each RX, when present, is independently selected from OH, NO2, CN, halo, C1-6 alkyl, C2-6 alkenyl, C2-6 alkynyl, C1-4 haloalkyl, C1-6 alkoxy, C1-6 haloalkoxy, cyano-C1-3 alkyl, HO—C1-3 alkyl, amino, C1-6 alkylamino, di(C1-6 alkyl)amino, thio, C1-6 alkylthio, C1-6 alkylsulfinyl, C1-6 alkylsulfonyl, carbamyl, C1-6 alkylcarbamyl, di(C1-6 alkyl)carbamyl, carboxy, C1-6 alkylcarbonyl, C1-6 alkoxycarbonyl, C1-6 alkylcarbonylamino, C1-6 alkylsulfonylamino, aminosulfonyl, C1-6 alkylaminosulfonyl, di(C1-6 alkyl)aminosulfonyl, aminosulfonylamino, C1-6 alkylaminosulfonylamino, di(C1-6 alkyl)aminosulfonylamino, aminocarbonylamino, C1-6 alkylaminocarbonylamino, and di(C1-6 alkyl)aminocarbonylamino, with the proviso that the primary alcohol defined by Formula I is not one of the following
with a first kinase in the presence of ATP to form a phosphate defined by Formula II below
wherein R1, R1′, and RX are as defined above with respect to Formula I and P represents a phosphate group; and (ii) contacting the phosphate defined by Formula II with a second kinase in the presence of ATP to generate the isoprenoid subunit defined by Formula III below
wherein R1, R1′, and RX are as defined above with respect to Formula I and PP represents a pyrophosphate group.
In some embodiments, the first kinase can comprise a kinase that uses an alcohol acceptor. For example, the first kinase can be chosen from a hexokinase, a glucokinase, a galactokinase, a fructokinase, a glycerol kinase, a choline kinase, a pantetheine kinase, a mevalonate kinase, a pyruvate kinase, an undecaprenol kinase, an ethanolamine kinase, a diacylglycerol kinase, a dolichol kinase, a macrolide 2′-kinase, a ceramide kinase, or a combination thereof.
In some embodiments, the second kinase can comprise a kinase that uses a phosphate acceptor. For example, the second kinase can be chosen from a polyphosphate kinase, a phosphomevalonate kinase, a phosphomethylpyrimidine kinase, a farnesyl-diphosphate kinase, or a combination thereof. In certain embodiments, the second kinase can comprise isopentenyl phosphate kinase (IPK).
In certain embodiments, the first kinase, the second kinase, or a combination thereof comprise a mutant enzyme engineered to increase substrate promiscuity, improve enzyme activity, increase enzyme specificity with respect to a particular substrate, or a combination thereof.
In some embodiment steps (i) and (ii) can be performed in a cell-free system. In some of these embodiments, the method can further comprise recovering the isoprenoid subunit from the cell-free system. In other embodiments, steps (i) and (ii) can be performed in a cell comprising genes encoding for the first kinase and the second kinase. The cell can be engineered to express (or overexpress) the genes encoding for the first kinase and/or the second kinase.
Methods can further comprise introducing the isoprenoid subunit into a natural or artificial isoprenoid biosynthetic pathway to synthesize an isoprenoid. This can be done performed within a cell or in a cell-free system.
Also provided are methods for synthesizing an isoprenoid subunit that comprise (i) providing a cell comprising genes encoding for a first kinase and a second kinase; (ii) incubating the cell in a fermentation broth with ATP and a primary alcohol defined by Formula I below
wherein R1 is selected from the group consisting of C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; R1′ is selected from the group consisting of hydrogen, C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; and each RX, when present, is independently selected from OH, NO2, CN, halo, C1-6 alkyl, C2-6 alkenyl, C2-6 alkynyl, C1-4 haloalkyl, C1-6 alkoxy, C1-6 haloalkoxy, cyano-C1-3 alkyl, HO—C1-3 alkyl, amino, C1-6 alkylamino, di(C1-6 alkyl)amino, thio, C1-6 alkylthio, C1-6 alkylsulfinyl, C1-6 alkylsulfonyl, carbamyl, C1-6 alkylcarbamyl, di(C1-6 alkyl)carbamyl, carboxy, C1-6 alkylcarbonyl, C1-6 alkoxycarbonyl, C1-6 alkylcarbonylamino, C1-6 alkylsulfonylamino, aminosulfonyl, C1-6 alkylaminosulfonyl, di(C1-6 alkyl)aminosulfonyl, aminosulfonylamino, C1-6 alkylaminosulfonylamino, di(C1-6 alkyl)aminosulfonylamino, aminocarbonylamino, C1-6 alkylaminocarbonylamino, and di(C1-6 alkyl)aminocarbonylamino, with the proviso that the primary alcohol defined by Formula I is not one of the following
thereby generating the isoprenoid subunit.
In some embodiments, the first kinase can comprise a kinase that uses an alcohol acceptor. For example, the first kinase can be chosen from a hexokinase, a glucokinase, a galactokinase, a fructokinase, a glycerol kinase, a choline kinase, a pantetheine kinase, a mevalonate kinase, a pyruvate kinase, an undecaprenol kinase, an ethanolamine kinase, a diacylglycerol kinase, a dolichol kinase, a macrolide 2′-kinase, a ceramide kinase, or a combination thereof.
In some embodiments, the second kinase can comprise a kinase that uses a phosphate acceptor. For example, the second kinase can be chosen from a polyphosphate kinase, a phosphomevalonate kinase, a phosphomethylpyrimidine kinase, a farnesyl-diphosphate kinase, or a combination thereof. In certain embodiments, the second kinase can comprise isopentenyl phosphate kinase (IPK).
In certain embodiments, the first kinase, the second kinase, or a combination thereof comprise a mutant enzyme engineered to increase substrate promiscuity, improve enzyme activity, increase enzyme specificity with respect to a particular substrate, or a combination thereof.
Methods can further comprise introducing the isoprenoid subunit into a natural or artificial isoprenoid biosynthetic pathway to synthesize an isoprenoid. This can be done performed within a cell or in a cell-free system.
Also provided are methods for synthesizing an isoprenoid subunit that comprise (i) contacting a primary alcohol defined by Formula I below
wherein R1 is selected from the group consisting of C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; R1′ is selected from the group consisting of hydrogen, C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; and each RX, when present, is independently selected from OH, NO2, CN, halo, C1-6 alkyl, C2-6 alkenyl, C2-6 alkynyl, C1-4 haloalkyl, C1-6 alkoxy, C1-6 haloalkoxy, cyano-C1-3 alkyl, HO—C1-3 alkyl, amino, C1-6 alkylamino, di(C1-6 alkyl)amino, thio, C1-6 alkylthio, C1-6 alkylsulfinyl, C1-6 alkylsulfonyl, carbamyl, C1-6 alkylcarbamyl, di(C1-6 alkyl)carbamyl, carboxy, C1-6 alkylcarbonyl, C1-6 alkoxycarbonyl, C1-6 alkylcarbonylamino, C1-6 alkylsulfonylamino, aminosulfonyl, C1-6 alkylaminosulfonyl, di(C1-6 alkyl)aminosulfonyl, aminosulfonylamino, C1-6 alkylaminosulfonylamino, di(C1-6 alkyl)aminosulfonylamino, aminocarbonylamino, C1-6 alkylaminocarbonylamino, and di(C1-6 alkyl)aminocarbonylamino, with a first kinase in the presence of ATP to form a phosphate defined by Formula II below
wherein R1, R1′, and RX are as defined above with respect to Formula I and P represents a phosphate group; and (ii) contacting the phosphate defined by Formula II with a second kinase in the presence of ATP to generate the isoprenoid subunit defined by Formula III below
wherein R1, R1′, and RX are as defined above with respect to Formula I and PP represents a pyrophosphate group; wherein the first kinase, the second kinase, or a combination thereof comprise a mutant enzyme engineered to increase substrate promiscuity, improve enzyme activity, increase enzyme specificity with respect to a particular substrate, or a combination thereof.
In some embodiments, the first kinase can comprise a kinase that uses an alcohol acceptor. For example, the first kinase can be chosen from a hexokinase, a glucokinase, a galactokinase, a fructokinase, a glycerol kinase, a choline kinase, a pantetheine kinase, a mevalonate kinase, a pyruvate kinase, an undecaprenol kinase, an ethanolamine kinase, a diacylglycerol kinase, a dolichol kinase, a macrolide 2′-kinase, a ceramide kinase, or a combination thereof.
In some embodiments, the second kinase can comprise a kinase that uses a phosphate acceptor. For example, the second kinase can be chosen from a polyphosphate kinase, a phosphomevalonate kinase, a phosphomethylpyrimidine kinase, a farnesyl-diphosphate kinase, or a combination thereof. In certain embodiments, the second kinase can comprise isopentenyl phosphate kinase (IPK).
In some embodiments, the primary alcohol defined by Formula I is not one of the following
In some embodiment steps (i) and (ii) can be performed in a cell-free system. In some of these embodiments, the method can further comprise recovering the isoprenoid subunit from the cell-free system. In other embodiments, steps (i) and (ii) can be performed in a cell comprising genes encoding for the first kinase and the second kinase. The cell can be engineered to express (or overexpress) the genes encoding for the first kinase and/or the second kinase.
Methods can further comprise introducing the isoprenoid subunit into a natural or artificial isoprenoid biosynthetic pathway to synthesize an isoprenoid. This can be done performed within a cell or in a cell-free system.
Also provided are methods for synthesizing an isoprenoid subunit that comprise (i) providing a cell comprising genes encoding for a first kinase and a second kinase, wherein the first kinase, the second kinase, or a combination thereof comprise a mutant enzyme engineered to increase substrate promiscuity, improve enzyme activity, increase enzyme specificity with respect to a particular substrate, or a combination thereof; (ii) incubating the cell in a fermentation broth with ATP and a primary alcohol defined by Formula I below
wherein R1 is selected from the group consisting of C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; R1′ is selected from the group consisting of hydrogen, C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; and each RX, when present, is independently selected from OH, NO2, CN, halo, C1-6 alkyl, C2-6 alkenyl, C2-6 alkynyl, C1-4 haloalkyl, C1-6 alkoxy, C1-6 haloalkoxy, cyano-C1-3 alkyl, HO—C1-3 alkyl, amino, C1-6 alkylamino, di(C1-6 alkyl)amino, thio, C1-6 alkylthio, C1-6 alkylsulfinyl, C1-6 alkylsulfonyl, carbamyl, C1-6 alkylcarbamyl, di(C1-6 alkyl)carbamyl, carboxy, C1-6 alkylcarbonyl, C1-6 alkoxycarbonyl, C1-6 alkylcarbonylamino, C1-6 alkylsulfonylamino, aminosulfonyl, C1-6 alkylaminosulfonyl, di(C1-6 alkyl)aminosulfonyl, aminosulfonylamino, C1-6 alkylaminosulfonylamino, di(C1-6 alkyl)aminosulfonylamino, aminocarbonylamino, C1-6 alkylaminocarbonylamino, and di(C1-6 alkyl)aminocarbonylamino; thereby generating the isoprenoid subunit.
In some embodiments, the first kinase can comprise a kinase that uses an alcohol acceptor. For example, the first kinase can be chosen from a hexokinase, a glucokinase, a galactokinase, a fructokinase, a glycerol kinase, a choline kinase, a pantetheine kinase, a mevalonate kinase, a pyruvate kinase, an undecaprenol kinase, an ethanolamine kinase, a diacylglycerol kinase, a dolichol kinase, a macrolide 2′-kinase, a ceramide kinase, or a combination thereof.
In some embodiments, the second kinase can comprise a kinase that uses a phosphate acceptor. For example, the second kinase can be chosen from a polyphosphate kinase, a phosphomevalonate kinase, a phosphomethylpyrimidine kinase, a farnesyl-diphosphate kinase, or a combination thereof. In certain embodiments, the second kinase can comprise isopentenyl phosphate kinase (IPK).
In some embodiments, the primary alcohol defined by Formula I is not one of the following
Methods can further comprise introducing the isoprenoid subunit into a natural or artificial isoprenoid biosynthetic pathway to synthesize an isoprenoid. This can be done performed within a cell or in a cell-free system.
Also provided are methods for synthesizing an isoprenoid subunit that comprise (i) contacting a primary alcohol defined by Formula I below
wherein R1 is selected from the group consisting of C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; R1′ is selected from the group consisting of hydrogen, C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; and each RX, when present, is independently selected from OH, NO2, CN, halo, C1-6 alkyl, C2-6 alkenyl, C2-6 alkynyl, C1-4 haloalkyl, C1-6 alkoxy, C1-6 haloalkoxy, cyano-C1-3 alkyl, HO—C1-3 alkyl, amino, C1-6 alkylamino, di(C1-6 alkyl)amino, thio, C1-6 alkylthio, C1-6 alkylsulfinyl, C1-6 alkylsulfonyl, carbamyl, C1-6 alkylcarbamyl, di(C1-6 alkyl)carbamyl, carboxy, C1-6 alkylcarbonyl, C1-6 alkoxycarbonyl, C1-6 alkylcarbonylamino, C1-6 alkylsulfonylamino, aminosulfonyl, C1-6 alkylaminosulfonyl, di(C1-6 alkyl)aminosulfonyl, aminosulfonylamino, C1-6 alkylaminosulfonylamino, di(C1-6 alkyl)aminosulfonylamino, aminocarbonylamino, C1-6 alkylaminocarbonylamino, and di(C1-6 alkyl)aminocarbonylamino; with a single enzyme in the presence of ATP to generate the isoprenoid subunit defined by Formula III below
wherein R1, R1′, and RX are as defined above with respect to Formula I and PP represents a pyrophosphate group, wherein the single enzyme comprises a phosphotransferase that can catalyze both a first phosphorylation and a second phosphorylation of the primary alcohol defined by Formula I to generate the isoprenoid subunit defined by Formula III.
In some embodiments, the single enzyme can comprise a phosphotransferase that uses an alcohol acceptor. In some embodiments, the single enzyme can comprise a phosphotransferase that uses a phosphate acceptor. In some embodiments, the single enzyme can comprise isopentenyl phosphate kinase (IPK). In certain embodiments, the single enzyme can comprise a mutant enzyme engineered to increase substrate promiscuity, improve enzyme activity, increase enzyme specificity with respect to a particular substrate, or a combination thereof.
In some embodiments, the primary alcohol defined by Formula I is not one of the following
In some embodiment steps (i) and (ii) can be performed in a cell-free system. In some of these embodiments, the method can further comprise recovering the isoprenoid subunit from the cell-free system. In other embodiments, steps (i) and (ii) can be performed in a cell comprising genes encoding for the first kinase and the second kinase. The cell can be engineered to express (or overexpress) the genes encoding for the first kinase and/or the second kinase.
Methods can further comprise introducing the isoprenoid subunit into a natural or artificial isoprenoid biosynthetic pathway to synthesize an isoprenoid. This can be done performed within a cell or in a cell-free system.
Also provided are methods for synthesizing an isoprenoid subunit that comprise (i) incubating a cell in a fermentation broth with ATP and a primary alcohol defined by Formula I below
wherein R1 is selected from the group consisting of C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; R1′ is selected from the group consisting of hydrogen, C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; and each RX, when present, is independently selected from OH, NO2, CN, halo, C1-6 alkyl, C2-6 alkenyl, C2-6 alkynyl, C1-4 haloalkyl, C1-6 alkoxy, C1-6 haloalkoxy, cyano-C1-3 alkyl, HO—C1-3 alkyl, amino, C1-6 alkylamino, di(C1-6 alkyl)amino, thio, C1-6 alkylthio, C1-6 alkylsulfinyl, C1-6 alkylsulfonyl, carbamyl, C1-6 alkylcarbamyl, di(C1-6 alkyl)carbamyl, carboxy, C1-6 alkylcarbonyl, C1-6 alkoxycarbonyl, C1-6 alkylcarbonylamino, C1-6 alkylsulfonylamino, aminosulfonyl, C1-6 alkylaminosulfonyl, di(C1-6 alkyl)aminosulfonyl, aminosulfonylamino, C1-6 alkylaminosulfonylamino, di(C1-6 alkyl)aminosulfonylamino, aminocarbonylamino, C1-6 alkylaminocarbonylamino, and di(C1-6 alkyl)aminocarbonylamino, thereby generating the isoprenoid subunit defined by Formula III below
wherein R1, R1′, and RX are as defined above with respect to Formula I and PP represents a pyrophosphate group; wherein the cell comprises a gene encoding for a phosphotransferase that can catalyze both a first phosphorylation and a second phosphorylation of the primary alcohol defined by Formula I to generate the isoprenoid subunit.
In some embodiments, the single enzyme can comprise a phosphotransferase that uses an alcohol acceptor. In some embodiments, the single enzyme can comprise a phosphotransferase that uses a phosphate acceptor. In some embodiments, the single enzyme can comprise isopentenyl phosphate kinase (IPK). In certain embodiments, the single enzyme can comprise a mutant enzyme engineered to increase substrate promiscuity, improve enzyme activity, increase enzyme specificity with respect to a particular substrate, or a combination thereof.
In some embodiments, the primary alcohol defined by Formula I is not one of the following
Methods can further comprise introducing the isoprenoid subunit into a natural or artificial isoprenoid biosynthetic pathway to synthesize an isoprenoid. This can be done performed within a cell or in a cell-free system.
As discussed above, the methods described herein can be used to prepare a variety of isoprenoids. Accordingly, provided herein are a variety of new isoprenoids, including isoprenoid defined by Formula IV below
wherein R2 is selected from the group consisting of hydrogen, C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; R3 is selected from the group consisting of C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; and each RX, when present, is independently selected from OH, NO2, CN, halo, C1-6 alkyl, C2-6 alkenyl, C2-6 alkynyl, C1-4 haloalkyl, C1-6 alkoxy, C1-6 haloalkoxy, cyano-C1-3 alkyl, HO—C1-3 alkyl, amino, C1-6 alkylamino, di(C1-6 alkyl)amino, thio, C1-6 alkylthio, C1-6 alkylsulfinyl, C1-6 alkylsulfonyl, carbamyl, C1-6 alkylcarbamyl, di(C1-6 alkyl)carbamyl, carboxy, C1-6 alkylcarbonyl, C1-6 alkoxycarbonyl, C1-6 alkylcarbonylamino, C1-6 alkylsulfonylamino, aminosulfonyl, C1-6 alkylaminosulfonyl, di(C1-6 alkyl)aminosulfonyl, aminosulfonylamino, C1-6 alkylaminosulfonylamino, di(C1-6 alkyl)aminosulfonylamino, aminocarbonylamino, C1-6 alkylaminocarbonylamino, and di(C1-6 alkyl)aminocarbonylamino.
In some embodiments, R2 is hydrogen. In other embodiments, R2 can be selected from the group consisting of 6-10 membered aryl, 5-10 membered heteroaryl, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups. In certain embodiments, R2 can comprise:
wherein n is 0, 1, or 2 and RX is as defined above with respect to Formula IV.
In some embodiments, R3 is not one of the following
In some embodiments, R3 can be one of the following:
Also provided are isoprenoids defined by Formula V below
wherein R3 is selected from the group consisting of C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; R4 is selected from the group consisting of hydrogen, C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; and each RX, when present, is independently selected from OH, NO2, CN, halo, C1-6 alkyl, C2-6 alkenyl, C2-6 alkynyl, C1-4 haloalkyl, C1-6 alkoxy, C1-6 haloalkoxy, cyano-C1-3 alkyl, HO—C1-3 alkyl, amino, C1-6 alkylamino, di(C1-6 alkyl)amino, thio, C1-6 alkylthio, C1-6 alkylsulfinyl, C1-6 alkylsulfonyl, carbamyl, C1-6 alkylcarbamyl, di(C1-6 alkyl)carbamyl, carboxy, C1-6 alkylcarbonyl, C1-6 alkoxycarbonyl, C1-6 alkylcarbonylamino, C1-6 alkylsulfonylamino, aminosulfonyl, C1-6 alkylaminosulfonyl, di(C1-6 alkyl)aminosulfonyl, aminosulfonylamino, C1-6 alkylaminosulfonylamino, di(C1-6 alkyl)aminosulfonylamino, aminocarbonylamino, C1-6 alkylaminocarbonylamino, and di(C1-6 alkyl)aminocarbonylamino.
In some embodiments, R4 can be hydrogen.
In some embodiments, R3 is not one of the following
In some embodiments, R3 is one of the following:
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Methods and materials are described herein for use in the present invention; other, suitable methods and materials known in the art can also be used. The materials, methods, and examples are illustrative only and not intended to be limiting. All publications, patent applications, patents, sequences, database entries, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control.
At various places in the present specification, divalent linking substituents are described. Where the structure clearly requires a linking group, the Markush variables listed for that group are understood to be linking groups.
The term “n-membered” where n is an integer typically describes the number of ring-forming atoms in a moiety where the number of ring-forming atoms is n. For example, piperidinyl is an example of a 6-membered heterocycloalkyl ring, pyrazolyl is an example of a 5-membered heteroaryl ring, pyridyl is an example of a 6-membered heteroaryl ring, and 1,2,3,4-tetrahydro-naphthalene is an example of a 10-membered cycloalkyl group.
As used herein, the phrase “optionally substituted” means unsubstituted or substituted. As used herein, the term “substituted” means that a hydrogen atom is removed and replaced by a substituent. It is to be understood that substitution at a given atom is limited by valency.
Throughout the definitions, the term “Cn-m” indicates a range which includes the endpoints, wherein n and m are integers and indicate the number of carbons. Examples include C1-4, C1-6, and the like.
As used herein, the term “Cu-m alkyl”, employed alone or in combination with other terms, refers to a saturated hydrocarbon group that may be straight-chain or branched, having n to m carbons. Examples of alkyl moieties include, but are not limited to, chemical groups such as methyl, ethyl, n-propyl, isopropyl, n-butyl, tert-butyl, isobutyl, sec-butyl; higher homologs such as 2-methyl-1-butyl, n-pentyl, 3-pentyl, n-hexyl, 1,2,2-trimethylpropyl, and the like. In some embodiments, the alkyl group contains from 1 to 6 carbon atoms, from 1 to 4 carbon atoms, from 1 to 3 carbon atoms, or 1 to 2 carbon atoms.
The term “heteroalkyl,” by itself or in combination with another term, means, unless otherwise stated, a stable straight or branched chain, or cyclic hydrocarbon radical, or combinations thereof, consisting of at least one carbon atoms and at least one heteroatom selected from the group consisting of O, N, P, Si and S, and wherein the nitrogen, phosphorus, and sulfur atoms may optionally be oxidized and the nitrogen heteroatom may optionally be quaternized. The heteroatom(s) O, N, P and S and Si may be placed at any interior position of the heteroalkyl group or at the position at which alkyl group is attached to the remainder of the molecule. Examples include, but are not limited to, —CH2—CH2—O—CH3, —CH2—CH2—NH—CH3, —CH2—CH2—N(CH3)—CH3, —CH2—S—CH2—CH3, —CH2—CH2, S(O)—CH3, —CH2—CH2—S(O)2—CH3, CH═CH—O—CH3, —Si(CH3)3, —CH2—CH═N—OCH3, —CH═CH—N(CH3)—CH3, O—CH3, —O—CH2—CH3, and —CN. Up to two or three heteroatoms may be consecutive, such as, for example, —CH2—NH—OCH3 and —CH2—O—Si(CH3)3. Similarly, the term “heteroalkylene” by itself or as part of another substituent means a divalent radical derived from heteroalkyl, as exemplified, but not limited by, —CH2—CH2—S—CH2—CH2— and —CH2—S—CH2—CH2—NH—CH2—. For heteroalkylene groups, heteroatoms can also occupy either or both of the chain termini (e.g., alkyleneoxo, alkylenedioxo, alkyleneamino, alkylenediamino, and the like). Still further, for alkylene and heteroalkylene linking groups, no orientation of the linking group is implied by the direction in which the formula of the linking group is written. For example, the formula C(O)OR′— represents both —C(O)OR′— and —R′OC(O)—. As described above, heteroalkyl groups, as used herein, include those groups that are attached to the remainder of the molecule through a heteroatom, such as —C(O)R′, —C(O)NR′, —NR′R″, —OR′, —SR′, and/or —SO2R′. Where “heteroalkyl” is recited, followed by recitations of specific heteroalkyl groups, such as —NR′R″ or the like, it will be understood that the terms heteroalkyl and —NR′R″ are not redundant or mutually exclusive. Rather, the specific heteroalkyl groups are recited to add clarity. Thus, the term “heteroalkyl” should not be interpreted herein as excluding specific heteroalkyl groups, such as —NR′R″ or the like.
As used herein, “Cn-m alkenyl” refers to an alkyl group having one or more double carbon-carbon bonds and having n to m carbons. Example alkenyl groups include, but are not limited to, ethenyl, n-propenyl, isopropenyl, n-butenyl, sec-butenyl, and the like. In some embodiments, the alkenyl moiety contains 2 to 6, 2 to 4, or 2 to 3 carbon atoms.
As used herein, “Cn-m alkynyl” refers to an alkyl group having one or more triple carbon-carbon bonds and having n to m carbons. Example alkynyl groups include, but are not limited to, ethynyl, propyn-1-yl, propyn-2-yl, and the like. In some embodiments, the alkynyl moiety contains 2 to 6, 2 to 4, or 2 to 3 carbon atoms.
As used herein, the term “Cn-m alkylene”, employed alone or in combination with other terms, refers to a divalent alkyl linking group having n to m carbons. Examples of alkylene groups include, but are not limited to, ethan-1,2-diyl, propan-1,3-diyl, propan-1,2-diyl, butan-1,4-diyl, butan-1,3-diyl, butan-1,2-diyl, 2-methyl-propan-1,3-diyl, and the like. In some embodiments, the alkylene moiety contains 2 to 6, 2 to 4, 2 to 3, 1 to 6, 1 to 4, or 1 to 2 carbon atoms.
As used herein, the term “Cn-m alkoxy”, employed alone or in combination with other terms, refers to a group of formula —O-alkyl, wherein the alkyl group has n to m carbons. Example alkoxy groups include methoxy, ethoxy, propoxy (e.g., n-propoxy and isopropoxy), tert-butoxy, and the like. In some embodiments, the alkyl group has 1 to 6, 1 to 4, or 1 to 3 carbon atoms.
As used herein, the term “Cn-m alkylamino” refers to a group of formula —NH(alkyl), wherein the alkyl group has n to m carbon atoms. In some embodiments, the alkyl group has 1 to 6, 1 to 4, or 1 to 3 carbon atoms.
As used herein, the term “Cn-m alkoxycarbonyl” refers to a group of formula —C(O)O-alkyl, wherein the alkyl group has n to m carbon atoms. In some embodiments, the alkyl group has 1 to 6, 1 to 4, or 1 to 3 carbon atoms.
As used herein, the term “Cn-m alkylcarbonyl” refers to a group of formula —C(O)— alkyl, wherein the alkyl group has n to m carbon atoms. In some embodiments, the alkyl group has 1 to 6, 1 to 4, or 1 to 3 carbon atoms.
As used herein, the term “Cn-m alkylcarbonylamino” refers to a group of formula —NHC(O)-alkyl, wherein the alkyl group has n to m carbon atoms. In some embodiments, the alkyl group has 1 to 6, 1 to 4, or 1 to 3 carbon atoms.
As used herein, the term “Cn-m alkylsulfonylamino” refers to a group of formula —NHS(O)2-alkyl, wherein the alkyl group has n to m carbon atoms. In some embodiments, the alkyl group has 1 to 6, 1 to 4, or 1 to 3 carbon atoms.
As used herein, the term “aminosulfonyl” refers to a group of formula —S(O)2NH2.
As used herein, the term “Cn-m alkylaminosulfonyl” refers to a group of formula —S(O)2NH(alkyl), wherein the alkyl group has n to m carbon atoms. In some embodiments, the alkyl group has 1 to 6, 1 to 4, or 1 to 3 carbon atoms.
As used herein, the term “di(Cn-m alkyl)aminosulfonyl” refers to a group of formula —S(O)2N(alkyl)2, wherein each alkyl group independently has n to m carbon atoms. In some embodiments, each alkyl group has, independently, 1 to 6, 1 to 4, or 1 to 3 carbon atoms.
As used herein, the term “aminosulfonylamino” refers to a group of formula —NHS(O)2NH2.
As used herein, the term “Cn-m alkylaminosulfonylamino” refers to a group of formula —NHS(O)2NH(alkyl), wherein the alkyl group has n to m carbon atoms. In some embodiments, the alkyl group has 1 to 6, 1 to 4, or 1 to 3 carbon atoms.
As used herein, the term “di(Cn-m alkyl)aminosulfonylamino” refers to a group of formula —NHS(O)2N(alkyl)2, wherein each alkyl group independently has n to m carbon atoms. In some embodiments, each alkyl group has, independently, 1 to 6, 1 to 4, or 1 to 3 carbon atoms.
As used herein, the term “aminocarbonylamino”, employed alone or in combination with other terms, refers to a group of formula —NHC(O)NH2.
As used herein, the term “Cn-m alkylaminocarbonylamino” refers to a group of formula —NHC(O)NH(alkyl), wherein the alkyl group has n to m carbon atoms. In some embodiments, the alkyl group has 1 to 6, 1 to 4, or 1 to 3 carbon atoms.
As used herein, the term “di(Cn-m alkyl)aminocarbonylamino” refers to a group of formula —NHC(O)N(alkyl)2, wherein each alkyl group independently has n to m carbon atoms. In some embodiments, each alkyl group has, independently, 1 to 6, 1 to 4, or 1 to 3 carbon atoms.
As used herein, the term “Cn-m alkylcarbamyl” refers to a group of formula —C(O)—NH(alkyl), wherein the alkyl group has n to m carbon atoms. In some embodiments, the alkyl group has 1 to 6, 1 to 4, or 1 to 3 carbon atoms.
As used herein, the term “thio” refers to a group of formula —SH.
As used herein, the term “Cn-m alkylsulfinyl” refers to a group of formula —S(O)— alkyl, wherein the alkyl group has n to m carbon atoms. In some embodiments, the alkyl group has 1 to 6, 1 to 4, or 1 to 3 carbon atoms.
As used herein, the term “Cn-m alkylsulfonyl” refers to a group of formula —S(O)2-alkyl, wherein the alkyl group has n to m carbon atoms. In some embodiments, the alkyl group has 1 to 6, 1 to 4, or 1 to 3 carbon atoms.
As used herein, the term “amino” refers to a group of formula —NH2.
As used herein, the term “aryl,” employed alone or in combination with other terms, refers to an aromatic hydrocarbon group, which may be monocyclic or polycyclic (e.g., having 2, 3 or 4 fused rings). The term “Cn-m aryl” refers to an aryl group having from n to m ring carbon atoms. Aryl groups include, e.g., phenyl, naphthyl, anthracenyl, phenanthrenyl, indanyl, indenyl, and the like. In some embodiments, aryl groups have from 6 to about 20 carbon atoms, from 6 to about 15 carbon atoms, or from 6 to about 10 carbon atoms. In some embodiments, the aryl group is a substituted or unsubstituted phenyl.
As used herein, the term “carbamyl” to a group of formula —C(O)NH2.
As used herein, the term “carbonyl”, employed alone or in combination with other terms, refers to a —C(═O)— group, which may also be written as C(O).
As used herein, the term “di(Cn-m-alkyl)amino” refers to a group of formula —N(alkyl)2, wherein the two alkyl groups each has, independently, n to m carbon atoms. In some embodiments, each alkyl group independently has 1 to 6, 1 to 4, or 1 to 3 carbon atoms.
As used herein, the term “di(Cn-m-alkyl)carbamyl” refers to a group of formula —C(O)N(alkyl)2, wherein the two alkyl groups each has, independently, n to m carbon atoms. In some embodiments, each alkyl group independently has 1 to 6, 1 to 4, or 1 to 3 carbon atoms.
As used herein, the term “halo” refers to F, Cl, Br, or I. In some embodiments, a halo is F, Cl, or Br. In some embodiments, a halo is F or Cl.
As used herein, “Cn-m haloalkoxy” refers to a group of formula —O-haloalkyl having n to m carbon atoms. An example haloalkoxy group is OCF3. In some embodiments, the haloalkoxy group is fluorinated only. In some embodiments, the alkyl group has 1 to 6, 1 to 4, or 1 to 3 carbon atoms.
As used herein, the term “Cn-m haloalkyl”, employed alone or in combination with other terms, refers to an alkyl group having from one halogen atom to 2s+1 halogen atoms which may be the same or different, where “s” is the number of carbon atoms in the alkyl group, wherein the alkyl group has n to m carbon atoms. In some embodiments, the haloalkyl group is fluorinated only. In some embodiments, the alkyl group has 1 to 6, 1 to 4, or 1 to 3 carbon atoms.
As used herein, “cycloalkyl” refers to non-aromatic cyclic hydrocarbons including cyclized alkyl and/or alkenyl groups. Cycloalkyl groups can include mono- or polycyclic (e.g., having 2, 3 or 4 fused rings) groups and spirocycles. Cycloalkyl groups can have 3, 4, 5, 6, 7, 8, 9, or 10 ring-forming carbons (C3-10). Ring-forming carbon atoms of a cycloalkyl group can be optionally substituted by oxo or sulfido (e.g., C(O) or C(S)).
Cycloalkyl groups also include cycloalkylidenes. Example cycloalkyl groups include cyclopropyl, cyclobutyl, cyclopentyl, cyclohexyl, cycloheptyl, cyclopentenyl, cyclohexenyl, cyclohexadienyl, cycloheptatrienyl, norbornyl, norpinyl, norcarnyl, and the like. In some embodiments, cycloalkyl is cyclopropyl, cyclobutyl, cyclopentyl, cyclohexyl, cyclopentyl, or adamantyl. In some embodiments, the cycloalkyl has 6-10 ring-forming carbon atoms. In some embodiments, cycloalkyl is adamantyl. Also included in the definition of cycloalkyl are moieties that have one or more aromatic rings fused (i.e., having a bond in common with) to the cycloalkyl ring, for example, benzo or thienyl derivatives of cyclopentane, cyclohexane, and the like. A cycloalkyl group containing a fused aromatic ring can be attached through any ring-forming atom including a ring-forming atom of the fused aromatic ring.
As used herein, “heteroaryl” refers to a monocyclic or polycyclic aromatic heterocycle having at least one heteroatom ring member selected from sulfur, oxygen, and nitrogen. In some embodiments, the heteroaryl ring has 1, 2, 3, or 4 heteroatom ring members independently selected from nitrogen, sulfur and oxygen. In some embodiments, any ring-forming N in a heteroaryl moiety can be an N-oxide. In some embodiments, the heteroaryl has 5-10 ring atoms and 1, 2, 3 or 4 heteroatom ring members independently selected from nitrogen, sulfur and oxygen. In some embodiments, the heteroaryl has 5-6 ring atoms and 1 or 2 heteroatom ring members independently selected from nitrogen, sulfur and oxygen. In some embodiments, the heteroaryl is a five-membered or six-membered heteroaryl ring. A five-membered heteroaryl ring is a heteroaryl with a ring having five ring atoms wherein one or more (e.g., 1, 2, or 3) ring atoms are independently selected from N, O, and S. Exemplary five-membered ring heteroaryls are thienyl, furyl, pyrrolyl, imidazolyl, thiazolyl, oxazolyl, pyrazolyl, isothiazolyl, isoxazolyl, 1,2,3-triazolyl, tetrazolyl, 1,2,3-thiadiazolyl, 1,2,3-oxadiazolyl, 1,2,4-triazolyl, 1,2,4-thiadiazolyl, 1,2,4-oxadiazolyl, 1,3,4-triazolyl, 1,3,4-thiadiazolyl, and 1,3,4-oxadiazolyl. A six-membered heteroaryl ring is a heteroaryl with a ring having six ring atoms wherein one or more (e.g., 1, 2, or 3) ring atoms are independently selected from N, O, and S. Exemplary six-membered ring heteroaryls are pyridyl, pyrazinyl, pyrimidinyl, triazinyl and pyridazinyl.
As used herein, “heterocycloalkyl” refers to non-aromatic monocyclic or polycyclic heterocycles having one or more ring-forming heteroatoms selected from O, N, or S. Included in heterocycloalkyl are monocyclic 4-, 5-, 6-, and 7-membered heterocycloalkyl groups. Heterocycloalkyl groups can also include spirocycles. Example heterocycloalkyl groups include pyrrolidin-2-one, 1,3-isoxazolidin-2-one, pyranyl, tetrahydropuran, oxetanyl, azetidinyl, morpholino, thiomorpholino, piperazinyl, tetrahydrofuranyl, tetrahydrothienyl, piperidinyl, pyrrolidinyl, isoxazolidinyl, isothiazolidinyl, pyrazolidinyl, oxazolidinyl, thiazolidinyl, imidazolidinyl, azepanyl, benzazapene, and the like. Ring-forming carbon atoms and heteroatoms of a heterocycloalkyl group can be optionally substituted by oxo or sulfido (e.g., C(O), S(O), C(S), or S(O)2, etc.). The heterocycloalkyl group can be attached through a ring-forming carbon atom or a ring-forming heteroatom. In some embodiments, the heterocycloalkyl group contains 0 to 3 double bonds. In some embodiments, the heterocycloalkyl group contains 0 to 2 double bonds. Also included in the definition of heterocycloalkyl are moieties that have one or more aromatic rings fused (i.e., having a bond in common with) to the cycloalkyl ring, for example, benzo or thienyl derivatives of piperidine, morpholine, azepine, etc. A heterocycloalkyl group containing a fused aromatic ring can be attached through any ring-forming atom including a ring-forming atom of the fused aromatic ring. In some embodiments, the heterocycloalkyl has 4-10, 4-7 or 4-6 ring atoms with 1 or 2 heteroatoms independently selected from nitrogen, oxygen, or sulfur and having one or more oxidized ring members.
At certain places, the definitions or embodiments refer to specific rings (e.g., an azetidine ring, a pyridine ring, etc.). Unless otherwise indicated, these rings can be attached to any ring member provided that the valency of the atom is not exceeded. For example, an azetidine ring may be attached at any position of the ring, whereas a pyridin-3-yl ring is attached at the 3-position.
The term “compound” as used herein is meant to include all stereoisomers, geometric isomers, tautomers, and isotopes of the structures depicted. Compounds herein identified by name or structure as one particular tautomeric form are intended to include other tautomeric forms unless otherwise specified.
Compounds provided herein also include tautomeric forms. Tautomeric forms result from the swapping of a single bond with an adjacent double bond together with the concomitant migration of a proton. Tautomeric forms include prototropic tautomers which are isomeric protonation states having the same empirical formula and total charge. Example prototropic tautomers include ketone-enol pairs, amide-imidic acid pairs, lactam-lactim pairs, enamine-imine pairs, and annular forms where a proton can occupy two or more positions of a heterocyclic system, for example, 1H- and 3H-imidazole, 1H-, 2H- and 4H-1,2,4-triazole, 1H- and 2H-isoindole, and 1H- and 2H-pyrazole. Tautomeric forms can be in equilibrium or sterically locked into one form by appropriate substitution.
In some embodiments, the compounds described herein can contain one or more asymmetric centers and thus occur as racemates and racemic mixtures, enantiomerically enriched mixtures, single enantiomers, individual diastereomers and diastereomeric mixtures (e.g., including (R)- and (S)-enantiomers, diastereomers, (D)-isomers, (L)-isomers, (+) (dextrorotatory) forms, (−) (levorotatory) forms, the racemic mixtures thereof, and other mixtures thereof). Additional asymmetric carbon atoms can be present in a substituent, such as an alkyl group. All such isomeric forms, as well as mixtures thereof, of these compounds are expressly included in the present description. The compounds described herein can also or further contain linkages wherein bond rotation is restricted about that particular linkage, e.g. restriction resulting from the presence of a ring or double bond (e.g., carbon-carbon bonds, carbon-nitrogen bonds such as amide bonds). Accordingly, all cis/trans and E/Z isomers and rotational isomers are expressly included in the present description. Unless otherwise mentioned or indicated, the chemical designation of a compound encompasses the mixture of all possible stereochemically isomeric forms of that compound.
Optical isomers can be obtained in pure form by standard procedures known to those skilled in the art, and include, but are not limited to, diastereomeric salt formation, kinetic resolution, and asymmetric synthesis. See, for example, Jacques, et al., Enantiomers, Racemates and Resolutions (Wiley Interscience, New York, 1981); Wilen, S. H., et al., Tetrahedron 33:2725 (1977); Eliel, E. L. Stereochemistry of Carbon Compounds (McGraw-Hill, N Y, 1962); Wilen, S. H. Tables of Resolving Agents and Optical Resolutions p. 268 (E. L. Eliel, Ed., Univ. of Notre Dame Press, Notre Dame, Ind. 1972), each of which is incorporated herein by reference in their entireties. It is also understood that the compounds described herein include all possible regioisomers, and mixtures thereof, which can be obtained in pure form by standard separation procedures known to those skilled in the art, and include, but are not limited to, column chromatography, thin-layer chromatography, and high-performance liquid chromatography.
Unless specifically defined, compounds provided herein can also include all isotopes of atoms occurring in the intermediates or final compounds. Isotopes include those atoms having the same atomic number but different mass numbers. Unless otherwise stated, when an atom is designated as an isotope or radioisotope (e.g., deuterium, [11C], [18F]), the atom is understood to comprise the isotope or radioisotope in an amount at least greater than the natural abundance of the isotope or radioisotope. For example, when an atom is designated as “D” or “deuterium”, the position is understood to have deuterium at an abundance that is at least 3000 times greater than the natural abundance of deuterium, which is 0.015% (i.e., at least 45% incorporation of deuterium).
All compounds, and pharmaceutically acceptable salts thereof, can be found together with other substances such as water and solvents (e.g. hydrates and solvates) or can be isolated.
In some embodiments, preparation of compounds can involve the addition of acids or bases to affect, for example, catalysis of a desired reaction or formation of salt forms such as acid addition salts.
Example acids can be inorganic or organic acids and include, but are not limited to, strong and weak acids. Some example acids include hydrochloric acid, hydrobromic acid, sulfuric acid, phosphoric acid, p-toluenesulfonic acid, 4-nitrobenzoic acid, methanesulfonic acid, benzenesulfonic acid, trifluoroacetic acid, and nitric acid. Some weak acids include, but are not limited to acetic acid, propionic acid, butanoic acid, benzoic acid, tartaric acid, pentanoic acid, hexanoic acid, heptanoic acid, octanoic acid, nonanoic acid, and decanoic acid.
Example bases include lithium hydroxide, sodium hydroxide, potassium hydroxide, lithium carbonate, sodium carbonate, potassium carbonate, and sodium bicarbonate. Some example strong bases include, but are not limited to, hydroxide, alkoxides, metal amides, metal hydrides, metal dialkylamides and arylamines, wherein; alkoxides include lithium, sodium and potassium salts of methyl, ethyl and t-butyl oxides; metal amides include sodium amide, potassium amide and lithium amide; metal hydrides include sodium hydride, potassium hydride and lithium hydride; and metal dialkylamides include lithium, sodium, and potassium salts of methyl, ethyl, n-propyl, iso-propyl, n-butyl, tert-butyl, trimethylsilyl and cyclohexyl substituted amides.
In some embodiments, the compounds provided herein, or salts thereof, are substantially isolated. By “substantially isolated” is meant that the compound is at least partially or substantially separated from the environment in which it was formed or detected. Partial separation can include, for example, a composition enriched in the compounds provided herein. Substantial separation can include compositions containing at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 97%, or at least about 99% by weight of the compounds provided herein, or salt thereof. Methods for isolating compounds and their salts are routine in the art.
The expressions, “ambient temperature” and “room temperature” or “rt” as used herein, are understood in the art, and refer generally to a temperature, e.g. a reaction temperature, that is about the temperature of the room in which the reaction is carried out, for example, a temperature from about 20° C. to about 30° C.
The phrase “pharmaceutically acceptable” is employed herein to refer to those compounds, materials, compositions, and/or dosage forms which are, within the scope of sound medical judgment, suitable for use in contact with the tissues of human beings and animals without excessive toxicity, irritation, allergic response, or other problem or complication, commensurate with a reasonable benefit/risk ratio.
The present application also includes pharmaceutically acceptable salts of the compounds described herein. As used herein, “pharmaceutically acceptable salts” refers to derivatives of the disclosed compounds wherein the parent compound is modified by converting an existing acid or base moiety to its salt form. Examples of pharmaceutically acceptable salts include, but are not limited to, mineral or organic acid salts of basic residues such as amines; alkali or organic salts of acidic residues such as carboxylic acids; and the like. The pharmaceutically acceptable salts of the present application include the conventional non-toxic salts of the parent compound formed, for example, from non-toxic inorganic or organic acids. The pharmaceutically acceptable salts of the present application can be synthesized from the parent compound which contains a basic or acidic moiety by conventional chemical methods. Generally, such salts can be prepared by reacting the free acid or base forms of these compounds with a stoichiometric amount of the appropriate base or acid in water or in an organic solvent, or in a mixture of the two; generally, non-aqueous media like ether, ethyl acetate, alcohols (e.g., methanol, ethanol, iso-propanol, or butanol) or acetonitrile (MeCN) are preferred. Lists of suitable salts are found in Remington's Pharmaceutical Sciences, 17th ed., Mack Publishing Company, Easton, Pa., 1985, p. 1418 and Journal of Pharmaceutical Science, 66, 2 (1977). Conventional methods for preparing salt forms are described, for example, in Handbook of Pharmaceutical Salts: Properties, Selection, and Use, Wiley-VCH, 2002.
As used in the specification and claims, the singular form “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise. For example, the term “a cell” includes a plurality of cells, including mixtures thereof.
As used herein, the terms “may,” “optionally,” and “may optionally” are used interchangeably and are meant to include cases in which the condition occurs as well as cases in which the condition does not occur.
The terms “about” and “approximately” are defined as being “close to” as understood by one of ordinary skill in the art. In one non-limiting embodiment the terms are defined to be within 10%. In another non-limiting embodiment, the terms are defined to be within 5%. In still another non-limiting embodiment, the terms are defined to be within 1%.
The term “nucleic acid” as used herein means a polymer composed of nucleotides, e.g. deoxyribonucleotides or ribonucleotides.
The terms “ribonucleic acid” and “RNA” as used herein mean a polymer composed of ribonucleotides.
The terms “deoxyribonucleic acid” and “DNA” as used herein mean a polymer composed of deoxyribonucleotides.
The term “oligonucleotide” denotes single- or double-stranded nucleotide multimers of from about 2 to up to about 100 nucleotides in length. Suitable oligonucleotides may be prepared by the phosphoramidite method described by Beaucage and Carruthers, Tetrahedron Lett., 22:1859-1862 (1981), or by the triester method according to Matteucci, et al., J. Am. Chem. Soc., 103:3185 (1981), both incorporated herein by reference, or by other chemical methods using either a commercial automated oligonucleotide synthesizer or VLSIPS™ technology. When oligonucleotides are referred to as “double-stranded,” it is understood by those of skill in the art that a pair of oligonucleotides exist in a hydrogen-bonded, helical array typically associated with, for example, DNA. In addition to the 100% complementary form of double-stranded oligonucleotides, the term “double-stranded,” as used herein is also meant to refer to those forms which include such structural features as bulges and loops, described more fully in such biochemistry texts as Stryer, Biochemistry, Third Ed., (1988), incorporated herein by reference for all purposes.
The term “polynucleotide” refers to a single or double stranded polymer composed of nucleotide monomers. In some embodiments, the polynucleotide is composed of nucleotide monomers of generally greater than 100 nucleotides in length and up to about 8,000 or more nucleotides in length.
The term “polypeptide” refers to a compound made up of a single chain of D- or L-amino acids or a mixture of D- and L-amino acids joined by peptide bonds.
The term “promoter” or “regulatory element” refers to a region or sequence determinants located upstream or downstream from the start of transcription and which are involved in recognition and binding of RNA polymerase and other proteins to initiate transcription. Promoters need not be of bacterial origin, for example, promoters derived from viruses or from other organisms can be used in the compositions, systems, or methods described herein
The term “recombinant” refers to a human manipulated nucleic acid (e.g. polynucleotide) or a copy or complement of a human manipulated nucleic acid (e.g. polynucleotide), or if in reference to a protein (i.e, a “recombinant protein”), a protein encoded by a recombinant nucleic acid (e.g. polynucleotide). In embodiments, a recombinant expression cassette comprising a promoter operably linked to a second nucleic acid (e.g. polynucleotide) may include a promoter that is heterologous to the second nucleic acid (e.g. polynucleotide) as the result of human manipulation (e.g., by methods described in Sambrook et al., Molecular Cloning-A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., (1989) or Current Protocols in Molecular Biology Volumes 1-3, John Wiley & Sons, Inc. (1994-1998)). In another example, a recombinant expression cassette may comprise nucleic acids (e.g. polynucleotides) combined in such a way that the nucleic acids (e.g. polynucleotides) are extremely unlikely to be found in nature. For instance, human manipulated restriction sites or plasmid vector sequences may flank or separate the promoter from the second nucleic acid (e.g. polynucleotide). One of skill will recognize that nucleic acids (e.g. polynucleotides) can be manipulated in many ways and are not limited to the examples above.
The terms “identical” or percent “identity,” in the context of two or more nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same (i.e., about 60% identity, preferably 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or higher identity over a specified region when compared and aligned for maximum correspondence over a comparison window or designated region) as measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default parameters described below, or by manual alignment and visual inspection (see, e.g., NCBI web site or the like). Such sequences are then said to be “substantially identical.” This definition also refers to, or may be applied to, the compliment of a test sequence. The definition also includes sequences that have deletions and/or additions, as well as those that have substitutions. As described below, the preferred algorithms can account for gaps and the like. Preferably, identity exists over a region that is at least about 10 amino acids or 20 nucleotides in length, or more preferably over a region that is 10-50 amino acids or 20-50 nucleotides in length. As used herein, percent (%) amino acid sequence identity is defined as the percentage of amino acids in a candidate sequence that are identical to the amino acids in a reference sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity. Alignment for purposes of determining percent sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN, ALIGN-2 or Megalign (DNASTAR) software. Appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full-length of the sequences being compared can be determined by known methods.
For sequence comparisons, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Preferably, default program parameters can be used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters.
One example of an algorithm that is suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al. (1977) Nuc. Acids Res. 25:3389-3402, and Altschul et al. (1990) J. Mol. Biol. 215:403-410, respectively. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al. (1990) J. Mol. Biol. 215:403-410). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) or 10, M=5, N=−4 and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff and Henikoff (1989) Proc. Natl. Acad. Sci. USA 89:10915) alignments (B) of 50, expectation (E) of 10, M=5, N=−4, and a comparison of both strands.
The BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin and Altschul (1993) Proc. Natl. Acad. Sci. USA 90:5873-5787). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably less than about 0.01.
The term “gene” or “gene sequence” refers to the coding sequence or control sequence, or fragments thereof. A gene may include any combination of coding sequence and control sequence, or fragments thereof. Thus, a “gene” as referred to herein may be all or part of a native gene. A polynucleotide sequence as referred to herein may be used interchangeably with the term “gene”, or may include any coding sequence, non-coding sequence or control sequence, fragments thereof, and combinations thereof. The term “gene” or “gene sequence” includes, for example, control sequences upstream of the coding sequence (for example, the ribosone binding site).
The term “culture”, “cultivate”, and “ferment” are used interchangeably and refer to the intentional growth, propagation, proliferation, and/or enablement of metabolism, catabolism, and/or anabolism of one or more cells (e.g., bacteria such as Bacillus cereus). The combination of both growth and propagation may be termed proliferation. Examples include production by an organism of a polyketide of interest. Culture does not refer to the growth or propagation of microorganisms in nature or otherwise without human intervention.
The term “growth” means an increase in cell size, total cellular contents, and/or cell mass or weight of a cell (e.g., bacteria such as Bacillus cereus).
A “growth media” or “growth medium” as used herein can be a solid, powder, or liquid mixture which comprises all or substantially all of the nutrients necessary to support the growth of cells, such as bacterial cells; various nutrient compositions are preferably prepared when particular species are being cultured. Amino acids, carbohydrates, minerals, vitamins and other elements known to those skilled in the art to be necessary for the growth of cells (e.g., bacteria such as Bacillus cereus) are provided in the medium. In one embodiment, the growth medium is liquid. In one embodiment, the growth medium is a production medium (for example, medium optionally containing higher concentrations of glucose and/or altered concentrations of nitrogen).
A polynucleotide sequence is “heterologous” to a second polynucleotide sequence if it originates from a foreign species, or, if from the same species, is modified by human action from its original form. For example, a promoter operably linked to a heterologous coding sequence refers to a coding sequence from a species different from that from which the promoter was derived, or, if from the same species, a coding sequence which is different from naturally occurring allelic variants.
Methods
Provided herein are methods for synthesizing an isoprenoid subunit. In some embodiments, these methods can comprise (i) contacting a primary alcohol defined by Formula I below
wherein R1 is selected from the group consisting of C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; R1′ is selected from the group consisting of hydrogen, C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; and each RX, when present, is independently selected from OH, NO2, CN, halo, C1-6 alkyl, C2-6 alkenyl, C2-6 alkynyl, C1-4 haloalkyl, C1-6 alkoxy, C1-6 haloalkoxy, cyano-C1-3 alkyl, HO—C1-3 alkyl, amino, C1-6 alkylamino, di(C1-6 alkyl)amino, thio, C1-6 alkylthio, C1-6 alkylsulfinyl, C1-6 alkylsulfonyl, carbamyl, C1-6 alkylcarbamyl, di(C1-6 alkyl)carbamyl, carboxy, C1-6 alkylcarbonyl, C1-6 alkoxycarbonyl, C1-6 alkylcarbonylamino, C1-6 alkylsulfonylamino, aminosulfonyl, C1-6 alkylaminosulfonyl, di(C1-6 alkyl)aminosulfonyl, aminosulfonylamino, C1-6 alkylaminosulfonylamino, di(C1-6 alkyl)aminosulfonylamino, aminocarbonylamino, C1-6 alkylaminocarbonylamino, and di(C1-6 alkyl)aminocarbonylamino; with a phosphatase that exhibits bidirectional activity in the presence of ATP to form a phosphate defined by Formula II below
wherein R1, R1′, and RX are as defined above with respect to Formula I and P represents a phosphate group; and (ii) contacting the phosphate defined by Formula II with a kinase in the presence of ATP to generate the isoprenoid subunit defined by Formula III below
wherein R1, R1′, and RX are as defined above with respect to Formula I and PP represents a pyrophosphate group.
Phosphatases that exhibit bidirectional activity are known in the art. Such enzymes are known in the art, and classified under Enzyme Commission (EC) numbers 3.1 and 3.2. In some embodiments, the phosphatase can comprise a non-specific acid phosphatase (e.g., an enzyme classified under EC number 3.1.3.2). In certain embodiments, the phosphatase can comprise PhoN. In certain embodiments, the phosphatase can comprise PhoC.
In some embodiments, the kinase can comprise a kinase that uses a phosphate acceptor. Such enzymes are classified under EC numbers 2.7.4, and include phosphomevalonate kinases, adenylate kinases, nucleoside-phosphate kinases, nucleoside-diphosphate kinases, phosphomethylpyrimidine kinases, guanylate kinases, dTMP kinases, nucleoside-triphosphate-adenylate kinases, (deoxy)adenylate kinases, T2-induced deoxynucleotide kinases, (deoxy)nucleoside-phosphate kinases, cytidylate kinases, thiamine-diphosphate kinases, thiamine-phosphate kinases, 3-phosphoglyceroyl-phosphate-polyphosphate phosphotransferases, farnesyl-diphosphate kinases, 5-methyldeoxycytidine-5′-phosphate kinases, dolichyl-diphosphate-polyphosphate phosphotransferases, inositol-hexakisphosphate kinases, UMP kinases, ribose 1,5-bisphosphate phosphokinases, diphosphoinositol-pentakisphosphate kinases, (d)CMP kinases, isopentenyl phosphate kinases, (pyruvate, phosphate dikinase)-phosphate phosphotransferases, and (pyruvate, water dikinase)-phosphate phosphotransferases. In some embodiments, the kinase can be chosen from a polyphosphate kinase, a phosphomevalonate kinase, a phosphomethylpyrimidine kinase, a farnesyl-diphosphate kinase, or a combination thereof. In certain embodiments, the kinase can comprise isopentenyl phosphate kinase (IPK).
In some cases, the phosphatase, the kinase, or a combination thereof can comprise a mutant enzyme engineered to increase substrate promiscuity, improve enzyme activity, increase enzyme specificity with respect to a particular substrate, or a combination thereof.
In some embodiments, the primary alcohol defined by Formula I is not one of the following
In some embodiment steps (i) and (ii) can be performed in a cell-free system. In some of these embodiments, the method can further comprise recovering the isoprenoid subunit from the cell-free system. In other embodiments, steps (i) and (ii) can be performed in a cell comprising genes encoding for the phosphatase that exhibits bidirectional activity and the kinase. The cell can be engineered to express (or overexpress) the genes encoding for the phosphatase and the kinase.
Also provided are methods for synthesizing an isoprenoid subunit that comprise (i) providing a cell comprising genes encoding for (1) a phosphatase that exhibits bidirectional activity, and (2) a kinase; and (ii) incubating the cell in a fermentation broth with ATP and a primary alcohol defined by Formula I below
wherein R1 is selected from the group consisting of C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; R1′ is selected from the group consisting of hydrogen, C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; and each RX, when present, is independently selected from OH, NO2, CN, halo, C1-6 alkyl, C2-6 alkenyl, C2-6 alkynyl, C1-4 haloalkyl, C1-6 alkoxy, C1-6 haloalkoxy, cyano-C1-3 alkyl, HO—C1-3 alkyl, amino, C1-6 alkylamino, di(C1-6 alkyl)amino, thio, C1-6 alkylthio, C1-6 alkylsulfinyl, C1-6 alkylsulfonyl, carbamyl, C1-6 alkylcarbamyl, di(C1-6 alkyl)carbamyl, carboxy, C1-6 alkylcarbonyl, C1-6 alkoxycarbonyl, C1-6 alkylcarbonylamino, C1-6 alkylsulfonylamino, aminosulfonyl, C1-6 alkylaminosulfonyl, di(C1-6 alkyl)aminosulfonyl, aminosulfonylamino, C1-6 alkylaminosulfonylamino, di(C1-6 alkyl)aminosulfonylamino, aminocarbonylamino, C1-6 alkylaminocarbonylamino, and di(C1-6 alkyl)aminocarbonylamino; thereby generating the isoprenoid subunit.
As described above, phosphatases that exhibit bidirectional activity are known in the art, and classified under Enzyme Commission (EC) numbers 3.1 and 3.2. In some embodiments, the phosphatase can comprise a non-specific acid phosphatase (e.g., an enzyme classified under EC number 3.1.3.2). In certain embodiments, the phosphatase can comprise PhoN. In certain embodiments, the phosphatase can comprise PhoC.
In some embodiments, the kinase can comprise a kinase that uses a phosphate acceptor. Such enzymes are classified under EC numbers 2.7.4, and include phosphomevalonate kinases, adenylate kinases, nucleoside-phosphate kinases, nucleoside-diphosphate kinases, phosphomethylpyrimidine kinases, guanylate kinases, dTMP kinases, nucleoside-triphosphate-adenylate kinases, (deoxy)adenylate kinases, T2-induced deoxynucleotide kinases, (deoxy)nucleoside-phosphate kinases, cytidylate kinases, thiamine-diphosphate kinases, thiamine-phosphate kinases, 3-phosphoglyceroyl-phosphate-polyphosphate phosphotransferases, farnesyl-diphosphate kinases, 5-methyldeoxycytidine-5′-phosphate kinases, dolichyl-diphosphate-polyphosphate phosphotransferases, inositol-hexakisphosphate kinases, UMP kinases, ribose 1,5-bisphosphate phosphokinases, diphosphoinositol-pentakisphosphate kinases, (d)CMP kinases, isopentenyl phosphate kinases, (pyruvate, phosphate dikinase)-phosphate phosphotransferases, and (pyruvate, water dikinase)-phosphate phosphotransferases. In some embodiments, the kinase can be chosen from a polyphosphate kinase, a phosphomevalonate kinase, a phosphomethylpyrimidine kinase, a farnesyl-diphosphate kinase, or a combination thereof. In certain embodiments, the kinase can comprise isopentenyl phosphate kinase (IPK).
In some cases, the phosphatase, the kinase, or a combination thereof can comprise a mutant enzyme engineered to increase substrate promiscuity, improve enzyme activity, increase enzyme specificity with respect to a particular substrate, or a combination thereof.
In some embodiments, the primary alcohol defined by Formula I is not one of the following
Also provided are methods for synthesizing an isoprenoid subunit that comprise (i) contacting a primary alcohol defined by Formula I below
wherein R1 is selected from the group consisting of C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; R1′ is selected from the group consisting of hydrogen, C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; and each RX, when present, is independently selected from OH, NO2, CN, halo, C1-6 alkyl, C2-6 alkenyl, C2-6 alkynyl, C1-4 haloalkyl, C1-6 alkoxy, C1-6 haloalkoxy, cyano-C1-3 alkyl, HO—C1-3 alkyl, amino, C1-6 alkylamino, di(C1-6 alkyl)amino, thio, C1-6 alkylthio, C1-6 alkylsulfinyl, C1-6 alkylsulfonyl, carbamyl, C1-6 alkylcarbamyl, di(C1-6 alkyl)carbamyl, carboxy, C1-6 alkylcarbonyl, C1-6 alkoxycarbonyl, C1-6 alkylcarbonylamino, C1-6 alkylsulfonylamino, aminosulfonyl, C1-6 alkylaminosulfonyl, di(C1-6 alkyl)aminosulfonyl, aminosulfonylamino, C1-6 alkylaminosulfonylamino, di(C1-6 alkyl)aminosulfonylamino, aminocarbonylamino, C1-6 alkylaminocarbonylamino, and di(C1-6 alkyl)aminocarbonylamino, with the proviso that the primary alcohol defined by Formula I is not one of the following
with a first kinase in the presence of ATP to form a phosphate defined by Formula II below
wherein R1, R1′, and RX are as defined above with respect to Formula I and P represents a phosphate group; and (ii) contacting the phosphate defined by Formula II with a second kinase in the presence of ATP to generate the isoprenoid subunit defined by Formula III below
wherein R1, R1′, and RX are as defined above with respect to Formula I and PP represents a pyrophosphate group.
The kinases can comprise any suitable kinases employ small molecules as acceptors. Such enzymes are classified under EC numbers 2.7.1-2.7.9, and include phosphotransferases with an alcohol group as acceptor, phosphotransferases with a carboxy group as acceptor, phosphotransferases with a nitrogenous group as acceptor, phosphotransferases with a phosphate group as acceptor, phosphotransferases with regeneration of donors, apparently catalyzing intramolecular transfers, diphosphotransferases, nucleotidyltransferases, transferases for other substituted phosphate groups, and phosphotransferases with paired acceptors (dikinases). In some cases, the kinases are kinases that are expressed in soluble form (e.g., in E. coli and/or yeast).
In some embodiments, the first kinase can comprise a kinase that uses an alcohol acceptor. Such enzymes are classified under EC numbers 2.7.1, and include hexokinases, glucokinases, ketohexokinases, fructokinases, rhamnulokinases, galactokinases, mannokinases, glucosamine kinases, phosphoglucokinases, 6-phosphofructokinases, gluconokinases, dehydrogluconokinases, sedoheptulokinases, ribokinases, ribulokinases, xylulokinases, phosphoribokinases, phosphoribulokinases, adenosine kinases, thymidine kinases, ribosylnicotinamide kinases, NAD+kinases, dephospho-CoA kinases, adenylyl-sulfate kinases, riboflavin kinases, erythritol kinases, triokinases, glycerone kinases, glycerol kinases, glycerate kinases, choline kinases, pantothenate kinase, pantetheine kinases, pyridoxal kinases, mevalonate kinases, homoserine kinases, pyruvate kinases, glucose-1-phosphate phosphodismutases, riboflavin phosphotransferases, glucuronokinases, galacturonokinases, 2-dehydro-3-deoxygluconokinases, L-arabinokinases, D-ribulokinases, uridine kinases, hydroxymethylpyrimidine kinases, hydroxyethylthiazole kinases, L-fuculokinases, fucokinases, L-xylulokinases, D-arabinokinases, allose kinases, 1-phosphofructokinases, 2-dehydro-3-deoxygalactonokinases, N-acetylglucosamine kinases, N-acylmannosamine kinases, acyl-phosphate-hexose phosphotransferases, phosphoramidate-hexose phosphotransferases, polyphosphate-glucose phosphotransferases, inositol 3-kinases, scyllo-inosamine 4-kinases, undecaprenol kinases, 1-phosphatidylinositol 4-kinases, 1-phosphatidylinositol-4-phosphate 5-kinases, protein-Npi-phosphohistidine-sugar phosphotransferases, shikimate kinases, streptomycin 6-kinases, inosine kinases, deoxycytidine kinases, deoxyadenosine kinases, nucleoside phosphotransferases, polynucleotide 5′-hydroxyl-kinases, diphosphate-glycerol phosphotransferases, diphosphate-serine phosphotransferases, hydroxylysine kinases, ethanolamine kinases, pseudouridine kinases, alkylglycerone kinases, b-glucoside kinases, NADH kinases, streptomycin 3″-kinases, dihydrostreptomycin-6-phosphate 3′a-kinases, thiamine kinases, diphosphate-fructose-6-phosphate 1-phosphotransferases, sphinganine kinases, 5-dehydro-2-deoxygluconokinases, alkylglycerol kinases, acylglycerol kinases, kanamycin kinases, S-methyl-5-thioribose kinases, tagatose kinases, hamamelose kinases, viomycin kinases, 6-phosphofructo-2-kinases, glucose-1,6-bisphosphate synthases, diacylglycerol kinases, dolichol kinases, deoxyguanosine kinases, AMP-thymidine kinases, ADP-thymidine kinases, hygromycin-B kinases, phosphoenolpyruvate-glycerone phosphotransferases, xylitol kinases, inositol-trisphosphate 3-kinases, tetraacyldisaccharide 4′-kinases, inositol-tetrakisphosphate 1-kinases, macrolide 2′-kinases, phosphatidylinositol 3-kinases, ceramide kinases, inositol-tetrakisphosphate 5-kinases, glycerol-3-phosphate-glucose phosphotransferases, diphosphate-purine nucleoside kinases, tagatose-6-phosphate kinases, deoxynucleoside kinases, ADP-dependent phosphofructokinases, ADP-dependent glucokinases, 4-(cytidine 5′-diphospho)-2-C-methyl-D-erythritol kinases, 1-phosphatidylinositol-5-phosphate 4-kinases, 1-phosphatidylinositol-3-phosphate 5-kinases, inositol-polyphosphate multikinases, phosphatidylinositol-4,5-bisphosphate 3-kinases, phosphatidylinositol-4-phosphate 3-kinases, diphosphoinositol-pentakisphosphate kinases, adenosylcobinamide kinases, N-acetylgalactosamine kinases, inositol-pentakisphosphate 2-kinases, inositol-1,3,4-trisphosphate 5/6-kinases, 2′-phosphotransferases, CTP-dependent riboflavin kinases, N-acetylhexosamine 1-kinases, hygromycin B 4-O-kinases, O-phosphoseryl-tRNASec kinases, glycerate 2-kinases, 3-deoxy-D-manno-octulosonic acid kinases, D-glycero-beta-D-manno-heptose-7-phosphate kinases, D-glycero-alpha-D-manno-heptose-7-phosphate kinases, pantoate kinases, anhydro-N-acetylmuramic acid kinases, protein-fructosamine 3-kinases, protein-ribulosamine 3-kinases, nicotinate riboside kinases, diacylglycerol kinases (CTP dependent), maltokinases, UDP-N-acetylglucosamine kinases, and L-threonine kinases. In some embodiments, the first kinase can be chosen from a hexokinase, a glucokinase, a galactokinase, a fructokinase, a glycerol kinase, a choline kinase, a pantetheine kinase, a mevalonate kinase, a pyruvate kinase, an undecaprenol kinase, an ethanolamine kinase, a diacylglycerol kinase, a dolichol kinase, a macrolide 2′-kinase, a ceramide kinase, or a combination thereof.
In some embodiments, the second kinase can comprise a kinase that uses a phosphate acceptor. Such enzymes are classified under EC numbers 2.7.4, and include phosphomevalonate kinases, adenylate kinases, nucleoside-phosphate kinases, nucleoside-diphosphate kinases, phosphomethylpyrimidine kinases, guanylate kinases, dTMP kinases, nucleoside-triphosphate-adenylate kinases, (deoxy)adenylate kinases, T2-induced deoxynucleotide kinases, (deoxy)nucleoside-phosphate kinases, cytidylate kinases, thiamine-diphosphate kinases, thiamine-phosphate kinases, 3-phosphoglyceroyl-phosphate-polyphosphate phosphotransferases, farnesyl-diphosphate kinases, 5-methyldeoxycytidine-5′-phosphate kinases, dolichyl-diphosphate-polyphosphate phosphotransferases, inositol-hexakisphosphate kinases, UMP kinases, ribose 1,5-bisphosphate phosphokinases, diphosphoinositol-pentakisphosphate kinases, (d)CMP kinases, isopentenyl phosphate kinases, (pyruvate, phosphate dikinase)-phosphate phosphotransferases, and (pyruvate, water dikinase)-phosphate phosphotransferases. In some embodiments, the second kinase can be chosen from a polyphosphate kinase, a phosphomevalonate kinase, a phosphomethylpyrimidine kinase, a farnesyl-diphosphate kinase, or a combination thereof. In certain embodiments, the second kinase can comprise isopentenyl phosphate kinase (IPK).
In certain embodiments, the first kinase, the second kinase, or a combination thereof comprise a mutant enzyme engineered to increase substrate promiscuity, improve enzyme activity, increase enzyme specificity with respect to a particular substrate, or a combination thereof.
In some embodiment steps (i) and (ii) can be performed in a cell-free system. In some of these embodiments, the method can further comprise recovering the isoprenoid subunit from the cell-free system. In other embodiments, steps (i) and (ii) can be performed in a cell comprising genes encoding for the first kinase and the second kinase. The cell can be engineered to express (or overexpress) the genes encoding for the first kinase and/or the second kinase.
Also provided are methods for synthesizing an isoprenoid subunit that comprise (i) providing a cell comprising genes encoding for a first kinase and a second kinase; (ii) incubating the cell in a fermentation broth with ATP and a primary alcohol defined by Formula I below
wherein R1 is selected from the group consisting of C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; R1′ is selected from the group consisting of hydrogen, C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; and each RX, when present, is independently selected from OH, NO2, CN, halo, C1-6 alkyl, C2-6 alkenyl, C2-6 alkynyl, C1-4 haloalkyl, C1-6 alkoxy, C1-6 haloalkoxy, cyano-C1-3 alkyl, HO—C1-3 alkyl, amino, C1-6 alkylamino, di(C1-6 alkyl)amino, thio, C1-6 alkylthio, C1-6 alkylsulfinyl, C1-6 alkylsulfonyl, carbamyl, C1-6 alkylcarbamyl, di(C1-6 alkyl)carbamyl, carboxy, C1-6 alkylcarbonyl, C1-6 alkoxycarbonyl, C1-6 alkylcarbonylamino, C1-6 alkylsulfonylamino, aminosulfonyl, C1-6 alkylaminosulfonyl, di(C1-6 alkyl)aminosulfonyl, aminosulfonylamino, C1-6 alkylaminosulfonylamino, di(C1-6 alkyl)aminosulfonylamino, aminocarbonylamino, C1-6 alkylaminocarbonylamino, and di(C1-6 alkyl)aminocarbonylamino, with the proviso that the primary alcohol defined by Formula I is not one of the following
thereby generating the isoprenoid subunit.
The kinases can comprise any suitable kinases employ small molecules as acceptors. Such enzymes are classified under EC numbers 2.7.1-2.7.9, and include phosphotransferases with an alcohol group as acceptor, phosphotransferases with a carboxy group as acceptor, phosphotransferases with a nitrogenous group as acceptor, phosphotransferases with a phosphate group as acceptor, phosphotransferases with regeneration of donors, apparently catalyzing intramolecular transfers, diphosphotransferases, nucleotidyltransferases, transferases for other substituted phosphate groups, and phosphotransferases with paired acceptors (dikinases). In some cases, the kinases are kinases that are expressed in soluble form (e.g., in E. coli and/or yeast).
In some embodiments, the first kinase can comprise a kinase that uses an alcohol acceptor. Such enzymes are classified under EC numbers 2.7.1, and include hexokinases, glucokinases, ketohexokinases, fructokinases, rhamnulokinases, galactokinases, mannokinases, glucosamine kinases, phosphoglucokinases, 6-phosphofructokinases, gluconokinases, dehydrogluconokinases, sedoheptulokinases, ribokinases, ribulokinases, xylulokinases, phosphoribokinases, phosphoribulokinases, adenosine kinases, thymidine kinases, ribosylnicotinamide kinases, NAD+kinases, dephospho-CoA kinases, adenylyl-sulfate kinases, riboflavin kinases, erythritol kinases, triokinases, glycerone kinases, glycerol kinases, glycerate kinases, choline kinases, pantothenate kinase, pantetheine kinases, pyridoxal kinases, mevalonate kinases, homoserine kinases, pyruvate kinases, glucose-1-phosphate phosphodismutases, riboflavin phosphotransferases, glucuronokinases, galacturonokinases, 2-dehydro-3-deoxygluconokinases, L-arabinokinases, D-ribulokinases, uridine kinases, hydroxymethylpyrimidine kinases, hydroxyethylthiazole kinases, L-fuculokinases, fucokinases, L-xylulokinases, D-arabinokinases, allose kinases, 1-phosphofructokinases, 2-dehydro-3-deoxygalactonokinases, N-acetylglucosamine kinases, N-acylmannosamine kinases, acyl-phosphate-hexose phosphotransferases, phosphoramidate-hexose phosphotransferases, polyphosphate-glucose phosphotransferases, inositol 3-kinases, scyllo-inosamine 4-kinases, undecaprenol kinases, 1-phosphatidylinositol 4-kinases, 1-phosphatidylinositol-4-phosphate 5-kinases, protein-Npi-phosphohistidine-sugar phosphotransferases, shikimate kinases, streptomycin 6-kinases, inosine kinases, deoxycytidine kinases, deoxyadenosine kinases, nucleoside phosphotransferases, polynucleotide 5′-hydroxyl-kinases, diphosphate-glycerol phosphotransferases, diphosphate-serine phosphotransferases, hydroxylysine kinases, ethanolamine kinases, pseudouridine kinases, alkylglycerone kinases, b-glucoside kinases, NADH kinases, streptomycin 3″-kinases, dihydrostreptomycin-6-phosphate 3′a-kinases, thiamine kinases, diphosphate-fructose-6-phosphate 1-phosphotransferases, sphinganine kinases, 5-dehydro-2-deoxygluconokinases, alkylglycerol kinases, acylglycerol kinases, kanamycin kinases, S-methyl-5-thioribose kinases, tagatose kinases, hamamelose kinases, viomycin kinases, 6-phosphofructo-2-kinases, glucose-1,6-bisphosphate synthases, diacylglycerol kinases, dolichol kinases, deoxyguanosine kinases, AMP-thymidine kinases, ADP-thymidine kinases, hygromycin-B kinases, phosphoenolpyruvate-glycerone phosphotransferases, xylitol kinases, inositol-trisphosphate 3-kinases, tetraacyldisaccharide 4′-kinases, inositol-tetrakisphosphate 1-kinases, macrolide 2′-kinases, phosphatidylinositol 3-kinases, ceramide kinases, inositol-tetrakisphosphate 5-kinases, glycerol-3-phosphate-glucose phosphotransferases, diphosphate-purine nucleoside kinases, tagatose-6-phosphate kinases, deoxynucleoside kinases, ADP-dependent phosphofructokinases, ADP-dependent glucokinases, 4-(cytidine 5′-diphospho)-2-C-methyl-D-erythritol kinases, 1-phosphatidylinositol-5-phosphate 4-kinases, 1-phosphatidylinositol-3-phosphate 5-kinases, inositol-polyphosphate multikinases, phosphatidylinositol-4,5-bisphosphate 3-kinases, phosphatidylinositol-4-phosphate 3-kinases, diphosphoinositol-pentakisphosphate kinases, adenosylcobinamide kinases, N-acetylgalactosamine kinases, inositol-pentakisphosphate 2-kinases, inositol-1,3,4-trisphosphate 5/6-kinases, 2′-phosphotransferases, CTP-dependent riboflavin kinases, N-acetylhexosamine 1-kinases, hygromycin B 4-O-kinases, O-phosphoseryl-tRNASec kinases, glycerate 2-kinases, 3-deoxy-D-manno-octulosonic acid kinases, D-glycero-beta-D-manno-heptose-7-phosphate kinases, D-glycero-alpha-D-manno-heptose-7-phosphate kinases, pantoate kinases, anhydro-N-acetylmuramic acid kinases, protein-fructosamine 3-kinases, protein-ribulosamine 3-kinases, nicotinate riboside kinases, diacylglycerol kinases (CTP dependent), maltokinases, UDP-N-acetylglucosamine kinases, and L-threonine kinases. In some embodiments, the first kinase can be chosen from a hexokinase, a glucokinase, a galactokinase, a fructokinase, a glycerol kinase, a choline kinase, a pantetheine kinase, a mevalonate kinase, a pyruvate kinase, an undecaprenol kinase, an ethanolamine kinase, a diacylglycerol kinase, a dolichol kinase, a macrolide 2′-kinase, a ceramide kinase, or a combination thereof.
In some embodiments, the second kinase can comprise a kinase that uses a phosphate acceptor. Such enzymes are classified under EC numbers 2.7.4, and include phosphomevalonate kinases, adenylate kinases, nucleoside-phosphate kinases, nucleoside-diphosphate kinases, phosphomethylpyrimidine kinases, guanylate kinases, dTMP kinases, nucleoside-triphosphate-adenylate kinases, (deoxy)adenylate kinases, T2-induced deoxynucleotide kinases, (deoxy)nucleoside-phosphate kinases, cytidylate kinases, thiamine-diphosphate kinases, thiamine-phosphate kinases, 3-phosphoglyceroyl-phosphate-polyphosphate phosphotransferases, farnesyl-diphosphate kinases, 5-methyldeoxycytidine-5′-phosphate kinases, dolichyl-diphosphate-polyphosphate phosphotransferases, inositol-hexakisphosphate kinases, UMP kinases, ribose 1,5-bisphosphate phosphokinases, diphosphoinositol-pentakisphosphate kinases, (d)CMP kinases, isopentenyl phosphate kinases, (pyruvate, phosphate dikinase)-phosphate phosphotransferases, and (pyruvate, water dikinase)-phosphate phosphotransferases. In some embodiments, the second kinase can be chosen from a polyphosphate kinase, a phosphomevalonate kinase, a phosphomethylpyrimidine kinase, a farnesyl-diphosphate kinase, or a combination thereof. In certain embodiments, the second kinase can comprise isopentenyl phosphate kinase (IPK).
In certain embodiments, the first kinase, the second kinase, or a combination thereof comprise a mutant enzyme engineered to increase substrate promiscuity, improve enzyme activity, increase enzyme specificity with respect to a particular substrate, or a combination thereof.
Methods can further comprise introducing the isoprenoid subunit into a natural or artificial isoprenoid biosynthetic pathway to synthesize an isoprenoid. This can be done performed within a cell or in a cell-free system.
Also provided are methods for synthesizing an isoprenoid subunit that comprise (i) contacting a primary alcohol defined by Formula I below
wherein R1 is selected from the group consisting of C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; R1′ is selected from the group consisting of hydrogen, C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; and each RX, when present, is independently selected from OH, NO2, CN, halo, C1-6 alkyl, C2-6 alkenyl, C2-6 alkynyl, C1-4 haloalkyl, C1-6 alkoxy, C1-6 haloalkoxy, cyano-C1-3 alkyl, HO—C1-3 alkyl, amino, C1-6 alkylamino, di(C1-6 alkyl)amino, thio, C1-6 alkylthio, C1-6 alkylsulfinyl, C1-6 alkylsulfonyl, carbamyl, C1-6 alkylcarbamyl, di(C1-6 alkyl)carbamyl, carboxy, C1-6 alkylcarbonyl, C1-6 alkoxycarbonyl, C1-6 alkylcarbonylamino, C1-6 alkylsulfonylamino, aminosulfonyl, C1-6 alkylaminosulfonyl, di(C1-6 alkyl)aminosulfonyl, aminosulfonylamino, C1-6 alkylaminosulfonylamino, di(C1-6 alkyl)aminosulfonylamino, aminocarbonylamino, C1-6 alkylaminocarbonylamino, and di(C1-6 alkyl)aminocarbonylamino, with a first kinase in the presence of ATP to form a phosphate defined by Formula II below
wherein R1, R1′, and RX are as defined above with respect to Formula I and P represents a phosphate group; and (ii) contacting the phosphate defined by Formula II with a second kinase in the presence of ATP to generate the isoprenoid subunit defined by Formula III below
wherein R1, R1′, and RX are as defined above with respect to Formula I and PP represents a pyrophosphate group; wherein the first kinase, the second kinase, or a combination thereof comprise a mutant enzyme engineered to increase substrate promiscuity, improve enzyme activity, increase enzyme specificity with respect to a particular substrate, or a combination thereof.
The kinases can comprise any suitable kinases employ small molecules as acceptors. Such enzymes are classified under EC numbers 2.7.1-2.7.9, and include phosphotransferases with an alcohol group as acceptor, phosphotransferases with a carboxy group as acceptor, phosphotransferases with a nitrogenous group as acceptor, phosphotransferases with a phosphate group as acceptor, phosphotransferases with regeneration of donors, apparently catalyzing intramolecular transfers, diphosphotransferases, nucleotidyltransferases, transferases for other substituted phosphate groups, and phosphotransferases with paired acceptors (dikinases). In some cases, the kinases are kinases that are expressed in soluble form (e.g., in E. coli and/or yeast).
In some embodiments, the first kinase can comprise a kinase that uses an alcohol acceptor. Such enzymes are classified under EC numbers 2.7.1, and include hexokinases, glucokinases, ketohexokinases, fructokinases, rhamnulokinases, galactokinases, mannokinases, glucosamine kinases, phosphoglucokinases, 6-phosphofructokinases, gluconokinases, dehydrogluconokinases, sedoheptulokinases, ribokinases, ribulokinases, xylulokinases, phosphoribokinases, phosphoribulokinases, adenosine kinases, thymidine kinases, ribosylnicotinamide kinases, NAD+kinases, dephospho-CoA kinases, adenylyl-sulfate kinases, riboflavin kinases, erythritol kinases, triokinases, glycerone kinases, glycerol kinases, glycerate kinases, choline kinases, pantothenate kinase, pantetheine kinases, pyridoxal kinases, mevalonate kinases, homoserine kinases, pyruvate kinases, glucose-1-phosphate phosphodismutases, riboflavin phosphotransferases, glucuronokinases, galacturonokinases, 2-dehydro-3-deoxygluconokinases, L-arabinokinases, D-ribulokinases, uridine kinases, hydroxymethylpyrimidine kinases, hydroxyethylthiazole kinases, L-fuculokinases, fucokinases, L-xylulokinases, D-arabinokinases, allose kinases, 1-phosphofructokinases, 2-dehydro-3-deoxygalactonokinases, N-acetylglucosamine kinases, N-acylmannosamine kinases, acyl-phosphate-hexose phosphotransferases, phosphoramidate-hexose phosphotransferases, polyphosphate-glucose phosphotransferases, inositol 3-kinases, scyllo-inosamine 4-kinases, undecaprenol kinases, 1-phosphatidylinositol 4-kinases, 1-phosphatidylinositol-4-phosphate 5-kinases, protein-Npi-phosphohistidine-sugar phosphotransferases, shikimate kinases, streptomycin 6-kinases, inosine kinases, deoxycytidine kinases, deoxyadenosine kinases, nucleoside phosphotransferases, polynucleotide 5′-hydroxyl-kinases, diphosphate-glycerol phosphotransferases, diphosphate-serine phosphotransferases, hydroxylysine kinases, ethanolamine kinases, pseudouridine kinases, alkylglycerone kinases, b-glucoside kinases, NADH kinases, streptomycin 3″-kinases, dihydrostreptomycin-6-phosphate 3′a-kinases, thiamine kinases, diphosphate-fructose-6-phosphate 1-phosphotransferases, sphinganine kinases, 5-dehydro-2-deoxygluconokinases, alkylglycerol kinases, acylglycerol kinases, kanamycin kinases, S-methyl-5-thioribose kinases, tagatose kinases, hamamelose kinases, viomycin kinases, 6-phosphofructo-2-kinases, glucose-1,6-bisphosphate synthases, diacylglycerol kinases, dolichol kinases, deoxyguanosine kinases, AMP-thymidine kinases, ADP-thymidine kinases, hygromycin-B kinases, phosphoenolpyruvate-glycerone phosphotransferases, xylitol kinases, inositol-trisphosphate 3-kinases, tetraacyldisaccharide 4′-kinases, inositol-tetrakisphosphate 1-kinases, macrolide 2′-kinases, phosphatidylinositol 3-kinases, ceramide kinases, inositol-tetrakisphosphate 5-kinases, glycerol-3-phosphate-glucose phosphotransferases, diphosphate-purine nucleoside kinases, tagatose-6-phosphate kinases, deoxynucleoside kinases, ADP-dependent phosphofructokinases, ADP-dependent glucokinases, 4-(cytidine 5′-diphospho)-2-C-methyl-D-erythritol kinases, 1-phosphatidylinositol-5-phosphate 4-kinases, 1-phosphatidylinositol-3-phosphate 5-kinases, inositol-polyphosphate multikinases, phosphatidylinositol-4,5-bisphosphate 3-kinases, phosphatidylinositol-4-phosphate 3-kinases, diphosphoinositol-pentakisphosphate kinases, adenosylcobinamide kinases, N-acetylgalactosamine kinases, inositol-pentakisphosphate 2-kinases, inositol-1,3,4-trisphosphate 5/6-kinases, 2′-phosphotransferases, CTP-dependent riboflavin kinases, N-acetylhexosamine 1-kinases, hygromycin B 4-O-kinases, O-phosphoseryl-tRNASec kinases, glycerate 2-kinases, 3-deoxy-D-manno-octulosonic acid kinases, D-glycero-beta-D-manno-heptose-7-phosphate kinases, D-glycero-alpha-D-manno-heptose-7-phosphate kinases, pantoate kinases, anhydro-N-acetylmuramic acid kinases, protein-fructosamine 3-kinases, protein-ribulosamine 3-kinases, nicotinate riboside kinases, diacylglycerol kinases (CTP dependent), maltokinases, UDP-N-acetylglucosamine kinases, and L-threonine kinases. In some embodiments, the first kinase can be chosen from a hexokinase, a glucokinase, a galactokinase, a fructokinase, a glycerol kinase, a choline kinase, a pantetheine kinase, a mevalonate kinase, a pyruvate kinase, an undecaprenol kinase, an ethanolamine kinase, a diacylglycerol kinase, a dolichol kinase, a macrolide 2′-kinase, a ceramide kinase, or a combination thereof.
In some embodiments, the second kinase can comprise a kinase that uses a phosphate acceptor. Such enzymes are classified under EC numbers 2.7.4, and include phosphomevalonate kinases, adenylate kinases, nucleoside-phosphate kinases, nucleoside-diphosphate kinases, phosphomethylpyrimidine kinases, guanylate kinases, dTMP kinases, nucleoside-triphosphate-adenylate kinases, (deoxy)adenylate kinases, T2-induced deoxynucleotide kinases, (deoxy)nucleoside-phosphate kinases, cytidylate kinases, thiamine-diphosphate kinases, thiamine-phosphate kinases, 3-phosphoglyceroyl-phosphate-polyphosphate phosphotransferases, farnesyl-diphosphate kinases, 5-methyldeoxycytidine-5′-phosphate kinases, dolichyl-diphosphate-polyphosphate phosphotransferases, inositol-hexakisphosphate kinases, UMP kinases, ribose 1,5-bisphosphate phosphokinases, diphosphoinositol-pentakisphosphate kinases, (d)CMP kinases, isopentenyl phosphate kinases, (pyruvate, phosphate dikinase)-phosphate phosphotransferases, and (pyruvate, water dikinase)-phosphate phosphotransferases. In some embodiments, the second kinase can be chosen from a polyphosphate kinase, a phosphomevalonate kinase, a phosphomethylpyrimidine kinase, a farnesyl-diphosphate kinase, or a combination thereof. In certain embodiments, the second kinase can comprise isopentenyl phosphate kinase (IPK).
In some embodiments, the primary alcohol defined by Formula I is not one of the following
In some embodiment steps (i) and (ii) can be performed in a cell-free system. In some of these embodiments, the method can further comprise recovering the isoprenoid subunit from the cell-free system. In other embodiments, steps (i) and (ii) can be performed in a cell comprising genes encoding for the first kinase and the second kinase. The cell can be engineered to express (or overexpress) the genes encoding for the first kinase and/or the second kinase.
Methods can further comprise introducing the isoprenoid subunit into a natural or artificial isoprenoid biosynthetic pathway to synthesize an isoprenoid. This can be done performed within a cell or in a cell-free system.
Also provided are methods for synthesizing an isoprenoid subunit that comprise (i) providing a cell comprising genes encoding for a first kinase and a second kinase, wherein the first kinase, the second kinase, or a combination thereof comprise a mutant enzyme engineered to increase substrate promiscuity, improve enzyme activity, increase enzyme specificity with respect to a particular substrate, or a combination thereof; (ii) incubating the cell in a fermentation broth with ATP and a primary alcohol defined by Formula I below
wherein R1 is selected from the group consisting of C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; R1′ is selected from the group consisting of hydrogen, C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; and each RX, when present, is independently selected from OH, NO2, CN, halo, C1-6 alkyl, C2-6 alkenyl, C2-6 alkynyl, C1-4 haloalkyl, C1-6 alkoxy, C1-6 haloalkoxy, cyano-C1-3 alkyl, HO—C1-3 alkyl, amino, C1-6 alkylamino, di(C1-6 alkyl)amino, thio, C1-6 alkylthio, C1-6 alkylsulfinyl, C1-6 alkylsulfonyl, carbamyl, C1-6 alkylcarbamyl, di(C1-6 alkyl)carbamyl, carboxy, C1-6 alkylcarbonyl, C1-6 alkoxycarbonyl, C1-6 alkylcarbonylamino, C1-6 alkylsulfonylamino, aminosulfonyl, C1-6 alkylaminosulfonyl, di(C1-6 alkyl)aminosulfonyl, aminosulfonylamino, C1-6 alkylaminosulfonylamino, di(C1-6 alkyl)aminosulfonylamino, aminocarbonylamino, C1-6 alkylaminocarbonylamino, and di(C1-6 alkyl)aminocarbonylamino; thereby generating the isoprenoid subunit.
The kinases can comprise any suitable kinases employ small molecules as acceptors. Such enzymes are classified under EC numbers 2.7.1-2.7.9, and include phosphotransferases with an alcohol group as acceptor, phosphotransferases with a carboxy group as acceptor, phosphotransferases with a nitrogenous group as acceptor, phosphotransferases with a phosphate group as acceptor, phosphotransferases with regeneration of donors, apparently catalysing intramolecular transfers, diphosphotransferases, nucleotidyltransferases, transferases for other substituted phosphate groups, and phosphotransferases with paired acceptors (dikinases). In some cases, the kinases are kinases that are expressed in soluble form (e.g., in E. coli and/or yeast).
In some embodiments, the first kinase can comprise a kinase that uses an alcohol acceptor. Such enzymes are classified under EC numbers 2.7.1, and include hexokinases, glucokinases, ketohexokinases, fructokinases, rhamnulokinases, galactokinases, mannokinases, glucosamine kinases, phosphoglucokinases, 6-phosphofructokinases, gluconokinases, dehydrogluconokinases, sedoheptulokinases, ribokinases, ribulokinases, xylulokinases, phosphoribokinases, phosphoribulokinases, adenosine kinases, thymidine kinases, ribosylnicotinamide kinases, NAD+kinases, dephospho-CoA kinases, adenylyl-sulfate kinases, riboflavin kinases, erythritol kinases, triokinases, glycerone kinases, glycerol kinases, glycerate kinases, choline kinases, pantothenate kinase, pantetheine kinases, pyridoxal kinases, mevalonate kinases, homoserine kinases, pyruvate kinases, glucose-1-phosphate phosphodismutases, riboflavin phosphotransferases, glucuronokinases, galacturonokinases, 2-dehydro-3-deoxygluconokinases, L-arabinokinases, D-ribulokinases, uridine kinases, hydroxymethylpyrimidine kinases, hydroxyethylthiazole kinases, L-fuculokinases, fucokinases, L-xylulokinases, D-arabinokinases, allose kinases, 1-phosphofructokinases, 2-dehydro-3-deoxygalactonokinases, N-acetylglucosamine kinases, N-acylmannosamine kinases, acyl-phosphate-hexose phosphotransferases, phosphoramidate-hexose phosphotransferases, polyphosphate-glucose phosphotransferases, inositol 3-kinases, scyllo-inosamine 4-kinases, undecaprenol kinases, 1-phosphatidylinositol 4-kinases, 1-phosphatidylinositol-4-phosphate 5-kinases, protein-Npi-phosphohistidine-sugar phosphotransferases, shikimate kinases, streptomycin 6-kinases, inosine kinases, deoxycytidine kinases, deoxyadenosine kinases, nucleoside phosphotransferases, polynucleotide 5′-hydroxyl-kinases, diphosphate-glycerol phosphotransferases, diphosphate-serine phosphotransferases, hydroxylysine kinases, ethanolamine kinases, pseudouridine kinases, alkylglycerone kinases, b-glucoside kinases, NADH kinases, streptomycin 3″-kinases, dihydrostreptomycin-6-phosphate 3′a-kinases, thiamine kinases, diphosphate-fructose-6-phosphate 1-phosphotransferases, sphinganine kinases, 5-dehydro-2-deoxygluconokinases, alkylglycerol kinases, acylglycerol kinases, kanamycin kinases, S-methyl-5-thioribose kinases, tagatose kinases, hamamelose kinases, viomycin kinases, 6-phosphofructo-2-kinases, glucose-1,6-bisphosphate synthases, diacylglycerol kinases, dolichol kinases, deoxyguanosine kinases, AMP-thymidine kinases, ADP-thymidine kinases, hygromycin-B kinases, phosphoenolpyruvate-glycerone phosphotransferases, xylitol kinases, inositol-trisphosphate 3-kinases, tetraacyldisaccharide 4′-kinases, inositol-tetrakisphosphate 1-kinases, macrolide 2′-kinases, phosphatidylinositol 3-kinases, ceramide kinases, inositol-tetrakisphosphate 5-kinases, glycerol-3-phosphate-glucose phosphotransferases, diphosphate-purine nucleoside kinases, tagatose-6-phosphate kinases, deoxynucleoside kinases, ADP-dependent phosphofructokinases, ADP-dependent glucokinases, 4-(cytidine 5′-diphospho)-2-C-methyl-D-erythritol kinases, 1-phosphatidylinositol-5-phosphate 4-kinases, 1-phosphatidylinositol-3-phosphate 5-kinases, inositol-polyphosphate multikinases, phosphatidylinositol-4,5-bisphosphate 3-kinases, phosphatidylinositol-4-phosphate 3-kinases, diphosphoinositol-pentakisphosphate kinases, adenosylcobinamide kinases, N-acetylgalactosamine kinases, inositol-pentakisphosphate 2-kinases, inositol-1,3,4-trisphosphate 5/6-kinases, 2′-phosphotransferases, CTP-dependent riboflavin kinases, N-acetylhexosamine 1-kinases, hygromycin B 4-O-kinases, O-phosphoseryl-tRNASec kinases, glycerate 2-kinases, 3-deoxy-D-manno-octulosonic acid kinases, D-glycero-beta-D-manno-heptose-7-phosphate kinases, D-glycero-alpha-D-manno-heptose-7-phosphate kinases, pantoate kinases, anhydro-N-acetylmuramic acid kinases, protein-fructosamine 3-kinases, protein-ribulosamine 3-kinases, nicotinate riboside kinases, diacylglycerol kinases (CTP dependent), maltokinases, UDP-N-acetylglucosamine kinases, and L-threonine kinases. In some embodiments, the first kinase can be chosen from a hexokinase, a glucokinase, a galactokinase, a fructokinase, a glycerol kinase, a choline kinase, a pantetheine kinase, a mevalonate kinase, a pyruvate kinase, an undecaprenol kinase, an ethanolamine kinase, a diacylglycerol kinase, a dolichol kinase, a macrolide 2′-kinase, a ceramide kinase, or a combination thereof.
In some embodiments, the second kinase can comprise a kinase that uses a phosphate acceptor. Such enzymes are classified under EC numbers 2.7.4, and include phosphomevalonate kinases, adenylate kinases, nucleoside-phosphate kinases, nucleoside-diphosphate kinases, phosphomethylpyrimidine kinases, guanylate kinases, dTMP kinases, nucleoside-triphosphate-adenylate kinases, (deoxy)adenylate kinases, T2-induced deoxynucleotide kinases, (deoxy)nucleoside-phosphate kinases, cytidylate kinases, thiamine-diphosphate kinases, thiamine-phosphate kinases, 3-phosphoglyceroyl-phosphate-polyphosphate phosphotransferases, farnesyl-diphosphate kinases, 5-methyldeoxycytidine-5′-phosphate kinases, dolichyl-diphosphate-polyphosphate phosphotransferases, inositol-hexakisphosphate kinases, UMP kinases, ribose 1,5-bisphosphate phosphokinases, diphosphoinositol-pentakisphosphate kinases, (d)CMP kinases, isopentenyl phosphate kinases, (pyruvate, phosphate dikinase)-phosphate phosphotransferases, and (pyruvate, water dikinase)-phosphate phosphotransferases. In some embodiments, the second kinase can be chosen from a polyphosphate kinase, a phosphomevalonate kinase, a phosphomethylpyrimidine kinase, a farnesyl-diphosphate kinase, or a combination thereof. In certain embodiments, the second kinase can comprise isopentenyl phosphate kinase (IPK).
In some embodiments, the primary alcohol defined by Formula I is not one of the following
Methods can further comprise introducing the isoprenoid subunit into a natural or artificial isoprenoid biosynthetic pathway to synthesize an isoprenoid. This can be done performed within a cell or in a cell-free system.
Also provided are methods for synthesizing an isoprenoid subunit that comprise (i) contacting a primary alcohol defined by Formula I below
wherein R1 is selected from the group consisting of C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; R1′ is selected from the group consisting of hydrogen, C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; and each RX, when present, is independently selected from OH, NO2, CN, halo, C1-6 alkyl, C2-6 alkenyl, C2-6 alkynyl, C1-4 haloalkyl, C1-6 alkoxy, C1-6 haloalkoxy, cyano-C1-3 alkyl, HO—C1-3 alkyl, amino, C1-6 alkylamino, di(C1-6 alkyl)amino, thio, C1-6 alkylthio, C1-6 alkylsulfinyl, C1-6 alkylsulfonyl, carbamyl, C1-6 alkylcarbamyl, di(C1-6 alkyl)carbamyl, carboxy, C1-6 alkylcarbonyl, C1-6 alkoxycarbonyl, C1-6 alkylcarbonylamino, C1-6 alkylsulfonylamino, aminosulfonyl, C1-6 alkylaminosulfonyl, di(C1-6 alkyl)aminosulfonyl, aminosulfonylamino, C1-6 alkylaminosulfonylamino, di(C1-6 alkyl)aminosulfonylamino, aminocarbonylamino, C1-6 alkylaminocarbonylamino, and di(C1-6 alkyl)aminocarbonylamino; with a single enzyme in the presence of ATP to generate the isoprenoid subunit defined by Formula III below
wherein R1, R1′, and RX are as defined above with respect to Formula I and PP represents a pyrophosphate group, wherein the single enzyme comprises a phosphotransferase that can catalyze both a first phosphorylation and a second phosphorylation of the primary alcohol defined by Formula I to generate the isoprenoid subunit defined by Formula III.
In some embodiments, the single enzyme can comprise a phosphotransferase that uses an alcohol acceptor. Such enzymes are classified under EC numbers 2.7.1, and include hexokinases, glucokinases, ketohexokinases, fructokinases, rhamnulokinases, galactokinases, mannokinases, glucosamine kinases, phosphoglucokinases, 6-phosphofructokinases, gluconokinases, dehydrogluconokinases, sedoheptulokinases, ribokinases, ribulokinases, xylulokinases, phosphoribokinases, phosphoribulokinases, adenosine kinases, thymidine kinases, ribosylnicotinamide kinases, NAD+kinases, dephospho-CoA kinases, adenylyl-sulfate kinases, riboflavin kinases, erythritol kinases, triokinases, glycerone kinases, glycerol kinases, glycerate kinases, choline kinases, pantothenate kinase, pantetheine kinases, pyridoxal kinases, mevalonate kinases, homoserine kinases, pyruvate kinases, glucose-1-phosphate phosphodismutases, riboflavin phosphotransferases, glucuronokinases, galacturonokinases, 2-dehydro-3-deoxygluconokinases, L-arabinokinases, D-ribulokinases, uridine kinases, hydroxymethylpyrimidine kinases, hydroxyethylthiazole kinases, L-fuculokinases, fucokinases, L-xylulokinases, D-arabinokinases, allose kinases, 1-phosphofructokinases, 2-dehydro-3-deoxygalactonokinases, N-acetylglucosamine kinases, N-acylmannosamine kinases, acyl-phosphate-hexose phosphotransferases, phosphoramidate-hexose phosphotransferases, polyphosphate-glucose phosphotransferases, inositol 3-kinases, scyllo-inosamine 4-kinases, undecaprenol kinases, 1-phosphatidylinositol 4-kinases, 1-phosphatidylinositol-4-phosphate 5-kinases, protein-Npi-phosphohistidine-sugar phosphotransferases, shikimate kinases, streptomycin 6-kinases, inosine kinases, deoxycytidine kinases, deoxyadenosine kinases, nucleoside phosphotransferases, polynucleotide 5′-hydroxyl-kinases, diphosphate-glycerol phosphotransferases, diphosphate-serine phosphotransferases, hydroxylysine kinases, ethanolamine kinases, pseudouridine kinases, alkylglycerone kinases, b-glucoside kinases, NADH kinases, streptomycin 3″-kinases, dihydrostreptomycin-6-phosphate 3′a-kinases, thiamine kinases, diphosphate-fructose-6-phosphate 1-phosphotransferases, sphinganine kinases, 5-dehydro-2-deoxygluconokinases, alkylglycerol kinases, acylglycerol kinases, kanamycin kinases, S-methyl-5-thioribose kinases, tagatose kinases, hamamelose kinases, viomycin kinases, 6-phosphofructo-2-kinases, glucose-1,6-bisphosphate synthases, diacylglycerol kinases, dolichol kinases, deoxyguanosine kinases, AMP-thymidine kinases, ADP thymidine kinases, hygromycin-B kinases, phosphoenolpyruvate-glycerone phosphotransferases, xylitol kinases, inositol-trisphosphate 3-kinases, tetraacyldisaccharide 4′-kinases, inositol-tetrakisphosphate 1-kinases, macrolide 2′-kinases, phosphatidylinositol 3-kinases, ceramide kinases, inositol-tetrakisphosphate 5-kinases, glycerol-3-phosphate-glucose phosphotransferases, diphosphate-purine nucleoside kinases, tagatose-6-phosphate kinases, deoxynucleoside kinases, ADP-dependent phosphofructokinases, ADP-dependent glucokinases, 4-(cytidine 5′-diphospho)-2-C-methyl-D-erythritol kinases, 1-phosphatidylinositol-5-phosphate 4-kinases, 1-phosphatidylinositol-3-phosphate 5-kinases, inositol-polyphosphate multikinases, phosphatidylinositol-4,5-bisphosphate 3-kinases, phosphatidylinositol-4-phosphate 3-kinases, diphosphoinositol-pentakisphosphate kinases, adenosylcobinamide kinases, N-acetylgalactosamine kinases, inositol-pentakisphosphate 2-kinases, inositol-1,3,4-trisphosphate 5/6-kinases, 2′-phosphotransferases, CTP-dependent riboflavin kinases, N-acetylhexosamine 1-kinases, hygromycin B 4-O-kinases, O-phosphoseryl-tRNASec kinases, glycerate 2-kinases, 3-deoxy-D-manno-octulosonic acid kinases, D-glycero-beta-D-manno-heptose-7-phosphate kinases, D-glycero-alpha-D-manno-heptose-7-phosphate kinases, pantoate kinases, anhydro-N-acetylmuramic acid kinases, protein-fructosamine 3-kinases, protein-ribulosamine 3-kinases, nicotinate riboside kinases, diacylglycerol kinases (CTP dependent), maltokinases, UDP-N-acetylglucosamine kinases, and L-threonine kinases. In some embodiments, the first kinase can be chosen from a hexokinase, a glucokinase, a galactokinase, a fructokinase, a glycerol kinase, a choline kinase, a pantetheine kinase, a mevalonate kinase, a pyruvate kinase, an undecaprenol kinase, an ethanolamine kinase, a diacylglycerol kinase, a dolichol kinase, a macrolide 2′-kinase, a ceramide kinase, or a combination thereof.
In some embodiments, the single enzyme can comprise a phosphotransferase that uses a phosphate acceptor. Such enzymes are classified under EC numbers 2.7.4, and include polyphosphate kinases, phosphomevalonate kinases, adenylate kinases, nucleoside-phosphate kinases, nucleoside-diphosphate kinases, phosphomethylpyrimidine kinases, guanylate kinases, dTMP kinases, nucleoside-triphosphate-adenylate kinases, (deoxy)adenylate kinases, T2-induced deoxynucleotide kinases, (deoxy)nucleoside-phosphate kinases, cytidylate kinases, thiamine-diphosphate kinases, thiamine-phosphate kinases, 3-phosphoglyceroyl-phosphate-polyphosphate phosphotransferases, farnesyl-diphosphate kinases, 5-methyldeoxycytidine-5′-phosphate kinases, dolichyl-diphosphate-polyphosphate phosphotransferases, inositol-hexakisphosphate kinases, UMP kinases, ribose 1,5-bisphosphate phosphokinases, diphosphoinositol-pentakisphosphate kinases, (d)CMP kinases, isopentenyl phosphate kinases, (pyruvate, phosphate dikinase)-phosphate phosphotransferases, and (pyruvate, water dikinase)-phosphate phosphotransferases.
In some embodiments, the single enzyme can comprise isopentenyl phosphate kinase (IPK).
In certain embodiments, the single enzyme can comprise a mutant enzyme engineered to increase substrate promiscuity, improve enzyme activity, increase enzyme specificity with respect to a particular substrate, or a combination thereof.
In some embodiments, the primary alcohol defined by Formula I is not one of the following
In some embodiment steps (i) and (ii) can be performed in a cell-free system. In some of these embodiments, the method can further comprise recovering the isoprenoid subunit from the cell-free system. In other embodiments, steps (i) and (ii) can be performed in a cell comprising genes encoding for the first kinase and the second kinase. The cell can be engineered to express (or overexpress) the genes encoding for the first kinase and/or the second kinase.
Methods can further comprise introducing the isoprenoid subunit into a natural or artificial isoprenoid biosynthetic pathway to synthesize an isoprenoid. This can be done performed within a cell or in a cell-free system.
Also provided are methods for synthesizing an isoprenoid subunit that comprise (i) incubating a cell in a fermentation broth with ATP and a primary alcohol defined by Formula I below
wherein R1 is selected from the group consisting of C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; R1′ is selected from the group consisting of hydrogen, C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; and each RX, when present, is independently selected from OH, NO2, CN, halo, C1-6 alkyl, C2-6 alkenyl, C2-6 alkynyl, C1-4 haloalkyl, C1-6 alkoxy, C1-6 haloalkoxy, cyano-C1-3 alkyl, HO—C1-3 alkyl, amino, C1-6 alkylamino, di(C1-6 alkyl)amino, thio, C1-6 alkylthio, C1-6 alkylsulfinyl, C1-6 alkylsulfonyl, carbamyl, C1-6 alkylcarbamyl, di(C1-6 alkyl)carbamyl, carboxy, C1-6 alkylcarbonyl, C1-6 alkoxycarbonyl, C1-6 alkylcarbonylamino, C1-6 alkylsulfonylamino, aminosulfonyl, C1-6 alkylaminosulfonyl, di(C1-6 alkyl)aminosulfonyl, aminosulfonylamino, C1-6 alkylaminosulfonylamino, di(C1-6 alkyl)aminosulfonylamino, aminocarbonylamino, C1-6 alkylaminocarbonylamino, and di(C1-6 alkyl)aminocarbonylamino, thereby generating the isoprenoid subunit defined by Formula III below
wherein R1, R1′, and RX are as defined above with respect to Formula I and PP represents a pyrophosphate group; wherein the cell comprises a gene encoding for a phosphotransferase that can catalyze both a first phosphorylation and a second phosphorylation of the primary alcohol defined by Formula I to generate the isoprenoid subunit.
In some embodiments, the single enzyme can comprise a phosphotransferase that uses an alcohol acceptor. Such enzymes are classified under EC numbers 2.7.1, and include hexokinases, glucokinases, ketohexokinases, fructokinases, rhamnulokinases, galactokinases, mannokinases, glucosamine kinases, phosphoglucokinases, 6-phosphofructokinases, gluconokinases, dehydrogluconokinases, sedoheptulokinases, ribokinases, ribulokinases, xylulokinases, phosphoribokinases, phosphoribulokinases, adenosine kinases, thymidine kinases, ribosylnicotinamide kinases, NAD+kinases, dephospho-CoA kinases, adenylyl-sulfate kinases, riboflavin kinases, erythritol kinases, triokinases, glycerone kinases, glycerol kinases, glycerate kinases, choline kinases, pantothenate kinase, pantetheine kinases, pyridoxal kinases, mevalonate kinases, homoserine kinases, pyruvate kinases, glucose-1-phosphate phosphodismutases, riboflavin phosphotransferases, glucuronokinases, galacturonokinases, 2-dehydro-3-deoxygluconokinases, L-arabinokinases, D-ribulokinases, uridine kinases, hydroxymethylpyrimidine kinases, hydroxyethylthiazole kinases, L-fuculokinases, fucokinases, L-xylulokinases, D-arabinokinases, allose kinases, 1-phosphofructokinases, 2-dehydro-3-deoxygalactonokinases, N-acetylglucosamine kinases, N-acylmannosamine kinases, acyl-phosphate-hexose phosphotransferases, phosphoramidate-hexose phosphotransferases, polyphosphate-glucose phosphotransferases, inositol 3-kinases, scyllo-inosamine 4-kinases, undecaprenol kinases, 1-phosphatidylinositol 4-kinases, 1-phosphatidylinositol-4-phosphate 5-kinases, protein-Npi-phosphohistidine-sugar phosphotransferases, shikimate kinases, streptomycin 6-kinases, inosine kinases, deoxycytidine kinases, deoxyadenosine kinases, nucleoside phosphotransferases, polynucleotide 5′-hydroxyl-kinases, diphosphate-glycerol phosphotransferases, diphosphate-serine phosphotransferases, hydroxylysine kinases, ethanolamine kinases, pseudouridine kinases, alkylglycerone kinases, b-glucoside kinases, NADH kinases, streptomycin 3″-kinases, dihydrostreptomycin-6-phosphate 3′a-kinases, thiamine kinases, diphosphate-fructose-6-phosphate 1-phosphotransferases, sphinganine kinases, 5-dehydro-2-deoxygluconokinases, alkylglycerol kinases, acylglycerol kinases, kanamycin kinases, S-methyl-5-thioribose kinases, tagatose kinases, hamamelose kinases, viomycin kinases, 6-phosphofructo-2-kinases, glucose-1,6-bisphosphate synthases, diacylglycerol kinases, dolichol kinases, deoxyguanosine kinases, AMP—thymidine kinases, ADP—thymidine kinases, hygromycin-B kinases, phosphoenolpyruvate-glycerone phosphotransferases, xylitol kinases, inositol-trisphosphate 3-kinases, tetraacyldisaccharide 4′-kinases, inositol-tetrakisphosphate 1-kinases, macrolide 2′-kinases, phosphatidylinositol 3-kinases, ceramide kinases, inositol-tetrakisphosphate 5-kinases, glycerol-3-phosphate-glucose phosphotransferases, diphosphate-purine nucleoside kinases, tagatose-6-phosphate kinases, deoxynucleoside kinases, ADP-dependent phosphofructokinases, ADP-dependent glucokinases, 4-(cytidine 5′-diphospho)-2-C-methyl-D-erythritol kinases, 1-phosphatidylinositol-5-phosphate 4-kinases, 1-phosphatidylinositol-3-phosphate 5-kinases, inositol-polyphosphate multikinases, phosphatidylinositol-4,5-bisphosphate 3-kinases, phosphatidylinositol-4-phosphate 3-kinases, diphosphoinositol-pentakisphosphate kinases, adenosylcobinamide kinases, N-acetylgalactosamine kinases, inositol-pentakisphosphate 2-kinases, inositol-1,3,4-trisphosphate 5/6-kinases, 2′-phosphotransferases, CTP-dependent riboflavin kinases, N-acetylhexosamine 1-kinases, hygromycin B 4-O-kinases, O-phosphoseryl-tRNASec kinases, glycerate 2-kinases, 3-deoxy-D-manno-octulosonic acid kinases, D-glycero-beta-D-manno-heptose-7-phosphate kinases, D-glycero-alpha-D-manno-heptose-7-phosphate kinases, pantoate kinases, anhydro-N-acetylmuramic acid kinases, protein-fructosamine 3-kinases, protein-ribulosamine 3-kinases, nicotinate riboside kinases, diacylglycerol kinases (CTP dependent), maltokinases, UDP-N-acetylglucosamine kinases, and L-threonine kinases. In some embodiments, the first kinase can be chosen from a hexokinase, a glucokinase, a galactokinase, a fructokinase, a glycerol kinase, a choline kinase, a pantetheine kinase, a mevalonate kinase, a pyruvate kinase, an undecaprenol kinase, an ethanolamine kinase, a diacylglycerol kinase, a dolichol kinase, a macrolide 2′-kinase, a ceramide kinase, or a combination thereof.
In some embodiments, the single enzyme can comprise a phosphotransferase that uses a phosphate acceptor. Such enzymes are classified under EC numbers 2.7.4, and include polyphosphate kinases, phosphomevalonate kinases, adenylate kinases, nucleoside-phosphate kinases, nucleoside-diphosphate kinases, phosphomethylpyrimidine kinases, guanylate kinases, dTMP kinases, nucleoside-triphosphate-adenylate kinases, (deoxy)adenylate kinases, T2-induced deoxynucleotide kinases, (deoxy)nucleoside-phosphate kinases, cytidylate kinases, thiamine-diphosphate kinases, thiamine-phosphate kinases, 3-phosphoglyceroyl-phosphate-polyphosphate phosphotransferases, farnesyl-diphosphate kinases, 5-methyldeoxycytidine-5′-phosphate kinases, dolichyl-diphosphate-polyphosphate phosphotransferases, inositol-hexakisphosphate kinases, UMP kinases, ribose 1,5-bisphosphate phosphokinases, diphosphoinositol-pentakisphosphate kinases, (d)CMP kinases, isopentenyl phosphate kinases, (pyruvate, phosphate dikinase)-phosphate phosphotransferases, and (pyruvate, water dikinase)-phosphate phosphotransferases.
In some embodiments, the single enzyme can comprise isopentenyl phosphate kinase (IPK).
In certain embodiments, the single enzyme can comprise a mutant enzyme engineered to increase substrate promiscuity, improve enzyme activity, increase enzyme specificity with respect to a particular substrate, or a combination thereof.
In some embodiments, the primary alcohol defined by Formula I is not one of the following
Isoprenoids and Methods of Making Thereof
The methods described above can further comprise introducing the isoprenoid subunit into a natural or artificial isoprenoid biosynthetic pathway to synthesize an isoprenoid. Such biochemical pathways are well known in the art, and described, for example, in the examples below. The isoprenoid subunit can be introduced into a natural or artificial isoprenoid biosynthetic pathway within a cell or in a cell-free system.
As used herein, the term “isoprenoid” refers to a large and diverse class of naturally-occurring class of organic compounds composed of two or more units of hydrocarbons, with each unit consisting of five carbon atoms arranged in a specific pattern. Isoprenoids represent an important class of compounds and include, for example, food and feed supplements, flavor and odor compounds, and anticancer, antimalarial, antifungal, and antibacterial compounds.
As a class of molecules, isoprenoids are classified based on the number of isoprene units comprised in the compound. Monoterpenes comprise ten carbons or two isoprene units, sesquiterpenes comprise 15 carbons or three isoprene units, diterpenes comprise 20 carbons or four isoprene units, sesterterpenes comprise 25 carbons or five isoprene units, and so forth. Steroids (generally comprising about 27 carbons) are the products of cleaved or rearranged isoprenoids.
As used herein, the term “terpenoid” refers to a large and diverse class of organic molecules derived from five-carbon isoprenoid units assembled and modified in a variety of ways and classified in groups based on the number of isoprenoid units used in group members. Hemiterpenoids have one isoprenoid unit. Monoterpenoids have two isoprenoid units. Sesquiterpenoids have three isoprenoid units. Diterpenoids have four isoprene units. Sesterterpenoids have five isoprenoid units. Triterpenoids have six isoprenoid units. Tetraterpenoids have eight isoprenoid units. Polyterpenoids have more than eight isoprenoid units.
Examples of isoprenoids that can be prepared using the isoprenoid subunits described above include the following (as well as derivatives thereof):
Flavors and Fragrances: Myrcene, linalool, limonene, pinene, humulene, caryophellene, menthol, rose oxide, bisabolene, farnesene, farnesol, nootkatone, valencene, cuprene, epi-cubenol, epi-cedrol, a-santalene, vetispiradiene, (+)-curcumene, (+)-turmerone, (+)-dehydrocurcumene, (−)-cubebol, ionone, damascone, 7,11epoxymegastigma 5(6)-en-9-ol, theaspirane, ambrein, and ambrox;
Cannabinoids: tetrahydrocannabinol, cannabidiol, cannabiol, tetrahydrocannabinolic acid, cannabidiolic acid, cannabigerol, cannabigerol, cannabichromene, cannabicyclol, cannabivarin, tetrahydrocannabivarin, cannabidivarin, cannabichromevarin, cannabigerovarin, cannabigerol monomethyl ether, cannabielsoin, and cannabicitran;
Anti-cancer Agents: Bistabercarpamines A and B, β-pinene, 10-O-acetylmacrophyllide, Stylosin, Tabernaelegantine B and D, Perovskiaol, Asperolide A, Clerodane diterpenoid, Caesalppans A-F, Salyunnanins A-F, 7-(2-oxohexyl)-11-hydroxy-6, 12-dioxo-7,9 (11),13-abietatriene, 15-O-β-d-apiofuranosyl-(1→2)-β-d-glucopyranosyl-18O-β-d-glucopyranosyll3(E)-ent-la bda-8 (9),13 (14)-diene3β,15,18-triol, 6E,10E,14Z-(3S)-17-hydroxygeranyllinalool17-O-β-d-glucopyranosyl(1→2)-[α-1-rhamnopyranosyl-(1→6)]β-d-glucopyranoside, Sterebins O, P1, and P2, α-Santalol, Hoaensieremone, Syreiteate A and B, Artemilinin A, isoartemisolide, α-Cadinol, (2R)-pterosin P, Bieremoligularolide, Arbusculin B, α-cyclocostunolide, costunolide, dehydrocostuslactone, Parthenolide, zaluzanin D, and eupatoriopicrin, 1-oxoeudesm-11, (13)eno-12,8α-lactone, Caesalpinone A, Abiesesquine A, Lanosta-7,9, (24)trien-26-oic acid, 1α,2α,8β,9β1,8-bis (acetyloxy)2,9-bis (benzoyloxy)14-hydroxy-β-dihydroagarofuran, Linderolide G, lindestrene, Dihydro-b-agarofuran sesquiterpenes, Dehydrooopodin, Lupeol, 9alisol B, alisol B 23-acetate, Kaunial, 30-hydroxy-11α-methoxy-18β-olean-12-en-3-one′, Asiatic acid, Euscaphic acids G, Hederagenin, Arjunic acid, Schisanlactone C, Schisanlactone D, Schisanlactone H, Kadsulactone, Triregeloic acid, 3-oxo-9-lanosta-7,22Z,24-trien-26,23-olide, 20-hydroxy-24-dammare-n 3-one, bourjotinolone B, (20S,24R) epoxydammarane-12,25diol-3-one, methyl shoreate, Brachyantheraoside A2, 6β-hydroxy-3-oxoolean-12-en-27-oic acid, 3β,6β-dihydroxyolean-12-en-27-oic acid, 3β,24 β dihydroxyolean-12-en-27-oic acid, Urmiensolide B. Urmiensic acid, Neoabiestrine F, Cipaferen H, granatumin E, Neoabieslactone I, taxadiene, Englerin A, cortistatin A, and cyclopamine; and
Anti-Infectious Disease Agents: Artemisinin, Artemisinic Acid, and Ouabagenin
In some embodiments, the isoprenoid can comprise a hemiterpenoid, a monoterpenoid, a sesquiterpenoid, a diterpenoid, a sesterterpenoid, a triterpenoid, a tetraterpenoid, or a higher polyterpenoid. In some aspects, the hemiterpenoid is prenol (i.e., 3-methyl-2-buten-1-ol), isoprenol (i.e., 3-methyl-3-buten-1-ol), 2-methyl-3-buten-2-ol, or isovaleric acid. In some aspects, the monoterpenoid can be, without limitation, geranyl pyrophosphate, eucalyptol, limonene, or pinene. In some aspects, the sesquiterpenoid is farnesyl pyrophosphate, artemisinin, or bisabolol. In some aspects, the diterpenoid can be, without limitation, geranylgeranyl pyrophosphate, retinol, retinal, phytol, taxol, forskolin, or aphidicolin. In some aspects, the triterpenoid can be, without limitation, squalene or lanosterol. The isoprenoid can also be selected from the group consisting of abietadiene, amorphadiene, carene, α-framesene, β-farnesene, farnesol, geraniol, geranylgeraniol, linalool, limonene, myrcene, nerolidol, ocimene, patchoulol, β-pinene, sabinene, γ-terpinene, terpindene and valencene.
In some aspects, the tetraterpenoid is lycopene or carotene (a carotenoid). As used herein, the term “carotenoid” refers to a group of naturally-occurring organic pigments produced in the chloroplasts and chromoplasts of plants, of some other photosynthetic organisms, such as algae, in some types of fungus, and in some bacteria. Carotenoids include the oxygen-containing xanthophylls and the non-oxygen-containing carotenes. In some aspects, the carotenoids are selected from the group consisting of xanthophylls and carotenes. In some aspects, the xanthophyll is lutein or zeaxanthin. In some aspects, the carotenoid is α-carotene, β-carotene, γ-carotene, β-cryptoxanthin or lycopene.
By employing subunits other than dimethylallyl pyrophosphate (DMAPP) and isopentenyl pyrophosphate (IPP), a variety of new isoprenoid structures can be prepared. By way of example, provided are isoprenoids defined by Formula IV below
wherein R2 is selected from the group consisting of hydrogen, C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; R3 is selected from the group consisting of C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; and each RX, when present, is independently selected from OH, NO2, CN, halo, C1-6 alkyl, C2-6 alkenyl, C2-6 alkynyl, C1-4 haloalkyl, C1-6 alkoxy, C1-6 haloalkoxy, cyano-C1-3 alkyl, HO—C1-3 alkyl, amino, C1-6 alkylamino, di(C1-6 alkyl)amino, thio, C1-6 alkylthio, C1-6 alkylsulfinyl, C1-6 alkylsulfonyl, carbamyl, C1-6 alkylcarbamyl, di(C1-6 alkyl)carbamyl, carboxy, C1-6 alkylcarbonyl, C1-6 alkoxycarbonyl, C1-6 alkylcarbonylamino, C1-6 alkylsulfonylamino, aminosulfonyl, C1-6 alkylaminosulfonyl, di(C1-6 alkyl)aminosulfonyl, aminosulfonylamino, C1-6 alkylaminosulfonylamino, di(C1-6 alkyl)aminosulfonylamino, aminocarbonylamino, C1-6 alkylaminocarbonylamino, and di(C1-6 alkyl)aminocarbonylamino.
In some embodiments, R2 is hydrogen. In other embodiments, R2 can be selected from the group consisting of 6-10 membered aryl, 5-10 membered heteroaryl, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups. In certain embodiments, R2 can comprise:
wherein n is 0, 1, or 2 and RX is as defined above with respect to Formula IV.
In some embodiments, R3 can be one of the following:
In some embodiments, R3 is not one of the following
Also provided are isoprenoids defined by Formula V below
wherein R3 is selected from the group consisting of C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; R4 is selected from the group consisting of hydrogen, C1-10 alkyl, C1-10 heteroalkyl, C2-10 alkenyl, C2-10 heteroalkenyl, C2-10 alkynyl, C2-10 heteroalkynyl, C3-10 cycloalkyl, 6-10 membered aryl, 5-10 membered heteroaryl, 4-10 membered heterocycloalkyl, C3-10 cycloalkyl-C1-4 alkylene, C3-10 cycloalkyl-C1-4 heteroalkylene, 4-10 membered heterocycloalkyl-C1-4 alkylene, 4-10 membered heterocycloalkyl-C1-4 heteroalkylene, 6-10 membered aryl-C1-4 alkylene, 6-10 membered aryl-C1-4 heteroalkylene, 5-10 membered heteroaryl-C1-4 alkylene, and 5-10 membered heteroaryl-C1-4 heteroalkylene, each optionally substituted with 1, 2, 3, or 4 independently selected RX groups; and each RX, when present, is independently selected from OH, NO2, CN, halo, C1-6 alkyl, C2-6 alkenyl, C2-6 alkynyl, C1-4 haloalkyl, C1-6 alkoxy, C1-6 haloalkoxy, cyano-C1-3 alkyl, HO—C1-3 alkyl, amino, C1-6 alkylamino, di(C1-6 alkyl)amino, thio, C1-6 alkylthio, C1-6 alkylsulfinyl, C1-6 alkylsulfonyl, carbamyl, C1-6 alkylcarbamyl, di(C1-6 alkyl)carbamyl, carboxy, C1-6 alkylcarbonyl, C1-6 alkoxycarbonyl, C1-6 alkylcarbonylamino, C1-6 alkylsulfonylamino, aminosulfonyl, C1-6 alkylaminosulfonyl, di(C1-6 alkyl)aminosulfonyl, aminosulfonylamino, C1-6 alkylaminosulfonylamino, di(C1-6 alkyl)aminosulfonylamino, aminocarbonylamino, C1-6 alkylaminocarbonylamino, and di(C1-6 alkyl)aminocarbonylamino.
In some embodiments, R4 can be hydrogen.
In some embodiments, R3 is not one of the following
In some embodiments, R3 is one of the following:
In some embodiments, these isoprenoids can exhibit anticancer activity.
By way of non-limiting illustration, examples of certain embodiments of the present disclosure are given below.
Terpenes are a large class of natural products with wide-ranging biological activities and applications. Previous synthetic biology efforts in this area have focused on producing natural terpenes in microbes either in efforts to increase product titers through pathway engineering or for altering product specificity. However, just two building blocks are used in nature to assemble the carbon scaffolds of terpenes, thus limiting the synthetic scope and utility of natural terpene biosynthetic pathways for the generation of non-natural analogues. In these examples, a comprehensive strategy employing synthetic biology, metabolic engineering, and protein engineering are described that can be used to produce terpenes from non-natural building blocks. This work also provides a platform for the production of terpenes that are site-selectively modified with non-natural chemical functionality, including handles for chemo-selective diversification. Cumulatively, these examples (1) reveal remarkable substrate promiscuity in natural and engineered enzymes, (2) expand the mechanistic understanding of several key enzymes, (3) provide an in vivo and in vitro platform for generation of non-natural terpene building blocks, and (4) provide meroterpene and ergot alkaloid analogues via in vitro and in vivo chemo-enzymatic synthesis.
Terpene natural product diversity and biosynthesis. Terpene natural products are used as pharmaceuticals (taxol and artemisinin), pesticides (coumarin and pyrethrin), flavors (hopanoids and menthol), fragrances (citronel and limonene), pigments (carotenoids and xanthophylls), potential biofuels (farnesene) and a variety of other commercial products (
Terpenes are biosynthesized by the successive condensation of the five-carbon (C5-) isoprenes isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP) (
The generation of larger and more complex terpenes occurs through the polymerization of hemiterpenes into longer linear precursors. These precursors are biosynthesized by prenyltransferases that catalyze the elongation of hemiterpenes through the successive addition of IPP onto DMAPP using head-to-tail condensation (
Once the appropriate linear terpene is made, terpene cyclases cleave the allylic diphosphate to generate a highly reactive allylic carbocation (
It is this chemical reactivity that is essential to the biosynthesis of more complex terpenes. Enzymes involved in the reaction of these allylic diphosphates must delicately coordinate and facilitate this carbocationic chemistry to afford structurally diverse and mesmerizing compounds from simple achiral five-carbon building blocks.
Heterologous terpene production. Because terpenes are intimately involved in development, growth, reproduction, and signaling, they are tightly regulated and usually produced in limited quantity in non-engineered or native systems. Thus, terpene biosynthetic genes are often transplanted into various heterologous hosts to overcome these limitations. While genetic tools are increasingly becoming available allowing for genome engineering of plants for over production of terpenes and other natural products, generation time, seasonal variations, land use, and processing of plants including extraction and isolation is still quite an extensive effort that is both costly and inefficient.
To increase yields of these valuable products, hosts, often plants, have been engineered by optimizing regulatory factors, increasing the expression of rate-limiting enzymes. Additionally, balancing biochemical precursor flux using strategies such as plant breeding, genetic engineering, and development of scalable plant cell cultures have been explored. Even with these efforts, extraction of these high-value products from natural sources is inefficient, low-yielding, and expensive. For example, a 100-year-old Pacific yew tree can be harvested to produce 300 mg of taxol which, on average, is a sixth of the amount needed to treat a single cancer patient. Plant over-production is hampered by climate and cultivation limitations, product toxicity, and pests. A mere 60 mg/L of 10-deacethylbaccatin III (precursor) can be produced from optimized Taxus baccata cultures after 20 days of fermentation.
Heterologous host systems employing Escherichia coli (E. coli) and Saccharomyces cerevisiae (S. cerevisiae) have greatly simplified efforts towards increasing titers of terpenes due to available genetic resources and cloning tools which enable frameworks that serve as customizable microbial factories. Titers of terpenes have been increased in E. coli and S. cerevisiae by boosting precursor supply and modulating native genes involved in terpene synthesis. For example, while a non-engineered strain of E. coli is only able to generate 10 mg/L of taxadiene (a synthetic precursor to taxol), balancing the expression of genes responsible for the production and consumption of IPP enabled taxadiene titers as high as 10 g/L. Modification of the native DXP pathway in E. coli by a combination of gene deletions and alteration of expression, resulted in a 50-fold increase in carotenoid (a conjugated polyunsaturated terpene) production over the non-engineered system. When the lower-half of the mevalonate pathway from S. cerevisiae was transplanted into E. coli, feeding of mevalonate led to a 5-fold increase in production of carotenoids over the optimized DXP system. While S. cerevisiae has boasted the highest production of IPP, in silico profiling of E. coli and S. cerevisiae using various carbon sources predicts maximum IPP production can be achieved from E. coli using glycerol as a carbon source.
To produce the maximum amount of terpene in a given system, biosynthetic approaches can be blended with more efficient chemical processes. Semi-synthetic routes to terpenes use biosynthesis to generate chemical precursors that through chemical transformations can be converted to their final products (
Strategies for terpene natural product diversification. Terpene structures can be diversified through a variety of biosynthetic transformations, most notably P450 oxidation. After oxidation, terpenes can be further modified by acylation, methylation, glycosylation, isomerization, and a variety of other biosynthetic reactions (
Terpene natural products can be functionalized by a variety of processes. Usually, the highly saturated and carbon-rich scaffolds of terpenes are not themselves the most facile starting material for chemical diversification as regiospecific oxidation of these highly saturated scaffolds is very challenging. While the tailoring diversity in natural terpenes is extensive, chemists are still limited to modifications on terpene scaffolds that are directed by the usually stringent regio-selectivity of P450s.
Nevertheless, beyond simple chemical transformations such as acylation of isolated functional groups, chemists have devised a variety of methods that rely on heavily oxidized natural products from nature. Metabolites can be isolated and purified from native hosts or from engineered heterologous hosts by a combination of extraction, chromatography distillation, or recrystallization prior to chemical derivatization however, this can result in an increase costs and waste. Alternatively, methods are being developed for the direct diversification from crude extracts to produce pools of chemically altered natural products that can then be purified by similar methods (
As diversification of these structures hinges on the modification of existing oxidation and diversity, there has been enormous interest in generating non-naturally occurring oxidation patterns and functionality in terpenes. The most common method of doing so is the mixing of cytochrome P450 monooxygenases with their non-native substrates to capitalize on the substrate promiscuity and product specificity to generate non-naturally oxidized modifications. In these efforts, P450s have also been the target of extensive engineering using a variety of strategies such as DNA shuffling and rational engineering.
One potential route for diversifying the structures of terpenes includes feeding non-natural substrates to terpene biosynthetic pathways in vitro or in vivo. For example, terpene natural products have been inadvertently diversified by using analogues of structural precursors during mechanistic studies. To halt cyclization progression to dissect stepwise mechanisms utilized by terpene cyclases, fluorinated analogues are frequently employed (
These methods have been used to study protein prenyltransferases. Farnesyl pyrophosphate analogues containing terminal alkynes for “Click” reactions have been used to study ubiquitination pathways. The corresponding alcohols are synthesized and fed into various cell lines for in vivo phosphorylation by an unknown mechanism.
In order to produce terpene analogues in a sufficient amount for use in chemical libraries or potential drug studies, a different approach to terpene production must be considered. While insightful, the current method of synthesizing chemical precursors for in vitro cyclization is not scalable for adequate production. These pyrophosphorylated analogues are not cell permeable and therefore would be unavailable to whole-cell biocatalysis. In addition, synthesis of these analogues is limited in scope as each analogue needs a dedicated synthetic approach. Ideally a method for the production of analogues could be completed in vivo using cheap chemical precursors shuffled through a flexible pathway.
While there has been minimal insight into the catalytic promiscuity of prenyltransferases and terpene cyclases, the chemistry utilized by these enzymes suggests potential substrate flexibility and malleability for the production of natural product derivatives. A broad range of strategies has been developed to diversify other classes of natural products, but little progress has been made in applying these approaches for the diversification of terpenes, presumably due to the difficulty of applying these approaches in vivo.
Strategies for diversification of polyketides and non-ribosomal peptides. The templated and highly modular logic of polyketide and non-ribosomal peptide synthesis has spurred the development of various strategies that aim to diversify their structures. Precursor directed biosynthesis leverages unnatural building blocks and native biosynthetic machinery to produce unnatural products. An improvement of this strategy, termed mutasynthesis, blocks the availability of the natural substrate so that the unnatural substrate does not compete for incorporation, resulting in a single unnatural biosynthetic product. Potential building blocks can be synthesized using traditional organic synthesis and then used as substrates by biosynthetic machinery in vivo or in vitro for the production of natural product derivatives (
Chemical handles can also be incorporated into natural products via promiscuous or engineered biosynthetic machinery using unnatural building blocks. Such chemical handles may provide sites available for traditional organic synthesis enabling biochemical studies or chemical library construction (
While these strategies have been employed for the diversification of non-ribosomal peptides and polyketides, they have yet to be employed to terpene natural products. In systems where these approaches have been implemented, there exists standing chemical diversity already used in the biosynthesis of these natural products (
Isoprenoids are constructed in nature using hemiterpene building blocks that are biosynthesized from lengthy enzymatic pathways with little opportunity to deploy precursor-directed biosynthesis. Here, an artificial alcohol-dependent hemiterpene biosynthetic pathway was designed and coupled to several isoprenoid biosynthetic systems, affording lycopene and a prenylated tryptophan in robust yields. This approach affords a potential route to diverse non-natural hemiterpenes and by extension isoprenoids modified with non-natural chemical functionality. Accordingly, the prototype chemo-enzymatic pathway is a critical first step towards the construction of engineered microbial strains for bioconversion of simple scalable building blocks into complex isoprenoid scaffolds.
Introduction
Isoprenoids comprise >55,000 natural products for which methods to access and diversify their structures are in high demand. Ultimately, the isoprene motif plays a critical role in modulating the biological activity of isoprenoids, determines their utility as tools to study and treat human diseases, and provides the basis to develop new fuels and chemicals. Notably, although several valuable isoprenoids have been accessed via heterologous expression, our ability to diversify isoprenoids is extremely limited largely due to critical limitations imposed by native isoprenoid biosynthesis. Firstly, only the mevalonate (MEV) and 1-deoxy-D-xylulose-5-phosphate (DXP) pathways (
Together, these limitations could be overcome by supplying a membrane-permeable carbon building block dedicated for a designer pathway that would function independent of native isoprenoid metabolism. A potential strategy for hemiterpene biosynthesis could start with isopentenol (ISO) and dimethylallyl alcohol (DMAA) which are converted to the required diphosphates via stepwise phosphorylation catalyzed by two independent kinases (
In this example, as a first step to realizing this goal, the design and development of a prototype ADH pathway that is completely orthogonal to native hemiterpene biosynthesis is described. The ability of this pathway to access isoprenoids is demonstrated by coupling the ADH pathway to two different isoprenoid biosynthetic systems.
Results and Discussion
Inspired by the observation that several mammalian cell lines convert farnesol and farnesol analogues to the corresponding diphosphates, it was first determined whether E. coli harbors suitable enzymatic machinery that could convert exogenously provided ISO or DMAA into a pool of hemiterpenes for isoprenoid production. To test this, a reporter system that leverages lycopene biosynthesis was used that includes genes from the CrtEBI operon, a geranylgeranyl diphosphate synthase (CrtE or IspA Y80D), and the isopentenyl diphosphate isomerase ipi (
The native DXP pathway in the E. coli reporter strains supports production of lycopene independently of exogenously added DMAA/ISO. Because of this, fosmidomycin (Fs), an inhibitor of the first dedicated step in hemiterpene biosynthesis (
A protein recently found in archaea, isopentenyl phosphate kinase (IPK), is responsible for the generation of IPP from isopentenyl phosphate (IP) and forms a branch of the MEV pathway called the Archaeal MEV Pathway I (
Thermoplasma
acidophilum codon-
E. coli
Streptococcus
mutans
flexneri codon-
E. coli
Aspergillus
fumigatus
It was reasoned that over-expression of the E. coli gene product putatively responsible for DMAA/ISO phosphorylation would result in improved production of hemiterpenes. In a preliminary attempt to identify a suitable enzyme that could act as ‘kinase-1’ (
S. cerevisiae Glycerol kinase (EC2.7.1.30)
S. cerevisiae Homoserine kinase (EC2.7.1.39)
E. coli Phosphoribosylpyro- phosphate synthetase (EC2.7.6.1)
E. coli Homoserine kinase (EC2.7.1.39)
E. coli Ethanolamine kinase (EC2.7.1.82)
E. coli Hydroxyethylthiazole kinase (EC2.7.1.50)
E. coli Undecaprenol kinase (EC2.7.1.66)
E. coli Diacylglycerol kinase (EC2.7.1.107)
E. coli Glycerol kinase (EC2.7.1.30)
E. coli 4-diphosphocytidyl- 2-C-methyl-D-erythritol kinase (IspE) (EC2.7.1.148)
Arabidopsis thaliana Farnesol kinase (EC2.7.1.216)
Thermoplasma acidophilum Isopentenyl phosphate kinase (IPK) (EC2.7.4.26)
Shigella flexneri Non- specific acid phosphatase (PhoN) (EC3.1.3.2)
Streptococcus mutans Diacylglycerol kinase (DGK) (EC2.7.1.107)
Next, to determine whether the lycopene titers were dependent on the concentration of the exogenously provided alcohol substrates, a series of lycopene assays were carried out at various concentrations of DMAA/ISO using the strain harboring pETDuet-PhoN-IPK/pAC-LYCipi in the presence of Fs (
As an additional test of the ability of the designed prototype strain to provide an isoprenoid building block, the PhoN-IPK pathway was used to provide DMAPP which was then transferred to L-Trp using the prenyltransferase (PTase) FgaPT2 by providing an additional plasmid that expressed the PTase (
Conclusion
In summary, an artificial hemiterpene biosynthetic pathway dependent on the exogenous addition of DMAA/ISO was developed. The prototype ADH pathway performed similarly to previously established routes that depend on building blocks from primary metabolism. It is expected that ribosome binding site and/or promoter engineering can be leveraged to optimize the productivity of the pathway further. Notably, in contrast to synthetic biology efforts that have for example constructed a new entry point into the DXP pathway and provided new routes to DXP, the ADH pathway described here is the first to transform scalable simple precursors directly into the required pyrophosphates and couple them to isoprenoid biosynthesis. This provides a simple strategy to provide isoprenoids in good yields given that only two enzymes and DMAA/ISO need to be provided. Indeed, in the absence of ISO/DMAA, there was insufficient endogenous DMAPP in E. coli to support high level production of the prenylated tryptophan, even when Fs was not used to inhibit the native DXP pathway. These methods can be used as a future discovery tool that enables the in vivo biosynthesis of hemiterpene analogues, and by extension, non-natural isoprenoids.
For example, PhoN displays a broad specificity in vitro, and this is expected to extend to the in vivo system here. Furthermore, several features of PhoN have been targeted by enzyme engineering, including shifting its pH optima for neutral media and improving its kinase activity with concomitant reduction in phosphatase activity. Similarly, although the promiscuity of IPK is largely under-explored, its substrate interacts with the enzyme active site through electrostatic forces dictated by the phosphate portion of the substrate, while the remaining alkyl portion of the substrate is simply sterically accommodated. Indeed, the substrate specificity of IPK has been expanded to include geranyl- and farnesylphosphate. An expanded set of non-natural hemiterpenes provided by the prototype or engineered ADH pathway could be coupled with downstream enzymes to probe the promiscuity and utility of isoprenoid biosynthesis. For example, it is expected that this precursor-directed approach to non-natural isoprenoids can be readily extendible to natural product scaffolds that include L-Trp and/or other aromatics given the reported promiscuity of aromatic PTases. Subsequently, the ADH pathway may enable the production of prenylated and terpene natural products with non-natural alkyl groups expanding upon the limited chemical diversity afforded by nature.
Materials and Methods
General. All plasmids were verified by DNA sequencing. Purifications of all DNA were performed with kits from BioBasic. Lycopene standard was purchased from Sigma Aldrich. Synthetic oligonucleotides were purchased from IDT (Coralville, Iowa, USA). Plasmid pAC-LYCipi was purchased from Addgene (Plasmid #53279, Addgene, Cambridge, Mass., USA). Restriction enzymes were purchased from New England Biolabs (Ipswich, Mass., USA). Polymerase chain reactions were conducted using Phire Hot Start II DNA Polymerase from ThermoFisher Scientific (Waltham, Mass., USA).
Cloning Candidate Kinase Genes. Genes were amplified from E. coli BL21 and S. cerevisiae EBY100 genomic DNA by taking 100 μL of cell pellet, adding 200 μL of water, followed by boiling in a 1.5 mL tube for 15 min. The cell debris was pelleted and 1 μL of the supernatant was used for PCR from genomic DNA using the primers listed in Table 3. Farnesol kinase from A. thaliana was PCR amplified from a cDNA library gifted from the lab of Dr. José M. Alonso (NC State, Department of Genetics) using the primers listed in Table 3. The PCR reaction contained 5× Phire II buffer, 0.2 mM dNTPs, 0.5 μM each primer, 1 μL Phire II DNA polymerase, and 1 μL of genomic DNA, in a total volume of 50 μL. The cycling parameters used were as follows: 1) 98° C., 30 s; 2) 98° C., 5 s; 3) 66° C., 15 s; 4) 72° C., 20 s; 5) repeat steps 2-4 34 times; 6) 72° C., 1 min; 7) 4° C., hold. Following amplification, the amplified products were gel purified, digested with BamHI and NotI, and ligated into similarly treated ‘empty’ pETDuet and pETDuet-IPK (in MCS2). Ligation mixtures were transformed into chemically competent E. coli NovaBlue (DE3) cells (Novagen) and plated on LB agar supplemented with 50 μg/mL kanamycin for incubation overnight at 37° C. Colonies were then screened for the appropriate size insert by colony PCR using primers annealing to the T7 promoter and T7 terminator. Those colonies with correct sized inserts were then picked and grown in 3 mL LB supplemented with 50 μg/mL kanamycin for incubation overnight at 37° C. Plasmid was prepared from a single colony and the gene sequence confirmed by DNA sequencing.
S. cerevisiae
S. cerevisiae
E. coli
E. coli
E. coli
E. coli
E. coli
E. coli
E. coli
coli
Arabidopsis thaliana
coli
coli
ananatis
ananatis
Thermoplasma
acidophilum
Thermoplasma
acidophilum
Shigella flexneri
Streptococcus mutans
Aspergillus fumigatus
fumigatus (cloned in
Screening of Kinases for Phosphorylation of ISO and DMAA by LC-MS Analysis. Each plasmid containing a cloned kinase gene was transformed into chemically competent E. coli BL21(DE3) Tuner cells and plated on LB agar supplemented with 50 μg/mL kanamycin for incubation overnight at 37° C. Colonies were picked the following day and used to inoculate 3 mL LB supplemented with 50 g/mL kanamycin for incubation overnight at 37° C. A 1 mL portion of this culture was then used to inoculate 100 mL of LB supplemented with 50 μg/mL kanamycin and grown at 37° C. at 250 rpm until the culture reached OD600 of ˜0.2 before the temperature was reduced to 18° C. and IPTG was added to a final concentration of 1 mM. The culture was incubated for 18 hours at 250 rpm. The culture was pelleted at 4,000 rpm for 10 min, the supernatant decanted, and the cell pellet resuspended in 5 mL of lysis buffer (100 mM Tris, 300 mM NaCl, 10% glycerol, pH 8.0) and lysed by sonication. The debris was then pelleted at 4,500 rpm for 20 min, decanted, and the soluble protein was spun down additionally at 15,000 rpm for 1 h. The resulting soluble fraction was then purified using loose Ni2+ resin from GE Healthcare. 200 μL of resin was added to the soluble fraction of protein and incubated on ice for 1 hour with intermittent agitation to suspend the resin. The resin was then spun down for 10 min at 4,500 rpm at 4° C. and the lysate was removed. The resin was then resuspended in 1 mL of wash buffer (50 mM Tris, 500 mM NaCl, 20 mM imidazole, pH 8.0) and transferred to a 1.5 mL tube. The mixture was allowed to incubate on ice for 10 min before the resin was spun down again as before. This washing procedure was repeated 4 more times before the protein was eluted with 200 μL of elution buffer (50 mM Tris, 500 mM NaCl, 200 mM imidazole, pH 8.0). The protein was then directly assayed and purity was verified by SDS-PAGE with comparison to the soluble fraction of E. coli BL21 (DE3) Tuner cells not harboring any plasmid. The assay consisted of 10 μL of purified protein in a total volume of 100 μL containing 5 mM ATP, 1 mM of ISO and DMAA (stock of 100 mM in DMSO), 50 mM Tris at pH 7.5, and 2.5 mM MgCl2. The reaction was incubated overnight at 37° C. before being quenched with an equal volume of methanol. The mixture was then analyzed by low-resolution LC-MS along with a synthetic standard of isopentenyl phosphate and dimethylallyl phosphate. LC-MS experiments were conducted using a Shimadzu LC-MS 2020 single quadrupole instrument with a Phenomenex Kinetex UPLC C18 column (2.1×50 mm, 2.6 μm particle, 100 Å pores) column. An aliquot (5 μL) was injected onto and separated using a series of linear gradients developed from 0.1% formic acid in H2O (A) to 0.1% formic acid in acetonitrile (B) at 0.2 mL/min using the following protocol: 0-2.2 min, 95-1% A; 2.21-2.6 min, 1% A; 2.61-2.62 min, 1-95% A; 2.63-3.5 min, 95% A.
Synthesis of IPP and DMAPP. The synthesis of IPP and DMAPP followed a published procedure (Keller, R. K.; Thompson, R., Rapid synthesis of isoprenoid diphosphates and their isolation in one step using either thin layer or flash chromatography. Journal of chromatography 1993, 645 (1), 161-7). Briefly, to 4 mmol of the neat alcohol (IPO or DMAA) in a 50 mL polypropylene tube was added trichloroacetonitrile (10 mL) and the mixture was allowed to incubate at room temperature for 5 min. Bis-triethylammonium phosphate (TEAP) solution was prepared by slowly adding solution A (25 mL phosphoric acid, 94 mL acetonitrile) to solution B (110 mL triethylamine, 100 mL acetonitrile) to generate a solution that was 38% solution A and 62% solution B. To the mixture of alcohol and trichloroacetonitrile was added 10 mL of TEAP solution. The mixture was then incubated in a 37° C. water bath for 5 min before another addition of TEAP. A total of three additions of TEAP solution were added and incubated. The mixture was then separated by column chromatography using 6:2.5:0.5 iPrOH:conc. NH4OH:H2O with silica as the stationary phase. Prior to loading the column, the reaction mixture was diluted to 20% vol/vol with chromatography buffer and the resulting precipitate was pelleted by centrifugation prior to loading of the flash column. Fractions were analyzed by using a Shimadzu single quadrupole LCMS-2020 and those containing the diphosphorylated compound ([M−H]−), free of tri- or monophosphorylated were pooled. The pooled fractions were then concentrated in vacuo to remove isopropanol and acetonitrile. The concentrated mixture was then filtered using 0.2 m cellulose filter and frozen at −80° C. After being frozen overnight, the sample was lyophilized yielding a salt. The triammonium salt was then characterized and stored frozen in 250 μL aliquots at 25 mM each. Dimethylallyl diphosphate (DMAPP): 1H NMR (400 MHz, D2O) δ 5.43 (t, J=7.0 Hz, 1H), 4.43 (dd, JH,P=7.0 Hz, JH,H=7.0 Hz, 2H), 1.76 (s, 3H), 1.71 (s, 3H); 13C NMR (101 MHz, D2O) δ 140.1, 119.7 (d, JC,P=8.2 Hz), 62.7 (d, JC,P=5.4 Hz), 25.0, 17.3; 31P NMR (162 MHz, D2O) δ −6.04 (d, J=21.7 Hz), −9.38 (d, J=21.6 Hz); HRMS m/z calculated for C5H12O7P2 [M−H+]− 244.9985, found: 244.9986. Isopentenyl diphosphate (IPP): 1H NMR (400 MHz, D2O) δ 4.02-3.92 (m, 2H), 2.30 (t, J=6.7 Hz, 2H), 1.68 (s, 3H); 13C NMR (101 MHz, D2O) δ 143.8, 111.6, 64.1 (d, JC,P=6.0 Hz), 37.9 (d, JC,P=7.6 Hz), 21.7; 31P NMR (162 MHz, D2O) δ −6.22 (d, J=21.6 Hz), −9.54 (d, J=21.5 Hz); HRMS m/z calculated for C5H12O7P2 [M−H+]− 244.9985, found: 244.9985.
Preliminary Characterization of IPK via Lycopene Colorimetric Assay. Colonies of E. coli BL21Tuner(DE3) harboring pCDFDuet-GGPP (see below) and pACYCDuet-Lyc (see below) were used to inoculate separate wells of a deep-well plate containing 1 mL of LB media supplemented with ampicillin (150 μg/mL), chloramphenicol (35 μg/mL), and streptomycin (200 μg/mL). The deep-well plate was incubated overnight in a rotary shaker at 37° C. with orbital shaking at 350 rpm. Then, 50 μL of the culture was used to inoculate 400 μL of LB media supplemented with ampicillin (120 μg/mL), chloramphenicol (28 μg/mL), and streptomycin (160 μg/mL). After 3 h of incubation at 37° C. at with shaking at 350 rpm, the OD600 of the culture was approximately 0.1, and IPTG, DMAA/ISO, and Fs were each added to give final concentrations of 1 mM, 5 mM, and 0.5 μM, respectively, to bring the cultures to a final volume of 500 μL. For controls that lacked one or more of these components, LB media was added instead. Plates were then incubated in the dark in an incubator/shaker at 30° C. with shaking at 350 rpm for 48 h. Then, the deep-well plate was centrifuged at 3,000 rpm for 7 min to pellet the cells. After removal of the growth media from each well, the pellets were resuspended in 1 mL of phosphate buffer saline buffer with vigorous vortexing and the resuspended pellets were visualized and photographed.
Construction of pCDFDuet-GGPP and pACYCDuet-Lyc. pCDFDuet-GGPP contained a mutated version of the E. coli IspA gene (Y80D) and idi from E. coli that were subcloned sequentially into pCDFDuet. Briefly, ispA and idi were each PCR amplified from E. coli DH5α using the primers listed in Table 3. Each PCR reaction mixture contained 5× Phire II buffer, 0.2 mM dNTPs, 0.5 μM each primer, 1 μL Phire II DNA polymerase, and 1 μL of template DNA in a total volume of 50 μL. The cycling parameters used were as follows: 1) 98° C., 30 s; 2) 98° C., 5 s; 3) 63° C., 15 s; 4) 72° C., 20 s; 5) repeat steps 2-4 34 times; 6) 72° C., 1 min; 7) 4° C., hold. Following amplification, the products were purified by gel electrophoresis and digested with NcoI and Nod for ispA and BgIII for idi and ligated into the appropriately treated pCDFDuet vector. Each ligation mixture was transformed into chemically competent E. coli DH5α and plated on an LB agar plate containing 200 μg/mL streptomycin. Plasmid was prepared from a single colony and the gene sequence confirmed by DNA sequencing. The mutation Y80D was introduced by site-directed mutagenesis.
pACYCDuet-Lyc contains the CrtEBI operon genes from Pantoea ananatis that were sub-cloned sequentially using pACmod-crtE-crtB-crtl as a template. Briefly, crtB and crtI were each PCR amplified from the template pACmod-crtE-crtB-crtI using the primers listed in Table 3. Each PCR reaction mixture contained 5× Phire II buffer, 0.2 mM dNTPs, 0.5 μM each primer, 1 μL Phire II DNA polymerase, and 1 μL of template DNA, in a total volume of 50 μL. The cycling parameters used were as follows: 1) 98° C., 30 s; 2) 98° C., 5 s; 3) 63° C., 15 s; 4) 72° C., 20 s; 5) repeat steps 2-4 34 times; 6) 72° C., 1 min; 7) 4° C., hold. Following amplification, the products were purified by gel electrophoresis and digested with BgIII and XhoI for crtB and NcoI and HindIII for crtI and ligated into the appropriately treated pCDFDuet vector. Each ligation mixture was transformed into chemically competent E. coli DH5α and plated on an LB agar plate containing 35 μg/mL chloramphenicol. Plasmid was prepared from a single colony and the gene sequence confirmed by DNA sequencing.
Cloning of IPK, PhoN, and DGK. The sequence of ipk (Table 2) codon-optimized for heterologous expression in E. coli was synthesized by IDT and PCR amplified using the primers listed in Table 3. The PCR reaction mixture contained 5× Phire II buffer, 0.2 mM dNTPs, 0.5 μM each primer, 1 μL Phire II DNA polymerase, and 1 μL of template DNA, in a total volume of 50 μL. The cycling parameters used were as follows: 1) 98° C., 30 s; 2) 98° C., 5 s; 3) 63° C., 15 s; 4) 72° C., 20 s; 5) repeat steps 2-4 34 times; 6) 72° C., 1 min; 7) 4° C., hold. Following amplification, the product was purified by gel electrophoresis, digested with NdeI and XhoI and ligated into similarly treated pETDuet-1 (into multi-cloning site two). The ligation mixture was transformed into chemically competent E. coli DH5α and plated on LB agar containing 100 μg/mL of ampicillin. Plasmid was prepared from a single colony and the gene sequence confirmed by DNA sequencing using the primer ‘DuetUP2’ (5′-TTGTACACGGCCGCATAATC-3′; SEQ. ID 47). For cloning into pET28a, the ipk gene was PCR amplified using the primers listed in Table 3 and the same PCR conditions as described above. Following amplification, the product was purified by gel electrophoresis and was digested with NdeI and XhoI and ligated into similarly treated pET28a. The ligation mixture was then transformed into chemically competent E. coli DH5α and plated on LB agar containing 50 μg/mL kanamycin. Plasmid was prepared from a single colony and the gene sequence confirmed by DNA sequencing.
The sequence of phoN (Table 2) codon-optimized for expression in E. coli was synthesized by IDT and PCR amplified using the primers listed in Table 2. The PCR reaction contained 5× Phire II buffer, 0.2 mM dNTPs, 0.5 μM each primer, 1 μL Phire II DNA polymerase, and 20 ng of template DNA, in a total volume of 50 μL. The cycling parameters used were as follows: 1) 98° C., 30 s; 2) 98° C., 5 s; 3) 66° C., 15 s; 4) 72° C., 20 s; 5) repeat steps 2-4 34 times; 6) 72° C., 1 min; 7) 4° C., hold. Following amplification, the amplified product was gel purified, digested with BamHI and NotI, and ligated into similarly treated ‘empty’ pETDuet and pETDuet-IPK (into MCS1). The ligation mixture was transformed into chemically competent E. coli DH5α and plated on LB agar containing 100 μg/mL of ampicillin. Plasmid was prepared from a single colony and the gene sequence confirmed by DNA sequencing using the primer ‘pET Upstream’ (ATGCGTCCGGCGTAGA; SEQ. ID 48).
The sequence of DGK from Streptococcus mutans codon optimized for expression in E. coli (Table 2) was synthesized and subcloned into pET28a according to the same protocol for cloning candidate kinase genes above, using the primers listed in Table 3.
Expression and Protein Purification of PhoN. pETDuet-PhoN was transformed into chemically competent E. coli BL21(DE3) for protein expression. An overnight 3 mL culture in LB broth of PhoN-pETDuet containing 100 μg/mL of ampicillin was grown at 37° C. and 270 rpm. A 1 L culture in LB broth was inoculated with 1 mL of the overnight culture and grown at 37° C. and 250 rpm to an OD600 of 0.6. The culture was cooled to 18° C., and protein expression was induced by the addition of IPTG to a final concentration of 1 mM. The culture was incubated at 18° C. and 200 rpm for an additional 20 h. The culture was spun down into a pellet which was stored at −20° C. until purification. The cell pellets were thawed and resuspended in 20 mL of lysis buffer (250 mM sodium chloride, 50 mM sodium phosphate, 10 mM imidazole, pH 7.9). The cells were lysed by sonication and spun down. The cell lysate was then separated from the insoluble cell debris, and the His6-tagged proteins were purified from the lysate using Ni2+ beads on agarose using a low-imidazole buffer (50 mM Tris, 300 mM NaCl, 20 mM imidazole, pH 8.0) as the wash buffer and a high-imidazole buffer (50 mM Tris, 300 mM NaCl, 200 mM imidazole, pH 8.0) to isolate the protein. Then, a 10 kDa spin filter was used to concentrate and buffer exchange the protein into storage buffer (50 mM Tris-HCl, 500 mM NaCl, 20% glycerol, pH 7.4). The concentration of protein was determined using a Bradford assay, and small aliquots of protein were stored at −80° C. until needed.
Expression and Protein Purification of IPK. pET28a-IPK was transformed into chemically competent E. coli BL21(DE3) for protein expression. An overnight 3 mL culture in LB broth supplemented with kanamycin (50 μg/mL) was grown at 37° C. with shaking at 270 rpm. A 1 L culture in LB broth was inoculated with 1 mL of the overnight culture and grown at 37° C. with shaking at 250 rpm to an OD600 of 0.6. The culture was cooled to 30° C. and protein expression was induced by the addition of IPTG to a final concentration of 1 mM. The culture was incubated at 30° C. and 200 rpm for an additional 20 h. The culture was spun down into a pellet which was stored at −20° C. until purification. Cell pellets were thawed and resuspended in 25 mL lysis buffer (250 mM sodium chloride, 50 mM sodium phosphate, 10 mM imidazole, pH 7.9). The cells were lysed by sonication and centrifuged. The cell lysate was then separated from the insoluble cell debris, and the His6-tagged proteins were purified from the lysate using the Bio-Rad Profinia system and Bio-Scale Mini Nuvia IMAC Ni-Charged 5-mL columns. Following loading of the sample onto the column, the system washed first with 6 column volumes of 2× Native IMAC Wash 1 solution (1 M NaCl, 100 mM Tris, 10 mM imidazole, pH 8.0) and then with 6 column volumes of 2× Native IMAC Wash 2 solution (1 M NaCl, 100 mM Tris, 40 mM imidazole, pH 8.0). The sample was eluted in 3 column volumes of 2× Native IMAC Elution buffer (1 M NaCl, 100 mM Tris, 500 mM imidazole, pH 8.0). Then the eluent was concentrated and buffer exchanged into protein storage buffer (50 mM Tris-HCl, 500 mM NaCl, 20% glycerol, pH 7.4) using 10 kDa molecular-weight cutoff filters. The concentration of protein was determined using a Bradford assay and small aliquots of protein were stored at −80° C. until needed.
Lycopene quantification. E. coli NovaBlue (DE3) containing pAC-LYCipi and various pETDuet constructs were grown in 250 mL LB supplemented with ampicillin (100 μg/mL) and chloramphenicol (35 μg/mL) at 37° C. overnight with shaking at 250 rpm after inoculation with 0.25 mL of the starter culture. After 5 h the OD600 of the culture was ˜0.2 at which point combinations of DMAA/ISO (in DMSO), IPTG, and Fs were added to give final concentrations of 5 mM, 1 mM, and 0.5 μM, respectively. In controls that lacked DMAA/ISO, DMSO was added to give the equivalent volume. At various time points, 600 μL of culture was then removed and the lycopene was extracted and quantified.
Extraction and Quantification of Lycopene. Each aliquot (600 μL) of culture was subjected to acetone extraction and quantification of lycopene by HPLC. The remaining 500 μL was centrifuged at 10,000 rpm, and the supernatant removed. The cell pellets were then dried using a speed vacuum without heat until the pellets were dry. Then, 200 μL of acetone was added to the pellets, and the tubes were sonicated at 37° C. for 20 min before incubation at 55° C. for 30 min. The tubes were then sonicated again as before. The pellets were spun down and 100 μL was removed for HPLC analysis. HPLC was performed by injecting 10 μL of the clarified extract onto a Phenomenex Kinetex EVO C18 column (250×4.6 mm, 5 μm, 100 Å pores) with an isocratic elution buffer consisting of 8:1.5:0.5 isopropanol:acetonitrile:methanol over 20 min. Lycopene was assayed at 470 nm. Areas were extracted and compared to the standard curve for quantification.
Lycopene Standard Curve. A standard curve for lycopene was derived by adding various known amounts of commercial lycopene standard to E. coli Novablue(DE3) cell pellets, extracting, and quantified as outlined above.
Cloning of FgaPT2. The sequence of fgaPT2 codon-optimized for expression in E. coli (Table 2) was synthesized by IDT and PCR amplified using the primers listed in Table 3. The 50 μL reaction for amplification of fgaPT2 contained 5× Phire buffer, 0.2 mM dNTPs, 0.25 μM each primer, 1 μL Phire II DNA polymerase, and 1 μL template DNA. The cycling parameters used for each were as follows: 1) 98° C., 30 s; 2) 98° C., 5 s; 3) 64° C., 15 s; 4) 72° C., 20 s; 5) repeat steps 2-4 34 times; 6) 72° C., 1 min; 7) 4° C., hold. Following PCR amplification, the amplified product was purified by gel electrophoresis, digested with HindIII and NdeI and ligated into similarly treated pET28a to generate pET28a-FgaPT2. The ligation mixture was then transformed into chemically competent E. coli DH5α and plated on LB agar containing 50 μg/mL of kanamycin. Plasmid was prepared from a single colony and the gene sequence confirmed by DNA sequencing using the T7 promoter primer. For cloning into pCDFDuet (used in conjunction with the ADH module in pETDuet-PhoN-IPK), fgaPT2 was PCR amplified using the same conditions as above but with different primers also listed in Table 3. Following amplication, the product was digested with BamHI and HindIII and ligated into similarly treated pCDFDuet. The ligation mixture was then transformed into chemically competent E. coli DH5α and plated on LB agar plates containing 100 μg/mL of streptomycin. Plasmid was prepared from a single colony and the gene sequence confirmed by DNA sequencing using the DuetUP2 and T7 terminator primers.
Expression and Protein Purification of FgaPT2. pET28a-FgaPT2 was transformed into chemically competent Rossetta PLysS cells for expression. An overnight 3 mL culture of these cells containing 50 μg/mL of kanamycin and 25 μg/mL of chloramphenicol in LB media was grown at 37° C. and 270 rpm. A 1 L culture in terrific broth containing 30 μg/mL of kanamycin and 35 μg/mL of chloramphenicol was inoculated with 2 mL of overnight culture and grown at 37° C. and 250 rpm to an OD600 of 0.6. Once OD was reached, the culture was cooled to 24° C. and induced by the addition of IPTG to 0.5 mM final concentration. The culture was incubated for 24 h at 24° C. The culture was pelleted and stored in two aliquots at −20° C. until purification. Cell pellets were thawed and resuspended in 25 mL lysis buffer (250 mM sodium chloride, 50 mM sodium phosphate, 10 mM imidazole, pH 7.9). The cells were lysed by sonication and spun down. The cell lysate was then separated from the insoluble cell debris, and the His6-tagged proteins were purified from the lysate using the Bio-Rad Profinia system and Bio-Scale Mini Nuvia IMAC Ni-Charged 5-mL columns. Following loading of the sample onto the column, the system washed first with 6 column volumes of 2× Native IMAC Wash 1 solution (1 M NaCl, 100 mM Tris, 10 mM imidazole, pH 8.0) and then with 6 column volumes of 2× Native IMAC Wash 2 solution (1 M NaCl, 100 mM Tris, 40 mM imidazole, pH 8.0). The sample was eluted in 3 column volumes of 2× Native IMAC Elution buffer (1 M NaCl, 100 mM Tris, 500 mM imidazole, pH 8.0). Then the eluent was concentrated and buffer exchanged into protein storage buffer (50 mM Tris-HCl, 500 mM NaCl, 20% glycerol, pH 7.4) using 50 kDa molecular-weight cutoff filters. The concentration of protein was determined using a Bradford assay, and small aliquots of protein were stored at −80° C. until needed.
In vitro FgaPT2 assay. FgaPT2 reactions were run at pH 7.5 in 200 μL containing 50 mM Tris-HCl, 5 mM CaCl2), 1 mM L-tryptophan, 2 mM DMAPP, and 40 g of FgaPT2. The reactions were incubated at 37° C. for 1 h and then quenched by the addition of an equal volume of methanol. Reactions with PhoN-IPK generated DMAPP contained 25 mM Tris-HCl, 5 mM magnesium chloride, 1 mM L-tryptophan, 1.8 mM ATP, 30 mM DMAA, 270 ng/μL FgaPT2, 20 ng/μL IPK, and 87 ng/μL PhoN at pH 8.0 in a total volume of 50 μL. The reactions were incubated at 37° C. overnight and then quenched by the addition of an equal volume of methanol. For analytical-scale HPLC analysis, FgaPT2 reactions were followed at 269 nm using a Phenomenex Kinetex 5u EVO C18 column (250×4.6 mm; 100 Å) at a flow rate of 1 mL/min. A linear gradient of 20-70% acetonitrile in 0.1% aqueous trifluoroacetic acid over 20 min was used.
In vivo FgaPT2 assay. A 3 mL culture of E. coli Rosetta(DE3) pLysS pETDuet-PhoN-IPK+pCDFDuetFgaPT2 was grown overnight at 37° C. and 250 rpm in LB media containing ampicillin (100 μg/mL), chloramphenicol (35 μg/mL), and streptomycin (100 μg/mL). An aliquot (100 μL) of overnight culture were used to inoculate 10 mL cultures in TB media containing the same antibiotics as before, and those cultures were grown at 30° C. for 6 h at 250 rpm until induction with 0.5 mM IPTG (final concentration) and addition to a final concentration of 5 mM DMAA, 5 mM ISO, and 10 mM Trp. The cultures were grown for 48 h after induction. The culture supernatant was diluted 1:1 in methanol before analysis by HPLC and LC-MS.
Unnatural linear terpene precursors with enhanced chemical diversity beyond a carbon-hydrogen scaffold would expand the scope of available chemo-, regio-, and stereo-specific organic transformations that could be used to diversify terpene scaffolds generated by cyclases. In addition, such unnatural analogues may provide uncharacterized modes of reactivity for terpene cyclases. Importantly, such unnatural linear terpene precursors could also be appended to other natural products such as meroterpenoids and ergot alkaloids.
The generation of unnatural terpenes through, for example, a mixed synthetic biology and precursor directed diversification strategy, would provide chemists previously unavailable chemical handles needed for synthetic derivatization for use in structure-activity relationships, pharmaceutical production, and biochemical studies. A similar strategy can also explored for the production of unnatural biosynthetic polyketides (
Both terpene synthases and terpene cyclases have already been shown to be at least partially promiscuous towards analogues of their natural substrates. A platform enabling the production of hemiterpene analogues would allow for generation of diversified building blocks for which prenyltransferases and terpene cyclases could be engineered to accept.
Terpene precursors are naturally synthesized by the DXP and mevalonate pathways. Terpenes, produced in E. coli by the DXP pathway, are essential for the construction of lipid carriers used in the transportation of glycan components for the maintenance of the cell envelope. Because terpenes are essential for cell survival, modification of the native anabolic pathway may be lethal and in addition, would require the engineering of up to 7 enzymes and methodical planning of how to generate their corresponding substrate analogues from primary metabolites.
In an attempt to generate unnatural hemiterpenes, the lower part of the mevalonate pathway was investigated for plasticity by providing chemically synthesized analogues for sequential biocatalytic conversions. However, these substrates aren't commercially available and require extensive synthesis (9 steps minimum, 4 steps after divergence, 8 chromatographic purifications, 47% overall yield maximally achieved, and racemic). Analogues produced from such a route can only contain the homoallylic diphosphate core of IPP and would result in minimal and restricted modifications to IPP (Schemes 1 and 2). This limited flexibility for analogue generation is a consequence of using biocatalysts that have stereochemical preferences in addition to the requirement of structural motifs essential for full substrate maturation.
As an alternative to engineering endogenous metabolism to accept structural analogues of their native substrates, unnatural hemiterpenes can be produced by consecutive enzymatic phosphorylation of alcohols, as is accomplished synthetically. Use of an artificial pathway for the generation of natural hemiterpenes from the corresponding alcohols of IPP and DMAPP can be accomplished by the use of PhoN and IPK (Example 2). Instead of supplementing this novel pathway with a carbon precursor that would be converted to natural terpenes, it was envisioned that this pathway could potentially be used to generate hemiterpene analogues by providing the pathway with an alternative substrate.
Conversion of the alcohols to the corresponding hemiterpene analogues would require just two equivalents of ATP, a small energy cost compared to that of the mevalonate pathway (3×ATP and 2×NADH) and the DXP pathway (1×ATP, 1×CTP, and 3×NADH). As hemiterpene production via the mevalonate and DXP pathways often limits downstream production of terpenes, the energy benefit alone justifies inquiry of a novel biosynthesis platform.
A two-step biosynthetic platform using a non-specific acid phosphatase (PhoN) from Shigella flexneri and IPK from Thermoplasmsa acidophilum was validated in Example 2 by demonstrating that the production of carotenoids in an engineered E. coli reporter strain was dependent on feeding isopentenol (ISO) and dimethylallyl alcohol (DMAA). Importantly, both component enzymes exhibit significant substrate promiscuity. PhoN can phosphorylate a wide variety of alcohols with various phosphate donors (
Results and Discussion
Design and synthesis of a hemiterpene monophosphate analogue library. In order to describe the substrate promiscuity of IPK, a panel of isopentenyl monophosphate analogues was designed and synthesized (
The most straightforward and robust method was initially developed by Cramer before being modified by Keller and Thompson. This modified synthesis did not need to be carried out anhydrously, and a single purification step could be used to isolate the desired compounds. While this reaction was simpler to carry out for the production of many analogues, the reaction produces a mixture of the corresponding monophosphate, diphosphate, and triphosphate. This chemistry was carried out with starting materials that had a single alcohol in order to eliminate the isolation and purification of multiple regio-isomers. Building blocks containing other nucleophiles besides a single alcohol were omitted as the phosphorylation reaction would have produced mixed phosphorylation patterns
Alcohols were mixed with trichloroacetonitrile before addition the addition of triethylammonium phosphate solution (TEAP). After TEAP was added, the mixture was incubated at 37° C. for a few minutes before another addition of TEAP was added and the process repeated for a total of three additions. Next, the mixture was diluted by adding 20% v/v modified chromatography buffer to precipitate insoluble contaminants. The mixture was then centrifuged to pellet the contaminants before the mixture was loaded onto a silica column for separation. Initial iterations of the column chromatography were conducted in 60 mL syringes before being transferred to a flash column for larger scale syntheses. Syntheses were optimized for isolation of the diphosphates, however the monophosphate containing fractions could be collected. Initially, monophosphate fractions were identified by MS and concentrated in vacuo. While the resulting residue was not pure, the residues were diluted to 5 mg/mL and IPK was tested with and without ATP to observe whether the compounds were indeed substrates. If sufficient activity was observed, the syntheses were scaled for isolation of the monophosphates.
Because the chromatography in the syringes has a limited flowrate due to elution being dictated by gravity, this chromatographic separation takes more than eight hours and typically yields ˜15 mg of product. After scaling the reaction 20-fold, removing precipitate to increase flow rates, and using flash columns to hold larger volumes of reaction mixtures as well as pressure to increase flow rates, upwards of 150 mg of monophosphate could be isolated with the synthesis and chromatography taking a total of 2.5 hours.
Characterization of the substrate promiscuity of isopentenyl phosphate kinase from Thermoplasma acidophilum. Recently, several isopentenol phosphate kinases (IPK) from archaea have been characterized. In addition to a crystal structure being available, engineering a wider substrate tolerance of IPK from Thermoplasma acidophilum had been successful for the phosphorylation of geranyl monophosphate and farnesyl monophosphate. It was envisioned that this successful rational approach could be used to broaden substrate specificity if the catalytic use of IPK was found to be limiting. Several pieces of evidence suggest that IPK could be successfully engineered to broaden its substrate specificity. For example, the kinetic parameters and optimal conditions for IPK from T. acidophilum have already been determined by others, and the enzyme has been found to have the highest kcat/Km at pH 7.5 and was stable up to 70° C. The thermostability suggests a rigid structure that are often amenable to engineering. Further, IPK was also found to be active towards a variety of C4 and C5 monophosphorylated substrates. After finding some basal level of geranyl monophosphate (GP) phosphorylation, IPK has also been engineered to use GP resulting in 130-fold increase in kcat/Km at the cost of specificity with IP. As the authors simply used structure-guided alanine scanning mutagenesis to identify this mutant, further engineering could result in an increase of kcat/Km toward longer length terpene precursors (
Since the substrate scope of IPK has yet to be fully described, here IPK was tested against a wider panel of substrates. The IPK gene from T. acidophilum was codon optimized and subcloned into pET28a. Following expression in E. coli BL21 DE3, the enzyme was purified via metal-chelation affinity chromatography. Next, the substrate specificity of the enzyme was determined in vitro using a panel of synthesized alcohol monophosphates using low resolution mass spectrometry (
This initial study revealed broad promiscuity of IPK towards a wide variety of substrates. IPK showed some activity with nearly every substrate tested, while 11, 15, geranyl monophosphate (GP), FP, and neryl monophosphate (NP) did not result in detection of the corresponding pyrophosphate, as judged by MS analysis. Substrates 10, 12, 13, 14, 16, and 17 supported detectable levels of phosphorylations as found by MS, but as the conversions were very low after overnight reactions, they were deemed too poor for subsequent kinetic analysis. These results suggest that substrate promiscuity is limited by simple sterics, whereby substrates longer than 17 were not accepted by the enzyme. With the exception of substrate 14, all of the smaller compounds were substrates for IPK. To better understand the limits and utility of IPK for these phosphorylations, kinetics parameters required determination with a wide variety of representative substrates. A simple moderate throughput microplate assay was employed for this purpose. To examine the kinetic parameters of IPK towards novel substrates, a commonly used NADH-coupled assay was employed (
Steady stake kinetic parameters of the IPK-catalyzed turnover of successful substrates were determined by measuring initial rates using a fixed concentration of phosphate donor and variable concentration of alcohol monophosphate (Table 4, see methods for details). The data was fitted to the Michaelis-Menten equation using SigmaPlot, and the kinetic parameters (kcat, Km, kcat/Km) extracted.
Gratifyingly, the Km and kcat values were within 10% of previously reported kinetic constants for T. acidophilum IPK with IP, DMAP, and 1. Due to limits in solubility for many of the substrates, kinetic values weren't obtained for a large portion of the substrates initially found using LC-MS (10, 12, 13, 14, 16, and 17). Notably, IPK displayed a higher kcat with many of the substrates compared to the natural substrate, IP. For example, the kcat with 1 and 3 was 4.25- and 5.1-fold higher than that with IP. However, this increase in kcat was offset by the large increase in Km which resulted in lower catalytic efficiencies (kcat/Km) for all non-natural substrates tested when compared to the natural substrate IP. While Km isn't a perfect descriptor for affinity, the analogues tested all had significantly higher Km's presumably due to IPK having a high specificity for IP at low concentrations. This could be a mechanism by which IPK only phosphorylates IP instead of any monophosphorylated metabolite in the cellular context. Interestingly, and consistent with previous data, some correlation is observed between overall length of the substrate and catalytic efficiency. For instance, there was a 3-fold drop in kcat/Km between 1 and 6. Branching at C2 also seems to dramatically lower kcat/Km as observed with 5 and the detectable but kinetically irrelevant use of 2. Interestingly, a 17-fold decrease in kcat/Km is observed when the π-bond geometry of 7 is switched from Z to E in substrate 8. Overall, longer substrates and especially those with structural rigidity (high proportions of sp2 and sp) are poorer substrates for phosphorylation by IPK (
As diphosphate moieties are presumably the primary component contributing to the binding energy between prenyltransferases and these short alkyl monophosphates, it makes sense that IPK has a wide substrate tolerance. Prediction of c Log P values of the substrates to measure greasiness and plotting these against measured Km's and kcat's provided no correlation. No correlation between Km's and kcat's were observed when plotting against molecular volumes. While IPK exhibits a wide substrate tolerance, Km values very greatly. This indicates that while it was hypothesized that the phosphate was the primary governing force in substrate binding, the remaining alkyl portions of the monophosphates have a large impact on Km. Residues found to allow for the turnover of larger substrates are not involved in other aspects of catalysis and seem to sterically accommodate larger alkyl chains (
Conclusions and Future Work
The data in this example points towards the plausibility of using IPK to generate small monophosphorylated substrates (from four up to eight carbons) into their corresponding pyrophosphates. Congruent with the promiscuity of PhoN, such a system can be coupled together to convert a broad variety of alcohols into diphosphorylated compounds via the consumption of just two phosphate donors. Notably, Nature has not been afforded the opportunity to select against the use of 1-15 as substrates, and this is effectively leveraged by PhoN-IPK as a platform the hemiterpene production.
In order to generate hemiterpene analogues for non-naturally prenylated compounds and terpene natural product derivatives, the next step is to couple these enzymes in vivo.
Methods
General. All plasmids were verified by DNA sequencing. Purifications of all DNA were performed with kits from BioBasic. Synthetic oligonucleotides were purchased from IDT (Coralville, Iowa, USA). All plate reader assays were performed using a BioTek Hybrid Synergy 4 plate reader (Winooski, Vt., USA). Restriction enzymes were purchased from New England Biolabs (Ipswich, Mass., USA). Polymerase chain reactions were conducted using Phire Hot Start II DNA Polymerase from ThermoFisher Scientific (Waltham, Mass., USA). Chemicals were purchased from Sigma Aldrich (St. Louis, Mo., USA) and Alfa Aesar (Haverhill, Mass., USA).
Gene Cloning. Isopentenyl monophosphate kinase (IPK) from Thermoplasma acidophilum was codon-optimized and synthesized by Genewiz, Inc. The ipk gene was PCR amplified from the provided template using then cloned into pET28a using NdeI and XhoI restriction sites. PCR was performed using Phire Hot Start II polymerase (ThermoFisher) according to supplier's protocol. PCR product was purified prior to and after digestion by agarose gel electrophoresis. Digested PCR product and similarly treated pET28a were ligated at room temperature with T4 ligase (New England BioLabs) according to supplier's protocol. Ligated plasmid was then transformed into DH5α and plated onto LB agar plates containing 50 μg/mL kanamycin. Individual colonies were picked, grown in the presence of kanamycin, plasmids purified and the ipk gene sequence and frame verified by DNA sequencing (Genewiz).
Expression and Purification of IPK. pET28a-IPK plasmid was transformed into E. coli BL21 (DE3) for protein expression. A single colony was used to inoculate a 3 mL culture in LB media supplemented with 50 μg/mL kanamycin. A 1 L culture containing 50 μg/mL kanamycin in LB media was then inoculated with 1 mL of the overnight culture and grown to an OD600 of ˜0.6 at 37° C. with shaking at 300 rpm at which point protein expression was induced by the addition of 1 mM IPTG. The temperature of the incubator-shaker was reduced to 30° C. and the culture incubated for approximately 18 hours. The culture was pelleted at 4000 rpm for 10 mins, the supernatant was decanted, the cell pellet resuspended in 15 mL of lysis buffer (100 mM Tris, 300 mM NaCl, 10% glycerol, pH 8.0) and lysed by sonication. The lysate was then pelleted at 4500 rpm for 10 mins, decanted, and the soluble protein was spun down at 15,000 rpm for 1 hour. The resulting soluble fraction was then purified by fast protein liquid chromatography (FPLC) using nickel-bead column chromatography for the extraction of His6-tagged proteins. The column was first equilibrated with wash buffer (50 mM TRIS-HCl, 500 mM NaCl, 20 mM imidazole, pH 8.0) prior to loading of the soluble fraction. The soluble fraction was then eluted with elution buffer (50 mM TRIS-HCl, 500 mM NaCl, 200 mM imidazole, pH 8.0) using a gradient of 0% elution buffer 0-7.5 min., 0-50% 7.5-18 min., 50-100% 18-22 min., 100% 22-27.5 min, and equilibrated for additional runs with 0% elution buffer 27.5-35 min. Fractions containing the desired protein were identified by SDS-PAGE and pooled. The pooled protein was then concentrated using a 10 KDa molecular weight cut-off filter (Millipore Amicon-Ultra) and the buffer was exchanged with protein storage buffer (50 mM Tris-HCl, 100 mM NaCl, and 20% glycerol at pH 8.0). Protein aliquots were flash frozen with a dry ice isopropanol bath before storage at −80° C. Protein purity was confirmed by SDS-PAGE while concentration was determined by absorbance using a Pierce Bradford Protein Assay kit.
General Procedure for the Synthesis of Isoprenoid Monophosphates. 400 μmol of the neat alcohol substrate was added to a 15 mL falcon tube. Trichloroacetonitrile (1 mL, 10 mmol) was then added and the mixture was allowed to incubate at room temperature for 5 min. Bis-triethylammonium phosphate (TEAP) solution was prepared by slowly adding solution A (25 mL phosphoric acid, 94 mL acetonitrile) to solution B (110 mL triethylamine, 100 mL acetonitrile) to generate a solution that was 38% solution A and 62% solution B. To the mixture of alcohol and trichloroacetonitrile was added 1 mL of TEAP solution. The mixture was then incubated in a 37° C. water bath for 5 min before another addition of TEAP was added. A total of three additions of TEAP solution were added and incubated. The mixture was then separated by column chromatography using 6:2.5:0.5 iPrOH:conc. NH4OH:H2O with silica as the stationary phase. Prior to loading the column, the reaction mixture was diluted 20% v/v with chromatography buffer and the resulting precipitate was pelleted by centrifugation prior to loading of the flash column. Generally, each column was eluted with a total of 400 mL of eluent with a total silica load of 50 mL pre-equilibrated stationary phase slurry. Fractions of 10 mL (around 24 total) were collected after the yellow color of the solvent front disappeared. Fractions were analyzed by using a Shimadzu single quadrupole LCMS-2020 and those containing the diphosphorylated compound, (M−H)−, free of tri- or mono-phosphorylated were pooled. The pooled fractions were then concentrated in vacuo to remove isopropanol and acetonitrile. The concentrated mixture was then filtered using 0.2 m cellulose filter and frozen at −80° C. After being frozen overnight, the sample was lyophilized yielding a salt. The triammonium salt was then characterized and stored frozen as 250 μL 25 mM aliquots.
Mass Spectrometry. Samples from synthesized monophosphates were subjected to negative-mode mass analysis on a Thermo Fisher Scientific Exactive Plus operating with a heated ESI source connected to a UV detector with a Phenomenex Kinetex UPLC C18 column (2.1×50 mm, 2.6 μm particle, 100 Å pores). 1 μL was injected onto a and separated using a series of linear gradients was developed from 20 mM NH4HCO3 in H2O (A) to 4:1 acetonitrile:H2O (B) at 0.2 mL/min using the following protocol: 0-2 min, 100-80% A; 2-6 min, 80-0% A; 6-7 min, 0% A; 7-7.1 min, 0-100% A; 7.1-12 min, 100% A.
Assay for Initial Activity of Monophosphates with IPK. Fractions from the synthesis of the diphosphates on a 400 μmol scale containing the monophosphates were concentrated in vacuo and resuspended to 5 mg/mL in water. Enzymatic reaction mixtures contained 50 mM Tris (pH 8.0), 2.5 mM MgCl2, 0.05 mM DTT, 1 mM ATP, and 4.2 μg of enzyme in a 200 μL reaction with 40 μL of substrate (1 mg/mL final). A reaction mixture without enzyme was setup as a control. Reactions were incubated overnight at 37° C. and checked by low-resolution LC-MS for diphosphate product formation. Standards from the isolated diphosphates were used to confirm retention time and mass. Reactions with a >10% diphosphate generation as compared to the no enzyme control were selected for kinetic experiments. LC-MS experiments were conducted using a Shimadzu LC-MS 2020 single quadrupole instrument with a Phenomenex Kinetex UPLC C18 column (2.1×50 mm, 2.6 m particle, 100 Å pores) column. 5 μL was injected onto and separated using a series of linear gradients developed from 0.1% formic acid in H2O (A) to 0.1% formic acid in acetonitrile (B) at 0.2 mL/min using the following protocol: 0-2.2 min, 95-1% A; 2.21-2.6 min, 1% A; 2.61-2.62 min, 1-95% A; 2.63-3.5 min, 95% A. Products of enzymatic reactions were verified by mass and comparison with diphosphate standards previously synthesized.
NADH-Coupled Kinetic Assays. NADH coupled assays were performed with purified enzymes. Reaction progress was monitored by absorbance at 340 nm at 30° C. in a 96-well plate using a Biotek Synergy 4 plate reader (Winooski, Vt.). 200 μL enzymatic mixtures contained 50 mM Tris (pH 8.0), 25 mM KCl, 2.5 mM MgCl2, 0.05 mM DTT, 1 mM ATP, 320 μM NADH, 400 μM phosphoenolpyruvate, 0.5 U pyruvate kinase, 0.7 U lactate dehydrogenase, and various amounts of substrate. Conditions were verified by doubling enzyme and verifying the initial rate was doubled as well.
Kinetics of IPK were done with purified enzyme using 0.09 μg of enzyme per well. Serial dilution was used to generate specific concentrations of substrates (3125, 1562, 781, 625, 391, 195, 98, 24, 12, and 3 μM). Each condition was performed in triplicate. Nonlinear regression was fitted using SigmaPlot (Systat Software Inc., San Jose, Calif., USA).
In silico Modeling of Molecular Properties of Substrates. Compounds were modeled in Chem3D Pro 13.0 (Perkin Elmer, Waltham, Mass., USA). c Log P values were determined by using the c Log P driver. Surface areas were calculated after MM2 minimization using the Connolly Solvent Excluded Volume.
Sequence of Codon-Optimized IPK from Thermoplasma acidophilum:
Compounds
3-Methylbut-2-en-1-yl monophosphate (dimethylallyl monophosphate, DMAP): 1H NMR (400 MHz, D2O) δ 5.40 (t, J=5.8 Hz, 1H), 4.34 (dd, JH,P=6.8 Hz, JH,H=6.8 Hz, 2H), 1.75 (s, 3H), 1.70 (s, 3H); 31P NMR (162 MHz, D2O) δ 1.58; HRMS m/z calculated for C5H11O4P [M−H+]− 165.0322, found: 165.0316.
3-Methylbut-3-en-1-yl monophosphate (isopentenyl monophosphate, IP): 1H NMR (400 MHz, D2O) δ 3.92 (dt, JH,P=6.6 Hz, JH,H=6.6 Hz, 2H), 2.34 (t, J=6.6 Hz, 2H), 1.74 (s, 3H); 31P NMR (162 MHz, D2O) δ 1.56; HRMS m/z calculated for C5H11O4P [M−H+]− 165.0322, found: 165.0317.
But-3-en-1-yl monophosphate (1): 1H NMR (400 MHz, D2O) δ 5.70 (ddtd, J=17.0, 9.9, 6.8 Hz, 1H), 4.99 (dd, J=17.0, 3.2 Hz, 1H), 4.97-4.87 (m, 1H), 3.70 (dt, JH,P=6.6 Hz, JH,H=6.6, 2H), 2.20 (dt, J=6.8, 6.6 Hz, 2H); 31P NMR (162 MHz, D2O) δ 2.31; HRMS m/z calculated for C4H9O4P [M−H+]− 151.0166, found: 151.0164.
Pent-4-en-2-yl monophosphate (2): 1H NMR (400 MHz, D2O) δ 5.87 (ddt, J=17.3, 10.5, 7.1 Hz, 1H), 5.19-5.06 (m, 2H), 4.29 (dtq, JH,P=6.5 Hz, JH,H=6.5, 6.3 Hz, 1H), 2.38-2.32 (m, 2H), 1.24 (d, J=6.3 Hz, 3H); 31P NMR (162 MHz, D2O) δ 1.82; HRMS m/z calculated for C5H11O4P [M−H+]− 165.0322, found: 165.0320.
3-Bromobut-3-en-1-yl monophosphate (3): 1H NMR (400 MHz, D2O) δ 5.81-5.71 (m, 1H), 5.53 (t, J=2.3 Hz, 1H), 3.94 (dt, JH,P=6.4 Hz, JH,H=6.3 Hz, 2H), 2.72 (t, J=6.3, 2H); 31P NMR (162 MHz, D2O) δ 1.59; HRMS m/z calculated for C4H8BrO4P [M−H+]− 228.9271; found: 228.9272.
Pent-4-yn-1-yl monophosphate (4): 1H-NMR (400 MHz, D2O): δ 3.90 (dt, JH,P=6.4 Hz, JH,H=6.4, 2H), 2.33-2.28 (m, 4H), 1.83-1.80 (m, 1H); 31P NMR (162 MHz, D2O) δ 1.73; HRMS m/z calculated for C5H9O4P [M−H+]− 163.01656, found: 163.0164.
2-Methylallyl monophosphate (5): 1H NMR (400 MHz, D2O) δ 5.75-5.43 (m, 2H), 4.14 (d, JH,P=6.7 Hz, 2H), 0.72 (s, 3H); 31P NMR (162 MHz, D2O) δ 2.08; HRMS m/z calculated for C4H9O4P [M−H+]− 151.0166, found: 151.0162.
Pent-4-en-1-yl monophosphate (6): 1H-NMR (400 MHz, D2O): □ 5.91-5.88 (m, 1H), 5.11-4.99 (m, 2H), 3.85 (dt, JH,P=6.6 Hz, JH,H=6.6 Hz, 2H), 2.13 (q, J=6.4 Hz, 2H), 1.70 (p, J=6.6 Hz, 2H); 31P NMR (162 MHz, D2O) δ 2.33; HRMS m/z calculated for C5H11O4P [M−H+]− 165.03221, found: 165.0320.
(Z)-Hex-2-en-1-yl monophosphate (7): 1H-NMR (400 MHz, D2O): □ 5.67-5.61 (m, 2H), 4.41 (dd, JH,P=6.2 Hz, JH,H=8.0 Hz, 2H), 2.10-2.07 (m, 2H), 1.52-1.48 (m, 2H), 0.87 (t, J=6.3 Hz, 3H); 31P NMR (162 MHz, D2O) δ 1.51; HRMS m/z calculated for C6H13O4P [M−H+]− 179.0479; found: 179.0477.
(E)-2-Hexen-1-yl monophosphate (8): 1H-NMR (400 MHz, D2O): □ 5.83-5.80 (m, 1H), 5.66-5.61 (1H, m), 4.29 (dd, JH,P=6.8 Hz, JH,H=6.8 Hz), 2.04 (q, J=7.0 Hz, 2H), 1.40 (q, J=7.5 Hz, 2H), 0.88 (t, J=7.5 Hz, 3H); 31P NMR (162 MHz, D2O) δ 1.92; HRMS m/z calculated for C6H13O4P [M−H+]− 179.0479, found: 179.0477.
But-3-yn-1-yl monophosphate (9): 1H-NMR (400 MHz, D2O): δ 3.91-3.95 (2H, m), 2.52-2.55 (2H, m), 2.36-2.40 (1H, m); 31P NMR (162 MHz, D2O) δ 2.38; HRMS m/z calculated for C4H7O4P [M−H+]− 149.0009, found: 149.0007.
Terpene natural products are used in pharmaceuticals (taxol and artemisinin), pesticides (coumarin and pyrethrin), flavors (hopanoids and menthol), fragrances (citronel and limonene), pigments (carotenoids and xanthophylls), potential biofuels (bisabolane) and a variety of other commercial products. Biosynthesis of terpenes proceeds through condensation of dimethylallyl pyrophosphate (DMAPP) with consecutive isopentenyl pyrophosphate extender units (IPP) to generate prenyl diphosphates. These linear precursors are then cyclized to generate cyclic terpenes via terpene cyclases (
In this example, the ability of prenyltransferases to condense non-natural hemiterpenes to form potential linear precursors for terpene cyclases is explored.
Prenyltransferases catalyze elongation of the hemiterpene starter unit, DMAPP, utilizing sequential additions of the hemiterpene extender unit, IPP (Scheme 4). Prenyltransferases are responsible for the generation of linear terpenoid intermediates from these two interconvertible endogenous building blocks and therefore directly impact composition of the final terpene natural product. Prenyltransferases can utilize DMAPP analogues containing alternative diphosphate moieties, alkyl extensions at the methyl positions, epoxidized alkenes, and unsaturated alkenes. Even though a range of DMAPP analogues can potentially serve as substrates for prenyltransferases, the structural diversity of known analogues is limited to derivatives of allylic diphosphates containing trisubstituted alkenes. While extensive studies on DMAPP derivatives have been conducted, minimal work has been carried out to describe the promiscuity of prenyltransferases with IPP analogues. Work done previously with IPP analogues has been limited to extender units containing the natural homoallylic diphosphate core with no exploration into alternative nucleophiles and only a single study using one substrate altering the spacing between the nucleophile and diphosphate moiety. To overcome this severely limited scope, it was hypothesized that prenyltransferases might be able to use a variety of nucleophiles in place of the natural extender unit IPP as a general strategy to biosynthesize isoprenoid analogues. In support of this, prenyltransferases catalyze carbon-carbon bond formation simply by directing the nucleophilic attack of a relatively non-nucleophilic homoallylic alkene to an allylic carbocation followed by stereospecific desaturation. Importantly, additional diversity accessed through this approach could provide new chemical handles or varying oxidation states of carbons in terpene backbones not afforded by endogenous P450s.
As the substrate scope of prenyltransferases has not been found to be limited, the tolerance of prenyltransferases towards unnatural nucleophiles should be further characterized for their ability to catalyze irreversible carbon-carbon bond formation. To more fully understand the inherent promiscuity of prenyltransferases, IspA, a farnesyl diphosphate synthase (FPPase) from Escherichia coli, was characterized for its ability to utilize a panel of unnatural extender units for the generation of extended prenyl diphosphate analogues.
Results and Discussion
Design and Synthesis of a Panel of DMAPP/IPP Analogues. A panel of potential extender and starter units was synthesized from their respective commercially available alcohols, as outlined in Example 3. A wide panel of alcohols was selected for phosphorylation and included functionalities such as alkynes, aromatics, and hetereoatomic moieties as potential nucleophiles in chain extension (
IspA Specificity with Starter Unit Analogues. The majority of research on prenyltransferase substrate tolerance has been focused on varying DMAPP. While the substrate scope of these enzymes has not been fully characterized, all substrates previously tested by others in place of DMAPP have had structural similarity (
Prenyltransferase reactions were run with 200 μM starter unit and 600 μM extender unit in 50 mM Tris and 10 mM MgCl2 at pH 7.5 with enzyme. Reactions were allowed to proceed overnight at 37° C. before being quenched with an equal volume of methanol. Products were analyzed by high-res mass spectrometry.
None of the substrates resulted in the detection of the predicted condensation product, as judged by LC-MS analysis of the product mixtures. Accordingly, none of the diphosphates are able to act as unnatural starter units, most likely because they lack the trisubstituted allylic double bond featured in previously reported DMAPP analogues utilized. Only analogues 27, 30, and 32 included trisubstituted allylic double bonds, however these analogues may have been too sterically demanding to be accommodated in the active site of IspA.
Use of Extender Unit Analogues. As IPP recognition is presumably principally driven by the electrostatic forces of the diphosphate rather than that of the greasy interactions of the alkyl tail, it was envisioned that the enzyme may be agnostic towards the remaining part of the extender unit. Besides the pyrophosphate segment, IPP must sterically fit into the extender unit binding pocket and must be recognized by low energy lipophilic binding forces facilitated by the displacement of water. The homoallylic alkene of IPP is unactivated suggesting that the FPPases may also accommodate alternative 71-bonds as nucleophiles. While studies have shown that farnesyl pyrophosphate synthases (FPPases) are quite promiscuous towards DMAPP analogues, only a few reports provide evidence for the incorporation of IPP analogues which have consisted of aliphatic analogues and a chlorinated analogue, all of which contain the seemingly requisite homoallylic diphosphate.
Next, the panel members were tested as potential replacements of IPP (unnatural extender units) by incubating IspA with 18-37 and DMAPP as the usual starter unit. Remarkably, most of the diphosphate extender units (18, 20-23, 26, 28-29, 31, 33) were used by IspA to generate the corresponding geranyl pyrophosphate (GPP) analogues, as judged by LC-MS analysis of the product mixtures (
Alkene diphosphate 18 was efficiently utilized by IspA to extend the natural starter unit DMAPP. Compared to the natural extender unit IPP, 18 only lacks a methyl group, and it is expected to occupy the IspA extender unit binding pocket as well as IPP does. Notably, the vinyl bromide 20 was used almost as efficiently as IPP. Low activity of IspA towards 22 is of note as extender units do not typically have allylic diphosphates unless the prenyltransferase is catalyzing head-to-head condensation. The alkene 23 was also found to be a good substrate indicating that IspA was able to accommodate catalysis of carbon-carbon forming reactions between unactivated 71-systems and DMAPP as is the case with the use of the natural substrate IPP (
As there is ample space for IPP analogues to occupy the active site in the presence of DMAPP, it is not entirely unexpected that a properly positioned nucleophile (e.g., 18, 20-23) can participate in catalysis (
Cumulatively, this data suggests that simple sterics and nucleophile positioning are sufficient to predict the substrate promiscuity of IspA. If this is accurate, then the extender unit specificity when an alternative, longer starter unit is used, should largely parallel that with DMAPP. To test this hypothesis, diphosphates 18-37 were each incubated with IspA in the presence of GPP as a starter unit and production of the corresponding farnesyl pyrophosphate (FPP) analogues was determined by LC-MS analysis of the product mixtures (
Most of the IPP analogues were efficiently utilized by IspA with GPP as the starter unit. For example, conversion of 26 and 29 were almost as good as that with IPP. Almost quantitative conversion was detected with 18 and 20. The only analogues that were not detectable substrates were 23, 28, and 31. While the specificity of analogue use in place of IPP is not identical with DMAPP and GPP as starter units, the similarity is striking. This implies that the binding pocket in which the starter unit occupies is entirely separate of that in which the extender unit resides.
In addition to the analogues discussed so far, utilization of 21, 23, 26, 28, 29, 31, and 33 reveal that IspA displays unprecedented use of unnatural nucleophiles, including non-homoallylic alkenic (23), alkynyl (21, 26, 28, 31, and 33), and aromatic π electrons (29). While the aromatic analogues 27, 30, 32, and 37 did not serve as an extender unit, 29 was accepted by IspA with both DMAPP and GPP as the starter unit. It is likely that the electron-rich thiophene motif increases the nucleophilicity of the aromatic π-electrons.
Unprecedented Use of Alkynes as Nucleophiles by IspA. Of particular interest were the alkynic analogues utilized by IspA, 21, 26, 28, 31, 33. If the alkynic analogues were activated in the same manner as the natural substrate and assuming the same stepwise mechanism was utilized, the 7E electrons of the alkyne would behave as a nucleophile and would subsequently generate an alkenyl cation before desaturation (Scheme 4). In the case of a terminal alkyne as the extender unit, desaturation by abstraction of Ha or Hb would generate the corresponding internal alkyne or allene, respectively. For the internal alkyne as the extender unit (28 and 31), only desaturation by abstraction of Ha is possible, forming the allene as the only possible product. While mechanistically indicative of an allene, 28 and 31 were very poor substrates (<5% conversion) and as such were not considered for scale-up and isolation.
The generation of an allene via prenyltransferase catalysis, regardless of substrate identity, is unprecedented and prompts multiple mechanistic considerations. First, if ionization of the starter unit precedes attack of the terminal alkyne on the carbocation, it is possible that the resulting leaving group, pyrophosphate, can act as a base and generate the zwitterionic intermediate DMAPP and 26a (Scheme 4, Route A). While this is possible, the presence of a zwitterionic species in an active site is unlikely. Another possibility is the attack of the terminal alkyne to generate a vicinal carbocation (26b) as is seen in the addition of HCl across alkynes (Route B). A third possibility would be an E2′/SN2′ mechanism in which little carbocationic character of the extended species is achieved due to the concerted nature of addition and elimination analogous to the generation of hopene from squalene by hopanoids cyclases in triterpene biosynthesis (Route C). Several lines of experimental evidence support the generation of allenes via IspA catalysis.
First, the internal alkynes (28 and 31) were utilized by IspA, albeit very poorly with DMAPP as the starter unit, and the allene must be generated as only a single proton is available (31a) to regenerate a neutral species (Scheme 4).
To fully elucidate this elusive structure, the enzymatic reaction using 26 and DMAPP was scaled-up for product isolation. Removal of the protein followed by chromatography provided a product with the expected mass for the allene GPP-26b. For comparison, the hypothetical alkyne GPP-26a was chemically synthesized.
The 13C NMR spectrum of the isolated product included a signal at 206 ppm characteristic of a carbon at the center of an allene. At the same time, signals characteristic of an internal alkyne (75-85 ppm) were not observed. Moreover, the 1H, 13C, and 1H-1H COSY NMR spectra were in full agreement with the assigned structure of GPP-26b (
To summarize, all the available evidence is fully consistent with the IspA-catalyzed formation of an allene from alkynyl IPP substrate analogues. Allenes are usually biosynthesized through enzyme catalyzed isomerization of alkyltriynes or base catalyzed elimination of hydrogen peroxide from allyl peroxides. This proposed manner of generating allenes from the addition of an electrophile is unprecedented in both biosynthesis and organic chemistry (
While there are many methods for stereospecific generations of allenes, the most common is probably through a SN2′ mechanism. For instance, in the chemical synthesis of panacene has used stereospecific anti-SN2′ addition on a propargylic mesylate with LiCuBr2. This mechanism is often employed with the use of Gilman reagents and propargylic leaving groups. The mechanism utilized by IspA for generation of an allene using an alkyne-containing extender unit is most like mechanism B. This mechanism is usually employed with an intramolecular nucleophile. When trying to synthesize the allene in panacene, Feldman attempted addition of a halogen using NBS, however this reaction did not occur stereospecifically. While most similar to mechanism B, use of alkynes by IspA presumably occurs stereospecifically due to steric constraints provided by the active site of the enzyme. Mechanism B utilizes the addition of a nucleophile to a conjugated alkene to facilitate a second addition to an electrophile while the biosynthesis of GPP-26b most likely is Brønsted base catalyzed addition of an alkyne to an electrophile in a stereospecific manner. Optical rotation of the isolated allene GPP-26b suggests this is the case.
Substrate Specificity of IspA “Chain Extension” Mutants. Beyond understanding the substrate scope and mechanism of IspA, this enzyme and other prenyltransferases have been the subject of multiple engineering efforts to better understand and manipulate carbon-carbon bond formation using the natural substrates. Perhaps most notably, enzyme engineering efforts afforded by chimeric shuffling have resulted in biocatalysts capable of all four reaction modes available to prenyltransferases: head-to-tail, head-to-head, and head-to-middle.
In addition to forming various types of structures, these enzymes are also capable of catalyzing multiple extension events. For instance, IspA catalyzes two extensions of DMAPP with IPP to first generate geranyl pyrophosphate (GPP) and then farnesyl pyrophosphate (FPP). Using random mutagenesis, the product specificity of IspA has been shifted by discovering mutants capable of a third extension to generate geranylgeranyl diphosphate (GGPP) or limit IspA to a single extension to generate GPP. These mutations were made using site-directed mutagenesis and their activities were confirmed in vitro (
These mutations were mapped onto an IspA crystal structure and map to the bottom of a cavity in the active site where the prenyl acceptors, DMAPP and GPP, extend. The mutation S81F appears to prematurely block the active site to only allow the binding of DMAPP, but not the extended product GPP, effectively limiting IspA to a single extension event. In contrast, Tyr80 is a residue that appears to limit the number of additional extensions to produce isoprenoids no longer than GPP. The mutation Y80D introduces a shorter amino acid sidechain that appears to point away from the active site, allowing for binding of FPP and the subsequent production of GGPP. These mutations therefore act as a molecular ruler (
While this molecular ruler hypothesis has been confirmed in vitro with the natural substrates, we wanted to see if these mutations had any effect on the ability of IspA to use unnatural extender units. Accordingly, the mutations S81F and Y80D were each introduced into the wild-type IspA gene sequence by site-directed mutagenesis. The mutant proteins were expressed in E. coli and purified as hexa-histidine fusion proteins via immobilized metal affinity chromatography. Next, the wild-type, S81F and Y80D IspA were each incubated with IPP and each of 18-37 and the resulting product mixtures analyzed by LC-MS analysis in order to quantify the total conversion of starter unit into C10 (GPP) products. As was observed before with the wild-type IspA, product could not be detected when any of the non-natural pyrophosphates were used in the place of DMAPP as the starter unit with the IspA mutants S81F and Y80D. Thus, consistent with the hypothesized ‘molecular ruler’ role of these two positions, these two mutations are not able to impact utilization of non-natural starter units in this context.
Notably, when the mutants were tested with the unnatural extender units in place of IPP, the specificity of the mutants appeared quite similar to that of the wild-type enzyme (
to product could not be detected with IspA S81F and Y80D, likely because the mutants are less active than the wild-type with all substrates, and the level of activity with these analogues fall below the detection limit of the LC-MS assay. It is well established that wild-type IspA produces a single GPP (C10) product with the natural substrates, DMAPP and IPP, highlighting the strict stereo- and regio-specificity of proton abstraction from the intermediate. Notably however, LC-MS analysis of the product profile generated by the wild-type, Y80D, and S81F IspA using some of novel non-natural extender units (18 and 22) revealed several product ions with unique retention times. For example, using 18 as the non-natural IPP analogue, two unique ions with masses consistent with the predicted C10 GPP analogue were detected (
Conclusions and Future Work
Without engineering IspA, the wild-type enzyme was able to catalyze chain extension using 10 of 20 unnatural extender units containing various functionalities. While some diversification of prenyl units has been achieved using unnatural prenyl diphosphates, this unprecedented use of non-natural extender units emphasizes not only the substrate promiscuity of FPPases, but the mechanistic plasticity. Use of aromatic- and alkynyl-n systems as nucleophiles suggests these enzymes are capable of far more than addition of an alkene to an allylic diphosphate.
As engineering of chain elongating prenyltransferases has been accomplished in vivo by screening using volatiles or colorimetric methods, it is expected that the same could be done with these analogues. Products generated by terpene cyclases could generate chiral quaternary carbons replacing the natural non-chiral gem-dimethyl moieties. If the terpene cyclases naturally generate alkenic methyls, proposed substitutions with the brominated substrate could afford unnatural terpenes with sp2 halogens enabling Heck coupling.
Diversification of prenyl units and by extension, generation of GPP and FPP analogues, may provide unprecedented diversity in cyclized terpenes. As terpene cyclases have been shown to be quite promiscuous towards unnatural analogues, these analogues may provide further diversity in the extensive collection of cyclized terpenes afforded by the elaborate and highly coordinated cyclization of linear precursors. As chemical alteration of this expansive natural product class is limited towards leveraging existing oxidation or providing oxidation through C—H activation, the diversity afforded by this approach could complement synthetic efforts towards alteration of such complex ring systems. By directly incorporating oxidation and diversity into the backbone of these structures, chemists may be afforded with built-in chemical handles or even new structures not provided by Nature due to the potential of novel ring structures generated from substrate directed cyclization patterns instead of strictly enzyme guided bond formation.
Methods
General. All plasmids were verified by DNA sequencing. Purifications of all DNA were performed with kits from BioBasic. Synthetic oligonucleotides were purchased from IDT (Coralville, Iowa, USA). All plate reader assays were performed using a BioTek Hybrid Synergy 4 plate reader (Winooski, Vt., USA). Restriction enzymes were purchased from New England Biolabs (Ipswich, Mass., USA). Polymerase chain reactions were conducted using Phire Hot Start II DNA Polymerase from ThermoFisher Scientific (Waltham, Mass., USA). Chemicals were purchased from Sigma Aldrich (St. Louis, Mo., USA) and Alfa Aesar (Haverhill, Mass., USA).
Gene Cloning. IspA (NP_414955.1) was PCR amplified from E. coli BL21 genomic DNA using the oligos IspA-BamHI-FWD and IspA-XhoI-REV, and cloned into pET28a using BamHI and XhoI restriction sites. PCR was performed using Phire Hot Start II polymerase (ThermoScientific) according to the manufacturer's protocol. PCR product was purified prior to and after digestion by agarose gel electrophoresis. Digested PCR product and similarly treated pET28a were ligated at room temperature with T4 ligase (New England BioLabs) according to supplier's protocol. Ligated plasmid was then transformed into DH5α and plated onto LB agar plates containing 50 μg/mL kanamycin. Individual colonies were picked, grown in the presence of kanamycin, plasmids purified and the IspA gene sequence verified by DNA sequencing.
Site-Directed Mutagenesis of IspA. The mutations S81F and Y80D were introduced into the wild-type IspA template by QuickChange II site-directed mutagenesis. Primers used previously were employed106. Briefly, mutagenic primers for each mutation were used to amplify the IspA gene from the pET28a-IspA template using Pfu turbo polymerase, digested with DpnI to remove the parent template, and ligated using T4 DNA ligase. Ligation mixtures were then transformed into E. coli DH5α and plated onto LB agar plates containing 50 μg/mL kanamycin. Individual colonies were then used to prepare plasmids, and the desired mutations confirmed by sequencing. No spurious mutations were identified. Purified plasmid was transformed into E. coli BL21 DE3 for expression.
Expression and Purification of IspA. Each pET28a-IspA, pET28a-IspA-S81F, and pET28a-IspA-Y80D plasmid was transformed into E. coli BL21 (DE3) for protein expression. A single colony was used to inoculate a 3 mL culture in LB media supplemented with 50 μg/mL kanamycin. A 1 L culture containing 50 μg/mL kanamycin in LB media was then inoculated with 1 mL of the overnight culture and grown to an OD600 of ˜0.6 at 37° C. with shaking at 300 rpm at which point protein expression was induced by the addition of 1 mM IPTG. The temperature of the incubator-shaker was reduced to 18° C. and the culture incubated for approximately 18 hours. The culture was pelleted at 4000 rpm for 10 mins, the supernatant was decanted, the cell pellet resuspended in 15 mL of lysis buffer (100 mM TRIS-HCl, 300 mM NaCl, 10% glycerol, pH 8.0) and lysed by sonication. The lysate was then pelleted at 4500 rpm for 10 mins, decanted, and the soluble protein was spun down at 15,000 rpm for 1 hour. The resulting soluble fraction was then purified by fast protein liquid chromatography (FPLC) using nickel-bead column chromatography for the extraction of His6-tagged proteins. The column was first equilibrated with wash buffer (50 mM TRIS-HCl, 500 mM NaCl, 20 mM imidazole, pH 8.0) prior to loading of the soluble fraction. The soluble fraction was then eluted with elution buffer (50 mM TRIS-HCl, 500 mM NaCl, 200 mM imidazole, pH 8.0) using a gradient of 0% elution buffer 0-7.5 min., 0-50% 7.5-18 min., 50-100% 18-22 min., 100% 22-27.5 min, and equilibrated for additional runs with 0% elution buffer 27.5-35 min. Fractions containing the desired protein were identified by SDS-PAGE and pooled. The pooled protein was then concentrated using a 10,000 molecular weight cut-off filter (Millipore Amicon-Ultra) and the buffer was exchanged with protein storage buffer (50 mM TRIS-HCl, 100 mM NaCl, and 20% glycerol at pH 8.0). Protein aliquots were flash frozen with a dry ice ethanol bath before storage at −80° C. Protein purity was confirmed by SDS-PAGE while concentration was determined by absorbance using a Pierce Bradford Protein Assay kit.
In silico Modeling of IspA Mutants. IspA mutants were modeled using PDB file 1RQI with PyMol. PyMol's mutagenesis wizard tool was used for visualization of different chain length determinant mutants. Modeling of analogues with IspA was conducted using Glide (Schrödinger) with DMAPP docked.
General Procedure for the Synthesis of Isoprenoid Diphosphates. 4000 μmol of the neat alcohol substrate was added to a 50 mL polypropylene tube. Alcohols were purchased from Sigma Aldrich. 7-Methyloct-6-en-3-yn-1-ol was synthesized using a method adapted from Brunel, Y. and Rousseau, G. J. Org. Chem., 1996, 61 (17), pp 5793-5800. Trichloroacetonitrile (10 mL) was then added and the mixture was allowed to incubate at room temperature for 5 min. Bis-triethylammonium phosphate (TEAP) solution was prepared by slowly adding solution A (25 mL phosphoric acid, 94 mL acetonitrile) to solution B (110 mL triethylamine, 100 mL acetonitrile) to generate a solution that was 38% solution A and 62% solution B. To the mixture of alcohol and trichloroacetonitrile was added 10 mL of TEAP solution. The mixture was then incubated in a 37° C. water bath for 5 min before another addition of TEAP was added. A total of three additions of TEAP solution were added and incubated. The mixture was then separated by column chromatography using 6:2.5:0.5 iPrOH:conc. NH4OH:H2O with silica as the stationary phase. Prior to loading the column, the reaction mixture was diluted 20% v/v with chromatography buffer and the resulting precipitate was pelleted by centrifugation prior to loading of the flash column. Fractions were analyzed by using a Shimadzu single quadrupole LCMS-2020 and those containing the diphosphorylated compound, (M−H)−, free of tri- or mono-phosphorylated were pooled. The pooled fractions were then concentrated in vacuo to remove isopropanol and acetonitrile. The concentrated mixture was then filtered using 0.2 m cellulose filter and frozen at −80° C. After being frozen overnight, the sample was lyophilized yielding a salt. The triammonium salt was then characterized and stored frozen as 250 μL 25 mM aliquots.
Mass Spectrometry. Samples were subjected to negative-mode mass analysis on a Thermo Fisher Scientific Exactive Plus operating with a heated ESI source in the negative mode connected to a UV detector with a Phenomenex Kinetex UPLC C18 column (2.1×50 mm, 2.6 μm particle, 100 Å pores). 1 μL was injected onto a and separated using a series of linear gradients was developed from 20 mM NH4HCO3 in H2O (A) to 4:1 acetonitrile:H2O (B) at 0.2 mL/min using the following protocol: 0-2 min, 100-80% A; 2-6 min, 80-0% A; 6-7 min, 0% A; 7-7.1 min, 0-100% A; 7.1-12 min, 100% A. Linear detection ranges for GPP and FPP were determined by serial dilution and were found to be from 5 μM to 500 μM.
Phenyltransferase Assays. 1 μL IspA WT, IspA S81F, or IspA Y80D (4.24, 3.13, and 5.78 μg/L respectively) An aliquot (5 μL) of wild-type or mutant IspA was added to a total volume of 100 μL 50 mM Tris buffer containing 200 μM dimethylallyl diphosphate (starter unit), 600 μM isopentenyl diphosphate (extender unit), and 5 mM MgCl2 at pH 7.5 and incubated at 37° C. Analogue reactions were conducted in the same manner. Reactions were initiated by the addition of purified enzyme and were incubated overnight. Aliquots were quenched after 16 hours with twice the volume of methanol and stored at −20° C. until analysis. Conversions were internally quantified by dividing the extracted ion counts (EIC) of the products by the EIC of DMAPP plus EIC of the product and multiplying the resulting fraction by 100.
Isolation of Prenyltransferase Product. The prenyltransferase reactions were scaled to 50 mL and were carried out in 50 mL polypropylene tubes with incubation at 37° C. in a shaker at 250 rpm. Importantly, agitation and introduction of ambient air resulted in a loss of product formation potentially attributable to oxidation of IspA. Reactions were run for 48 hours and monitored by LC-MS. Upon complete consumption of DMAPP, Chelex (200 mg) was added to the mixture and the reaction was incubated as before for 3 hours in order to remove Mg2+. The resin was then pelleted by centrifugation and the reaction was then passed through a 3K MWCO filter (Millipore) to remove protein. The mixture was then lyophilized to yield a white precipitate. The mixture was then suspended in 10 mL of 0.1 M ammonium bicarbonate.
Semipreparative HPLC was carried out using a Phenomenex Kinetex HPLC C18 column (10×150 mm, 2.6 μm particle, 100 Å pores) with a gradient consisting of an aqueous mobile phase of 25 mM ammonium bicarbonate (A) and an organic mobile phase of acetonitrile (B) at 4.5 mL/min using the following protocol: 0-5 min, 100% A; 5-30 min, 100-60% A; 30-35 min, 0% A; 35-40 min, 100% A. Fractions were collected using a fraction collector with 60 second windows. Fractions containing the desired product were identified by LC-MS, pooled, and lyophilized.
PiPer™ Assay. The PiPer assay (Invitrogen) was carried out according to manufacturer's instructions. Reactions were carried out as outlined in Prenyltransferase Assays.
Compounds in this Study
3-Methylbut-2-en-1-yl diphosphate (dimethylallyl diphosphate, DMAPP): 1H NMR (400 MHz, D2O) δ 5.43 (t, J=7.0 Hz, 1H), 4.43 (dd, JH,P=7.0 Hz, JH,H=7.0 Hz, 2H), 1.76 (s, 3H), 1.71 (s, 3H); 13C NMR (101 MHz, D2O) δ 140.1, 119.7 (d, JC,P=8.2 Hz), 62.7 (d, JC,P=5.4 Hz), 25.0, 17.3; 31P NMR (162 MHz, D2O) δ −6.04 (d, J=21.7 Hz), −9.38 (d, J=21.6 Hz); HRMS m/z calculated for C5H12O7P2 [M−H+]− 244.9985, found: 244.9986.
3-Methylbut-3-en-1-yl diphosphate (isopentenyl diphosphate, IPP): 1H NMR (400 MHz, D2O) δ 4.02-3.92 (m, 2H), 2.30 (t, J=6.7 Hz, 2H), 1.68 (s, 3H); 13C NMR (101 MHz, D2O) δ 143.8, 111.6, 64.1 (d, JC,P=6.0 Hz), 37.9 (d, JC,P=7.6 Hz), 21.7; 31P NMR (162 MHz, D2O) δ −6.22 (d, J=21.6 Hz), −9.54 (d, J=21.5 Hz); HRMS m/z calculated for C5H12O7P2 [M−H+]− 244.9985, found: 244.9985.
But-3-en-1-yl diphosphate (18): 1H NMR (400 MHz, D2O) δ 5.93-5.78 (m, 1H), 5.18-5.10 (m, 1H), 5.06 (ddd, J=10.4, 2.2, 1.1 Hz, 1H), 3.95 (dt, JH,P=1.0 Hz, JH,H=6.7, 2H), 2.37 (dt, J=6.7, 6.4 Hz, 2H); 13C NMR (101 MHz, D2O) δ 135.4, 116.9, 64.7, 34.5 (d, JC,P=7.2 Hz); 31P NMR (162 MHz, D2O) δ −7.03 (d, J=21.7 Hz), −9.52 (d, J=21.7 Hz); HRMS m/z calculated for C4H10O7P2 [M−H+]− 230.9829, found: 230.9831.
Pent-4-en-2-yl diphosphate (19): 1H NMR (400 MHz, D2O) δ 5.88 (ddt, J=17.3, 10.3, 7.1 Hz, 1H), 5.22-5.04 (m, 2H), 4.44-4.33 (m, 1H), 2.36 (dq, J=25.1, 7.2 Hz, 2H), 1.25 (d, J=6.3 Hz, 3H); 13C NMR (101 MHz, D2O) δ 134.9, 117.6, 73.2 (d, JC,P=5.7 Hz), 41.4, 20.45; 31P NMR (162 MHz, D2O) δ −8.96 (d, J=20.5 Hz), −11.51 (d, J=20.6 Hz); HRMS m/z calculated for C5H12O7P2 [M−H+]− 244.9985, found: 244.9987.
3-Bromobut-3-en-1-yl diphosphate (20): 1H NMR (400 MHz, D2O) δ 5.80 (s, 1H), 5.56 (s, 1H), 4.10 (dd, JH,P=6.8 Hz, JH,H=6.2 Hz, 2H), 2.78 (t, J=6.2 Hz, 2H); 13C NMR (101 MHz, D2O) δ 129.9, 119.4, 63.6 (d, JC,P=5.6 Hz), 41.70; 31P NMR (162 MHz, D2O) δ −8.05 (d, J=21.4 Hz), −10.72 (d, J=21.1 Hz); HRMS m/z calculated for C4H9BrO7P2 [M−H+]− 308.8935, found: 308.8940.
Pent-4-yn-1-yl diphosphate (21): 1H NMR (400 MHz, D2O) δ 5.85-5.98 (s, 1H), 5.24-4.98 (m, 2H), 3.95-3.90 (m, 2H), 2.14 (q, J=7.9 Hz, 2H), 1.72 (t, J=4.0 Hz, 2H); 13C NMR (101 MHz, D2O) δ 85.1, 69.5, 65.1, 28.8, 14.3; 31P NMR (162 MHz, D2O) δ −7.03 (d, J=20.1 Hz), −9.72 (d, J=21.9 Hz); HRMS m/z calculated for C5H10O7P2 [M−H+]− 242.9828, found 242.9830.
2-Methylallyl diphosphate (22): 1H NMR (400 MHz, D2O) δ 5.06 (s, 1H), 4.93 (s, 1H), 4.35 (d, J=7.0 Hz, 2H), 1.75 (s, 3H); 13C NMR (101 MHz, D2O) δ 142.3, 111.4, 69.2, 18.4; 31P NMR (162 MHz, D2O) δ −7.13 (d, J=20.8 Hz), −9.61 (d, J=20.9 Hz); HRMS m/z calculated for C4H10O7P2 [M−H+]− 230.9829, found: 230.9824.
Pent-4-en-1-yl diphosphate (23): 1H NMR (400 MHz, D2O) δ 5.98-5.83 (m, 1H), 5.07 (d, J=17.4 Hz, 1H), 4.98 (d, J=10.5 Hz, 1H), 3.91 (dt, JH,P=6.6 Hz, JH,H=3.3 Hz, 2H), 2.12 (dt, J=7.5, 7.4 Hz, 2H), 1.78-1.62 (m, 2H); 13C NMR (101 MHz, D2O) δ 139.1, 114.8, 65.8, 29.4, 29.2 (d, JC,P=7.3 Hz); 31P NMR (162 MHz, D2O) δ −6.71 (d, J=21.2 Hz), −9.38 (d, J=21.5 Hz); HRMS m/z calculated for C5H1207P2 [M−H+]− 244.9985, found: 244.9988.
(Z)-Hex-2-en-1-yl diphosphate (24): 1H NMR (400 MHz, D2O) δ 5.74-5.55 (m, 2H), 4.50 (dd, JH,P=6.8 Hz, JH,H=2.1 Hz, 2H), 2.12-2.02 (m, 2H), 1.36 (dq, J=13.3, 7.4, Hz, 2H), 0.86 (t, J=7.4 Hz, 3H); 13C NMR (101 MHz, D2O) δ 135.2, 125.3 (d, JC,P=8.0 Hz), 61.9, 29.0, 22.2, 13.1; 31P NMR (162 MHz, D2O) δ −7.08 (d, J=21.4 Hz), −9.42 (d, J=21.3 Hz); HRMS m/z calculated for C6H14O7P2 [M−H+]− 259.0142, found: 259.0145.
(E)-Hex-2-en-1-yl diphosphate (25): 1H NMR (400 MHz, D2O) δ 5.85 (dt, J=14.0, 6.8 Hz, 1H), 5.64 (dt, J=14.0, 6.4 Hz, 1H), 4.36 (dd, JH,P=6.8 Hz, JH,H=6.8 Hz, 2H), 2.02 (dt, J=6.4, 7.4 Hz, 2H), 1.36 (dtd, J=7.4, 7.3 Hz, 2H), 0.85 (td, J=7.3 Hz, 3H); 13C NMR (101 MHz, D2O) δ 136.2, 125.6, 66.9 (d, JC,P=5.4 Hz), 33.8, 21.7, 13.1; 31P NMR (162 MHz, D2O) δ −6.92 (d, J=21.4 Hz), −9.55 (d, J=21.4 Hz); HRMS m/z calculated for C6H14O7P2 [M−H+]− 259.0142, found: 259.0145.
But-3-yn-1-yl diphosphate (26): 1H NMR (400 MHz, D2O) δ 3.85 (dt, JH,P=6.8 Hz, JH,H=6.8 Hz, 2H), 2.44 (m, 2H), 2.13 (t, J=2.7 Hz, 1H); 13C NMR (101 MHz, D2O) δ 82.1, 70.6, 64.1 (d, JC,P=96.6, 5.5 Hz), 20.2; 31P NMR (162 MHz, D2O) δ −7.82 (d, J=20.4 Hz), −9.94 (d, J=20.9 Hz); HRMS m/z calculated for C4H8O7P2 [M−H+]− 228.9672, found: 228.9675.
Cinnamyl diphosphate (27): 1H NMR (400 MHz, D2O) δ 7.54-7.49 (m, 1H), 7.39 (td, J=7.5, 1.8 Hz, 1H), 7.34-7.30 (m, 1H), 6.60 (d, J=15.2, Hz, 1H), 6.30 (dt, J=15.2, 6.2, Hz, 1H), 4.50 (dd, JH,P=6.2 Hz, JH,H=6.2 Hz, 2H); 13C NMR (101 MHz, D2O) δ 136.4, 132.2, 128.9, 126.6, 125.6 (d, JC,P=8.0 Hz), 66.5 (d, JC,P=5.1 Hz); 31P NMR (162 MHz, D2O) δ −6.98 (d, J=21.0 Hz), −9.50 (d, J=20.8 Hz); HRMS m/z calculated for C9H12O7P2 [M−H+]− 292.9985, found: 292.9991.
Hept-3-yn-1-yl diphosphate (28): 1H NMR (400 MHz, D2O) δ 4.00-3.85 (m, 2H), 2.47 (t, J=6.5 Hz, 2H), 2.08 (d, J=7.4 Hz, 2H), 1.42 (tq, J=7.4, 7.3 Hz, 2H), 0.88 (t, J=7.3 Hz, 3H); 13C NMR (101 MHz, D2O) δ 83.3, 77.6, 64.4 (d, JC,P=5.0 Hz), 23.3, 21.7, 20.4 (d, JC,P=Hz), 20, 12.9; 31P NMR (162 MHz, D2O) δ −6.91 (d, J=21.1 Hz), −9.79 (d, J=21.2 Hz); HRMS m/z calculated for C7H14O7P2 [M−H+]− 271.0142, found: 271.0146.
2-(Thiophen-3-yl)ethyl diphosphate (29): 1H NMR (400 MHz, D2O) δ 7.20 (dd, J=4.9, 3.0 Hz, 1H), 7.04 (dd, J=3.0, 1.3 Hz, 1H), 6.92 (dd, J=4.9, 1.3 Hz, 1H), 3.93 (dt, JH,P=7.0 Hz, JH,H=6.6 Hz, 2H), 2.79 (t, J=6.6 Hz, 2H); 13C NMR (101 MHz, D2O) δ 138.8, 128.6, 125.9, 121.8, 65.7 (d, JC,P=5.8 Hz), 30.6 (d, JC,P=7.6 Hz); 31P NMR (162 MHz, D2O) δ −7.19 (d, J=20.9 Hz), −10.53 (d, J=21.0 Hz); HRMS m/z calculated for C6H10O7P2S [M−H+]− 286.9550, found: 286.9554.
Furan-3-ylmethyl diphosphate (30): 1H NMR (400 MHz, D2O) δ 7.50 (d, J=3.2 Hz, 1H), 6.49 (dd, J=3.7, 3.2 Hz, 1H), 6.41 (d, J=3.7 Hz, 1H), 4.88 (d, JH,P=6.8 Hz, 1H); 13C NMR (101 MHz, D2O) δ 150.7, 143.7, 110.8, 110.1, 59.7 (d, JC,P=5.3 Hz); 31P NMR (162 MHz, D2O) δ −6.62 (d, J=21.8 Hz), −9.91 (d, J=21.3 Hz); HRMS m/z calculated for C5H8O8P2 [M−H+]− 256.9622, found: 256.9626.
But-2-yn-1-yl diphosphate (31): 1H NMR (400 MHz, D2O) δ 3.17 (dd, JH,P=7.4 Hz, JH,H=7.3 Hz, 2H), 1.25 (t, J=7.3 Hz, 3H); 13C NMR (101 MHz, D2O) δ 79.8 (d, JC,P=9.5 Hz), 75.8, 54.2 (d, JC,P=4.6 Hz), 1.7; 31P NMR (162 MHz, D2O) δ −6.13 (d, J=21.5 Hz), −9.78 (d, J=21.6 Hz); HRMS m/z calculated for C4H8O7P2 [M−H+]− 228.9672; found: 228.9673.
Pyridin-3-ylmethyl diphosphate (32): 1H NMR (400 MHz, D2O) δ 8.54 (s, 1H), 8.43 (d, J=5.0 Hz, 1H), 7.97 (d, J=8.1 Hz, 1H), 7.35-7.26 (m, 1H), 5.02 (d, JH,P=7.4 Hz, 2H); 13C NMR (101 MHz, D2O) δ 147.0, 146.5, 138.1, 134.6, 124.6, 64.9 (d, JC,P=5.0 Hz); 31P NMR (162 MHz, D2O) δ −6.79 (d, J=21.5 Hz), −9.75 (d, J=21.4 Hz); HRMS m/z calculated for C6H9NO7P2 [M−H+]− 267.9781, found: 267.9785.
Hex-5-yn-1-yl diphosphate (33): 1H NMR (400 MHz, D2O) δ 3.86 (dt, JH,P=6.4 Hz, JH,H=6.4 Hz, 2H), 2.32 (d, J=2.7 Hz, 1H), 2.28-2.17 (m, 2H), 1.76-1.65 (m, 2H), 1.62-1.52 (m, 2H); 13C NMR (101 MHz, D2O) δ 85.8, 69.2, 65.2 (d, JC,P=5.5 Hz), 28.8 (d, JC,P=7.0 Hz), 24.0, 17.1; 31P NMR (162 MHz, D2O) δ −6.58 (d, J=22.1 Hz), −10.66 (d, J=22.1 Hz); HRMS m/z calculated for C6H12O7P2 [M−H+]− 256.9985, found: 256.9989.
2-(Allyloxy)ethyl diphosphate (34): 1H NMR (400 MHz, D2O) δ 5.93 (ddt, J=13.6, 7.2, 3.9 Hz, 1H), 5.32 (dt, J=13.6, 2.1 Hz, 1H), 5.28-5.20 (m, 1H), 4.08-3.95 (m, J=10.4, 7.9, 2.6 Hz, 4H), 3.75-3.68 (m, 2H); 13C NMR (101 MHz, D2O) δ 133.6, 118.7, 90.3, 73.1, 69.4 (d, JC,P=7.3 Hz); 31P NMR (162 MHz, D2O) δ −9.46 (d, J=25.1 Hz), −9.74 (d, J=25.1 Hz); HRMS m/z calculated for C5H12O8P2 [M−H+]− 260.9935, found: 260.9937.
2-(1H-Indol-3-yl)ethyl diphosphate (35): 1H NMR (400 MHz, D2O) δ 7.65 (d, J=8.1 Hz, 1H), 7.41 (d, J=8.1 Hz, 1H), 7.23 (s, 1H), 7.15 (t, J=7.6 Hz, 1H), 7.07 (t, J=7.6 Hz, 1H), 4.12 (dd, JH,P=7.3 Hz, JH,H=7.0 Hz, 2H), 3.05 (t, J=7.0 Hz, 2H); 13C NMR (101 MHz, D2O) δ 136.1, 127.0, 123.8, 121.9, 119.2, 118.8, 111.9, 111.22, 66.0 (d, JC,P=2.6 Hz), 26.0; 31P NMR (162 MHz, D2O) δ −5.63 (d, J=21.6 Hz), −9.34 (d, J=21.6 Hz); HRMS m/z calculated for C10H13NO7P2 [M−H+]− 320.0094, found: 320.0098.
Prop-2-yn-1-yl diphosphate (36): 1H NMR (400 MHz, D2O) δ 4.40 (d, JH,P=7.8 Hz, 2H), 2.57 (s, 1H); 13C NMR (101 MHz, D2O) δ 77.9, 75.9, 52.6 (d, JC,P=7.8 Hz); 31P NMR (162 MHz, D2O) δ −7.69 (d, J=20.8 Hz), −10.11 (d, J=20.8 Hz); HRMS m/z calculated for C3H6O7P2 [M−H+]− 214.9516, found: 214.9519.
3,7-dimethyloct-6-en-1-yl diphosphate (37): 1H NMR (400 MHz, D2O) δ 5.21 (t, J=7.4 Hz, 1H), 3.99-3.88 (m, 2H), 2.07-1.91 (m, 2H), 1.66 (s, 3H), 1.59 (s, 3H), 1.49-1.37 (m, 2H), 1.38-1.26 (m, 2H), 1.21-1.19 (m, J=6.3, 1.5 Hz, 1H), 0.87 (d, J=6.3 Hz, 3H); 13C NMR (101 MHz, D2O) δ 133.0, 125.2, 64.8 (d, JC,P=5.8 Hz), 36.7 (d, JC,P=7.3 Hz), 36.3, 24.7, 24.7, 23.1, 18.5, 16.8; 31P NMR (162 MHz, D2O) δ −7.08 (d, J=17.7 Hz), −9.39 (d, J=21.4 Hz); HRMS m/z calculated for C10H22O7P2 [M−H+]− 315.0768, found: 315.0770.
(E)-3,7-Dimethylocta-2,6-dien-1-yl diphosphate (geranyl diphosphate, GPP): 1H NMR (400 MHz, D2O) δ 5.17 (t, J=6.0 Hz, 1H), 5.00-4.82 (m, 1H), 4.20 (dt, JH,P=7.2 Hz, JH,H=6.0 Hz, 2H), 1.92-1.72 (m, 4H), 1.44 (s, 3H), 1.41 (s, 3H), 1.34 (s, 3H); 13C NMR (101 MHz, D2O) δ 142.8, 133.2, 124.1, 119.7 (d, JC,P=8.4 Hz), 62.8 (d, JC,P=5.3 Hz), 39.0, 25.8, 25.0, 17.1, 15.7; 31P NMR (162 MHz, D2O) δ −8.18 (d, J=20.2 Hz), −9.59 (d, J=20.7 Hz); HRMS m/z calculated for C10H20O7P2 [M−H+]− 313.0611, found: 313.0616.
7-methyloct-6-en-3-yn-1-yl diphosphate (GPP-26a): 1H NMR (400 MHz, D2O) δ 5.05 (tdd, J=5.5, 2.0, 1.5 Hz, 1H), 3.79 (dt, JH,P=6.8 Hz, JH,H=6.8 Hz J=6.9, 2H), 2.77-2.66 (m, 2H), 2.35 (td, J=6.9, 3.6 Hz, 2H), 1.53 (s, 3H), 1.46 (s, 3H); 13C NMR (101 MHz, D2O) δ 135.7, 118.7, 81.5, 77.2, 64.3 (d, JC,P=5.8 Hz), 24.8, 20.5, 17.1, 16.9; 31P NMR (162 MHz, D2O) δ −6.95 (d, J=25.6 Hz), −9.85 (d, J=21.5 Hz); HRMS m/z calculated for C9H16O7P2 [M−H+]− 297.0298, found: 297.0299.
7-methylocta-2,3,6-trien-1-yl diphosphate (GPP-26b): 1H NMR (700 MHz, D2O) δ 5.32 (m, 2H), 5.26-5.21 (m, 1H), 4.31 (dd, JH,P=8.2 Hz, JH,H=6.8 Hz, 2H), 2.70 (d, J=7.7 Hz, 2H), 1.70-1.66 (m, 3H), 1.60 (s, 2H); 13C NMR (176 MHz, D2O) δ 203.1, 133.6, 119.9, 90.6, 87.5, 62.7 (d, JC,P=6.4 Hz), 25.1, 23.3, 15.5; 31P NMR (162 MHz, D2O) δ −6.79 (d, J=20.4 Hz), −9.83 (d, J=21.8 Hz); [α]D25=2.9°; HRMS m/z calculated for C9H16O7P2 [M−H+]− 297.0298, found: 297.0291.
Unnatural terpenes have been used in a variety of reactions and applications. Initially, terpene analogues were applied as substrate mimics for the purpose of developing inhibitors. For instance, bisphosphonate drugs are substrate mimics that are used to target enzymes utilizing pyrophosphorylated substrates.
Terpene analogues have also been used in a variety of chemical biology studies. Researchers use farnesyl pyrophosphate analogues that contain Click handles for in vivo studies of protein farnesylation which is used to study ubiquitination pathways in cancer. More recently, terpene analogues have been used to study mechanisms utilized by terpene cyclases. Such fluorinated analogues of substrates for terpene cyclases halt reaction cascade processes by restricting hydride shifts and deprotonation events that may occur in the process of product maturation.
Generation of novel substrates for studying terpene cyclases and protein prenyltransferases has previously been accomplished through traditional organic synthesis in lieu of suitable biosynthetic approaches. While organic synthesis has generated a modest albeit limited pool of rationally designed potential substrates, biosynthesis of non-natural and/or non-native precursors has been the subject of only limited inquiry.
The platform presented herein provides a means by which a plethora of opportunities to diversify terpenes using a synthetic biology approach can be accomplished
Unnatural Diphosphate Cyclization
Preliminary investigation of terpene cyclase activity. Terpene cyclases afford incredible molecular diversity from relatively simply starting materials. Using carbocationic chemistry, these enzymes elegantly direct complex cascade reactions by navigating high energy intermediates to complex ring structures (
Before embarking on measuring activity in vitro, an analytical method had to be developed in order to minimize detection limits to assure that if unnatural terpenes were generated, even the smallest amount could be detected. GC-FID was qualitatively compared to GC-MS for detection limits and signal to noise (
Following this result, it was then tested to see if terpenes could be detected from in vitro reactions. Aristolochene synthase was expressed from ATAS. Aristolochene synthase was then tested with synthesized FPP to observe the natural reaction in vitro and products were confirmed by EI-MS using the NIST database.
Following the confirmation of activity in vitro, production overtime was then measured using FID (
These results cumulatively suggest a work flow for screening terpene cyclases. 1) Semi-purify terpene cyclases to increase the concentration of catalyst. 2) Screen for product cyclization using a high sensitivity method (FID). 3) Confirm product identity by GC-MS. 4) Elucidate structure by isolation and NMR.
Unnatural diphosphates as substrates for terpene cyclases. In vitro studies in prior examples have shown the production of unnatural farnesyl and geranyl diphosphates. To generate unnatural terpenes, the analogues afforded through incorporation of unnatural substrates into GPP and FPP analogues using IspA must be coupled with terpene cyclases.
These analogues may be incorporated into cyclic structures by terpene cyclases, but also may require enzyme engineering for few possible reasons. 1) The substrates did not contain the seemingly requisite allylic diphosphate moiety. This would require the engineering of a terpene cyclase capable of directing an intramolecular SN2 reaction for the release of diphosphate. 2) Terpene cyclases have a strict substrate specificity at the head of the diphosphates (
Attempts to cyclize substrates diversified at the head portion of the molecule have been limited. In one example, attempts were made to cyclize a halogenated derivative of FPP, but no ionization was detected. While it was found that the halogenated analogue directly inhibited the terpene cyclase, it was hypothesized that ionization of the halogenated analogue wasn't energetically feasible as the resulting allylic carbocation was too high in energy. In direct opposition to this assertion, halogenated DMAPP analogues have been used for prenyltransferase reactions. As these reactions would proceed through similar intermediates, engineering of terpene cyclases to accept halogenated derivatives should be feasible (
Another effort that would address these issues of substrate specificity would be screening for a terpene cyclase capable of cyclization of non-allylic diphosphates. An enzyme capable of cyclizing such a substrate would proceed through a concerted reaction mechanism which would surpass the requirement of an allylic carbocation (
Transfer of Non-Natural Prenyl Donors to Aromatic Acceptors
In vitro studies of aromatic prenyltransferases. Beyond use of hemiterpenes for elongation and subsequent cyclization by terpene cyclases, hemiterpenes can be appended to a variety of other natural products. Meroterpenoids are natural products such as polyketides or non-ribosomal peptides that have been prenylated. The prenyl groups often are critical for bioactivity as the prenyl side chains alter C log P values to increase bioavailability.
ABBA aromatic prenyltransferases are a class of soluble magnesium-independent enzymes that append prenyl groups to various aromatic prenyl acceptors. These enzymes have been noted for their broad promiscuity in terms of prenyl acceptors, but less so in terms of prenyl donors.
From the enzymes studied to date, prenyltransferases have shown remarkable promiscuity in terms of prenyl identity suggesting some catalytic flexibility never before noted (
As many of the substrate utilized do not contain an allylic diphosphate, this work cumulatively suggests that ABBA prenyltransferases can utilize a concerted mechanism for prenylation. This mechanistic tolerance of analogues suggests that these catalysts can be used for general C—C bond formation between various alkyl diphosphates and aromatic structures.
In vivo studies of aromatic prenyltransferases. As several aromatic prenyltransferases were shown to be promiscuous in vitro, we next evaluated if these activities could be coupled with the putative hemiterpene analogue production platform in vivo. The system consisted of BL21 DE3 harboring pETDuet-PhoN+IPK and pET28a-FgaPT2. Analogues found to be substrates in vitro for all three enzymes were the candidates for analoging in vivo. Briefly, cells containing the vectors and single enzyme omission controls were grown to a density of OD600=0.6 before protein expression was induced by the addition of IPTG. At this same point, substrate was provided to the cultures and the fermentation was allowed to proceed for 48 hours. Aliquots were taken and the cell lysate was examined by high resolution LC-MS (
Once assembled in vivo, the parts functioned as found in vitro. The hemiterpene production pathway was successfully coupled with FgaPT2 to generate two tryptophan derivatives in vivo. Controls omitting any of the enzymes or substrate did not provide any product consistent with the tryptophan derivatives generated in vitro. When the hemiterpene production pathway was used with dimethylallyl alcohol, as was done with the carotenoid assay, prenylated tryptophan was observed at twice the concentration of that observed without the artificial pathway. While these vectors were not optimized for in vivo production, analogues were detected nonetheless. Future studies will use compatible vectors to increase analogue production for scale-up, isolation, and structural characterization.
ABBA Prenyltransferases with IspA Generated Terpene Analogues
Isolation of the prenyltransferase generated GPP and FPP analogues is very difficult due to the ease of product decomposition, challenging separation, and the preparation of starting material. Stemming from the promiscuity observed with the ABBA prenyltransferases, it is envisioned that the mechanistic plasticity of these prenyltransferases can be used for the prenylation of aromatic systems using these unnatural prenyl donors generated by IspA (
Using a sufficiently promiscuous prenyltransferase for the addition of unnatural prenyl groups to aromatic systems should ease difficulties with isolation as the prenylated aromatic rings are much more stable and easier to separate by chromatography. Ideally such a system could be coupled in vivo for scale up.
Conclusions
The work outlined in this example sets the foundation for the biosynthesis of hemiterpene analogues from chemical precursors. These precursors show broad application for potential use in terpene and prenylated natural product derivatives. This can provide non-natural chemical handles for synthetic diversification not available to the native products themselves (
By validating this platform part by part in vitro, the promiscuity and limits of each catalyst is more fully realized. Enzymes such as the kinases as PhoN from S. flexneri and IPK from T. acidophilum carry out simple phosphorylations and appear to have a broad range in substrate specificity based on the simplicity of the chemical transformation and mechanism. While IspA catalyzes several distinct steps to elongate prenyl groups, promiscuity towards non-natural nucleophiles is observed. While no success was had in generating cyclic terpene analogues, these enzymes can be engineered by methodically screening for use of target substrates. Perhaps another application of terpene analogues is their appendage to aromatic rings as is observed with the aromatic prenyltransferases NovQ and FgaPT2 This work lays the foundation and validates use of this platform as a path for precursor directed terpene diversification that has not yet been investigated and attempted for this class of natural products (
Methods
General methods and material. All plasmids were verified by DNA sequencing. Purifications of all DNA were performed with kits from BioBasic. Synthetic oligonucleotides were purchased from IDT (Coralville, Iowa, USA). Restriction enzymes were purchased from New England Biolabs (Ipswich, Mass., USA). Polymerase chain reactions were conducted using Phire Hot Start II DNA Polymerase from ThermoFisher Scientific (Waltham, Mass., USA). Standards of trans-caryophyllene and α-humulene as well as farnesyl pyrophosphate lithium salt were purchased from Sigma Aldrich (St. Louis, Mo., USA).
Expression and purification of IspA. ATAS plasmid was transformed into E. coli BL21 (DE3) for protein expression. A single colony was used to inoculate a 3 mL culture in LB media supplemented with chloramphenicol (35 μg/mL). A 1 L culture containing chloramphenicol (35 μg/mL) in LB media was then inoculated with 1 mL of the overnight culture and grown to an OD600 of ˜0.6 at 37° C. with shaking at 300 rpm at which point protein expression was induced by the addition of 1 mM IPTG. The temperature of the incubator-shaker was reduced to 18° C. and the culture incubated for approximately 18 hours. The culture was pelleted at 4000 rpm for 10 mins, the supernatant was decanted, the cell pellet resuspended in 15 mL of lysis buffer (100 mM TRIS-HCl, 300 mM NaCl, 10% glycerol, pH 8.0) and lysed by sonication. The lysate was then pelleted at 4500 rpm for 10 mins, decanted, and the soluble protein was spun down at 15,000 rpm for 1 hour. The resulting soluble fraction was then purified by fast protein liquid chromatography (FPLC) using nickel-bead column chromatography for the extraction of His6-tagged proteins. The column was first equilibrated with wash buffer (50 mM TRIS-HCl, 500 mM NaCl, 20 mM imidazole, pH 8.0) prior to loading of the soluble fraction. The soluble fraction was then eluted with elution buffer (50 mM TRIS-HCl, 500 mM NaCl, 200 mM imidazole, pH 8.0) using a gradient of 0% elution buffer 0-7.5 min., 0-50% 7.5-18 min., 50-100% 18-22 min., 100% 22-27.5 min, and equilibrated for additional runs with 0% elution buffer 27.5-35 min. Fractions containing the desired protein were identified by SDS-PAGE and pooled. The pooled protein was then concentrated using a 10,000 molecular weight cut-off filter (Millipore Amicon-Ultra) and the buffer was exchanged with protein storage buffer (50 mM TRIS-HCl, 100 mM NaCl, and 20% glycerol at pH 8.0). Protein aliquots were flash frozen with a dry ice ethanol bath before storage at −80° C. Protein purity was confirmed by SDS-PAGE while concentration was determined by absorbance using a Pierce Bradford Protein Assay kit.
GC analysis of terpenes. Standards were diluted serially in ethyl acetate. GC-MS and GC-FID were performed on an Agilent Technologies 5975 GC/MS equipped with an HP-5MS capillary column (0.25 mm i.d., 30 m) (Agilent Technologies). The GC was operated using splitless injections at a volume of 1 μL with an injector temperature of 250° C. The initial oven temperature was set to 50° C. for 5 minutes before being ramped at 15° C./min to 230° C. and then held at 240° C. for 1 minute. Product peaks were integrated using Agilent ChemStation software.
Reactions of aristolochene synthase. Reactions were performed in 1.5 mL polypropylene tubes. Reactions were run in volumes of 200 μL overlaid with 200 μL of ethyl acetate. Reactions contained 2.5 mM farnesyl pyrophosphate, 2.5 mM MgCl2, 25 mM Tris-HCl at pH 7.5, and 7.8 μg of enzyme. Time points were taken by halting the reaction by vortexing the mixture and removing the ethyl acetate layer for subsequent analysis.
The following data demonstrates the ability of the artificial isoprenoid pathway (PhoN-IPK) to support production of natural isoprenoids (via DMAPP/IPP) and non-natural isoprenoids (via various non-natural alkyl-pyrophosphates). The PhoN-IPK is coupled to various downstream enzymes (FgaPT2, IspA, CpaD, FtmPT1) to afford a range of compounds.
Production of prenylated amino acids in vitro via PhoN-IPK-FgaPT2 pathway. Reactions including purified PhoN, IPK, and FgaPT2 with ATP, DMAA, and Trp were run with initial conditions similar to individual enzyme reactions.
Buffer was optimized, followed by iterative optimization of substrate and enzyme concentrations. Reactions were followed by HPLC, and percent conversion was calculated in the same manner as FgaPT2 in vitro reactions. The results are shown in Table 8 below.
Production of prenylated amino acids in vivo via the PhoN-IPK, FgaPT2 pathway. Alcohols of interest (5 mM) were fed into cultures expressing FgaPT2, PhoN, and IPK in E. coli Rosetta PLysS. Media and cell lysate were analyzed by HPLC detecting at 269 nm and HR-LCMS searching for mass ion consistent with expected product. Experiments were performed in duplicate. The results are illustrated in
The following experiments demonstrate the broad specificity any utility of the individual component enzymes.
Broad substrate specificity of wild-type PhoN. Each commercially available alcohol (100 mM) was tested for activity with purified PhoN (20 μg/mL) and ATP (50 mM). The product mixtures were analyzed by high resolution liquid chromatography mass spectrometry (HR-LCMS) to quantify the conversion of alcohol to mono-phosphate. Quantification was achieved by the addition of an internal standard (either geranyl phosphate or cinnamyl monophosphate) at known fixed concentration. The results are illustrated in
Broad substrate specificity of the wild-type prenyltransferase FgaPT2 in vitro. Each chemically synthesized pyrophosphate (3 mM) was tested for activity with purified FgaPT2 (200 μg/mL) and Trp (1 mM). The product mixtures were analyzed by HR-LCMS to identify the mass ions consistent with the expected alkylated product. The substrate (Trp) and product peak areas were determined by high performance liquid chromatography (HPLC) at a detection wavelength of 269 nm, and the conversion was calculated. The results are illustrated in
Altered substrate specificity of the prenyltransferase FgaPT2 mutant M328G in vitro. Each chemically synthesized pyrophosphate (3 mM) was tested for activity with purified FgaPT2 M328 (350 μg/mL) and Trp (1 mM). The product mixtures were analyzed by HPLC and confirmed by low-resolution liquid chromatography mass spectrometry (LR-LCMS) with mass ion consistent with expected product. The substrate (Trp) and product peak areas were determined by HPLC at a detection wavelength of 269 nm, and the conversion was calculated. The results are illustrated in
Broad substrate specificity of the prenyltransferase CpaD in vitro. Each chemically synthesized pyrophosphate (0.3 mM) and cyclic dipeptide (0.25 mM) were tested for activity with purified CpaD (1 mg/mL). The product mixtures were analyzed by HPLC and confirmed by LR-LCMS with mass ion consistent with expected product. The substrate (cyclic dipeptide) and product peak areas were determined by HPLC at a detection wavelength of 254 nm, and the conversion was calculated. The results are illustrated in
Altered substrate specificity of the prenyltransferase mutant CpaD 1329G in vitro. Each chemically synthesized pyrophosphate (2 mM) and cyclic dipeptide (1 mM) were tested for activity with purified CpaD I329G (1 mg/mL). The product mixtures were analyzed by HPLC and confirmed by LR-LCMS with mass ion consistent with expected product. The substrate (cyclic dipeptide) and product peak areas were determined by HPLC at a detection wavelength of 254 nm, and the conversion was calculated. The results are illustrated in
Broad substrate specificity of the prenyltransferase FtmPT1 in vitro. Each chemically synthesized pyrophosphate (2 mM) and cyclic dipeptide (1 mM) were tested for activity with purified FtmPT1 (100 mg/mL). The product mixtures were analyzed by HPLC and confirmed by LR-LCMS with mass ion consistent with expected product. The substrate (cyclic dipeptide) and product peak areas were determined by HPLC at a detection wavelength of 254 nm, and the conversion was calculated. The results are illustrated in
Broad substrate specificity of the prenyltransferase mutant FtmPT1 M364G in vitro. Each chemically synthesized pyrophosphate (2 mM) and cyclic dipeptide (1 mM) were tested for activity with purified FtmPT1 (100 mg/mL). The product mixtures were analyzed by HPLC and confirmed by LR-LCMS with mass ion consistent with expected product. The substrate (cyclic dipeptide) and product peak areas were determined by HPLC at a detection wavelength of 254 nm, and the conversion was calculated. The results are illustrated in
The compositions, systems, and methods of the appended claims are not limited in scope by the specific compositions, systems, and methods described herein, which are intended as illustrations of a few aspects of the claims. Any compositions, systems, and methods that are functionally equivalent are intended to fall within the scope of the claims. Various modifications of the compositions, systems, and methods in addition to those shown and described herein are intended to fall within the scope of the appended claims. Further, while only certain representative compositions, systems, and method steps disclosed herein are specifically described, other combinations of the components, compositions, systems, and method steps also are intended to fall within the scope of the appended claims, even if not specifically recited. Thus, a combination of steps, elements, components, or constituents may be explicitly mentioned herein or less, however, other combinations of steps, elements, components, and constituents are included, even though not explicitly stated.
The term “comprising” and variations thereof as used herein is used synonymously with the term “including” and variations thereof and are open, non-limiting terms. Although the terms “comprising” and “including” have been used herein to describe various embodiments, the terms “consisting essentially of” and “consisting of” can be used in place of “comprising” and “including” to provide for more specific embodiments of the invention and are also disclosed. Other than where noted, all numbers expressing geometries, dimensions, and so forth used in the specification and claims are to be understood at the very least, and not as an attempt to limit the application of the doctrine of equivalents to the scope of the claims, to be construed in light of the number of significant digits and ordinary rounding approaches.
Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of skill in the art to which the disclosed invention belongs. Publications cited herein and the materials for which they are cited are specifically incorporated by reference.
This application claims benefit of U.S. Provisional Application No. 62/792,523, filed Jan. 15, 2019, which is hereby incorporated by reference in its entirety.
This invention was made with Government Support under Grant No. 556943 awarded by the National Institutes of Health. The Government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2020/013663 | 1/15/2020 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62792523 | Jan 2019 | US |