Design and implementation of SIMTeGES
Considering the cytotoxic effects of sanguinarine at a concentration as low as 0.1 mM (Supplementary Fig. 1), we focused on the development of a temperature-responsive GAL regulatory system to rigorously decouple cell growth from product synthesis. Inteins have been described as ‘protein introns’ that are autonomously spliced during post-translational modification. Intein-splicing is a cofactor or energy independent intramolecular process mainly through bond rearrangement33. Therefore, conditionally splicing intein variants have the potential to enable temperature-responsive expression of target proteins in various host systems (Supplementary Fig. 2).
The V-type proton ATPase catalytic subunit A (VMA1) from S. cerevisiae contains a well-characterized intein, VMA34. We integrated genes involved in the biosynthesis of lycopene (tHMG1, CrtE, CrtYB11M, and CrtI)35 into the yeast chromosome under the control of GAL1-GAL10 bidirectional promoters and knocked out the transcriptional repressor gene GAL80, allowing the colored metabolite lycopene to serve as a reporter system for GAL4 activity (Supplementary Fig. 3). The GAL4 activator comprises a DNA binding domain (residues 1-106) and two transcription activation domains (residues 148–196 and 768–881). We inserted the VMA sequence at cysteine 21 (C21) of GAL4 (GAL4-INT) and the formation of colored colonies on the galactose plates (YPG) indicated that the activity of GAL4 was not significantly affected by intein insertion (Fig. 1b). To further demonstrate that GAL4 activity was dependent on intein splicing, we introduced the splicing-resistant mutation (N454Q) into VMA (dINT)33. As expected, GAL4-dINT failed to produce any lycopene. In light of this, we proposed SIMTeGES-GAL4 to dynamically regulate the expression of heterologous genes with temperature as an input signal (Fig. 1a): at non-permissive temperature (e.g., 30 °C, optimal for cell growth), GAL4 was maintained inactive to prevent the transcription of downstream promoters (GAL1p, GAL10p, GAL2p, and GAL7p) and minimize metabolic burden to promote cell growth; at permissive temperature (e.g., 25 °C, ideal for plant enzyme expression and folding)36, GAL4 was activated to drive the expression of the GAL promoters for the biosynthesis of target compounds. We investigated several intein variants at 25 °C and 30 °C (Fig. 1b and Supplementary Fig. 4) and found that only VMAL212P (named as tsINT) exhibited a temperature-responsive phenotype, no lycopene production (white colonies, without GAL4 activity) at non-permissive temperature (30 °C) and lycopene production (with GAL4 activity) at permissive temperature (25 °C). Hence, we chose GAL4-tsINT for subsequent studies.
Systematic characterization of SIMTeGES
GAL4M9 is a temperature-sensitive GAL4 mutant generated by directed evolution and has been applied in the dynamic regulation of lycopene25, vitamin E26, and lutein24,37 biosynthetic pathways. Unfortunately, the activation capacity of GAL4M9 was sacrificed and might hinder the production of plant secondary metabolites with intricate and lengthy biosynthetic pathways (e.g., sanguinarine)26. Hence, we further investigated the transcriptional activity and dynamic regulation properties of the temperature-responsive systems (GAL4-tsINT and GAL4M9) using mCherry under the control of the GAL1 promoter as a reporter gene. We measured the fluorescence signals in galactose media at 25 °C and 30 °C, respectively, using fluorescence microplate reader and flow cytometry. After 36 h, both GAL4-tsINT and GAL4M9 showed temperature-sensitive activity when compared with GAL4 and GAL4-dINT positive and negative controls (Fig. 1c and Supplementary Fig. 5). GAL4-tsINT exhibited stronger fluorescence intensity than GAL4M9 at 25 °C, indicating higher transcriptional activation capacity. At 30 °C, both systems showed low yet comparable levels of leakage expression, with GAL4-tsINT showing an activation dynamic around 14.1-fold.
Moreover, we evaluated the temperature-responsive kinetics of GAL4-tsINT and GAL4M9. After 24 h incubation, we shifted the temperature from 30 °C to 25 °C, and collected samples at 6 h intervals for analysis. As shown in Supplementary Fig. 6, GAL4-tsINT and GAL4M9 showed similar temperature-sensitive kinetic properties, whose expression was triggered at 6 h and gradually slowed down at 24 h, suggesting the most robust temperature-controlled transcriptional activation between 6 h and 24 h. Compared with GAL4M9, GAL4-tsINT consistently exhibited more pronounced changes in fluorescence intensity, highlighting that SIMTeGES preserves the maximal transcriptional activation capacity of GAL4 while displaying decent temperature sensitivity.
Afterward, we employed SIMTeGES for the biosynthesis of lycopene to further compare GAL4-tsINT and GAL4M9. Given the inhibitory effect of glucose on the GAL system, we investigated the effect of various carbon sources on the performance of the temperature-responsive systems and, accordingly, the production of lycopene. Under all conditions, GAL4-tsINT consistently outperformed GAL4M9 (Supplementary Fig. 7), particularly when both glycerol and galactose were present (YPGlyG), with the lycopene titer of GAL4-tsINT 15.5-fold higher than that of GAL4M9. Compared with the mCherry reporter system with only one gene, the lycopene biosynthesis pathway involving four steps of enzyme catalysis likely accentuated the disparities in transcriptional activity between GAL4-tsINT and GAL4M9. Surprisingly, GAL4-tsINT failed to exhibit significant advantages in the presence of glucose, probably due to the interplay of multiple signals, including the glucose repression signal and temperature-dependent activation signal. Nevertheless, the lycopene titer with GAL4-tsINT was still 3.7-fold higher than that with GAL4 under YPGlyG culture conditions. Consequently, we successfully reset the decoupling signal of the GAL system from high-concentration glucose to temperature. SIMTeGES not only maintained high expression levels but also established a more rigorous and easily controllable growth and production decoupling system, making it more suitable for the biosynthesis of complex plant secondary metabolites (Supplementary Table 1).
Mechanism exploration of SIMTeGES
The intein splicing mechanism is a well-defined process, delineated into four principal steps33 (Supplementary Fig. 8). Typically, the initial step is considered the key limiting step, where the S/O atom of the N-terminal cysteine or threonine of the inteins attacked the α-carbon of the preceding amino acid, resulting in N-O or N-S acyl rearrangement38,39. It is worth noting that the conformational strain on the backbone of the precursor protein is a vital indicator of this process. Quantifying these primary chain distortions involves assessing the deviation from peptide bond planarity (ω value) near the scissile peptide bond (with 180° as the ideal value), causing an energy increase of over 5 kcal/mol for each distorted residue39 (Fig. 2a). Therefore, we employed AlphaFold 2.0 and MD simulations to model GAL4-tsINT and adjusted the structures at both 30 °C and 25 °C to further explore the mechanism of conditional splicing by analyzing their conformational changes.
After molecular dynamic equilibrium (Supplementary Fig. 9), we computed the average planarity of the peptide bond to be 176.0°, formed by residues C21 and F22 within the GAL4-tsINT precursor protein at 30 °C during the 40 to 50 ns (Fig. 2b), closely mirroring the ideal value of 180°. This indicated minimal or negligible distortion in the backbone at 30 °C, insufficient to trigger N–O or N–S acyl rearrangement. As a result, the intein within the precursor protein remained unspliced, rendering it devoid of transcriptional activation capacity. Conversely, at 25 °C, the average planarity of the peptide bond at the cleavage site in the precursor protein was determined to be 166.7°, significantly deviating from the ideal value. This pronounced conformational strain at 25 °C facilitated the nucleophilic attack by the S/O atom of cysteine or threonine, thereby enabling intein self-hydrolysis for the generation of functional target proteins.
Exploration of general applicability of SIMTeGES
Upon our understanding of temperature-sensitive intein splicing mechanism, we further evaluated the performance of SIMTeGES for temperature-controlled expression of more proteins (e.g., GAL80 and mCherry) in various hosts (e.g., P. pastoris and mammalian cells). First, we inserted INT, dINT, and tsINT at position C277 of the GAL80 protein (SIMTeGES-GAL80), whose encoding genes were integrated into the lycopene-producing strain (Supplementary Fig. 10). Given the role of GAL80 in repressing GAL4 transcriptional activation, colonies remained colorless when GAL80 was active but turned to the carotenogenic color upon GAL80 inactivation. Our results showed that GAL80-INT led to a modest decrease in the inhibitory activity of GAL80, as colonies with a pale-yellow color (Fig. 2c). Notably, GAL80-tsINT exhibited robust temperature-dependent behavior: under non-permissive temperature, it became nonfunctional and resembled the carotenogenic color of the gal80 disrupted strain colonies; under permissive temperature, its performance closely mirrored GAL80-INT, preserving most of the inhibitory activity of GAL80.
In addition to transcriptional factors (e.g., GAL4 and GAL80), we further evaluated its application in temperature-controlled expression of target proteins directly, with mCherry chosen as a reporter protein (SIMTeGES-mCherry, Supplementary Fig. 10). As shown in Fig. 2d and Supplementary Fig. 11, we successfully achieved temperature-regulated and minimal leakage expression of mCherry-tsINT under non-permissive conditions (~5.21% determined by flow cytometry), while the positivity rate under permissive conditions reached 99.8%, further demonstrating strong activity and low leakage expression levels of SIMTeGES. Furthermore, we validated SIMTeGES in different host cells, with a temperature-controlled expression of mCherry achieved in both P. pastoris and mammalian cells (Supplementary Fig. 12 and 13). These results underscored the versatility of SIMTeGES for multiple proteins across diverse species.
Consistent with our genetic analysis, we verified intein splicing at the protein level. Specifically, mCherry-INT was predominantly detected in its spliced form (~35 kDa), showcasing the efficient self-splicing capability of inteins (Fig. 2e). Conversely, mCherry-dINT existed as an unspliced form, around 85 kDa. The temperature-dependent mCherry-tsINT exhibited an unspliced form like mCherry-dINT under non-permissive conditions yet aligned with mCherry-INT in a spliced form under permissive conditions. These findings further validated the direct correlation between protein activity and intein splicing. Moreover, the ratio of spliced to unspliced proteins in mCherry-tsINT at various time points provides an intuitive insight into the splicing kinetics of the temperature-sensitive intein. During the transition from non-permissive to permissive conditions, we observe a gradual decrease in the proportion of unspliced proteins over time, accompanied by an increase in the proportion of spliced proteins (Fig. 2f), with most proteins present in the spliced form after 8 h.
Stable production of sanguinarine using SIMTeGES
Initially, considering the involvement of 6 cytochrome P450 enzymes in the sanguinarine biosynthesis pathway (Fig. 3), we implemented several strategies to optimize the microenvironment for the expression and activity of P450s (Supplementary Fig. 14). We constructed strain SAN219 by overexpressing ZWF1, GAPN, ICE2, INO2, and EcCFS in the previously reported sanguinarine yeast strain SAN00619. Increasing the expression of glucose-6-phosphate dehydrogenase40 (ZWF1) and introducing glyceraldehyde-3-phosphate dehydrogenase41 (GAPN) from Streptococcus mutans are general approaches to enhance the availability of NADPH, a crucial cofactor for electron transfer in the CYP system. Moreover, the INO2/INO4 transcription factor complex activates phospholipid biosynthesis, and the transmembrane protein ICE2 stabilizes cytochrome P450 reductase (CPR) and other membrane proteins in the endoplasmic reticulum (ER)42,43. Unfortunately, none of these genome modifications resulted in a significant increase in sanguinarine production (Supplementary Fig. 15). Meanwhile, knockout out of GAL80 in the strain SAN220 resulted in the loss of sanguinarine synthesis capability during passaging. We speculated that leakage expression of the conventional GAL regulon under low glucose concentrations led to an inadequate decoupling of cell growth and product synthesis, which in turn exerted heavy metabolic burdens and even cytotoxicity on the yeast strain. Thus, we propose SIMTeGES as a promising and effective strategy for alleviating the bottleneck in the development of high-yield sanguinarine-producing strains.
Consequently, we constructed two strains, SAN220-tsINT and SAN220-M9, by integrating two temperature-responsive GAL systems into the sanguinarine-producing strain SAN219. In the presence of galactose, both GAL4M9 and GAL4-tsINT exhibited significantly improved phenotypes compared with GAL4. Especially when a mixed carbon source of galactose and glycerol was used, GAL4-tsINT achieved a sanguinarine titer of up to 14.71 mg L−1, surpassing GAL4 by 3.31-fold and GAL4M9 by 2.24-fold, respectively (Fig. 1d). In contrast to the lycopene-producing strain, all strains showed low sanguinarine titers when cultivated with glucose. We hypothesized that glucose, serving as a repressive carbon source, exerts a post-inhibitory effect on GAL promoters, especially in the intricate and lengthy sanguinarine biosynthetic pathway. Additionally, the mutual interference between the glucose repression signal and the temperature-controlled activation signal led to suboptimal performance. SIMTeGES effectively mitigated these limitations by replacing glucose with temperature as the decoupling signal without sacrificing heterologous gene expression levels. After subjecting the yeast strain SAN220-tsINT to five consecutive passages, we observed relatively stable sanguinarine titers, indicating the alleviation of the strain stability challenge caused by sanguinarine cytotoxicity (Supplementary Fig. 16).
Subsequently, we performed a comprehensive transcriptome analysis focusing on the cluster of genes involved in central carbon metabolism and the heterologous pathway to investigate the underlying reasons for significant variations in sanguinarine production with different carbon sources (Supplementary Fig. 17). As anticipated, when glycerol and galactose were used as mixed carbon sources, the expression levels of most heterologous genes and galactose assimilation associated genes were significantly increased. Although some genes involved in glycolysis and the pentose phosphate pathway (PPP) were downregulated, genes associated with the tricarboxylic acid (TCA) cycle displayed upregulation, thereby enhancing cellular robustness. Additionally, we identified several upregulated genes, including FBP1 (encoding fructose-1,6-bisphosphatase), RKI1 (encoding ribose-5-phosphate ketol-isomerase) and TPI1 (encoding triose phosphate isomerase), which played crucial roles in gluconeogenesis and the non-oxidative stages of the PPP, facilitating the assimilation of galactose and enhancing the supply of erythrose-4-phosphate (E4P). These findings further supported the beneficial effects of using glycerol and galactose as mixed carbon sources in the GAL system, where temperature acted as the sole decoupling signal.
Identification and functional expression of the flavoprotein berberine bridge enzyme
Reticuline oxidases belong to the berberine bridge enzyme family (BBE) and are responsible for the stereospecific conversion of (S)–reticuline to (S)–scoulerine (Fig. 4a). Our previous work has shown that increasing reticuline titer did not significantly improve the yield of the final product sanguinarine19, indicating that BBE might be the key rate-limiting step in the pathway. In vivo, experiments have confirmed that PsBBE and EcBBE, derived from Papaver somniferum and Eschscholzia californica, exhibited reticuline oxidase activity44,45. To explore the catalytic potential of different reticuline oxidase enzymes, we constructed a sequence similarity network (SSN) for the BBE family (Pfam08031) (Supplementary Fig. 18). SSN allows for the visualization of protein sequence relatedness and clustering based on similarity thresholds46. By setting the alignment score threshold (AST) to 94 and restricting sequence lengths to 450-660 amino acids, we identified 8,812 sequences containing 15 different SwissProt descriptions. The BBE-likes, reticuline oxidases, and tetrahydroberberine oxidases all clustered together in SSN (Fig. 4b). From this cluster, we selected CjBBE from Coptis japonica and McBBE from M. cordata for functional characterization.
PsBBE has been observed to be expressed in plant compartment vesicles with a 23-amino acid signal peptide47. According to the Philius prediction server from the Yeast resource center48, all BBEs were predicted to be non-cytoplasmic proteins. Considering the variation of signal peptide processing and localization in a heterologous host, we truncated the signal peptide of BBEs from different species (Supplementary Fig. 19) and fused an MBP (maltose binding protein) tag to their N-terminus to facilitate functional expression in the cytoplasm. To evaluate the activity of the BBE variants, we constructed the reticuline accumulation strain RET202, by knocking out PsBBE, EcCFS, and EcSTS from the sanguinarine-producing strain SAN006. Then, we evaluated and compared four BBEs (PsBBE, EcBBE, CjBBE, and McBBE), their signal peptide-truncated variants (tPsBBE, tEcBBE, tCjBBE, and tMcBBE), and variants with the signal peptide replaced with an MBP tag (MBP-tPsBBE, MBP-tEcBBE, MBP-tCjBBE, and MBP-tMcBBE). Our results indicated that McBBE exhibited high reticuline oxidase activity, and the MBP-tMcBBE variant had a higher scoulerine conversion rate (~87% compared to a conversion rate of ~23% for the most commonly used PsBBE) (Fig. 4c). Interestingly, all MBP fusion variants showed improved conversion rates. Noteworthy, while CjBBE and tCjBBE exhibited almost no reticuline oxidase activity, MBP-tCjBBE had a conversion rate of ~22%, highlighting the significance of MBP tag on the functional expression of BBEs in yeast.
To ascertain the localization of BBEs in yeast and evaluate the effect of signal peptide and MBP fusion tag on protein expression and cellular localization, we fused EGFP to the C-terminus of CjBBE, McBBE, and their variants. While the expression of BBEs and tBBEs was too low to determine their localization using confocal microscopy, MBP-tBBEs were clearly expressed in the cytoplasm (Fig. 4d). The fluorescence intensity significantly increased after signal peptide truncation and MBP fusion, indicating a positive effect of MBP on soluble expression of BBEs. Consequently, integrating MBP-tMcBBE into the genome of SAN220-tsINT resulted in the construction of SAN221, which significantly decreased reticuline accumulation and accordingly increased the production of sanguinarine by 1.53-fold to a titer of 21.05 mg L−1 (Fig. 4e).
Transmembrane domain engineering for proper expression and localization of protopine 6-hydroxylase
The final cytochrome P450 enzyme (protopine 6-hydroxylase, P6H) in the sanguinarine pathway catalyzes the 6-hydroxylation of protopine, followed by non-enzymatic intramolecular rearrangement to form the benzophenanthridine scaffold of dihydrosanguinarine36,49. Analysis of intermediate metabolites in strain SAN221 revealed significant accumulation of protopine, suggesting that McP6H from M. cordata is one of the major bottlenecks (Fig. 4e). Previous studies have reported that McP6H exhibited higher catalytic activity than EcP6H from E. californica, but its protein expression level was much lower in yeast8. Thus, we attempted to engineer the N-terminal α-helix to achieve proper localization and functional expression of McP6H (Fig. 5a).
Through sequence alignment between EcP6H and McP6H, a non-aligning region of 14 consecutive serine residues KKSSSSSSSSSSSSSS was identified in McP6H (Supplementary Fig. 20a). We used AlphaFold 2.0 to predict the structure of McP6H and obtained a low confidence score for this sequence (Supplementary Fig. 20b), indicating that this sequence might not contribute to protein activity but impair protein stability. Thus, we removed this sequence to create a McP6H variant (McP6Hs). We speculated that correct protein expression and localization might be more critical than catalytic activity in the reconstruction of heterologous biosynthetic pathways. Thus, we designed two chimeric proteins with engineered transmembrane domains: EcCFS1–83–McP6Hs84-522 and EcP6H1–39–McP6Hs33-522.
To evaluate the activity of P6H variants, we knocked out McP6H in strain SAN221, resulting in the construction of the protopine-producing strain PRO222. We then introduced McP6H, McP6Hs, EcP6H, EcCFS1–83–McP6Hs84-522, and EcP6H1–39–McP6Hs33-522 into PRO222 and evaluated the conversion of protopine to sanguinarine. Compared with the 35% conversion rate of McP6H, McP6Hs only showed a slight increase in sanguinarine production. In contrast, EcP6H exhibited much higher activity in yeast, significantly increasing the conversion rate to 73% (Fig. 5b). For the two chimeric proteins, EcCFS1–83–McP6Hs84-522 showed a relatively low activity with a conversion rate of only 4.7%, while EcP6H1–39–McP6Hs33-522 demonstrated optimal catalytic activity, almost eliminating the accumulation of protopine and increasing the titer of sanguinarine to 41.31 mg L−1 with a conversion rate of 85%. This suggested that the transmembrane domain of EcP6H was more compatible and significantly improved the localization and functional expression of McP6Hs in yeast.
The results of confocal fluorescence microscopy confirmed our hypothesis (Fig. 5c). EcP6H localized on ER with a circular distribution in yeast cells. In contrast, McP6H and McP6Hs accumulated at the edges of the cells, indicating incorrect localization, which impaired their catalytic activities. While the cellular localization of EcCFS1–83–McP6Hs84-522 could not be determined due to low expression level, EcP6H1–39–McP6Hs33-522 displayed a distribution similar to EcP6H and localized correctly on ER. Notably, the fluorescence intensity of EcP6H1–39–McP6Hs33-522 was significantly higher than that of EcP6H, demonstrating that fusion of the transmembrane domain of EcP6H and McP6Hs indeed facilitated protein expression and correct localization. The transmembrane domain engineering provides a useful example for enhancing the in vivo activity of cytochrome P450 enzymes in yeast.
Comprehensive pathway optimization
After addressing two key rate-limiting steps in the pathway, our focus shifted to enhancing precursor supply, cofactor engineering, and cellular detoxification capacity to redirect metabolic flux towards sanguinarine production. PEP derived from glycolysis and E4P derived from PPP serve as important precursors in the upstream pathway. Phosphoketolase (XFPK) from Leuconostoc mesenteroides has been shown to enable the conversion of xylose-5-phosphate (X5P) and/or fructose-6-phosphate (F6P) into acetyl phosphate and glyceraldehyde-3-phosphate (GAP)/E4P, thereby increasing the availability of E4P40. Additionally, the enhancement of the shikimic acid pathway through Escherichia coli shikimate kinase II (EcoAROL) and the deletion of aldehyde reductase (ARI1) have been reported to minimize the reduction of 4-HPAA to tyrosol28. Consequently, based on SAN223-4, we introduced LmXFPK and EcoAROL while knocking out the redundant oxidoreductase ARI1 to construct SAN224, resulting in a sanguinarine titer of 46.78 mg L−1 with the accumulation of norcoclaurine and reticuline increased by 1.55-fold and 1.37-fold, respectively (Fig. 6a). To optimize the microenvironment of the four SAM-dependent methyltransferases, replenishing methionine synthesis is crucial for establishing a complete SAM cycle (Supplementary Fig. 21). By employing the ACT1 promoter to alleviate feedback inhibition of the MET17 promoter, the catalytic activity of SAM-dependent methyltransferases (6OMT, CNMT, 4’OMT, and TNMT) in the pathway was further enhanced, leading to a 1.33-fold increase in reticuline accumulation, sanguinarine production in SAN225 was increased by 1.22-fold, reaching 57.22 mg L−1. Moreover, previous studies have demonstrated that sanguinarine exhibits antimicrobial activity by generating reactive oxygen species (ROS) within cells at high doses, leading to DNA damage and cell apoptosis50. Superoxide dismutase (SOD) plays a vital role in eliminating ROS by converting superoxide anions to hydrogen peroxide51. Overexpressing the endogenous SOD1 in SAN226 reduced ROS levels (Supplementary Fig. 22), leading to a slight improvement in sanguinarine production to 61.30 mg L−1.
To further investigate the rate-limiting steps in the pathway, we employed qPCR to assess the expression levels of 16 heterologous genes in the SAN220-tsINT strain (Supplementary Fig. 23), among which the expression levels of EcSTS and Ps6OMT were significantly lower than the others. Therefore, we introduced an additional copy of the EcSTS expression cassette into SAN225, resulting in a 5.58-fold decrease in cheilanthifoline accumulation, a 2.23-fold increase in protopine accumulation, and a 1.53-fold increase in sanguinarine production (with a titer of 87.50 mg L−1), respectively. Furthermore, we overexpressed the glycerol/H+ symporter encoding gene STL152,53 and adenosine kinase gene ADO140 to improve glycerol utilization and replenish homocysteine in the SAM cycle. This slightly promoted sanguinarine synthesis, and as a result, SAN229 exhibited a sanguinarine titer of 97.94 mg L−1. It has been reported that a class of BIA uptake permeases, known as BIA importers (BUP), localized in the Opium poppy plasma membrane, was able to transport various BIAs and certain pathway precursors (e.g., dopamine) and enhance the uptake rate of codeine and intermediates in yeast cells54,55. Unfortunately, the introduction of BUP1 into the SAN229 strain failed to significantly increase sanguinarine production, possibly due to mislocalization of BUP1 in the yeast organelle (Supplementary Fig. 24). Finally, we engineered SAN231 with enhanced expression of Ps6OMT, resulting in a 1.77-fold decrease in norcoclaurine accumulation and the increase of sanguinarine production to 106.87 mg L−1. We performed qPCR analysis on strains SAN223-4 and SAN231 to assess the abundance of each enzyme involved in heterologous expression or endogenous overexpression, thereby confirming the efficacy of our engineered modifications at the transcriptional level (Supplementary Fig. 25). We also constructed strain SAN334 by replacing GAL4-tsINT in strain SAN231 with the wild-type GAL4, resulting in a significant decrease in sanguinarine production, further validating the effectiveness of SIMTeGES (Supplementary Fig. 26).
For the temperature-regulated fed-batch fermentation, we constructed the prototrophic haploid strain SAN232 by complementing the auxotrophic markers in SAN231. In the early stages of fermentation, we maintained the temperature at 30 °C to support strain growth with hardly any sanguinarine production (Fig. 6b). When the strain entered the mid to late logarithmic growth phase at 59 h, we lowered the culture temperature to 25 °C to activate the temperature-sensitive GAL4-tsINT, thereby initiating the expression of heterologous genes. Notably, the relatively lower temperature has been found to favor the functionality of several plant-derived P450 enzymes36. After 70 h, while biomass was only slightly increased, sanguinarine started to accumulate to high levels. Finally, biomass reached an OD600 of 84.32, and we achieved a sanguinarine titer of 448.64 mg L−1 after 125.5 h of fermentation. Furthermore, different from flask fermentation, most intermediates were not accumulated to high levels in the fermentation broth, underscoring the high efficiency of sanguinarine biosynthesis (Fig. 6c).
Efficient yeast cell factory for biosynthesis of halogenated BIA derivatives
Previously, we introduced a tyrosine decarboxylase variant56, TyDCY350F, which converts l-tyrosine into 4-hydroxyphenylacetaldehyde (4-HPAA), a precursor of norcoclaurine synthase (NCS). Meanwhile, l-tyrosine can be transformed into l-3,4-dihydroxyphenylalanine (l-DOPA) by tyrosine hydroxylase (CYP76AD5), which is further utilized in the synthesis of l-dopamine, another precursor of NCS. By supplementing our high-yield sanguinarine cell factory with halogenated tyrosine (3-F-tyrosine, 3-Cl-tyrosine, and 3-I-tyrosine), we successfully synthesize diverse halogenated benzylisoquinoline derivatives. Our findings showed that feeding the cells with 0.5 mM 3-F-tyrosine resulted in the production of monofluorinated derivatives of all available pathway intermediates, including the production of fluorinated sanguinarine (Fig. 7a–e, Supplementary Fig. 27–31). As the derivatives could potentially be fluorinated at the 3’ position catalyzed by TyDCY350F or the 8 position catalyzed by CYP76AD5, we employed LC-MS/MS (QQQ) in production mode to extract the [M + H]+ molecular weight of the halogenated derivatives and obtain their corresponding MS2 spectra to determine the halogenation site57. For instance, the monofluorinated norcoclaurine had a [M + H]+ molecular weight of 290.1, and the characteristic fragments at 125 indicated fluorination at the 3’ position, while fragments at 179 indicated fluorination at the 8 position (Supplementary Fig. 27). The abundance ratio of 124.9 to 179.1 in the mass spectrum was 3.28, indicating that fluorination predominantly occurred at the 3’ position of norcoclaurine, which could be due to the higher metabolite flux of the TyDCY350F pathway when compared with CYP76AD5 or the steric hindrance at the 8 position. It is worth noting that we also detected a difluorinated derivative of norcoclaurine, which was less likely to undergo further enzymatic catalysis due to its increased molecular polarity. Due to the lack of structural references with characteristic fragments, we determined the fluorinated sanguinarine by comparing the MS2 spectra before and after fluorination and identifying the major fragments with a mass difference of 18 (Fig. 7c).
When fed with 0.5 mM 3-Cl-tyrosine, we observed the production of chlorinated norcoclaurine and chlorinated reticuline, but not chlorinated scoulerine. Similarly, when supplemented with 0.5 mM 3-I-tyrosine, we only identified iodinated norcoclaurine. To further validate our findings, we employed high-resolution mass spectrometry (LC-MS-TOF)27, facilitating the determination of precise molecular weights and isotopic distribution of major fragments, yielding consistent results (Fig. 7d, e, Supplementary Fig. 32–41, and Supplementary Table 2 and 3). These results suggest that variations in molecular size and polarity can influence subsequent catalysis. The biosynthesis of fluorinated sanguinarine highlights the potential of yeast as an expandable platform for constructing various benzylisoquinoline alkaloids.