In-depth stability characterization and engineering of bacterial N-terminal motifs and their protective tags
Date
2025
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
University of Delaware
Abstract
Protein degradation plays a pivotal role in the maintenance of the cell by recycling abnormally expressed proteins and regulating stimuli-responsive protein lifespans. A protein degradation pathway of particular interest is the N-degron pathway, where short N-terminal motifs known as N-degrons can modulate protein half-life from two minutes to over ten hours. Initially, substrate specificity in this pathway was thought to depend solely on the identity of the first amino acid; however, recent studies have revealed that residues up to the fifth position from the N-terminus can significantly influence protein stability. Consequently, there remain open questions about how precisely the amino acid sequence of N-terminal regions governs protein stability as well as how generalizable are previously observed sequence trends. Clarifying substrate preferences within this pathway would enable biological engineers to finely control protein half-life and prevent unwanted degradation of recombinant proteins. ☐ To monitor protein degradation in vivo, we adapted the ubiquitin fusion technique to create a dual fluorescent reporter, allowing us to differentiate destabilizing degrons and stable sequences over three orders of signal magnitude. Utilizing this assay, we screened novel pathway candidates including highly destabilizing sequences derived from human and plant N-degrons. Using these imported sequences as templates, we established a high-throughput screening platform combining DNA library generation, fluorescence-activated cell sorting (FACS), and next-generation sequencing (NGS), allowing simultaneous analysis of numerous sequences. After validating our platform with a 60-member library screen, we scaled the platform to analyze combinatorially mutagenized sequences across the first five N-terminal amino acids, generating detailed sequence-specificity maps from over 800,000 sequences at a depth of at least 20 reads per sequence. ☐ This extensive dataset revealed previously undiscovered trends in the bulk data, such as the destabilizing impact of glutamine and the stabilizing effects of glycine and proline residues downstream of the N-terminal position. It has also revealed trends that are specific to certain sequence contexts, such as how two bulky residues at positions two and three can occasionally convert a sequence with a canonically stable N-terminal residue, such as Nt-Cys, into an N-degron. Furthermore, we have characterized sequence stability trends for synthetic N-termini commonly used for small molecule or protein ligation, such as Nt-Cys, Gly, and Ser. Leveraging these insights, we developed N-FIVE, a machine learning model capable of predicting and recommending N-terminal sequences based on desired stability profiles. Collectively, these findings represent the most comprehensive characterization of the Escherichia coli N-degron pathway to date and provide practical insights for protein lifespan modulation. ☐ Despite its potential, the N-degron pathway remains underutilized in a biological engineering context, partly due to limited methods for dynamically exposing neo-N-termini. Addressing this challenge could unlock valuable applications, including dynamic protein lifespan switches. To resolve fundamental obstacles in neo-N-termini generation, we first evaluated the stability of widely-used protective tags. Notably, we observed unexpected cleavage of ubiquitin and SUMO fusions in common E. coli strains. By identifying and knocking out four candidate deubiquitinases in BL21, we engineered the Zero observable Ubiquitin cleavage (sUbZero) strain, significantly enhancing ubiquitin-fusion stability and improving yields by over 50%. Next, we developed a SUMO protease (Ulp1) that is dependent on the nonstandard amino acid o-methyltyrosine for expression. Using this engineered protease, we created an inducible protein stability switch. We demonstrated its ability to conditionally remove protective tags, enabling degradation of N-degron-tagged proteins. Subsequently, we have utilized this technology to regulate the lifespan of a toxin, showing a proof of concept that the toxin’s function can be negated in the presence of our engineered protease. Future directions and optimization strategies for conditional cell death in a biocontainment application are also discussed. ☐ In summary, this thesis advances our understanding and application of how N-terminal protein motifs govern protein stability. By screening and analyzing millions of N-degron candidates and overcoming key barriers related to engineered protective tag removal, we provide new insights and practical tools that empower biological engineers to precisely tune protein half-life.
Description
Keywords
Genetic circuits, Machine learning, Protein degradation, Synthetic biology, Escherichia coli