Self-interactions and aggregation of therapeutic proteins

Date
2024
Journal Title
Journal ISSN
Volume Title
Publisher
University of Delaware
Abstract
Protein-based therapies are a prominent class of drug products used in the treatment of a broad range of chronic illnesses such as cancers and immune-related disorders, and more recently infectious diseases such as SARS-CoV-2 and RSV. Many of the highest selling drugs globally are protein-based therapeutics, typically monoclonal antibodies (MAbs) or structurally derivative proteins such as Fc-fusion proteins and bispecific antibodies. The development process for therapeutic proteins is particularly uncertain, expensive, and resource-intensive compared to small molecule drugs, so there is sizable interest in the biopharmaceutical industry in methods that can improve predictions of how likely a protein drug candidate is to be successfully developed into a commercial product (also known as “developability”), and in methods that can streamline the development process. ☐ Many of the challenges that are faced during drug development of therapeutic proteins arise from protein-protein self-interactions, which are “weak” intermolecular forces (i.e., weak in comparison with “lock-and-key” specific binding events) between proteins of the same species in solution. The influence of self-interactions on solution nonidealities and problematic behaviors is increased at elevated protein concentrations, which is of particular relevance as the preferred liquid dosage form for many protein-based therapies is at relatively high protein concentration (on the order of 100 mg/mL). Static light scattering (SLS) and dynamic light scattering (DLS) are commonly used to measure net self-interactions in early-stage development to screen for attractive self-interactions that are fundamentally associated with a host of challenging behaviors and properties such as reversible self-association, irreversible aggregation, elevated viscosity, liquid-liquid phase separation, opalescence, and low solubility. Irreversible aggregation is especially problematic because proteins have a common tendency to aggregate and methods to predict changes in aggregation rates or mechanisms between different proteins or as a function of solution conditions are not well-developed. The presence of aggregates can be a liability in a number of manufacturing processes, reduce the efficacy and shelf-life of the product, and elicit a dangerous immunogenic response when administered to a patient. This thesis is focused on the development and assessment of methods to characterize and predict self-interactions and aggregation rates for therapeutic proteins with emphasis on practical applications in streamlining industrial drug development. The experimental datasets are for solution conditions and proteins similar to those in commercial protein-based therapies, fairly diverse in the behaviors they represent, and large compared to many other publicly-available datasets. ☐ Coarse-grained (CG) molecular simulations are applied throughout this thesis to model self-interactions, predict net self-interactions at high-concentration conditions, and probe specific electrostatic interactions between charged residues that were involved in attractive self-interactions. A range of coarse-grained models for therapeutic proteins were evaluated based on the tradeoffs between computational efficiency and accuracy in calculating net self-interactions. A dataset of previously reported experimental values of the second osmotic virial coefficient (B22) from SLS for five MAbs at multiple solution conditions (i.e., different pH and ionic strength conditions) were used as a test case. Lower resolution (e.g., domain-level) models allowed for higher throughput and more intensive simulation algorithms (e.g., simulations with many protein molecules to simulate high concentrations) but were limited in their representation of interactions between specific sites in the protein, such as attractive electrostatic interactions between specific charged amino acids. Higher resolution models were able to capture specific electrostatic attractions, but at great cost to computational efficiency. A hybrid model that combines features from the domain-level and higher resolution models was introduced that can capture specific electrostatic attractions and was tractable for simulations at high-concentrations like those representative of commercial therapeutic protein products. ☐ Net self-interactions via SLS and DLS experiments were measured systematically for four MAbs, two Fc-fusion proteins and the associated fusion partner (FP) protein as a function of solution pH and ionic strength. The measurements for the Fc-fusion proteins and FP protein were confined to low-concentration (e.g., B22 values), while for the four MAbs, the measurements scaled from low to high protein concentration. The solution conditions were chosen to represent fundamental features of commercial drug products within typical bounds of each feature (i.e., pH, ionic strength, and protein concentration). The proteins displayed a broad range of net self-interactions from strong repulsions to strong attractions that were sensitive to the changes in solution conditions that were assessed. ☐ The two Fc-fusion proteins and FP protein displayed reversible self-association at some conditions, which is linked to many industrial development challenges and is also a possible precursor to irreversible aggregation. The reversible self-association appeared to be related to attractive electrostatic self-interactions, so a high-resolution CG model was used to investigate the origins of attractive electrostatic self-interactions for the two Fc-fusion proteins. The results indicated that they were due to cross-domain interactions between the FP and Fc domains, which suggests that reversible self-association was due to those interactions as well. ☐ A previously developed method to combine low-concentration experimental values of B22 with CG molecular simulations to make predictions of high-concentration net self-interactions was improved by the integration of the hybrid CG model. The domain-level and hybrid CG models were directly compared based on how well they predicted high-concentration net self-interactions, using B22 values from SLS for six MAbs (two from prior work) to parameterize the CG models for a given MAb and pH. The predicted net self-interactions were compared against high-concentration SLS measurements. The findings and guidance from the CG model comparison described above with respect to low-concentration net self-interactions were also generally applicable for high-concentration net self-interactions where domain-level CG models were only able to reliably capture net repulsions and weak non-electrostatic attractions, while the hybrid CG model could capture strong electrostatic attractions as well. Inaccurate predictions of high-concentration behavior with the hybrid CG model at certain conditions were improved by methods that represented charge equilibria more precisely. ☐ The four MAbs were also used for studies with the overall goal of understanding and predicting MAb aggregation rates between different MAbs and as a function of solution conditions. Conformational stability of the four MAbs at four different formulations was measured by differential scanning calorimetry, and aggregation rates were measured via isothermal stability studies at four formulations (varying both pH and ionic strength) as a function of protein concentration and incubation temperature. Prediction of aggregation rates for solutions at high protein concentration stored at refrigerated conditions was of particular interest as it was intended to directly represent the protein concentration and storage condition of many commercial products. Studies at higher temperatures, where aggregation rates were generally faster, were judged by how they might relate to aggregation rates at refrigerated conditions and how similar the fundamental factors that mediated aggregation rates were. Overall, studies at elevated temperatures were poor predictors of aggregation rates at refrigerated storage conditions. Interpretable machine learning models were developed to rigorously deconvolute the impacts of fundamental phenomenon on aggregation rates, which included the net self-interactions and conformational stability measurements. At the highest temperatures, conformational stability was the most influential phenomenon, while at lower and refrigerated temperatures, net valence was the most influential, perhaps due to the influence of repulsive electrostatic self-interactions. The ML methods were also used to more thoroughly assess whether results from stability studies at higher incubation temperatures or lower protein concentrations could be useful for predicting aggregation rates. Another goal in developing the ML models was to provide a robust platform for predicting aggregation rates with the vast datasets that are not publicly available but presumably exist in the archives of many pharmaceutical companies. ☐ The studies in this thesis developed computational and statistical methods that were validated by or trained with fairly large, systematic datasets of experimental biophysical characterization, especially with respect to self-interactions. The results demonstrate how to 1) select a CG molecular model for a given application, 2) use CG molecular simulations in close connection with experimental measurements to extract additional knowledge about self-interactions and predict net self-interactions at other conditions (e.g., higher protein concentrations), and 3) understand and predict MAb aggregation rates as a function of protein concentration, incubation temperature, and solution conditions. These findings can be applied to various phases of industrial drug development for MAbs, Fc-fusion proteins, or other therapeutic proteins to improve selection of protein candidates (i.e., candidate selection) and optimization of formulation conditions (i.e., formulation development).
Description
Keywords
Molecular simulations, Fc-fusion proteins, Monoclonal antibodies, Protein aggregates, Protein formulation, Protein-protein self-interactions
Citation