-
Briefings in Bioinformatics Sep 2023For refining and designing protein structures, it is essential to have an efficient protein folding and docking framework that generates a protein 3D structure based on...
For refining and designing protein structures, it is essential to have an efficient protein folding and docking framework that generates a protein 3D structure based on given constraints. In this study, we introduce OPUS-Fold3 as a gradient-based, all-atom protein folding and docking framework, which accurately generates 3D protein structures in compliance with specified constraints, such as a potential function as long as it can be expressed as a function of positions of heavy atoms. Our tests show that, for example, OPUS-Fold3 achieves performance comparable to pyRosetta in backbone folding and significantly better in side-chain modeling. Developed using Python and TensorFlow 2.4, OPUS-Fold3 is user-friendly for any source-code level modifications and can be seamlessly combined with other deep learning models, thus facilitating collaboration between the biology and AI communities. The source code of OPUS-Fold3 can be downloaded from http://github.com/OPUS-MaLab/opus_fold3. It is freely available for academic usage.
Topics: Models, Molecular; Proteins; Software; Protein Folding
PubMed: 37833840
DOI: 10.1093/bib/bbad365 -
Annual Review of Biophysics May 2021Cooperativity is a hallmark of protein folding, but the thermodynamic origins of cooperativity are difficult to quantify. Tandem repeat proteins provide a unique... (Review)
Review
Cooperativity is a hallmark of protein folding, but the thermodynamic origins of cooperativity are difficult to quantify. Tandem repeat proteins provide a unique experimental system to quantify cooperativity due to their internal symmetry and their tolerance of deletion, extension, and in some cases fragmentation into single repeats. Analysis of repeat proteins of different lengths with nearest-neighbor Ising models provides values for repeat folding ([Formula: see text]) and inter-repeat coupling (Δ). In this article, we review the architecture of repeat proteins and classify them in terms of Δ and Δ; this classification scheme groups repeat proteins according to their degree of cooperativity. We then present various statistical thermodynamic models, based on the 1D-Ising model, for analysis of different classes of repeat proteins. We use these models to analyze data for highly and moderately cooperative and noncooperative repeat proteins and relate their fitted parameters to overall structural features.
Topics: Models, Molecular; Protein Folding; Tandem Repeat Sequences; Thermodynamics
PubMed: 33606943
DOI: 10.1146/annurev-biophys-102220-083020 -
Progress in Molecular Biology and... 2020
Topics: Computational Biology; Computer Simulation; Protein Folding
PubMed: 32145954
DOI: 10.1016/S1877-1173(20)30032-6 -
International Journal of Molecular... Mar 2023We review the key steps leading to an improved analysis of thermal protein unfolding. Thermal unfolding is a dynamic cooperative process with many short-lived... (Review)
Review
We review the key steps leading to an improved analysis of thermal protein unfolding. Thermal unfolding is a dynamic cooperative process with many short-lived intermediates. Protein unfolding has been measured by various spectroscopic techniques that reveal structural changes, and by differential scanning calorimetry (DSC) that provides the heat capacity change C(T). The corresponding temperature profiles of enthalpy ΔH(T), entropy ΔS(T), and free energy ΔG(T) have thus far been evaluated using a chemical equilibrium two-state model. Taking a different approach, we demonstrated that the temperature profiles of enthalpy ΔH(T), entropy ΔS(T), and free energy ΔG(T) can be obtained directly by a numerical integration of the heat capacity profile C(T). DSC thus offers the unique possibility to assess these parameters without resorting to a model. These experimental parameters now allow us to examine the predictions of different unfolding models. The standard two-state model fits the experimental heat capacity peak quite well. However, neither the enthalpy nor entropy profiles (predicted to be almost linear) are congruent with the measured sigmoidal temperature profiles, nor is the parabolic free energy profile congruent with the experimentally observed trapezoidal temperature profile. We introduce three new models, an empirical two-state model, a statistical-mechanical two-state model and a cooperative statistical-mechanical multistate model. The empirical model partially corrects for the deficits of the standard model. However, only the two statistical-mechanical models are thermodynamically consistent. The two-state models yield good fits for the enthalpy, entropy and free energy of unfolding of small proteins. The cooperative statistical-mechanical multistate model yields perfect fits, even for the unfolding of large proteins such as antibodies.
Topics: Protein Denaturation; Thermodynamics; Protein Unfolding; Entropy; Proteins; Calorimetry, Differential Scanning; Protein Folding
PubMed: 36982534
DOI: 10.3390/ijms24065457 -
The Journal of Physical Chemistry. B Apr 2023Protein stability is important in many areas of life sciences. Thermal protein unfolding is investigated extensively with various spectroscopic techniques. The...
Protein stability is important in many areas of life sciences. Thermal protein unfolding is investigated extensively with various spectroscopic techniques. The extraction of thermodynamic properties from these measurements requires the application of models. Differential scanning calorimetry (DSC) is less common, but is unique as it measures directly a thermodynamic property, that is, the heat capacity (). The analysis of () is usually performed with the chemical equilibrium two-state model. This is not necessary and leads to incorrect thermodynamic consequences. Here we demonstrate a straightforward model-independent evaluation of heat capacity experiments in terms of protein unfolding enthalpy Δ(), entropy Δ(), and free energy Δ()). This now allows the comparison of the experimental thermodynamic data with the predictions of different models. We critically examined the standard chemical equilibrium two-state model, which predicts a positive free energy for the native protein, and diverges distinctly from the experimental temperature profiles. We propose two new models which are equally applicable to spectroscopy and calorimetry. The Θ()-weighted chemical equilibrium model and the statistical-mechanical two-state model provide excellent fits of the experimental data. They predict sigmoidal temperature profiles for enthalpy and entropy, and a trapezoidal temperature profile for the free energy. This is illustrated with experimental examples for heat and cold denaturation of lysozyme and β-lactoglobulin. We then show that the free energy is not a good criterion to judge protein stability. More useful parameters are discussed, including protein cooperativity. The new parameters are embedded in a well-defined thermodynamic context and are amenable to molecular dynamics calculations.
Topics: Hot Temperature; Protein Denaturation; Proteins; Thermodynamics; Cold Temperature; Protein Unfolding; Calorimetry, Differential Scanning; Protein Folding
PubMed: 37040567
DOI: 10.1021/acs.jpcb.3c00882 -
Proceedings of the National Academy of... Nov 2014How do proteins fold, and why do they fold in that way? This Perspective integrates earlier and more recent advances over the 50-y history of the protein folding... (Review)
Review
How do proteins fold, and why do they fold in that way? This Perspective integrates earlier and more recent advances over the 50-y history of the protein folding problem, emphasizing unambiguously clear structural information. Experimental results show that, contrary to prior belief, proteins are multistate rather than two-state objects. They are composed of separately cooperative foldon building blocks that can be seen to repeatedly unfold and refold as units even under native conditions. Similarly, foldons are lost as units when proteins are destabilized to produce partially unfolded equilibrium molten globules. In kinetic folding, the inherently cooperative nature of foldons predisposes the thermally driven amino acid-level search to form an initial foldon and subsequent foldons in later assisted searches. The small size of foldon units, ∼ 20 residues, resolves the Levinthal time-scale search problem. These microscopic-level search processes can be identified with the disordered multitrack search envisioned in the "new view" model for protein folding. Emergent macroscopic foldon-foldon interactions then collectively provide the structural guidance and free energy bias for the ordered addition of foldons in a stepwise pathway that sequentially builds the native protein. These conclusions reconcile the seemingly opposed new view and defined pathway models; the two models account for different stages of the protein folding process. Additionally, these observations answer the "how" and the "why" questions. The protein folding pathway depends on the same foldon units and foldon-foldon interactions that construct the native structure.
Topics: Kinetics; Models, Chemical; Protein Folding
PubMed: 25326421
DOI: 10.1073/pnas.1411798111 -
The Journal of Chemical Physics Nov 2020Molecular dynamics simulations are an invaluable tool to characterize the dynamic motions of proteins in atomistic detail. However, the accuracy of models derived from...
Molecular dynamics simulations are an invaluable tool to characterize the dynamic motions of proteins in atomistic detail. However, the accuracy of models derived from simulations inevitably relies on the quality of the underlying force field. Here, we present an evaluation of current non-polarizable and polarizable force fields (AMBER ff14SB, CHARMM 36m, GROMOS 54A7, and Drude 2013) based on the long-standing biophysical challenge of protein folding. We quantify the thermodynamics and kinetics of the β-hairpin formation using Markov state models of the fast-folding mini-protein CLN025. Furthermore, we study the (partial) folding dynamics of two more complex systems, a villin headpiece variant and a WW domain. Surprisingly, the polarizable force field in our set, Drude 2013, consistently leads to destabilization of the native state, regardless of the secondary structure element present. All non-polarizable force fields, on the other hand, stably characterize the native state ensembles in most cases even when starting from a partially unfolded conformation. Focusing on CLN025, we find that the conformational space captured with AMBER ff14SB and CHARMM 36m is comparable, but the ensembles from CHARMM 36m simulations are clearly shifted toward disordered conformations. While the AMBER ff14SB ensemble overstabilizes the native fold, CHARMM 36m and GROMOS 54A7 ensembles both agree remarkably well with experimental state populations. In addition, GROMOS 54A7 also reproduces experimental folding times most accurately. Our results further indicate an over-stabilization of helical structures with AMBER ff14SB. Nevertheless, the presented investigations strongly imply that reliable (un)folding dynamics of small proteins can be captured in feasible computational time with current additive force fields.
Topics: Molecular Dynamics Simulation; Protein Conformation; Protein Folding; Protein Unfolding; Proteins
PubMed: 33187403
DOI: 10.1063/5.0022135 -
ACS Chemical Biology Apr 2024The role of nucleic acids in protein folding and aggregation is an area of continued research, with relevance to understanding both basic biological processes and... (Review)
Review
The role of nucleic acids in protein folding and aggregation is an area of continued research, with relevance to understanding both basic biological processes and disease. In this review, we provide an overview of the trajectory of research on both nucleic acids as chaperones and their roles in several protein misfolding diseases. We highlight key questions that remain on the biophysical and biochemical specifics of how nucleic acids have large effects on multiple proteins' folding and aggregation behavior and how this pertains to multiple protein misfolding diseases.
Topics: Humans; Nucleic Acids; Protein Folding; Molecular Chaperones; Proteostasis Deficiencies
PubMed: 38477936
DOI: 10.1021/acschembio.3c00695 -
The Protein Journal Oct 2020The protein folding problem has been extensively studied for decades, and hundreds of thousands of protein structures have been solved. Yet, how proteins fold from a... (Review)
Review
The protein folding problem has been extensively studied for decades, and hundreds of thousands of protein structures have been solved. Yet, how proteins fold from a linear peptide chain to their unique 3D structures is not fully understood. With key clues having emerged unexpectedly from the field of nanoscience, a "Confined Lowest Energy Fragment" (CLEF) hypothesis was proposed. The CLEF hypothesis states that a protein chain can be divided into CLEFs, the semi-independent folding units, by a small number of key residues that form key long-range interactions. The native structure of a CLEF is the lowest energy state under the constraints of the key long-range interactions, but the native structure of the whole protein is not necessary the lowest energy state as Anfinsen's thermodynamic hypothesis suggested. The CLEF hypothesis proposes a unified CLEF mechanism for protein folding, basically a two-step process. In the first step, the favorable enthalpy of CLEFs for native structures quickly brings those residues for the key long-range interactions together, forming intermediates corresponding to the so-called hydrophobic collapse. In the second step, those collapsed key residues shuffle for the right combination to form the native key long-range interactions. The CLEF hypothesis provides a simple solution to all protein folding paradoxes, and proposes a "CLEF Age" or "Stone Age" for the prebiotic evolution of proteins.
Topics: Hydrophobic and Hydrophilic Interactions; Kinetics; Models, Molecular; Protein Folding; Proteins; Thermodynamics
PubMed: 33040262
DOI: 10.1007/s10930-020-09925-w -
Computers in Biology and Medicine Mar 2023Protein folding is a complex physicochemical process whereby a polymer of amino acids samples numerous conformations in its unfolded state before settling on an...
Protein folding is a complex physicochemical process whereby a polymer of amino acids samples numerous conformations in its unfolded state before settling on an essentially unique native three-dimensional (3D) structure. To understand this process, several theoretical studies have used a set of 3D structures, identified different structural parameters, and analyzed their relationships using the natural logarithmic protein folding rate (ln(k)). Unfortunately, these structural parameters are specific to a small set of proteins that are not capable of accurately predicting ln(k) for both two-state (TS) and non-two-state (NTS) proteins. To overcome the limitations of the statistical approach, a few machine learning (ML)-based models have been proposed using limited training data. However, none of these methods can explain plausible folding mechanisms. In this study, we evaluated the predictive capabilities of ten different ML algorithms using eight different structural parameters and five different network centrality measures based on newly constructed datasets. In comparison to the other nine regressors, support vector machine was found to be the most appropriate for predicting ln(k) with mean absolute differences of 1.856, 1.55, and 1.745 for the TS, NTS, and combined datasets, respectively. Furthermore, combining structural parameters and network centrality measures improves the prediction performance compared to individual parameters, indicating that multiple factors are involved in the folding process.
Topics: Protein Folding; Proteins; Algorithms; Amino Acids; Models, Theoretical
PubMed: 36848800
DOI: 10.1016/j.compbiomed.2022.106436