Although originally a scientific curiosity, protein folding has become of primary medical importance, given that incorrect folding of proteins is responsible for many devastating diseases. To understand protein folding, one needs to know, on the one hand, how a protein chain finds its unique native fold within a plethora of alternatives and, on the other hand, how to predict this unique folding from the amino acid sequence of the protein chain.
The first problem basically has been solved for single-domain proteins, partly due to the work of my group. Specifically, we suggested a theory that solves the famous "Levinthal paradox" of protein folding. This paradox stresses that, among the myriad potential protein folds, the most stable cannot be found even by an exhaustive enumeration of the chain structures within any reasonable time frame (it would take longer than the lifetime of the universe). Therefore, for a long time it was thought that the amino acid sequence must encode not only the native fold but also the pathway to it; the goal of the researcher, therefore, should be to identify the specific pathway leading to the native structure of a protein. However, we put forth a theory that any chain fold that is much more stable than its competitors (i.e., separated from them by a significant "energy gap," as it is currently called) is automatically a focus for rapid folding pathways. When a chain's amino acid sequence provides such a gap, it folds via a first-order phase transition; a nucleation process typical of these transitions occurs rapidly and unambiguously brings this chain into its native fold. As a result, we developed a general theory of protein folding, which paved the way to specific theories about and algorithms concerning protein folding rates and protein folding nuclei.
The simple nucleation mechanism that we and others have studied so far mostly pertained to small proteins exhibiting two-state folding kinetics. For larger proteins, however, molten globule–like folding intermediates are often observed. The presence of folding intermediates complicates both the proteins' folding and the underlying theory. Thus, folding of larger proteins, especially of oligomeric proteins, as well as folding in a complicated environment remains a problem that has not been solved and that we now wish to address. The molten globule state, the third physical state of the protein molecule, was theoretically predicted, experimentally discovered, and characterized in detail in our laboratory (headed at that time by Oleg B. Ptitsyn). This discovery enabled us to understand the physical nature of protein denaturation. Later, we discovered and investigated a folding-intermediate state ("pre-molten globule") that is distinct from the molten globule state. We proposed and experimentally demonstrated that the molten and pre-molten globule states are involved in some physiological processes and may even be involved in some genetic diseases. We also showed that the molten globule state can be formed under mild denaturing conditions in the cell, such as during protein-membrane interactions.
We and others have demonstrated a molten globule–like folding intermediate in apomyoglobin, which is an object of our current study. This intermediate (I) rapidly forms from the unfolded state (U) and then undergoes a slow transition to the native state (N). We are trying to determine whether the same or different residues are involved in folding nuclei of I↔N and U↔N transitions. However, we did not find conditions for pure I↔N and U↔N transitions; the admixture of the third state is always present, which presents an obstacle for conventional (Firsht's) Φ-analysis (used to outline the folding nucleus experimentally). To overcome this, we developed a method to estimate the admixture of transient intermediates in the U↔N transition from the amplitude of the burst U↔I transition, whose rate is too high to be measured by a stopped-flow technique. This new technique allows us to dissect a complicated U↔I↔N transition, separating out the rates of I↔N and U↔N transitions. Thus, we can now return to the question of whether the same or different residues are involved in nucleation of I↔N and U↔N transitions, a task that is in progress.
Folding of oligomeric proteins is a comparatively new field of investigation. It is known that their folding is a complicated multistate process, with various monomeric and oligomeric folding intermediates. Recent studies, including some of our own, of equilibrium unfolding of the oligomeric human chaperonin 10 have shown that urea unfolds the oligomer's subunits without oligomer disruption, whereas guanidine hydrochloride disrupts the oligomer and unfolds subunits simultaneously. Kinetic studies have shown that formation of non-native oligomer is slow, while subsequent folding of subunits within the oligomer is a fast two-state process. We are now addressing the following questions: the nature of the conformational state(s) of monomers that is a starting point for their unique oligomerization, and whether the starting point depends on the number of subunits and on their internal structure. To answer these questions, we are studying folding and unfolding of oligomers with various numbers of subunits, molecular weights, and internal (single- and multidomain) structures.
We propose to study the following proteins: (1) isopropylmalate dehydrogenase from Thermusthermophilus (consisting of two 37-kDa two-domain subunits); (2) GroES co-chaperonin from Escherichia coli (consisting of seven 10-kDa single-domain subunits); (3) GroEL chaperonin from E. coli (consisting of 14 60-kDa three-domain subunits). The three-dimensional structures of these proteins are known, but investigation of their folding has only begun. Preliminary studies show that all these proteins fold from a urea-unfolded state, yielding almost 100 percent of native oligomers. However, they have different folding mechanisms: GroES unfolds and refolds reversibly in a two-state manner; GroEL, a larger and more complicated protein, has a native-like monomeric folding intermediate, which yields oligomers in the presence of ATP and ADP only.
Another key problem—how a protein chain finds its unique native fold—has not been solved with the accuracy necessary for biological applications except when the three-dimensional structure of a protein can be copied from that of the known structure of a homologue. The underlying reason is the insufficient accuracy of force field potentials used to predict protein structures in a water environment. Therefore, the second aim of this project is to develop new estimates of force field potentials for protein structure prediction and drug design. To increase the accuracy of force field estimates, we plan to extract them from the solubility of molecular crystals instead of extracting them partly from crystal sublimation and partly from vapor solubility, as is done now. Moreover, the new force field potentials will cover multiatom interactions in addition to the currently used pairwise ones. We will also use feedback from predictions and experiments, of great biological interest in themselves, to improve search algorithms and estimates of force field potentials.
Last updated August 2009