 |
Harnessing Evolution for the Study of Biological and Synthetic Molecules

Summary: David Liu develops and applies evolution-based methods to the discovery and study not only of molecules found in living systems but also of synthetic molecules and new chemical reactions.
Two of the most common challenges in chemistry and chemical biology are (1) how to control chemical reactivity and (2) how to discover molecules with desired properties from many possible structures. My laboratory has initiated a program to develop a new approach to these two challenges that differs fundamentally from the approaches most frequently taken by chemists. Our approach is based on nature's effective molarity-based control of reactivity and nature's evolution-based discovery of functional molecules, both of which offer advantages compared with traditional approaches to synthesis and discovery. Our group uses the principles of biological evolution in conjunction with synthetic organic chemistry and molecular biology to discover and study small molecules, macromolecules, and new chemical reactions.
DNA-Templated Organic Synthesis We discovered that DNA-duplex formation exerts remarkable control over the effective molarity of DNA-linked reactants without requiring structural mimicry of the DNA backbone. As a result, DNA-templated organic synthesis (DTS) is a surprisingly general phenomenon that can direct a wide range of chemical reactions, including carbon-carbon bond-forming reactions and organometallic-coupling reactions, even if the structures of the reactants or products do not resemble the DNA backbone. For many DNA-templated reactions, products form efficiently even when reactive groups are separated by large distances on the template (long-distance DNA-templated synthesis). DTS is sufficiently sequence specific that a single DNA-linked template can react primarily with its sequence-programmed partner reagent in a single solution containing a 1,000-fold excess of nonpartner, sequence-mismatched reagents.
Since our initial findings, we have developed a suite of linker and purification strategies that have enabled us to translate DNA templates into small-molecule products of multistep DNA-templated syntheses. For example, a 5'-amineterminated DNA template was sequence specifically translated into a non-natural tripeptide or a branched thioether product through three DNA-templated steps, with each step encoded by a different region of a 30-base DNA template. More recently, we used multistep DTS to translate DNA templates into N-acyloxazolidines and macrocyclic N-acyloxazolidines. To date, multistep DTS has been used to create more than a dozen classes of diverse, organic small-molecule structures.
In addition to exploring and expanding the synthetic capabilities of DTS, we have also shown that DTS enables new modes of controlling reactivity that are not possible using current synthetic approaches. For example, in a DNA-templated format, many starting materials can undergo multiple, otherwise incompatible, reaction types in a single solution to generate exclusively a set of sequence-programmed products. The analogous experiment in a traditional reaction format would generate an uncontrolled mixture of all possible products. We have used this reaction mode to diversify synthetic small-molecule libraries, using iterated branching reaction pathways in a single solution, in contrast to the more common diversification approach of using different building blocks in one type of reaction. We have also found that DTS enables heterocoupling reactions to take place efficiently between reactants that preferentially homocouple in a conventional synthesis format, and also allows multistep ordered small-molecule synthesis to take place in one pot between multiple reactants of comparable reactivity. For example, an orchestrated series of changes in template secondary structure was used to synthesize an ordered triolefin or an ordered tripeptide in a single solution in which all reactants are initially present. If combined under conventional synthesis conditions, these reactants would instead generate a vast mixture of predominantly non-ordered products.
Our other advances in this area include the development of two new template architectures that expand the synthetic capabilities of DTS by (1) allowing virtually any DNA-templated reaction to be encoded by any region of a DNA template and (2) enabling two reactions to take place on a single DNA template in one step. The use of both of these architectures together with more recently developed DNA-templated synthetic reactions proved crucial in the DNA-templated N-acyloxazolidines syntheses mentioned above. We also discovered the ability of a DNA template to induce stereoselectivity in a DNA-templated reaction that generates products unrelated to the DNA backbone, and we have traced the origins of this stereoselectivity to the macromolecular conformation of the templates. We used this stereoselectivity as a sensitive measure of the conditions under which the DNA templates can directly influence a reaction beyond simple modulation of the effective molarity of the reactants, and we found that even a small number of rotatable bonds abrogates observed template-induced effects.
In some cases it may not be possible or convenient to link reactants to oligonucleotides. To develop an alternative strategy for DNA-programmed synthesis that enables the participation of non-DNA-linked reagents, we developed DNA-templated functional group transformations that convert azide groups into amines, thiols, or carboxylic acids in a sequence-specific manner. These functional group transformations were used in conjunction with four non-DNA-tethered electrophilic reactants to convert four template-linked azides in a single solution into four sequence-programmed sulfonamide, carbamate, urea, and thiourea products.
We have also developed highly sensitive in vitro selections for DNA-linked synthetic small molecules (such as the products of DNA-templated library synthesis) with protein-binding affinity and specificity. These selections can be iterated to achieve enormous enrichments for functional DNA-linked synthetic small molecules.
Integrating many of the above concepts, we recently translated a library of 65 DNA templates into a pilot library of complex synthetic small-molecule macrocycles, using a "genetic code" that dictates which reactants are recruited by each 10-base coding sequence. The resulting library of DNA-linked macrocycles was selected for binding to a target protein, and the DNA templates encoding macrocycles with target protein affinity were amplified by PCR and characterized by DNA sequencing. A single template of the pilot library that encodes a synthetic macrocycle with affinity for the target protein was enriched in this manner. This work represents the translation, selection, and amplification of a library of DNA sequences that encode synthetic small molecules, rather than proteins. Encouraged by these developments, we and others are now applying this approach to small-molecule synthesis and discovery on libraries of much larger complexities and structural diversities. For example, we have translated a library of DNA templates into a library of up to 1,000 N-acyloxazolidine heterocycles, and we are analyzing the results of selection of this library for affinity to a wide variety of target proteins of biological interest.
We have begun to apply these principles to synthetic polymers in addition to small molecules. Based on the distance dependence of DNA-templated reductive amination and on the previous findings of David Lynn (Emory University) and his co-workers, we have translated DNA templates into synthetic sequence-defined peptide nucleic acid (PNA) polymers, using DNA-templated polymerization of PNA aldehydes. This polymerization proceeds with remarkable efficiency and excellent sequence specificity, and it can generate synthetic polymers of length similar to that of proteins and nucleic acids known to possess functional binding or catalytic properties. These findings are the basis of our ongoing efforts to evolve sequence-defined synthetic heteropolymers through processes of translation, selection, amplification, and diversification previously available only to natural biopolymers.
This novel approach to creating and discovering functional molecules offers significant advantages compared with existing methods. DNA-templated libraries of synthetic molecules can be subjected to true in vitro selections (as opposed to screens) for desired binding or catalytic activities, obviating the need to separate each library member spatially or to spend effort characterizing uninteresting molecules. Only minute quantities of material (~1,000 molecules of each different library member) are required for these selections because the information that directs each member's synthesis can be amplified by PCR; the syntheses and selections described above were typically executed on a nanomole to subfemtomole scale. The small amount of material required, coupled with the suitability of these molecules to undergo selection, in theory enables libraries of unprecedented complexity (much larger than the current total size of the CAS Registry synthetic structure database) to be generated and evaluated. In addition, the new modes of controlling reactivity enabled by DNA-templated synthesis may allow diverse regions of structure space to be explored more effectively than is possible using existing library creation strategies. Finally, the infrastructure requirements to perform library synthesis and evaluation in this format are modest compared with those of conventional approaches.
A New Approach to Reaction Discovery Unique features of DNA-templated organic synthesis have also led to a new approach for the discovery of bond-forming chemical reactions. In contrast with traditional reaction discovery methods, our approach does not focus on a specific combination of substrates or on the formation of one type of product structure. Instead, we combine pools of many DNA-linked substrates in one solution and select all possible pairwise combinations of substrates simultaneously for bond-forming combinations in a single experiment. The identity of bond-forming reactant pairs is revealed by exposing DNA sequences that survive the in vitro selection to DNA microarrays containing sequences that represent every possible combination of substrates. Because the results of this reaction discovery selection can be amplified by PCR, we perform this process on a femtomole scale that is unprecedented for reaction discovery.
We validated this approach to reaction discovery by "rediscovering" several known reactions mediated by transition metals or organic reagents. We have since used this system in a 96- and 168-reaction matrix format to discover several new transition metal-catalyzed bond-forming reactions that have been confirmed in a DNA-templated format. One of the discovered reactions, a carbon-carbon bond-forming macrocyclization between a simple alkyne and alkene mediated by catalytic quantities of Pd(II) in neutral water or mixed organic solvent at room temperature to form a macrocyclic trans-enone in high yield, has also been confirmed by extensive characterization in a non-DNA-templated, conventional synthesis format. Our exploration of this enone-forming reaction recently led to its successful use in an intermolecular (rather than macrocyclization) format, as well as the elucidation of a novel mechanism as the plausible basis of this transformation. This approach enables a broad and unbiased search of functional group space for new reactions at a rate of thousands of combinations of reactants and reaction conditions per two-day experiment.
The development of these new areas merges the creativity of the chemist with the powerful principles underlying the evolution of living systems.
Expanding the Scope of Protein and Nucleic Acid Evolution We are also interested in developing and applying new methods for evolving biological macromolecules. We developed a new method for diversifying nucleic acid libraries by nonhomologous random recombination (NRR), and we used this method to evolve DNA aptamers with significantly higher affinities than those evolved using error-prone PCR under identical selection conditions. NRR has also proved to be a valuable strategy for minimizing a functional nucleic acid and for rapidly identifying structure-function relationships among evolved nucleic acids. We recently developed a modified version of NRR that enables protein evolution to access structures not previously accessible using existing protein evolution methods. The functional requirements of chorismate mutase, a natural protein enzyme, were explored in a broad and unbiased manner using protein NRR. Functional chorismate mutases emerging from protein NRR-diversified libraries included enzymes containing major rearrangements of secondary structural elements. We have also applied NRR to functionally dissect natural nucleic acids such as sRNA translational regulators, mRNA sequences involved in subcellular localization, or promoter elements that render gene expression dependent on specific cellular responses. In all three cases, NRR has been an efficient and comprehensive method for revealing essential and nonessential elements and for providing insights into the molecular requirements for these complex and important biological functions.
We have also developed methods for evolving functional RNA molecules in vivo from random RNA libraries. For example, from random RNA sequences expressed in yeast cells we evolved RNA sequences capable of activating gene transcription to a degree comparable to that of the most potent natural protein transcriptional activators, and we recently used a similar approach to evolve RNA-based gene silencers. We have used site-directed mutagenesis in conjunction with chemical studies to characterize the nature of these artificially evolved RNA-protein transcription factors. The ability of RNA to be efficiently engineered and evolved enabled us to develop a variant of the evolved RNA transcriptional activator that is dependent on the presence of a cell-permeable synthetic small molecule.
Our protein evolution efforts have focused on the evolution of inteins (protein-splicing domains) that are only active in the presence of a cell-permeable synthetic small molecule, the evolution of nucleases with tailor-made DNA-cleavage specificities, the evolution of protein-protein interfaces, and the evolution of nonribosomal peptide synthetases with altered activities. In all of these cases, we have linked protein activity both positively and negatively with the survival of a bacterial cell. Coupled positive and negative in vivo selections enabled us to evolve homing endonucleases with altered (rather than merely broadened) substrate specificity.
We have integrated a variety of directed evolution methods to evolve inteins that undergo protein splicing only in the presence of a cell-permeable synthetic small molecule (called 4-HT), and we have shown that these inteins enable the functions of several unrelated proteins to be controlled in living cells in a rapid, dose-dependent, and post-translational manner. Recently we demonstrated (in collaboration with Andrew McMahon's group, Harvard University) that these evolved inteins render the localization and activity of Gli1 and Gli3, two transcription factors involved in embryogenesis, dependent on 4-HT in living mammalian cells. Mammalian cells expressing evolved intein-inserted Gli1 differentiate into osteoblasts in a 4-HTdependent manner, demonstrating that a complex biological signaling pathway can be regulated by a small molecule using the evolved intein. Small-molecule activated inteins may serve as powerful and general tools for rapidly perturbing virtually any protein's activity in living cells with a degree of temporal control, spatial resolution, and dose dependence not achievable by intervening at the DNA or RNA levels. Although the use of evolved ligand-activated inteins requires genetic intervention, this approach does not require the discovery of a specific small-molecule modulator for each protein of interest and therefore may serve as a complementary approach to chemical genetics.
This research was funded in part by the National Institutes of Health, the National Science Foundation, the Office of Naval Research, the Arnold and Mabel Beckman Foundation, the Alfred P. Sloan Foundation, the Searle Scholars Program, the Camille and Henry Dreyfus Foundation, and the American Chemical Society.
Last updated: November 16, 2006
|
 |
|
 |