Mouse geneticists get a boost with the release of an encyclopedia containing more than 360,000 genetic sequences.
The laboratory mouse, small and easy to breed, has long been biologists' favorite model organism for studying mammalian development. Now, a multi-laboratory project with collaboration from the Howard Hughes Medical Institute has produced a free, publicly accessible catalog of mouse gene fragments that will help ensure that the mouse remains an important model in the genomic era.
The catalog, a collection of more than 360,000 gene fragments, is helping scientists get a handle on particular genes that they want to study. The database should also be a valuable tool for interpreting and comparing the genome sequences of mouse and human as vast stretches of chromosome sequence are cranked out in the months and years ahead.
Marco Marra of the Washington University Genome Sequencing Center in St. Louis led a team of 42 scientists who surveyed the genes of the mouse, hoping to snare many genes that are important during development. In the February 1999 issue of the journal Nature Genetics , the researchers describe how they generated their large database of expressed sequence tags (ESTs).
ESTs provide quick access to genes. They are fragments of sequences from messenger RNA molecules, the molecules that act as intermediaries between genes and proteins.
Since the array of messenger RNA molecules in a cell varies according to the type of cell and its developmental stage, the biologists generated ESTs from a broad range of cell types to get the largest possible sample of genes. The ESTs arose from various adult mouse organs as well as from mice in the earliest stages of development.
"The mouse provides a very powerful mammalian model system," Marra points out. "If we want to understand the function of a human gene, we can cross reference to the mouse sequence and identify the mouse gene, which can be the starting point for elucidating the biological function of that sequence. The ESTs are an entry point."
Another thing that scientists can do in mice is "knock-out," or inactivate, a gene to try to determine its function. Today, researchers routinely create knock-outs in yeast and in nematode worms, model organisms whose genomes are now thoroughly sequenced. With the new mouse EST encyclopedia, such studies are likely to become more common in mice.
The scientists have released their data over the Internet as it has been generated during the last three years. The mouse ESTs Marra and his colleagues have produced represent 93 percent of all mouse ESTs available in the public domain.
Researchers at Washington University, working with Marra, have sequenced the gene fragments, analyzed them, and submitted the ESTs into public databases. Their colleagues at the University of Iowa and the Oklahoma Medical Research Foundation created the libraries of messenger RNA clones that were used for sequencing, and a team of scientists from the Lawrence Livermore National Laboratory in California distributed them.
Many scientists have not waited for a formal announcement to begin using the data, some of which has been available since early 1996. Marra says, "This is a useful resource not only for mouse biologists but for all biologists. It emphasizes the universal nature of DNA." Tools such as the mouse EST database permit scientists to look for related genes among a range of organisms. A gene can then be studied in parallel in different organisms, each of which may be suited for revealing different aspects of the gene's role.
Now that the EST database is well stocked, scientists at Washington University are trying to determine just how many genes the 360,000 ESTs represent. Since the mouse is thought to have fewer than 100,000 genes, many ESTs represent portions of the same gene.
Though ESTs are versatile tools in helping to map genes onto chromosomes and in revealing disease gene candidates, for example, they do not provide the whole picture. "What you need in the end is genome sequence," Marra says. "But when you have genome sequence, ESTs aid massively in analyzing the structure and organization of the genome. Having the two together is more powerful than having either alone."