
PAGE 5 OF 5
Egnor is trying to better understand the elements of these mouse whispers. “When you look at mouse vocalizations, there appears to be some acoustic structure to them. What is it for?” she says.
To find out, she is collaborating with another Janelia fellow, Elena Rivas, who is starting to process the communications using a statistical analysis tool called a hidden Markov model. The software is similar to that which Sean Eddy uses to compare millions of DNA strands.
“The cool thing about hidden Markov models is, you can tell them ‘Look, here's what I think are good examples of what I want you to characterize. Learn them, and then I'm going to give you unlabeled vocalizations and I want you to see which match and which don't,’” Egnor says.
Rivas has reworked a standard protein analysis program called HMMER3 to handle Egnor's data, which comes in one-terabyte chunks. “The core of the programs is identical,” Rivas says. “We're going to try to determine the types of [mouse] vocalizations and try to model each one.”
Once those models, or “families,” are delineated, researchers can then test new mouse vocalizations against these templates. “Then we'll try to catalog everything the mouse says,” Rivas explains. The analysis takes the computing cluster only a few moments to run.
“The beautiful thing about Janelia is that I stream that [information] to the data share, and Elena picks it up and starts working on it,” Egnor says.
Making that transfer possible is another hidden attribute of the Janelia research complex—its internal network. It's the pipeline that carries huge image or auditory files without clogging or slowing down the system. In the startup phase, that meant overbuilding the fiber infrastructure as much as possible and designing it to handle unpredictable loads through 10-gigabit ports. Janelia's network is fully meshed and runs at “line rate”—meaning that the 40-gigabit/second data-center backbone is available to every user at all times, rather than being designed to serve only a small percentage of researchers as they need it while the rest ponder their research or go to lunch.
The computing cluster communicates with the rest of the campus through 450 miles of fiber optic cable, operating at 1 gigabit/second to users' desktops.
The updated cluster also runs at an impressive 84 percent efficiency, based on the global standard, called the Linpack Benchmark, traditionally used to measure performance and rank top supercomputers. Right now, the Janelia system would rank roughly in the top 200 of existing computing clusters, says Ceric. Janelia plans to enter its cluster in the next edition of the Linpack ranking system this summer.
The installation's increased efficiency is also better for the planet, since it gobbles less electricity. The old cluster ran at 25 million operations per second per watt. Now it can produce 200 million operations per second on the same amount of power. And it throws off less heat that ultimately must be air conditioned away. “It takes less power and we produce fewer BTUs,” says Cicerchia.
As Janelia researchers go about their day thinking up novel ways to explore neural networks, few contemplate the silicon marvel that quietly makes much of their work possible. But ask any of them to consider their research without the cluster and you quickly enter the realm of the unthinkable.
“In a single day at Janelia we can do something that would take 11 years on a single-processor workstation,” says Eddy. “We breathe CPU cycles like air.”
|