Home About Press Employ Contact Spyglass Advanced Search
HHMI Logo
HHMI News
HHMI News
Scientists & Research
Scientists & Research
Janelia Farm
Janelia Farm
Grants & Fellowships
Grants & Fellowships
Resources
Resources
HHMI Bulletin
Currrent Issue Subscribe
Back Issues About the Bulletin
May '06
Features
divider

Lymphocytes,
Camera, Actionsmall arrow


divider

There's Gold In
Those Archives


divider

Extreme Shoppingsmall arrow

divider
Online Exclusive
divider

A Lab With a View

divider

The Powerhouse—and Sentinel—of the Cellsmall arrow

divider
Cech
divider
UpFront
divider
Chronicle
divider
Perspectives
divider
Editor

Subscribe Free
Sign up now and receive the HHMI Bulletin by mail free.small arrow

FEATURES: There's Gold In Those Archives

PAGE 5 OF 5

When Articles and Data Go AWOL
Nathans

Jeremy Nathans If you're counting on your published articles serving as a record of your research, he warns, think again.

Jeremy Nathans, an HHMI investigator at the Johns Hopkins University School of Medicine, remembers searching for an article he considers to be a landmark publication. Citation in hand, he figured the fastest way to find it would be a quick PubMed search to link to the original article, which appeared in the journal Nature in 1978. He found nothing. The article had been missed in the process of adding pre-computer-era articles to the PubMed database, which includes citations and abstracts for virtually all published biomedical literature.

Eventually, Nathans tracked down the article by contacting the author, who scanned the original print document and sent a grainy PDF file. But Nathans was still left with an uneasy feeling. Because scientists rely so heavily on PubMed searches, he reasoned, if it doesn't appear there “it's as if it had never existed.” (Nature has since added that particular article to its electronic archive.)

Research results can also disappear when they are relegated to the ranks of “supplemental data” when a journal article is published. These data are only available online, and do not always print out along with the main article. “A lot of us believe that the best way to store data is by publishing it,” says Nathans. “But now journals are telling us to put so much in supplemental data, and that gets divorced from the published article.”

“This issue of supplemental data is becoming bigger and bigger,” says Edwin Sequeira, policy coordinator for PubMed Central, an electronic complement to PubMed that offers free access to full-text journal articles at the National Library of Medicine. “I see it as an economic decision not to put all of the data into print, but I would argue that if the data are important enough to include at all, they are an integral part of an article and should be treated as such.”

Further, says Sequeira, not all journal publishers provide supplemental data when sending their articles for archiving. If a publisher goes out of business, there's no guarantee that those types of materials in its possession will survive. He thinks that as long as scientists are providing such supplemental materials, they should make sure the journals are supplying them to PubMed Central along with the article they complement.

Traditionally, publishers have relied on libraries to maintain long-term archives, but in the digital age that role is in transition. Librarians, publishers, and the scientific community are grappling with how libraries will maintain the role of storing published articles and their supplemental data in the digital age.

One potential solution is now being explored by a consortium organized by Stanford University Libraries. The system, called LOCKSS collects newly published content from participating publishers by using a Web crawler that compares the content it has collected with the same content collected by other LOCKKS users and repairs any discrepancies. The system, initiated by a small team of librarians and engineers, provides a mechanism to guarantee libraries long-term access to complete content by making multiple copies of published data stored at all participating sites. If one site has a technical problem, data can be restored from any of the other sites. Some scientific publishers have begun to buy into the system, which is still in its infancy. To date, 80 major research libraries in the United States and 25 in Europe, as well as others scattered around the world, are participating.

“If publishers go out of business their online resources can vanish,” says Michael Seadle, assistant director for information technology at Michigan State University and a LOCKSS user. “We want to make sure that scholars, 10 or 100 years from now, will still have access to this data. LOCKSS is a way to make sure that published information doesn't disappear, while respecting the publishers' copyright. It's a security policy for everyone.”

—Karyn Hede

Photo: Bill Denison

dividers
PAGE 1 2 3 4 5
small arrow Go Back
dividers
Download Story PDF
Requires Adobe Acrobat
Email This Story

HHMI INVESTIGATOR

Jeremy Nathans
Jeremy Nathans
 
Related Links

AT HHMI

bullet icon

New Recipe for Discovery: An Online Blend of Worms, Flies, Yeast
(03.10.06)

bullet icon

Making the Right Moves: A Guide to Lab Management

ON THE WEB

external link icon

WormBase

external link icon

WormAtlas

external link icon

Mouse Genome Informatics

external link icon

San Diego Supercomputer Center

external link icon

Drosophila RNAi Screening Center at Harvard Medical School

external link icon

The Explorer's Club

external link icon

PubMed Central

external link icon

LOCKSS

external link icon

Scientific Sources Material

dividers
Back to Topto the top
HHMI Logo

Home | About HHMI | Press Room | Employment | Contact

© 2012 Howard Hughes Medical Institute. A philanthropy serving society through biomedical research and science education.
4000 Jones Bridge Road, Chevy Chase, MD 20815-6789 | (301) 215-8500 | e-mail: webmaster@hhmi.org