Experiences Applying Metadata to Bioinformatics

Terence Critchlow, Tom Slezak
Center for Applied Scientific Computing
Lawrence Livermore National Laboratory
P.O. Box 808, L-561, Livermore, CA 94551
{critchlow, slezak}@llnl.gov

Ron Musick*
iKuni, Inc
Palo Alto, CA 94304
musick@ikuni.com

Abstract

Bioinformatics is facing the daunting challenge of providing geneticists and biologists effective, efficient access to data currently distributed among dynamic, heterogeneous data sources. Complicating the problem is hte speed at which the underlying science and technology evolve, leaving the terminology, databases and interfaces to catch up. As the genomics community moves from sequences to functional genomics, the pressure to find a solution is increasing. Realistically addressing this problem, whether through a data warehouse, multi-database, federated database, or other approach, requires development of a scalable, flexible infrastructure that can quickly adapt to meet user needs in this extremely dynamic environment. This is best accomplished by extensively using meta-data to reduce the application's maintenance costs. Using the DataFoundry project as an example, this paper discusses the first steps and practical problems of developing a metadata-based infrastructure capable of meeting the demands of an active scientific community. It also demonstrates how much bioinformatics must still progress before it can truly satisfy its users.

Appeared in

Information Sciences, Volume 139 (1-2), November 2001.

Look at the Paper (pdf.gz)

Companion paper appeared in

Fifth Joint Conference on Information Sciences, Vol 2. February 2000, Atlantic City NJ.

Look at the Paper (ps.gz, pdf.gz)