Ad hoc Query Support for Very Large Simulation Mesh Data: the Metadata Approach

Byung S. Lee, Robert R. Snapp
Department of Computer Science
University of Vermont
{bslee, snap}@cs.uvm.edu

Terence Critchlow
Center for Applied Scientific Computing
Lawrence Livermore National Laboratory
P.O. Box 808, L-561, Livermore, CA 94551
critchlow@llnl.gov

Ron Musick*
iKuni, Inc
Palo Alto, CA 94304
musick@ikuni.com

Abstract

We present our approach to enabling approximate ad hoc queries on terabyte-scale mesh data generated from large scientific simulations through the extension and integration of database, statistical, and data mining techniques. There are several significant barriers to overcome in achieving this objective. First, large-scale simulation data is already at the multi-terabyte scale and growing quickly, thus rendering traditional forms of interactive data exploration and query processing untenable. Second, a priori knowledge of user queries is not available, making it impossible to tune special-purpose solutions. Third, the data has spatial and temporal aspects, as well as arbitrarily high dimensionality, which exacerbates the task of finding compact, accurate and easy-to-compute data models.

Our approach is to preprocess the mesh data to generate highly compressed, lossy models that are used in lieu of the original data to answer users' queries. This approach leads to interesting challenges. The model (equivalently, the content-oriented metadata) being generated must be smaller than the original data by at least an order of magnitude. Second, the metadata must contain enough information to support a broad class of queries. Finally, the accuracy and speed of the queries must be within the tolerances required by users. In this paper we give an overview of ongoing development efforts with an emphasis on extracting metadata and using it in query processing.

Best Paper Nominee at the

Brazilian Symposium on Databases, Rio de Janeiro, Brazil, October 2001.

Look at the Paper (pdf.gz)

Reprinted in

Brazilian Computer Society, Vol. 8, No. 1, July 2002

Look at the Paper (pdf.gz)

* Work done while author at
Center for Applied Scientific Computing
Lawrence Livermore National Laboratory
P.O. Box 808, L-561, Livermore, CA 94551