Abstract:
Delivering NASA Earth Observing System (EOS) Data with Digital Content Repository Technology: A Software System to Promote Best Practice in Digital Provenance and Effective Access of Content and Associated Metadata
Digital content, including Earth Science observations and model output, is an essential part of contemporary scientific research activities. Not only is the rate of archiving for such ... content increasing rapidly, but there is also an increase in derived and on-demand data product creation and consumption. As a result of these trends, scientific digital content has become even more heterogeneous in format and more distributed across the Internet. In turn, this makes the content more difficult for providers to manage and preserve and for users to locate, understand, and consume. Specifically, it is increasingly harder to deliver relevant metadata and data processing lineage information along with the actual content, particularly when there are multiple ways of delivering the content, including the increasing use of web services. Readme files, data quality information, production provenance, and other descriptive metadata are often separated in the storage level as well in the data search and retrieval interfaces available to a user. Critical archival metadata, such as auditing trails and integrity checks, are often even more difficult for users to access, if they exist at all.
We propose to address these challenges by using and extending the capabilities of a contemporary digital object repository to work for science data and metadata delivery.
Digital repository technology has been used for digital libraries at great success, and we believe it can also be applied to the more complex needs of Earth Science data management. We will demonstrate this capability in the context of an existing modeling and synthesis data center project for the North American Carbon Program (NACP) as the primary science context and one of the more complex data projects for the ORNL Distributed Active Archive Center for Biogeochemical Dynamics (ORNL DAAC) as a second context.
There are three high-level objectives in this project:
(1) Demonstrate the applicability of a digital object repository technology to science data. Based on our preliminary work, we expect to couple the Fedora Repository and a Drupal-based Graphic User Interface (GUI) as key elements of a next-generation NASA Earth system science data center infrastructure, using datasets collected as part of the NACP Modeling and Synthesis Thematic Data Center (MAST-DC) as the primary science context.
(2) Use this implementation to enable better and more consistent access to critical metadata, including processing lineage information and administrative metadata, using the capabilities inherent in a digital repository (multiple streams for a given object and remote data streams). The enhanced metadata access ensures that science digital content becomes more transparent to the end user, with provenance and quality control information readily available.
(3) Demonstrate how data providers can more easily and effectively manage science data sets, associated metadata, processing lineage, and quality control/data provenance information. A consistent process, with associated user interfaces, application programming interfaces (APIs) can be used by the data provider to ingest, update, and modify a dataset for metadata changes or additional content dissemination revenues. Particularly in the context of the ORNL DAAC data, this work will demonstrate potential technology migration paths for existing data operations.
Originators:
Jerry Pan, Christopher Lenhardt, Biva Shrestha, Yaxing Wei, Robert Cook, Giri Palanisamy, Bruce Wilson
Title:
ESDORA: A Digital Repository System Facilitating Data Preservation, Provenance, Discovery, and Access
Release_Date:
2010
Provider:
Oak Ridge National Laboratory
URL:
http://esdora2.ornl.gov/
Name:
JERRY
YUN
PAN
Phone:
678-770-3721
Fax:
865-241-3685
Email:
pany at ornl.gov
Contact Address:
Environmental Sciences Division
Oak Ridge National Laboratory City:
Oak Ridge
Province or State:
TN
Postal Code:
37831-6407
Country:
USA
Distribution Media
Distribution_Media:
Online
Fees:
No fees
Personnel
JERRY
YUN
PAN Role:
TECHNICAL CONTACT
Phone:
678-770-3721
Fax:
865-241-3685
Email:
pany at ornl.gov
Contact Address:
Environmental Sciences Division
Oak Ridge National Laboratory City:
Oak Ridge
Province or State:
TN
Postal Code:
37831-6407
Country:
USA
TYLER
B.
STEVENS Role:
SERF AUTHOR
Phone:
(301) 614-6898
Fax:
301-614-5268
Email:
Tyler.B.Stevens at nasa.gov
Contact Address:
NASA Goddard Space Flight Center
Global Change Master Directory City:
Greenbelt
Province or State:
MD
Postal Code:
20771
Country:
USA
Center for International Earth Science Information Network (CIESIN)/Columbia University (Not Available), Compendium of Environmental Sustainability Indicator Collections: Ancillary Data, NASA Socioeconomic Data and Applications Center (SEDAC), Palisades, NY, http://sedac.ciesin.columbia.edu/data/set/cesic-ancillary-data
South Pacific Applied Geoscience Commission (SOPAC), and United Nations Environmental Program (UNEP) (2004), Compendium of Environmental Sustainability Indicator Collections: 2004 Environmental Vulnerability Index (EVI), NASA Socioeconomic Data and Applications Center (SEDAC), Palisades, NY, http://sedac.ciesin.columbia.edu/data/set/cesic-environmental-vulne...
Creation and Review Dates
SERF Creation Date:
2011-11-14
SERF Last Revision Date:
2011-11-15