IPY Data and Information Service for Distributed Data Management
Project DescriptionShort Title: IPY DIS
Project URL: http://nsidc.org/ipydis/status.html
Proposal URL: http://classic.ipy.org/development/eoi/proposal-details.php?id=49
The WDC for Glaciology, Boulder and the Electronic Geophysical Year (eGY) in collaboration with many others propose to host the IPY DIS described in the IPY Framework Document. The DIS will work closely with the Data Policy and Management Sub-Committee (Data Committee) and other data management bodies and observing networks to develop the IPY data and information policy and strategy. The DIS will then be the primary implementer of that strategy and policy recognizing that the strategy will need to evolve with the science needs and developments of IPY.
Although much will depend on the strategy that is developed, we envision the DIS as an overall data management consultant and coordinator and a central data portal for an internationally distributed data management system. The DIS will continue to establish close partnerships with data centers and organizations around the world to build on existing systems. We will also work with each specific IPY cluster to ensure appropriate centralized data description and distributed archiving. Regional or discipline-specific affinity centers coordinated by the DIS will facilitate appropriate data description and archive. For example the Frozen Ground Data Center at the WDC, Boulder is working closely with the permafrost cluster, while the proposed Arctic Peoples Observations Center (EoI 358) could coordinate community-based monitoring data. Other potential affinity centers based on our current partners could include Russian data, Chinese data, data for education and outreach, remote sensing data, geospatial data infrastructures (regional and global), paleoenvironmental data, marine biological data, bibliographic data, and others (a detailed spreadsheet is available on request). Many of these affinity centers will likely create their own means of access to the data. It is unrealistic for a central DIS to be the single or even primary means of access, but we would like to establish a means to automatically share metadata across the system through a common (perhaps XML-based) framework
Specific activities of the DIS could include:
-Collection (automated, where possible) of catalog metadata for all IPY projects and provision of Web-based portals to all IPY data archived around the world.
-Examination and implementation of data discovery tools and data presentation schemes such as an interoperable web-based map server to enhance data access through a Web portal (could include a locator map for all IPY projects).
-Identification of existing tools to facilitate data management, and build on those to meet the needs of the IPY community. For example, the Global Change Master Directory's (GCMD) metadata authoring tool, docBuilder, could be customized.
-Serving as a focal point for cross-disciplinary data integration, especially across the natural and social sciences.
-Creation of appropriate management tools for non-numerical data such as interview transcripts, photographs, and videotapes.
-Collaboration with eGY to make data management best practices and principles available to researchers and agencies, via Web pages, workshops, and other channels.
-Acting as a clearinghouse and facilitator for data management and integration issues that need research, discussion, and resolution.
-Working with eGY to increase awareness of the value of data management for both numerical and non-numerical data.
-Responsive service to the IPY research community regarding data management
The DIS will take advantage of existing data management infrastructures, organizations, and technologies such as National and World Data Centers, the Joint Committee for Antarctic Data Management (JCADM), the GCMD, virtual observatories, and the Global Spatial Data Infrastructure. This distributed system will allow for appropriate management of the various types of data including social science and physical science data, and analog collections. The distributed nature of the system will also encourage development of new and experimental data access methods, including data mining technology and innovative data presentation methods that facilitate data integration.
It is essential, however, to ensure ready and equitable access to and effective long-term preservation of the data. The DIS will assist distributed archives in adhering to sound data management principles and best practices as defined by the Data Committee, eGY, CODATA, WCRP CliC, JCADM, our partners, and other entities. We will ensure that these principles build upon existing international standards such the Open Archival Information System Reference Model and the ISO19115 metadata standard. The DIS will take advantage of emerging shared resources in the geosciences community, such as the effort to develop an international geophysical sample number (IGSN). In addition, the DIS could assist data providers in addressing human subjects protections and confidentiality issues for social science data and for other types of geo-referenced data.