Data Management Resources

Following the National Science Foundation (NSF) principle that data collected with public funds should be publicly available, the Directorate for Social, Behavioral, & Economic Sciences (SBE) data archiving policy requires that “grantees from all fields will develop and submit specific plans to share materials collected with NSF support, except where this is inappropriate or impossible.” The new Proposal and Award Policies and Procedures Guide now includes guidelines for the preparation of a data management plan that must supplement all NSF proposals staring January 18, 2011. A list of data management and sharing Frequently Asked Questions is also available from NSF, while an overview of specific practices can be found in ARC’s one-page primer on database management and in the first set of resources listed below. The second set of listed resources provides information on emerging opportunities, challenges, and policies.

Archiving Data

Inter-University Consortium for Political and Social Research (ICPSR)
ICPSR is an international consortium of about 700 academic institutions and research organizations that maintains a data archive of more than 500,000 files of research in the social sciences. The Consortium also provides training in data access, curation, and methods of analysis for the social science research community. ICPSR recently hosted a webinar on data management plans, and their website includes a framework and resources for researchers as well as a comprehensive Guide to Social Science Data Preparation and Archiving: Best Practice Throughout the Data Life Cycle (4th ed).  

Data Documentation Initiative (DDI)
The DDI Alliance of over 30 member organizations worldwide was founded by ICPSR in order to create international standards for describing social science data across the research lifecycle, including data conceptualization, collection, processing, distribution, analysis, and archiving. Using an XML standard, the DDI promotes interoperability, uniform variable structure, online analysis, and precise searching. The DDI website includes pages on Getting Started with DDI and DDI Best Practices.

The Dataverse Network Project
The Institute for Quantitative Social Science (IQSS) at Harvard University has made it possible through the Dataverse Network project to store and access data on a remote server. Specifically, “the Dataverse Network is an application to publish, share, reference, extract and analyze research data. It facilitates making data available to others, and allows replication of others work. Researchers and data authors get credit, publishers and distributors get credit, affiliated institutions get credit.” This free resource allows researchers to publish data and share it publicly or just with project collaborators.

UK Data Curation Centre (DCC)
The DCC is the UK’s leading centre of expertise in digital data curation, offering expert advice and practical help for anyone who has an obligation to store, manage and protect digital data. The DCC Curation Reference Manual contains advice, in-depth information and criticism on current digital curation techniques and best practice.  It is an ongoing, community-driven project, which involves members of the DCC community suggesting topics, authoring manual installments and conducting peer reviews.

Emerging Issues

National Science Foundation
In addition to the guidelines offered in the data archiving policy, NSF provides a framework for meeting the challenges of data access and management can be found in a 2007 report Cyberinfrastructure Vision for 21st Century Discovery . An earlier report by the National Science Board provides findings and recommendations from an analysis of the policy issues relevant to long-lived digital data collections. Other places to find resources for data management and archiving include:

National Academy of Sciences (NAS)
The National Research Council (NRC) report on Ensuring the Integrity, Accessibility, and Stewardship of Research Data in the Digital Age examines the consequences of the changes affecting research data with respect to three issues: integrity, accessibility, and stewardship. The authoring committee has developed a fundamental principle that applies in all fields of research regardless of the pace or nature of technological change. The report then explores the implications of these three central principles for the various components of the research enterprise.

National Science and Technology Council (NSTC)
Empowered by an array of new digital technologies, science in the 21st century will be conducted in a fully digital world. In this world, the power of digital information to catalyze progress is limited only by the power of the human mind. The NSTC report Harnessing the Power of Digital Data for Science and Society provide strategies to ensure that digital scientific data can be reliably preserved for maximum use in catalyzing progress in science and society.

Council of Governmental Relations (COGR)
Few institutions have formal policies and procedures for access to and retention of research data. The COGR guide on Access to and Retention of Research Data: Rights and Responsibilities and its component case studies are intended to assist these stakeholders in recognizing situations where roles or policy need to be clarified, to identify issues that may need to be addressed, and to review options for defining responsibilities with respect to access and retention of research data.