• Data are numerical quantities or other factual attributes derived from observation, experiment or calculation.
- National Research Council, 1992a. "Setting priorities for space research: Opportunities and imperatives."
• Data are facts, numbers, letters, and symbols that describe an object, idea, condition, situation, or other factors. Data in a database may be characterized as predominantly word oriented (e.g., as in a text, bibliography, directory, dictionary), numeric (e.g., properties, statistics, experimental values), image(e.g., fixed or moving video, such as a film of microbes under magnification or time-lapse photography of a flower opening), or sound (e.g., a sound recording of a tornado or a fire)... Data can also be referred to as raw, processed, or verified.
- Committee for a Study on Promoting Access to Scientific and Technical Data for the Public Interest, National Research Council. A Question of Balance: Private Rights and the Public Interest in Scientific and Technical Databases (1999). Available at: http://www.nap.edu/openbook.php?record_id=9692&page=15
• The term "data" is used in this report to refer to any information that can be stored in digital form, including text, numbers, images, video or movies, audio, software, algorithms, equations, animations, models, simulations, etc. Such data may be generated by various means including observation, computation, or experiment.
- National Science Foundation (2005). Long-Lived digital data Collections: enabling Research and education in the 21st Century. P.9. Available at:http://www.nsf.gov/pubs/2005/nsb0540/nsb0540.pdf
• Research data, unlike other types of information, is collected, observed, or created, for purposes of analysis to produce original research results.
- University of Edinburgh. How to manage research data: Defining research data.
• In the context of these Principles and Guidelines [Principles and Guidelines for Access to Research Data from Public Funding], “research data” are defined as factual records (numerical scores, textual records, images and sounds) used as primary sources for scientific research, and that are commonly accepted in the scientific community as necessary to validate research findings.
- Organisation for Economic Co-operation and Development (OECD, 2007). OECD Principles and Guidelines for Access to Research Data from Public Funding. P.13. Available at: http://www.oecd.org/dataoecd/9/61/38500813.pdf
• Data set: A logically meaningful collection or grouping of similar or related data, usually assembled as a matter of record or for research, for example, the American FactFinder Data Sets provided online by the U.S. Census Bureau or the National Elevation Dataset available from the U.S. Geological Survey. Also spelled dataset.
- Online dictionary for library and information science (ODLIS). Available at: http://www.abc-clio.com/ODLIS/odlis_A.aspx.
• A research data set constitutes a systematic, partial representation of the subject being investigated.
- Organisation for Economic Co-operation and Development (OECD, 2007). Available at: http://www.oecd.org/dataoecd/9/61/38500813.pdf.
• DOE generates scientific research data in many forms, both text and non-text. Much of the Department's text-based R&D results are readily available via OSTI databases. OSTI has broadened efforts to make non-text scientific and technical information (STI) available as well, providing access to underlying non-text data such as numeric files, computer simulations and interactive maps, as well as multimedia and scientific images.
- Department of Energy (DOE). Available at: http://www.osti.gov/data/index.shtml
• Over the life course of a survey that results in a data set – from initial conceptualization to data publication and beyond -- a huge amount of metadata is typically produced. These metadata can be recorded in DDI format and re-used as the data collection, processing, tabulation, and reporting/dissemination take place.
- Arofan Gregory, Open Data Foundation (2011). The Data Documentation Initiative (DDI): An Introduction for National Statistical Institutes. Available at:http://odaf.org/papers/DDI_Intro_forNSIs.pdf
Research data can be generated for different purposes and through different processes. Based on Research Information Network, it can include the following types of data:
Metadata and documentation are different things: Documentation is meant to be read by humans; some metadata is designed more for machine processing than human readability. However metadata can be taken as a type of documentation. Create and generate metadata for your research data and datasets in your research lifecycle to preserve the data in the long run.
1. Consider what information is needed for the data to be read and interpreted in the future.
2. Understand your funder requirements for data documentation and metadata. Funder requirements for NSF, GBMF, IMLS, NEH, NIH and NOAA can be found at https://dmptool.org/guidance.
3. Consult available metadata standards in your field. You may refer to Common Metadata Standards and Domain Specific Metadata Standards for details.
4. Describe data and datasets created in your research lifecycle, and use software programs and tools to assist in data documentation. Assign or capture administrative, descriptive, technical, structural and preservation metadata for the data. Some potential information to document:
5. Adopt a thesauri in your field or compile a data dictionary for your dataset.
6. Obtain persistent identifiers (e.g. doi) for datasets if possible to ensure data can be found in the future.
For your full data management plan, please refer to Digital Curation centre’s Checklist for a Data Management Plan.
(Source: DMPTool: https://dmp.cdlib.org/; Digital Curation: A How-To-Do-It Manual; Digital Curation Centre: http://www.dcc.ac.uk/)