ORI Introduction to RCR: Chapter 6. Data Management Practices
Once collected, data must be properly protected. They may be needed later:
- to confirm research findings,
- to establish priority, or
- to be reanalyzed by other researchers.
Over time, data, as the currency of research, become an investment in research. If the data are not properly protected, the investment, whether public or private, could become worthless.
Data storage. The responsible handling of data begins with proper storage and protection from accidental damage, loss, or theft:
- Lab notebooks should be stored in a safe place.
- Computer files should be backed up and the backup data saved in a secure place that is physically removed from the original data.
- Samples should be appropriately saved so that they will not degrade over time.
- Care should be taken to reduce the risk of fire, flood, and other catastrophic events.
Properly store and protect your data. They are valuable.
Confidentiality. Some data are collected with the understanding that only authorized individuals will use them for specific purposes. In such cases, care needs to be taken to assure that privacy agreements are honored. This is particularly true of data that contain personal information that can be linked to specific individuals. It is also true of confidential information about protected processes and materials. If a company shares confidential data about a process with a researcher prior to seeking a patent on that process, the researcher must take care to make sure the data are kept confidential.
Data that are subject to privacy restrictions must be stored in a safe place that is accessible only to authorized personnel. Using random codes to identify individual subjects, rather than names or social security numbers, can also further protect private information. Access to these codes can then be restricted to provide a double layer of protection. Whatever the method used to protect private or confidential information, the researcher who collects or uses the information has the primary responsibility for its protection.
Period of retention. Data should be retained for a reasonable period of time to allow other researchers to check results or to use the data for other purposes. There is, however, no common definition of a reasonable period of time. NIH generally requires that data be retained for 3 years following the submission of the final financial report. Some government programs require retention for up to 7 years. A few universities have adopted data-retention policies that set specific time periods in the same range, that is, between 3 and 7 years. Aside from these specific guidelines, however, there is no comprehensive rule for data retention or, when called for, data destruction.
It is difficult to predict when data collected sometime in the past could be useful. When a new disease emerges, such as AIDS, researchers use stored samples/data to pinpoint first occurrences and the likely course of development of the disease. Although the original data were not stored for this purpose, they nonetheless can be useful for tracking diseases years later. Stored data are also useful for understanding social questions. The Department of Energy committee that made recommendations on appropriate compensation for improper human radiation experiments conducted during the Cold War pulled together data collected as far back as the 1950’s. Researchers also cannot predict when someone will challenge their work and ask to see the original data.
Given the different reasons data could be useful over long periods of time, researchers should give some thought to retaining data longer than some minimum period required by specific regulations. How long is reasonable will vary from field to field and institution to institution. Nevertheless, it is important to have a clear retention policy that balances the best interests of society with those of the research institution and the individual researcher. Before throwing out notebooks, cleaning out files, or erasing your computer memory, give careful consideration to who might benefit from or ask to see your data in the future.