Things to Think About: Things to Think About is a new section of the ORI blog where members of the ORI staff communicate about ideas we have about the responsible conduct of research.
It’s a familiar scenario for graduate students in many disciplines: your PI walks into the lab and tells you he needs a series of figures in his inbox right now. He’s up against a hard deadline and everything is riding on your data.
You drop what you are doing and begin compiling the data he requested, but which file is it? You know you need a proton and carbon NMR spectrum of compound 1, but you took spectra at each stage of your synthesis. And you performed each step of the synthesis multiple times.
Opening your data folder, you see the following file names:
Which files are you going to send your PI? You know that you should go through your notebook to see if you wrote down the file names of each spectrum you took, but do you have time? What if you did not write down all of the file names in your notebook? Should you just choose the best looking spectrum? After all, all the spectra are of the same compound. Where is the harm?
It is easy to see how poorly thought out file naming can lead to problems in publishing data. How should you be saving your files so you can find what you need when you need it? There is no single right answer, but here are a few suggestions:
- Include the date
- Correlate the name with your notebook entry to easily find more information about this data
- Keep a list of shorthand codes that simplify names so that others in your research group can decipher the file names after you move on (ex: Compound=CMPD)
Now imagine you open your data folder to see these file names:
Here you have used a simple file naming system:
Initials_month-date-year_notebook number_notebook page-description.
With a naming system like this you can easily find information about each file in your lab notebook. Now you can check your lab notebook quickly to confirm that you are sending the right files to your PI, reducing the risk of including the wrong images in a publication.
Is this the only acceptable way to name your files? Of course not! Find a system that works for you so that you can avoid mix-ups and maintain the integrity of the scientific record.
Better to do the dates as year-month-day (so all the 2017 files sort before the 2018 files) and be sure to always use 2 digits (so the earlier single-digit months (e.g. 03) sort before the later double-digit months (e.g.11).
The YYYY-MM-DD method of creating dates is actually an ISO standard now, see: https://www.iso.org/iso-8601-date-and-time-format.html