Treating Images as Data
Scientific digital images are data that can be compromised by inappropriate manipulations.
- Digital images are a composed of a grid of individual elements called pixels.
- Each pixel has a specific location relative to its neighbors, it has a scale (size in microns) relative to the instrument used to capture the image and it has an intensity value based on the amount of light (energy) that was collected when the entire image was captured.
- Intensity values can be expressed as numerical values, often these are 8 bit numbers (integers). In the case of a three color image, this means numbers between 0-255 for each color; red, green and blue.
- Because images are a grid and each position in the grid has numerical values, digital images share many characteristics with a spreadsheet.
- All of the different types of image manipulations are simply mathematical functions that change the underlying numbers in the image.
- It is easy to conceptualize that if you change the numbers, you would change the image. It is important to remember that when you manipulate the image in software, you are changing the underlying numerical values.
- A careful scientist working with a spreadsheet of numerical data would document any mathematical functions that were applied to the data in a uniform manner. The scientist would be unlikely to modify the data in ways that were non-uniform.
- Careful scientists document their image manipulations for the same reasons and avoid performing manipulations that are non-uniform.
- Careful scientists are aware of the physical, electronic and software limitations of their acquisition instrument. Every technique has limitations and caveats that must be accounted for when interpreting image data.
- Acquisition settings on a particular instrument can compromise image data from the very beginning.
- Users often want to see bright images. By aggressively adjusting the gain on instruments like confocal microscopes, portions of the image can become over-saturated. Over-saturation truncates the values of the brightest areas in the image such that every pixel has a value of 255 and any subtle differences between pixels are missing. Over-saturation of the image during acquisition means that the subtleties of the data in those areas are lost and cannot be recovered at a later time.
- Users are also tempted to reduce the background level settings to ensure that they acquire a clean-looking image. An image with a background level that has a uniform black color (e.g., in a confocal image) is highly unlikely. The presence of background signal is the hallmark of a real biological image.
- Post-acquisition manipulations of image data using software can also create over-saturation and/or abnormally clean backgrounds in an image. Users should learn how to use and interpret the intensity histogram tool to ensure that they are not over-manipulating their image data. Our eyes can perceive perhaps 30 greyscales and at most a few thousand colors. The intensity histogram tool allows us to look at the data in a different way to ensure that we are manipulating it correctly.
|