Every once in a while, PhD students and young researchers may come across the situation of asking for external grey data to continue with our research. This has always been a dilemma and extremely vague to operate without clear rules. As open science researchers we want to make most dataset widely and easily accessible for researcher all over the word, but why would somebody give you their data for free under the risk of it being misinterpreted? Thus, in today’s brain storming, we built up a fictional retired professor ‘Shommi’ to discuss the following questions, and trying to figure out strategies of obtaining the dataset from Dr. Shommi in a legitimate and mannered way.
- What is grey data?
- Why is it desirable to see grey data published?
- What sources of grey data exist?
- Why might a data producer not publish on the data they’ve produced?
- What sort of questions can we ask of grey data?
- How do we convince producers of grey data to work with us?
Dr. Shommi is a 78 year old retired professor. He worked in a medical school on veterinary and entomology. Specializing in lyme disease, he has got a lot data of tick genome from more than 15 years observation. However, he is very stubborn and reluctant to either publish or share his grey data. What can we do with him?
First of all, we discussed the definition of grey data: data that is not published, underlying data behind paper or public report, vaguely be found in government report/survey. In medical trials, it’s almost a known secret that researchers sometimes selectively publish good data, while the whole dataset might reveal more comprehensive effects.
Then it comes the question why we need grey data. Because tax money was spent on collecting them, long-term data can change in years, making some grey data valuable.
Given the above concerns, what would the motivation for researchers to share grey data? 1. authorship, but only if researchers take active role in explaining, cleaning or analyzing the data. 2. altruism, some open scientists are always willing to share their unused data. 3. legacy. It happened a lot with old retired professor that valuable dataset were laying in the basement till they passed away. In this case you need to reach out to the deceased researcher’s spouses to dig the grey data.