In scientific research, data represents the key component that puts researchers at the forefront of discovery, enabling researchers to validate hypotheses, and allows researchers to advance our understanding of the natural world around us. However, data is not gained by way of a foolproof process, and it is not an everlasting currency. Acquisition of data is a human process researchers follow to make sense of our otherwise imperfect world.
Even in that sense of exploration, though, data can change throughout in an experiment—either through actual experimentation or unintentionally by how the data gets handled. In terms of the unintended changes to data, some examples are in the forms of duplication errors that leak into completed datasets or overall misrepresentation of data undergoing analysis or incorrect conclusions caused by being drawn from data degradation.
Data is often collected through a rudimentary system consisting of physical notebooks, customized spreadsheets, and key personnel who sometimes graduate from an academic institution or who leave a laboratory to pursue higher positions. In the span of two to six years, an academic institution’s laboratory will likely change many times to entirely different personnel from the personnel who were there at the inception of the lab or its research.
New personnel in the lab have to compile data from projects—without any insights from the original project scientist, which leads to errors in the keying of data input into spreadsheets and negatively impacting subsequent data analysis.

Avoiding Data Rabbit Holes

Because of this decentralized approach, datasets can yield multiple final versions generated by multiple personnel through the course of multiple years. As each subsequent researcher uses data that contains duplication errors and thus performs inappropriate analyses, the data’s degradation can lead to weeks or months or years of sifting through mistakes and having to correct previous work before publishing new discoveries.
Couple that situation with the pressures of reaching research milestones and the production of inconsistencies in data handling, and the situation of p-hacking and redacted papers is a real possibility. P-hacking—also known as data dredging, data fishing, data snooping, and data butchery—involves taking a dataset and either collecting further information indiscriminately or running different analyses until otherwise statistically insignificant results become significant (P<0.05).
It’s easy to see how one scientific team can lead an entire research line into a rabbit hole by reaching false conclusions and spending millions of dollars that are wasted on therapeutics and theories that do not advance our understanding of the world around us.

The Data Call to Action

This is not a condemnation of the scientific process; rather, this is a call to action so that scientists think about how data gets handled throughout the entire research process. Here are some principles to follow:
  • Embrace software and systems that track data in a centralized location and that enable collaboration between multiple personnel working on analyses, which should help avoid headaches down the road.
  • Encourage specialization of analysis by a small number of senior researchers through statistical software, which can eliminate unintended acts of p-hacking.
  • Avoid data loss that can lead to productivity loss; the recovery of previous, outdated versions of data; and inconsistent analyses that in turn lead to such problems as p-hacking.
  • Be mindful that data degrades over time, and old datasets can get inserted into final versions of data, leading to duplication errors and overall misrepresentations of experiments.
The scientific method self-corrects based on what is considered current truth, but the development of data-handling methods that help avoid negative side effects, loss of time, and duplications of effort is in the best interests of all who participate.
Share this post:

Articles in Your Inbox

Stay up to date on our latest articles, delivered directly to your inbox.

Newsletter
About Our Blog

Support researchers on their quest to adopt best practices in the in vivo lab and stay up-to-date on the latest trends and insights from the experts at RockStep Solutions. 

Organize. Automate. Accelerate.

Modernize in vivo Research

Discover how Climb can help revolutionize your in vivo workflow.
Whitepaper