If you’ve read some of my editorials on scientific publishing in the past, you may have gotten the (correct) impression that I’m an avid proponent of open-access. I
strongly believe it is one of the keys to the
future of scientific development and innovation. That’s why a recent study from Penn
State researchers caught my eye.
According to Rick Gilmore, associate professor of psychology at the university, data
sharing may actually play a significant role in
science’s reproducibility crisis.
For years, the popular criticism of scientific researchers has been their inability to
reproduce certain studies, including high-pro-file paper retractions such as the memorable
STAP stem cells paper out of the Riken
Institute a few years ago. Couple that to the
life sciences industry, where some studies
have demonstrated that an estimated 50 percent of published data is irreproducible. All
of a sudden it’s not a problem—it’s a crisis.
But, Gilmore may have a somewhat simple approach.
According to his study, in psychological
and brain sciences at least, irreproducibility
has more to do with the complexity of managing data, rather than incorrect or hidden
methods and results.
Gilmore uses cognitive neuroscience as his
example. It is a computationally intensive
field that produces data files in a variety of
sizes and formats. There’s data from EEGs,
fMRIs, MRIs and CT and PET scans. Then
there’s video and audio recordings, surveys
and computer-based tasks. However, there
are relatively few organized initiatives to encourage sharing of these different file types,
nor is sharing widespread.
“Right now, data sharing is still largely
unfunded and unrewarded and is only rarely
required,” said Gilmore.
To that point, Gilmore and his co-authors
suggest requiring data sharing for federal
grant funding. Publishers of scientific journals
could also mandate the accessibility of data
as a requirement to be published. On an encouraging note, some journals have already
begun to do this.
We’ve also recently seen other positive
trends in creating a more open, and subsequently reproducible, environment. For
example, more and more researchers are
sharing not only their data but the computer
100 Enterprise Drive, Suite 600
Rockaway, NJ 07866-0912
973-920-7000; Fax: 973-920-7541
Chief Executive Officer
Chief Operating Officer/
Chief Financial Officer
SENIOR VICE PRESIDEN T, SALES
The YGS Group
For subscription related matters:
software they used
to analyze it. At the
same time, technology is continuing to improve, with developers
creating new web-based management tools
and software to help scientists work with
and share their data.
Gilmore is also the founding co-director of
the Databrary Project, which is a web-based
digital library for storing, managing, preserving, analyzing and sharing video. The project,
funded by the National Science Foundation
and the National Institutes of Health, aims
to promote data sharing, archiving and reuse
among researchers who study human development.
Another example is the website protocols.
io, an up-to-date, open-access, collaborative
repository of scientific methods and protocols. Founded almost two years ago by MIT
postdoc Lenny Teytelman, protocols.io is the
result of his own horrific research experience. He spent a year and a half conducting
research before discovering that a single step
of the fish microscopy method he was using,
which was published in Nature Methods,
was faulty—and there was nothing he could
do to warn others.
Similar tools include CiteAb, which helps
scientists identify antibodies; ChemSpider, for
chemical structures; and Access Innovations,
which scans and flags published papers that
have accidentally used misidentified cell lines.
Still, the number of open-access articles,
journals and tools pales in comparison to
their restricted counterparts.
Although Gilmore and his co-authors were
speaking about the field of neuroscience in
particular when they said the following, I
think it has applications in the broader sci-
“We think that investments in…infrastruc-
ture will generate big payoffs,” the research-
ers said. “Fostering the widespread adoption
of open, transparent and reproducible re-
search practices coupled with innovations in
technology that enable the large-scale analysis
of ‘big data’ will accelerate the discovery of
generalizable, robust and meaningful find-