How Research Data Management benefits everyone through open access

Research Data Management (RDM) is a systematic and planned approach to the entire life cycle of scholarly data: from the collection, creation, and/or observation to documentation, storage, and sharing. All researchers engage in RDM in some capacity, but the better a project’s research data is managed, the better the impact the project will have beyond its duration of work.

Making research data accessible to the public and other scholars increases the integrity of the research and contributes to building a greater body of knowledge, a noble cause for every scholar. “Public access is a natural continuation of an academic institution’s research mission,” says Shane Moeykens, director of Advanced Research Computing, Security, and Information Management (ARCSIM) at UMaine. “Part of that continuum is making sure it’s not just writing a report about the information, but making the information itself available to the broader community.”

Open access to research data increases the visibility of a researcher’s work, advances the field of discipline, and ultimately enhances public trust. In some cases making research data open to access is required. The National Science Foundation’s Proposal & Award Policies & Procedures Guide (PAPPG), for example, clearly states that “investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants.” This means that researchers with NSF funding must make project materials available to others.

Research data extends beyond measurements and publications and should be thought of as everything utilized in a project. Lab notebooks, archival photos taken during a library visit, survey results, audio recordings, samples, and film clips are just some of the possibilities of what can be considered research data.

The most common ways to make research outputs available are through data repositories. UMaine has two tools available: DigitalCommons managed by Fogler Library and Dataverse facilitated by the Advanced Computing Group and the University of Maine System IT organization. DigitalCommons primarily aggregates publications and other research and creative products of a project. Its ease of use and aggregation saves time for researchers looking to find the history of research done on a certain topic. Dataverse contains the research data itself, and it is particularly useful for large projects with multiple contributors to place all their work in one place. Any researcher at UMaine can seek advice from the Office of Research Administration, the Fogler Library, and the Advanced Computing Group to make a plan for managing their research data when applying for funding. Ami Gaspar, an outreach specialist for the Advanced Computing Group, also runs useful seminars on research data management for researchers at any career level.

“We often think that if we publish in books or journals, we must give our intellectual property away to those publishers,” says Jen Bonnet, social science and humanities reference librarian at Fogler Library. “Authors and publishers now have more options for what they can do with their research outputs. A journal article is one output, data would be another output. Increasingly, publishers will let you make your article and the data associated with that article available in your open access institutional repository, within a certain timeframe.” If open access to research data is required by a grant funding the project the negotiation can be easier between the researcher and publisher. Some grants could also require placing research data in national or international repositories. “That doesn’t mean we’re not going to work locally with Dataverse and other institutional alternatives,” says Moeykens. “It’s really a spectrum of activity.”

While research projects will vary in what they contain and require, the overarching goal is to lower the bar and make it easier for people to gain access to the project’s information in its raw form. “Faculty strive to be in compliance today with the federal requirements, but is it as easy as it could be?” Moeykens explains. “There’s always more that can be done. Over time, there could be new portals developed, and new institutional practices around existing tools.”

Derivative research products are becoming increasingly part of federally funded projects, where the research contributes back to the public by creating a user-friendly tool that makes its data digestible. ShellGIS is an outstanding example of a derivative decision support tool, directly extending federally sponsored data and work to the general public in a way that benefits the public. The tool was developed under Damian Brady, associate professor at the Darling Marine Center for the Sustainable Ecological Aquaculture Network (SEANET), with Meggan Dwyer as the research coordinator for the project.

Even if a project does not result in a derivative tool, other things matter in making research data accessible. Version control and descriptive naming conventions throughout a project’s duration help track changes in a project. “There’s so much interdisciplinary and collaborative work happening on campus that being able to have an audience that’s outside your field understand at least what each of your files contains is a true kindness,” says Bonnet. “Best practices suggest that researchers be as descriptive as possible, and if possible, have a guiding document so that people going through their data files can make sense of what’s there, why it’s there, and how it connects.” For the general public, it’s important to avoid jargon in keywords and descriptions.

File formats are also important for the long run, as proprietary formats may no longer be supported. For instance, formatting the data file as a CSV instead of an Excel file ensures access to a wider audience. PDFs can also be formatted as XML or HTML so that readers will not require software to read them. More on best practices for research data management can be found on Fogler Library’s extensive online resource.


Written by Clarisa Diaz