Boosting research and science by sharing data

AMdEX has achieved many milestones via its usecases. In a series of articles, we’ll look back on these results and showcase their user scenarios. This article highlights the Research Data Exchange (RDX). 

Universities would love to share as much research data as possible. Data sharing promotes the progress of science, transparency and enables replication or new analyses. AMdEX’s founding partners SURF and the University of Amsterdam jointly developed the Research Data Exchange (RDX).

The RDX allows researchers to share data in a controlled and secure manner, whilst also adhering to legal requirements and institutional policies. Freek Dijkstra, project lead Data Exchange at SURF, tells more.

Freek Dijkstra | AMdEX

Freek Dijkstra

What was the objective of the usecase?

“RDX is a step towards solving the ‘open science dilemma’. This conundrum revolves around the conflicting goals of maximising data openness to foster scientific progress, and the need to navigate legal and sovereignty issues that can restrict data sharing. It touches issues such as ownership, copyright, privacy, informed consent, purpose limitations, dual-use restrictions, and resale prohibitions. There are existing tools for secure data reuse, but they often entail tedious manual processes. We sought a way to automate these.

“Our objectives for this usecase were twofold. First, we wanted our pilot to serve as a technology showcase, illustrating the possibilities for enhancing data sharing in the research community. Second, we wanted to offer insights into the roles of data owners, particularly the distinction between the researchers who generate the datasets and the data stewards at research institutions.”

What were the biggest challenges?

“With our prototype, we split the workflow into two distinct phases: one for data owners and one for data consumers. For data owners, RDX offers a straightforward process for specifying data sharing conditions when publishing their datasets. This ensures that data sharing conditions are articulated upfront and only require one-time setup.

“Data consumers can locate the desired dataset on existing repositories. But instead of engaging in negotiations with the data owner for access, RDX automates the enforcement of access permissions. Depending on the specified data sharing conditions, data consumers may need to demonstrate their affiliation. Such as membership in a research community. They’ll also need to agree to the designated sharing terms, like non-commercial use or citation requirements.

“Once these conditions are met, data consumers gain access to download the data. Or they can perform analyses within a secure environment. Consumer rights are always in accordance with the data owner’s established sharing conditions.”

What were the main lessons learned?

“Both data owners and data consumers had a mutual desire for retaining control. We found that they preferred robust logging, and monitoring mechanisms provided sufficient ‘security’. It was not necessary to take extreme preventive measures at the start of the process when there were solutions to verify the output, if so desired.”

What recommendations do you have for future research or experiments?

“Now that we know the basis works, we are looking at how to connect to bigger ecosystems. We want more users to test the system, so we can refine it further. I would also like it if ecosystems were somewhat compatible. So that data sharing systems in academia and for example in logistics or urban renewal can interface with each other. But it is early in the process and we are content now that parties are showing an interest.”

What user scenarios did you encounter in the RDX usecase?

“The Research Institute of Child Development and Education (RICDE) at the University of Amsterdam is a data owner and publisher of datasets. One of the cases focus specifically on sharing data from pedagogical and educational research to assess government policy. For the data owner, we offered a straightforward process for specifying data sharing conditions when publishing their datasets, making them discoverable on data repositories. This ensured that data sharing conditions were articulated upfront and only require one-time setup.

“Data consumers can locate the desired dataset on existing repositories. Instead of engaging in negotiations with the data owner for access, RDX automated the enforcement of access permissions. Depending on the specified data sharing conditions, data consumers may need to demonstrate their affiliation, such as membership in a research community, and agree to the designated sharing terms, like non-commercial use or citation requirements. Once these conditions are met, data consumers gain access to download the data or perform analyses within a secure environment, in accordance with the data owner’s established sharing conditions.”

Text: Karina Meerman

Deliverables RDX usecase

Please visit: https://zenodo.org/record/8269273 and https://rdx.lab.surf.nl/

Read more about the RDX usecase.