The COVID-19 pandemic has seen some areas of science jettison authorship disputes, competitiveness and creaky conventions overnight. Other barriers should also fall away. Among the most important are the data-sharing restrictions that arose between researchers as a result of the General Data Protection Regulation (GDPR), a laudable European law on personal data implemented in May 2018.
Unintended consequences of the GDPR hamper researchers around the world who want to work with colleagues in the European Union. As the designated contact person for researchers affected by this regulation at the US National Institutes of Health (NIH), I have wrestled with many attempts to broker international data sharing.
The GDPR has stalled at least 40 clinical and observational studies on risk factors and exposures for cancer. The NIH’s Clinical Center in Bethesda, Maryland, is unable to secure European donor samples for experimental blood-stem-cell transplants aimed at treating otherwise intractable cancers. A 25-year-old diabetes study was derailed for 18 months — it took top-level intervention to move forward, and remains the only data-sharing agreement reached between the NIH and a European counterpart since the enactment of the GDPR.
We must urgently clarify data-sharing rules
The law aims to protect citizens’ privacy and reform how personal data are collected, handled, processed and stored. It is a landmark in regulating use and misuse of private and sensitive data — much-needed in a world where data are a valued commodity and where abuses in contexts such as elections or profiling can infringe basic liberties. As the GDPR took effect, media outlets revealed that UK political-consulting firm Cambridge Analytica had exploited personal data from Facebook users to inform election campaigns, in a prime example of the types of breach that are changing how tech companies and others manage data.
Advocacy by the European scientific community ensured that the GDPR incorporated multiple exemptions for research. But there is still no clarity around how to implement them. As one European biobanking executive put it, hard-won research concessions are now lost in translation.
The result? Many of us have spent more than two years trying to work within rules and restrictions on data transfer between would-be collaborators. The scientific community, politicians and regulators must push for clarifications that both protect privacy and let research proceed.
In late April, the European Data Protection Board announced new guidelines for data processing in the context of COVID-19. It offered needed legal assurance that scientific data fall under two of the GDPR’s exemptions: important reasons of public interest, and explicit consent. However, a durable legal basis for joint research and data sharing is still needed.
Legal uncertainties and penalties for non-compliance make European research institutes extremely hesitant to share data. What’s more, EU member states diverge in how they interpret GDPR requirements — as do separate institutions in the same member state.
Science needs clarity on Europe’s data-protection law
As things stand, transatlantic groups such as the International Genomics of Alzheimer’s Project cannot share data in real time: they must run isolated analyses and pool results. This reduces the value of data, limits the questions that can be explored, costs more and takes longer.
Two practical steps could do much to ease collaborations and still protect individual data.
First, we need a way to allow data transfers between EU and non-EU public agencies and non-profit organizations conducting health research. Currently, entities exchange data under ‘standard clauses’ sanctioned by the European Commission. But this produces conflicts: public agencies and other non-EU authorities have to waive sovereignty and accept provisions that their countries’ laws prohibit. The commission should create a ‘model clause’ to facilitate publicly funded and non-profit research, while accounting for the privacy laws and obligations of other jurisdictions.
Second, we need agreement on when data are considered anonymized. Anonymized data are outside the scope of the GDPR, but EU regulatory authorities have different stances on whether ‘pseudonymized’ data qualify. Without consistency, teams are unwilling to risk non-compliance.
One way to decide whether pseudonymization represents a GDPR safe harbour is to consider not the data set, but the level of access. Typically, in pseudonymized data sets, people cannot be identified without an encryption key. Assuming other organizational safeguards are in place, if a holder does not have the key, those data should be considered anonymized in the hands of the holder.
It is not simple to reconcile privacy with the potential benefits of data access, but there is an instructive precedent for accommodating competing values. Next year marks the 25th anniversary of the Bermuda Principles for the release of DNA data, a landmark compact that addressed the tension between protecting private investment and maximizing social benefit. Later data-sharing policies adapted to meet new privacy concerns, while advancing the agreement’s benchmark commitments to open science. With political and creative will, the GDPR could also enable data sharing while safeguarding privacy. We owe this to the thousands of volunteers who contribute their data and biosamples, expecting that doing so accelerates cures.