Data sharing: What, Why and How?
Updated: Mar 18, 2019
Data sharing in the new era of publications
Clinical trials are a crucial part of the development process for new medical products and represent a significant investment from all involved — patients who volunteer to participate, organizations that sponsor trials, and the researchers who conduct a study and analyze the data. The data generated in clinical trials can be summary data (study report, patient lay summaries etc.), individual patient data (raw files) and even meta-data (protocol, statistical analysis plan etc.). However, much of the data generated by clinical trials is not public or shared beyond the data holder, and significant barriers to sharing these data exist. Data sharing could advance scientiﬁc discovery and improve clinical care by maximizing knowledge gained from data collected in trials, stimulating new ideas for research, and avoiding unnecessarily duplicative trials.
In July 2018, the International Committee of Medical Journal Editors (ICMJE) recommended manuscripts on clinical trial research to contain a data sharing statement prior to consideration of publication in their member journals. In addition, clinical trials that begin enrolling participants on or after 1 January 2019 must include a data sharing plan in the trial’s registration. While the policy does not require data sharing, ICJME states that “investigators should be aware that editors may take into consideration data sharing statements when making editorial decisions”. These steps may increase data transparency; however, policies are now needed to protect the privacy and consent of participants, the validity of analyses, the investment of funders and sponsors, and the academic recognition of investigators. There is an urgent need to implement several operational strategies for mitigating the risks and enhancing the benefits of sharing sensitive clinical trial data (especially individual participant data).
A few pharmaceutical companies and lifescience industries have started to include data sharing statements in their published manuscripts. Yet, several questions remain unanswered- who are the different stakeholders responsible for data sharing and protection, what is their role in the process of data sharing, how do we make the patient sensitive data available and to what extent. Let us try to decipher all these questions and understand the best practices that can be adopted by both pharmaceutical companies and journal editors to comply to this recommendation by ICMJE as well as protect the participant sensitive information.
The first question that arises is what kind of data should be shared – As per ICMJE recommendations, all individual participant data collected during the study should be shared. However, given the practical challenges of sharing such huge raw data without causing a data dump, it is recommended that authors and their sponsor companies should at least share the data that underlie the results reported in that particular publication (text, tables, figures, appendices etc.) after appropriate de-identification of patient sensitive information. The United States Ofﬁce for Human Research Protections has indicated that provided the appropriate conditions are met by those receiving them, the sharing of de-identiﬁed individual participant data from clinical trials does not require separate consent from trial participants. This apart, sponsor companies should also share the study protocol, statistical analysis plan, informed consent form, clinical study report and analytic code wherever applicable.
The New England Journal of Medicine insists on sharing the redacted study protocol (to protect the business sensitive information of the sponsor) as supplementary information and linked in the online version of the publication. While few journals have definite guidelines on sharing the data, many still do not have any discrete procedures. In such cases, ICMJE recommends that the sponsor company or authors can decide the time period for which the data will be available. Nevertheless, the best practice would be to share the data immediately following publication. Furthermore, ICMJE also recommends that companies should identify an individual or a third party who can be approached for data access. Companies should provide either an e-mail address or link in the publication, which can direct the reader to that responsible individual or third party for data access.
Data access can be provided to anyone- however, in all cases, data access can be provided only after the sponsor company approves the methodological sound proposal by the data access seeker (a researcher or an investigator in most cases) and can identify the purpose of seeking the data. If the sponsor company does not wish to share the said data, it should provide a valid justification for rejecting the data access proposal. In case of academic publications wherein such provisions may be difficult to implement, ICMJE recommends that authors can decide the time frame for which data will be available with them (usually 36 months) and mention in the publication that for all data access requests beyond the mentioned time-frame, data will be available at their institution’s data warehouse but without any investigator support other than the deposited metadata.
Data sharing is a relatively new concept. Unresolved issues remain, including appropriate scholarly credit to those who share data, and the resources needed for data access, the transparent processing of data requests, and data archiving. As more and more data access requests pour in and data are shared, we will perhaps get greater understanding and collaboration among funders, ethics committees, journals, trialists, data analysts, participants, and others.