The National Institutes of Health (NIH) issued the Genomic Data Sharing Policy (GDS Policy) effective January 25, 2015. This policy describes the responsibilities of investigators and institutions for the submission of human and non-human genomic data to data repositories and the secondary research use of such data as well as expectations regarding intellectual property.
The GDS Policy applies to all NIH-funded research (e.g., grants, contracts, intramural research) that generates large-scale human or non-human genomic data, regardless of the funding level, as well as the future use of the data. Large-scale data include genome-wide association studies (GWAS), single nucleotide polymorphisms (SNP) arrays, and genome sequence, transcriptomic, epigenomic, and gene expression data.
- Type of data that will be shared (i.e., the type of genomic data, relevant associated data, and information necessary to interpret the data)
- The data repository to which the data will be submitted
- The timeline for the data to be shared
- Any limitations on the secondary research uses of the data, if the study involved human data
- Acknowledgement that the Institutional Certification will be submitted and assurance by the Institutional Review Board (IRB) that the data can be shared through NIH-designated data repositories, consistent with data sharing under the NIH GDS Policy
- Genomic and phenotypic data, and any other data relevant for the study (such as exposure or disease status) will be generated and may be used for future research on any topic and shared broadly in a manner consistent with the consent and all applicable federal and state laws and regulations.
- Prior to submitting the data to an NIH-designated data repository, data will be stripped of identifiers such as name, address, account and other identification numbers and will be de-identified by standards consistent with the Common Rule. Safeguards to protect the data according to Federal standards for information protection will be implemented.
- Access to de-identified participant data will be controlled, unless participants explicitly consent to allow unrestricted access to and use of their data for any purpose.
- Because it may be possible to re-identify de-identified genomic data, even if access to data is controlled and data security standards are met, confidentiality cannot be guaranteed, and re-identified data could potentially be used to discriminate against or stigmatize participants, their families, or groups. In addition, there may be unknown risks.
- No direct benefits to participants are expected from any secondary research that may be conducted.
- Participants may withdraw consent for research use of genomic or phenotypic data at any time without penalty or loss of benefits to which the participant is otherwise entitled. In this event, data will be withdrawn from any repository, if possible, but data already distributed for research use will not be retrieved.
- The name and contact information of an individual who is affiliated with the institution and familiar with the research and will be available to address participant questions
- The data submission is consistent, as appropriate, with applicable national, tribal, and state laws and regulations as well as relevant institutional policies;
- Any limitations on the research use of the data, as expressed in the informed consent documents, are delineated;
- The identities of research participants will not be disclosed to NIH-designated data repositories; and
- An IRB, privacy board, and/or equivalent body, as applicable, has reviewed the investigator’s proposal for data submission and assures that:
- The protocol for the collection of genomic and phenotypic data is consistent with 45 CFR Part 46;
- Data submission and subsequent data sharing for research purposes are consistent with the informed consent of study participants from whom the data were obtained;
- Consideration was given to risks to individual participants and their families associated with data submitted to NIH-designated data repositories and subsequent sharing;
- To the extent relevant and possible, consideration was given to risks to groups or populations associated with submitting data to NIH-designated data repositories and subsequent sharing; and
- The investigator’s plan for de-identifying datasets is consistent with the standards outlined in the GDS Policy (see section IV.C.1. in the GDS Policy).
- Detailed Data Sharing Plan
- Any known data use limitations
- Indication of whether any aggregate-level data are appropriate for general research use
- Contact information of the relevant NIH GDS Program Adminstrator
- Human Subjects Protocol (most current version)
- Detailed Data Sharing Plan
- Informed consent document (all versions) delineating whether participants’ individual-level data will be shared through unrestricted or controlled access repositories
- Any known data use limitations
- Indication of whether any aggregate-level data are appropriate for general research use
- Contact information of the relevant NIH GDS Program Administrator