dbGaP (public) - dbgap

Grade
The grade for the resource as automatically determined by the criteria violations.
Cannot be determined with current information
Description
A full description of the resource from the resource itself, if possible.
The database of Genotypes and Phenotypes (dbGaP) was developed to archive and distribute the data and results from studies that have investigated the interaction of genotype and phenotype in Humans. Provides authorized access to protected and raw data (e.g., Genotype-Tissue Expression (GTEx) project).
Last curated
(Optional) The ISO 8601 date of when the resource was last curated.
Unknown
Location
URL for the resource.
https://www.ncbi.nlm.nih.gov/gap
Source type
(Optional) How the resource relates to the data it contains. Current allowable entries are: "unknown", "repository", "source", "integrator", and "warehouse".
TBD
Curation status
Whether or not annotation is complete on this resource. Current allowable entries are: "complete", "incomplete", and "nonpublic".
complete
Field
The area of research for the resource.
biology
Type
The type of data the resource contains.
human
Categories
(Optional) Tags to describe the resource and its data.
genotype-phenotype
Access
(Optional) Links to the resource's data.
download
License
The license that is used by the resource. We use SPDX where we can or: "inconsistent", "public domain", "unlicensed", "all rights reserved", or "custom".
inconsistent
License type
The type of license that is being used. This will be to define compatible data pools in the future; we only use the grossest terms now. If it is not known "unknown" is used. Current possible values are: "unknown", "unlicensed", "copyleft", "permissive", "public domain", "copyright", "restrictive", or "private pool".
unknown
License location
(Optional) The link to the resource license.
https://grants.nih.gov/grants/guide/notice-files/NOT-OD-14-124.html
Focused curation
(Optional) Setting this flag to true indicates that the licensing was combinatorially complicated enough (as is the case in some commercial licenses) that the curator chose to wear a single "hat" during the process. From the site text: "While we try to cover as much of the licensing possibilities of a data resource that we can, in a few cases we may choose a particular "hat" to wear while evaluating to prevent a combinatorial explosion, which may also reduce the clarity of our curations for the community. In these cases, we may take on the role of a (1) non-commercial (2) academic (3) group that is (4) based in the US and trying to (5) create an aggregating resource, noting that other entities may have different results in the license commentary."
false
Issues
(Optional) Structured issues with the license. For every issue discovered with a resource, there should be a corresponding item in the license-issues field that marks the /exact/ violation, along with any comments. This field can be used by resources as the first step to improvement, as well as clarify any surrounding circumstances. Any issues or thoughts about a resource that do not slot into one of the criteria violations can go into the license-commentary field.
Criteria A.1.1: Per the dgGaP data use certification, 'The terms and conditions of using dbGaP data vary by study'. All terms and conditions are to align with NIH GDS.
Criteria C.1: Cannot access all the data.
Criteria C.2: Access methods are not transparent.
Commentary
(Optional) Further commentary on the license, possibly including the though process of the curations and things like locations of additional licenses.
• The data seems to include a large amount of identifiable and is controlled as such.
• There also exists unrestricted data and the thus the NIG GDS data use policy is two-tiered.
• Controlled-access policy 'https://www.ncbi.nlm.nih.gov/books/NBK99227/'
• General data sharing information (see data sharing policy section) 'https://www.ncbi.nlm.nih.gov/books/NBK5296/'
• NIH GDS policy tone and language seems non-uniform and more like case-by-case.
• Example GTex project submitted policy 'https://dbgap.ncbi.nlm.nih.gov/aa/wga.cgi?view_pdf&stacc=phs000424.v1.p1'
• NIH GDS states 'NIH expects investigators and their institutions to provide basic plans for following this Policy' however we assume the burden stays with data re-reuser to contact source with questions and issues should they arise effecting criteria B.
• Regarding institutional affiliation 'https://www.ncbi.nlm.nih.gov/books/NBK45308/'
• A.2: unknown license type due to 'The terms and conditions of using dbGaP data vary by study'.
• B.1: Negotation needed for controlled- access data and license type or terms and conditions unknown.
• B.2.2: Scoping is incomplete as license type or terms and conditions unknown.
• D.1.2: Because the terms and conditions vary by study, cannot comment on downstream reuse/remix.
• E.1.1: Language in the NIH GDS policy may be interpreted by non-legal professional that the contents may be reused by researchers (at NIH or with Institutional affiliation).
Controversial
(Optional) Marker noting that there was some extended internal discussion or controversy about the evaluation of the licensing terms. If this is marked at "true", the controversy, or a link to a permanent archive of the controversy, must be sufficiently contained in the "license-commentary" to reconstruct the issues.
false
Contacts
(Optional) Resource contact information, link, email, or whatever is public.
https://www.ncbi.nlm.nih.gov/books/NBK5296/
Grants
(Optional) Semi-structured list of supporting grants.
TBD

All copyrightable materials on this site are © 2019 the (Re)usable Data Project under the CC-BY 4.0 license.
The (Re)usable Data Project is funded by the National Center for Advancing Translational Sciences (NCATS) OT3 TR002019 as part of the Biomedical Data Translator project and U24TR002306 as part of the CTSA Program National Center for Data to Health (CD2H).
The (Re)usable Data Project would like to acknowledge the assistance of many more people than can be listed here. Please visit the about page for the full list.