Kyoto Encyclopedia of Genes and Genomes (KEGG), FTP - kegg-ftp

Grade
The grade for the resource as automatically determined by the criteria violations.
2
Description
A full description of the resource from the resource itself, if possible.
KEGG is an integrated database resource consisting of the seventeen main databases including systems, genomic, chemical, and health information.
Last curated
(Optional) The ISO 8601 date of when the resource was last curated.
2017-10-17
Location
URL for the resource.
http://kegg.jp
Source type
(Optional) How the resource relates to the data it contains. Current allowable entries are: "unknown", "repository", and "source", and "integrator".
TBD
Curation status
Whether or not annotation is complete on this resource. Current allowable entries are: "complete", "incomplete", and "nonpublic".
complete
Field
The area of research for the resource.
biology
Type
The type of data the resource contains.
genomic resource
Categories
(Optional) Tags to describe the resource and its data.
gene-pathway association
disease-gene association
orthology
Access
(Optional) Links to the resource's data.
download
License
The license that is used by the resource. We use SPDX where we can (https://spdx.org/licenses/) or: "unknown", "public domain", "all rights reserved", or "custom".
custom
License type
The type of license that is being used. This will be to define compatible data pools in the future; we only use the grossest terms now. If it is not known "TODO" is used. Current possible values are: "unknown", "copyleft", "permissive", "public domain", "copyright", "restrictive", or "closed pool".
restrictive
License location
(Optional) The link to the resource license.
http://www.kegg.jp/kegg/legal.html
Focused curation
(Optional) Setting this flag to true indicates that the licensing was combinatorially complicated enough (as is the case in some commercial licenses) that the curator chose to wear a single "hat" during the process. From the site text: "While we try to cover as much of the licensing possibilities of a data resource that we can, in a few cases we may choose a particular "hat" to wear while evaluating to prevent a combinatorial explosion, which may also reduce the clarity of our curations for the community. In these cases, we may take on the role of a (1) non-commercial (2) academic (3) group that is (4) based in the US and trying to (5) create an aggregating resource, noting that other entities may have different results in the license commentary."
false
Issues
(Optional) Structured issues with the license. For every issue discovered with a resource, there should be a corresponding item in the license-issues field that marks the /exact/ violation, along with any comments. This field can be used by resources as the first step to improvement, as well as clarify any surrounding circumstances. Any issues or thoughts about a resource that do not slot into one of the criteria violations can go into the license-commentary field.
Criteria A.2: For our use case (see commentary), KEGG (FTP) uses custom licensing through NPO Bioinformatics Japan (https://www.bioinformatics.jp/en/keggftp.html).
Criteria B.1: According to the organizational use agreement: "Your Product or Service must not allow Your users to obtain KEGG FTP Data, except in small quantities"; many uses would require negotiation here as continuing reuse is unclear.
Criteria D.1.2: The licensing terms are too restrictive for reasonable reuse (e.g. B.1 violation above).
Criteria E.1.2: Given our interpretation of the licensing terms, there is unlikely to be much ability to freely reuse the KEGG FTP data within any class.
Commentary
(Optional) Further commentary on the license, possibly including the though process of the curations and things like locations of additional licenses.
• KEGG (FTP) presents special problems as their licensing depends on variables like academic/non-academic, group/individual, commercial/non-commercial, and location. While the end terms do have some variation, for the sake of convenience, we will act as a non-commercial academic group based in the US wanting to access the entire database. Much curation was then based off of the organizational use agreement: https://www.bioinformatics.jp/docs/subscription_organizational.pdf .
• Given all of the possible use cases, license verbiage, and special conditions, KEGG (FTP) was exceptionally difficult to curate.
• Academic users license through NPO Bioinformatics Japan (https://www.bioinformatics.jp/en/keggftp.html).
• Non-academic users license through Pathway Solutions (http://www.pathway.jp/).
• While there are many possible combinations, any individual or use case should have a single license that they will need to get, giving us a pass on A.1.
• While we did not go through the registration process, C.1 seems like it would be cleared by the directory layout description on http://www.kegg.jp/kegg/download/ .
• While we did not go through the registration process, C.2 seems like it would be cleared by the implied straighforward FTP access post licensing.
• Also see KEGG API and KEGG REST resources.
Controversial
(Optional) Marker noting that there was some extended internal discussion or controversy about the evaluation of the licensing terms. If this is marked at "true", the controversy, or a link to a permanent archive of the controversy, must be sufficiently contained in the "license-commentary" to reconstruct the issues.
true
Contacts
(Optional) Resource contact information, link, email, or whatever is public.
http://www.genome.jp/feedback/
Grants
(Optional) Semi-structured list of supporting grants.
TBD

All copyrightable materials on this site are © 2017 the (Re)usable Data Project under the CC-BY 4.0 license.
ReusableData.org is funded by the National Center for Advancing Translational Sciences (NCATS) OT3 TR002019 as part of the Biomedical Data Translator project.
The (Re)usable Data Project would like to acknowledge the assistance of many more people than can be listed here. Please visit the about page for the full list.