About
editThis page belongs to a paper presented on April 22, 2015 (from 9.00 to 9.45am EDT) as part of JATS-Con 2015 in the Lister Hill Auditorium at the National Library of Medicine in Bethesda, Maryland.
Title
editAdapting JATS to support data citation
Authors
editDaniel Mietchen, Johanna McEntyre, Jeff Beck, Chris Maloney; Force11 Data Citation Implementation Group
Abstract
editData referred to in articles is usually not cited in a consistent or structured fashion. To address this, Force 11 have developed the Joint Declaration of Data Citation Principles. JATS 1.1d1 has provisions for citing articles and other sources, but does not offer straightforward ways of expressing some of the concepts needed for data citation. In order to facilitate the citation of data in JATS-tagged documents in a way that is compliant with the Joint Declaration of Data Citation Principles, the Force11 Data Citation Implementation Group held a meeting in June of last year, at which several new elements, attributes and values for attributes were suggested to be added to JATS. These have since been submitted to the JATS Standing Committee, which largely accepted them, so they are now included in the draft standard JATS 1.1d2. This talk will provide background on the decision criteria behind the elements that were proposed, and how they were selected for JATS 1.1d2. It will in addition provide suggested examples for use of the new tags.
The full paper is available via http://www.ncbi.nlm.nih.gov/books/NBK280240/.
Formats
editQuiz
editWho likes standards updates? Slides 33-44 in What’s New in JATS since 1.0? |
Rationale
editFAIR data Guiding Principles
edit- Data Objects (Identifiable Data Item with Data elements + Metadata + an Identifier) should be
- Findable
- Accessible
- Interoperable
- Reusable
Data Citation Principles
edit- Data Citation Synthesis Group: Joint Declaration of Data Citation Principles. Martone M. (ed.) San Diego CA: FORCE11; 2014 [https://www.force11.org/datacitation].
- The principles include
- Evidence
- In scholarly literature, whenever and wherever a claim relies upon data, the corresponding data should be cited.
- Unique Identification
- A data citation should include a persistent method for identification that is machine actionable, globally unique, and widely used by a community.
- Access
- Data citations should facilitate access to the data themselves and to such associated metadata, documentation, code, and other materials, as are necessary for both humans and machines to make informed use of the referenced data.
- Interoperability and Flexibility
- Data citation methods should be sufficiently flexible to accommodate the variant practices among communities, but should not differ so much that they compromise interoperability of data citation practices across communities.
NIH Public Access Policy
edit“ | NIH will explore ways to advance data as a legitimate form of scholarship through data citation and other means. | ” |
Options to extend JATS functionality
editGetting new elements added to JATS itself
edit- NISO Access and License Indicators (ALI), available in JATS 1.1d3
A superset extension of JATS
edit- TaxPub
- Catapano T. TaxPub: An Extension of the NLM/NCBI Journal Publishing DTD for Taxonomic Descriptions. In: Journal Article Tag Suite Conference (JATS-Con) Proceedings 2010 [Internet]. Bethesda (MD): National Center for Biotechnology Information (US); 2010. Available from: http://www.ncbi.nlm.nih.gov/books/NBK47081/
- Penev L, Catapano T, Agosti D, et al. Implementation of TaxPub, an NLM DTD extension for domain-specific markup in taxonomy, from the experience of a biodiversity publisher. In: Journal Article Tag Suite Conference (JATS-Con) Proceedings 2012 [Internet]. Bethesda (MD): National Center for Biotechnology Information (US); 2012. Available from: http://www.ncbi.nlm.nih.gov/books/NBK100351/
Process
edit- Survey of
- existing citation infrastructure in JATS 1.0
- data citation practices
- Remote discussions via the Force11 Data Citation Implementation Working Group
- One-day workshop in London in June 2014
- Decision to go for extending JATS rather than a superset extension
- Agreement reached on set of suggestions for new elements, attributes and attribute values
- Submission of suggestions to JATS Standing Committee
- Response from JATS Standing Committee
- Incorporation into JATS 1.1d2
- Recommendation by JATS Standing Committee to NISO: adopt JATS 1.1d3 as JATS 1.1
New elements
edit- Similar to the existing JATS <edition> element, and the @version attribute for the <tex-math> element.
- Analogous to the <article-title> in a normal citation.
- <source> could also be given, which would identify the data repository
The following example (which was added to the tag library) shows how <data-title> might be used.
<mixed-citation publication-type="data">Xu, J. <etal/> <data-title>Cross-platform ultradeep transcriptomic profiling of human reference RNA samples by RNA-Seq</data-title>. <source>Sci. Data</source> <volume>1</volume>:<elocation-id>140020</elocation-id> doi: <pub-id pub-id-type="doi" xlink:href='http://dx.doi.org/10.1038/sdata.2014.20'>10.1038/sdata.2014.20</pub-id> (<year iso-8601-date="2014">2014</year>). </mixed-citation>
New attributes
edit- For elements <ext-link> and <pub-id>
- @pub-id-type used to be used to specify the authority; now it should only be used to specify the type of identifier
- For example, a DOI might be described with
assigning-authority="crossref"
- Many identifiers are associated with URLs, so can be rendered as hyperlinks
- Indeed, in the linked data world, many identifiers are HTTP URIs.
- Therefore, the "might-link attributes" were added.
New values for attributes
edit- New value, "data", was added.
- For “dataset, database, spreadsheet, et al."
- New value, "curator", was added.
- Standing Committee has indicated that they will revisit this issue in light of the CRediT - Contributor Role Taxonomy, which has just been published
Example of the use of the "curator" value:
<mixed-citation> <person-group person-group-type='curator'> <name><surname>Frankis</surname><given-names>Michael</given-names></name> </person-group>, curator. "<data-title>Mountain bluebird</data-title>." <source>Encyclopedia of Life</source>, available from <ext-link ext-link-type='uri' xlink:href='http://eol.org/pages/1177542'>http://eol.org/pages/1177542</ext-link>. Accessed 30 Mar 2015. </mixed-citation>
- This attribute is used on <pub-id>
- Added three new values:
- accession - a unique identifier in many bioinformatics databases, for example, protein or DNA sequences
- ark - Archival Resource Key
- handle - a Handle identifier
The following example shows how the "accession" value might be used. Note that it is accompanied by an @assigning-authority, to make clear the provenance of the identifier.
<mixed-citation publication-type='data'> <name><surname>Heinz</surname><given-names>D.W.</given-names></name>, <name><surname>Baase</surname><given-names>W.A.</given-names></name>, <etal>et. al.</etal> <data-title>How amino-acid insertions are allowed in an alpha-helix of T4 lysozyme</data-title>. <source>RCSB Protein Data Bank</source>, accession <pub-id pub-id-type='accession' assigning-authority='pdb' xlink:href='http://www.rcsb.org/pdb/explore/explore.do?structureId=102l'>102l</pub-id>. <pub-id pub-id-type='doi' xlink:href='http://dx.doi.org/10.2210/pdb102l/pdb'>10.2210/pdb102l/pdb</pub-id> </mixed-citation>
Examples
editFor further examples, see our full paper.
Re-Quiz
editUntagged citation:
Müller, C et al. (2005): Audio record of a 'singing iceberg' from the Weddell Sea, Antarctica. doi:10.1594/PANGAEA.339110, Supplement to: Müller, Christian; Schlindwein, Vera; Eckstaller, Alfons; Miller, Heinz (2005): Singing Icebergs. Science, 310, 12, doi:10.1126/science.1117145
Possible tagging solution:
<mixed-citation publication-type="data"> <name><surname>Müller</surname><given-names>C</given-names></name>, <etal>et al.</etal> (<year iso-8601-date="2005">2005</year>): <data-title>Audio record of a 'singing iceberg' from the Weddell Sea, Antarctica.</data-title> <pub-id pub-id-type='doi' xlink:href='http://dx.doi.org/10.1594/PANGAEA.339110 >doi:10.1594/PANGAEA.339110</pub-id> </mixed-citation>
Outlook
edit- JATS4R recommendations on data citation
- Outreach into the community
- Hopefully wide uptake
- Possibly adjustments in response to feedback
- Adding license information to references, be they classical citations or data citations