Options for Publishing YOPP Datasets

(by SiriJodha S. Khalsa)

The Year of Polar Prediction (YOPP) encourages good data management practices among the YOPP-endorsed projects, and facilitates the documentation and discoverability of datasets through the YOPP Data Portal. This article aims at providing guidance on the options that researchers doing YOPP-related work have for publishing their data.

Data, in the form of observations and numerical simulations, is the foundation YOPP will build upon to achieve its objectives. The outputs of the research that is done using YOPP data also needs to be discoverable and accessible. In fact, most publishers now require that the data upon which a manuscript is based be openly accessible. Funding agencies are also now requiring that data generated through publicly funded research be made openly available for the purposes of reuse and reproducibility.

Publication of Research Data
The publication of research data, as a scholarly output in its own right, stems from several different drivers, among them are: 1) the desire of researchers to publish as many works as possible, 2) the desire of dataset creators to be given recognition for their work, and 3) the desire of repository managers to quantify the impact of the data in their archives [1]. This has led to the creation of numerous journals focusing solely on datasets, and in some instances also experimental setup, data collection and analysis methodologies.

Digital Object Identifier (DOI)
When publishing research results it is important that the data used in the study is properly cited. The YOPP Data Portal provides basic guidelines for citing data in publications. Data citations aid in reproducibility, provide credit to the people and institutions who were essential for the data production, aid in tracking the use and impact of a data set, increase potential for finding new collaborators, and help future users learn how others have used a data set [2]. Data citation is greatly aided by having a digital object identifier (DOI) assigned to the dataset. Many data repositories now have the capability of assigning DOIs to the datasets they curate.

Publishing a Data Paper
Publishing a data article is another method of obtaining a DOI. A data paper can supply details on the collection, processing, file structure and other aspects of a dataset without going into the specifics of the scientific analysis. Material that is often relegated to the “supplementary material” of a journal article can be expanded upon and made into a separate publication. This makes it possible to establish ownership of the dataset, especially if it is required to be made open immediately after collection, ahead of research results. The Joint Declaration of Data Citation Principles [3] states that

Data citations should be accorded the same importance in the scholarly record as citations of other research objects such as publications

A data article should therefore subsequently be cited in every publication that makes use of the data.

Data is becoming viewed as part of a scholarly ecosystem, which also includes software for data management and analysis, and the workflows used in the research process. The ultimate aim, which the F.A.I.R. Principles are intended to support, is to enable machines to automatically find and use data to generate new knowledge [4].

The YOPP ICO will advertise in its newsletter and website any published YOPP data articles. The YOPP Data Portal will display the citation for any data article describing data cataloged in the portal provided this information is included in the metadata that the Portal harvests. Alternatively, if the metadata has been submitted to the YOPP Data Portal via the metadata collection form, the “Dataset citation” fields will need to be completed.


[1] Callaghan, S., 2019. Research Data Publication: Moving Beyond the Metaphor. Data Science Journal, 18(1), p.39. DOI: http://doi.org/10.5334/dsj-2019-039

[2] ESIP Data Preservation and Stewardship Committee (2019): Data Citation Guidelines for Earth Science Data
, Version 2. ESIP. Online resource. https://doi.org/10.6084/m9.figshare.8441816.v1

[3] Data Citation Synthesis Group (2014): Joint Declaration of Data Citation Principles, Martone M. (ed.) San Diego CA: FORCE11. Online resource. https://doi.org/10.25490/a97f-egyk

[4] Wilkinson, M., et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016) doi:10.1038/sdata.2016.18