
What Is Dataset Citation?
Dataset citation is the process of referencing datasets used or generated in a research project — in the same way that you would cite a journal article or book.
It ensures that the data creators receive credit, that the dataset can be located, verified, and reused, and that your research remains transparent and reproducible.
Datasets are now recognized as first-class research outputs — citable, measurable, and indexed by repositories and databases such as Zenodo, Figshare, Dryad, and Open Science Framework (OSF).
Why Dataset Citation Matters
Reason | Description |
---|---|
Transparency | Readers can verify results using the same data. |
Reproducibility | Other researchers can replicate your work. |
Credit | Dataset creators receive formal recognition (via DOI). |
Compliance | Many journals and funders (EU, NIH, UKRI) now require data citation. |
Linkage | Connects data, software, and publications in the scholarly record. |
Core Principles (Joint Declaration of Data Citation Principles – JDDCP)
- Importance: Data should be considered legitimate, citable products of research.
- Credit and Attribution: Citations give proper credit to data creators.
- Unique Identification: Every dataset should have a persistent identifier (DOI or Handle).
- Access: Citations should facilitate access to the data.
- Persistence: Identifiers and metadata must remain accessible over time.
- Specificity and Verifiability: Citations should point to the exact dataset used.
- Interoperability: Citations should follow community standards for reusability.
Basic Dataset Citation Format (APA 7th Edition)
Author(s). (Year). Title of dataset [Data set]. Publisher. DOI
Example:
Smith, J. A., & Lee, T. (2024). Global coral reef bleaching dataset (v2.1) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.1234567
Citing Datasets in Other Styles
Style | Example |
---|---|
MLA | Smith, John A., and Tina Lee. “Global Coral Reef Bleaching Dataset (v2.1).” Zenodo, 2024, doi:10.5281/zenodo.1234567. |
Chicago | Smith, John A., and Tina Lee. Global Coral Reef Bleaching Dataset (v2.1). Zenodo, 2024. https://doi.org/10.5281/zenodo.1234567. |
Vancouver | Smith JA, Lee T. Global coral reef bleaching dataset (v2.1) [dataset]. Zenodo; 2024. Available from: https://doi.org/10.5281/zenodo.1234567 |
IEEE | J. A. Smith and T. Lee, “Global coral reef bleaching dataset (v2.1),” Zenodo, 2024. DOI: 10.5281/zenodo.1234567 |
How to Obtain a DOI for Your Dataset
Most open repositories automatically mint a DOI (Digital Object Identifier) through DataCite or CrossRef when you deposit data.
Recommended repositories:
- Zenodo – https://zenodo.org
- Figshare – https://figshare.com
- Dryad – https://datadryad.org
- Harvard Dataverse – https://dataverse.harvard.edu
- Open Science Framework (OSF) – https://osf.io
How Europub Supports Dataset Citation
Europub integrates dataset citation in:
- Journal metadata and article records
- Certificate issuance via the Certificate Management System
- Validation through CrossRef DOI linking
- Optional display on certificates for data creators and publishers
Access: https://cms.europub.co.uk
Best Practices for Authors
Always provide the DOI or persistent link of each dataset used.
Clearly state whether the dataset is public, restricted, or proprietary.
Use the Data Availability Statement (DAS) section in your article.
Avoid citing URLs that may change; use permanent identifiers.
Acknowledge secondary data sources properly.
For Publishers and Editors
- Encourage authors to include formal dataset citations in reference lists.
- Require a Data Availability Statement (DAS).
- Use CrossRef + DataCite to interlink article DOIs with dataset DOIs.
- Reward data sharing with “Open Data Badges” or certificates.
- Include dataset citation metrics in impact evaluations.
Data Availability Statement Example
“The dataset supporting the findings of this study is available in Zenodo at https://doi.org/10.5281/zenodo.1234567, under a CC-BY 4.0 license.”
Recommended Tools
Tool | Purpose | Link |
---|---|---|
DataCite Commons | Search & register dataset DOIs | https://commons.datacite.org |
Zenodo / Figshare | Upload & cite datasets | https://zenodo.org |
CrossRef Metadata Search | Verify DOI linking | https://search.crossref.org |
Europub CMS | Generate certificates linking DOIs & datasets | https://cms.europub.co.uk |
Ethics and Licensing
- Use open licenses (e.g., CC BY, CC0) when possible.
- Always acknowledge data creators.
- Avoid uploading sensitive or personal data without consent.
- Verify dataset integrity before reuse.
Frequently Asked Questions (FAQ)
Q1. Can datasets be cited like journal articles?
Yes. Datasets with DOIs are formally citable and count toward scholarly metrics.
Q2. How can I check if a dataset has a DOI?
Search in DataCite Commons or CrossRef using the dataset title or author.
Q3. What if the dataset doesn’t have a DOI?
Deposit it in a repository that issues DOIs, such as Zenodo or Figshare.
Q4. Can I cite my own dataset?
Yes — self-citation is allowed and encouraged for transparency.
Q5. Are dataset citations indexed in Scopus and Web of Science?
Partially — but integration is increasing through DataCite and CrossRef links.
Q6. Should datasets be included in the reference list?
Yes. Formal citations ensure traceability and recognition.
Q7. What’s the difference between data citation and data availability statement?
The former gives credit; the latter explains where and how to access the data.
Q8. Can supplementary data be treated as datasets?
If it’s stored separately (with DOI), yes — cite it as a dataset.
Q9. How do I verify dataset ownership?
Check repository metadata (creator ORCID, publisher, license).
Q10. How does Europub validate datasets?
Each dataset DOI is crosschecked via CrossRef or DataCite APIs before inclusion in certificates.