Phrasit

Search Phrasit

Search every tool, guide, and citation page.

CITATION GUIDE 9 MIN READ

How to cite a dataset (APA 7, MLA 9, Chicago, Harvard)

Dataset citations are about reproducibility. A reader needs the exact dataset, version, repository, creator, date, and persistent identifier. A vague URL to a data portal is not enough when the dataset can be updated, corrected, split, or replaced.

Written by Vikas Dulgunde, Software EngineerUpdated How this is madeConnect on LinkedIn

When to use this source type

Use this source type when you cite a dataset from a repository, data archive, government portal, research project, or institutional database. Zenodo, Figshare, Dryad, ICPSR, NASA, World Bank, and government open-data portals often provide citation metadata with creators, versions, release dates, and DOIs.

Do not cite a dataset as a website when the data has its own title, version, DOI, or repository record. If you cite an analysis report that discusses data, cite the report. If you downloaded a live table, record the access date and version because the numbers may change after you submit your work.

When your analysis depends on filters, geography, date ranges, or variables, describe those choices in your methods section. The reference identifies the dataset, while your prose identifies the subset you actually used.

Quick reference table

The same source facts appear in each style, but they move around. Check the author role, date detail, title formatting, container, locator, and the one style-specific rule before you paste a citation into your reference list.

StyleAuthorDateTitleContainerURL or locatorStyle note
APA 7Creator or organization.Year or release date.Dataset title italicized.[Data set] label.Repository or publisher.DOI preferred over URL.
MLA 9Creator or organization.Year after repository details.Dataset title in quotation marks or italics by container.Repository as container.Version if available.DOI or URL.
ChicagoCreator or organization.Year after author.Dataset title italicized or quoted by format.Repository and version.Publisher if separate.DOI or URL.
HarvardCreator or organization.Year in parentheses.Dataset title italicized.[Data set] label where allowed.Repository or publisher.DOI or Available at URL.

APA 7 walkthrough

APA 7 starts with the same basic question: who is responsible for this dataset? For a dataset, use the dataset creator, research group, or publishing organization. The date element uses the release year or version year, not the date you downloaded it unless no other date exists. The title element italicizes the dataset title and adds a data set label. The source element names the repository or publisher. Finally, the locator element uses the DOI as a persistent identifier, with URL only when no DOI exists. Work through those fields in order and the punctuation becomes much easier to control.

APA 7 expects a version number when the dataset provides one. Put the version near the title so readers can reproduce your analysis. In text, use (NASA Goddard Institute for Space Studies, 2024). If you quote directly, add the page, paragraph, timestamp, or legal pin cite required by the style. If your source is online, prefer a stable URL or DOI over a search-result link, and remove tracking parameters before you submit the reference.

Studies, N. G. I. F. S. (2024). Global temperature anomalies, 1880-2024 [Data set]. *Zenodo*. https://doi.org/10.5281/zenodo.1234567

MLA 9 walkthrough

MLA 9 starts with the same basic question: who is responsible for this dataset? For a dataset, starts with the dataset creator or organization. The date element uses the release year or repository date. The title element identifies the dataset as the cited work. The source element uses the repository as a container when appropriate. Finally, the locator element adds version, DOI, URL, and access date as needed. Work through those fields in order and the punctuation becomes much easier to control.

MLA dataset citations vary because repositories describe records differently. Keep creator, title, repository, version, date, and DOI visible. In text, use (NASA Goddard Institute for Space Studies). If you quote directly, add the page, paragraph, timestamp, or legal pin cite required by the style. If your source is online, prefer a stable URL or DOI over a search-result link, and remove tracking parameters before you submit the reference.

Studies, NASA Goddard Institute for Space "Global temperature anomalies, 1880-2024 [Data set]." *Zenodo*, 2024, doi.org/10.5281/zenodo.1234567. Accessed 15 Jan. 2025.

Chicago walkthrough

Chicago starts with the same basic question: who is responsible for this dataset? For a dataset, uses the data creator or organization in reference-list order. The date element places the year after the author. The title element uses the dataset title as the work title. The source element names the repository, publisher, and version where available. Finally, the locator element uses DOI or URL at the end. Work through those fields in order and the punctuation becomes much easier to control.

Chicago is flexible for datasets, but reproducibility is the test. Include version and access information whenever the data can change. In text, use (NASA Goddard Institute for Space Studies 2024). If you quote directly, add the page, paragraph, timestamp, or legal pin cite required by the style. If your source is online, prefer a stable URL or DOI over a search-result link, and remove tracking parameters before you submit the reference.

Studies, NASA Goddard Institute for Space. 2024. "Global temperature anomalies, 1880-2024 [Data set]." Zenodo. accessed January 15, 2025. https://doi.org/10.5281/zenodo.1234567.

Harvard walkthrough

Harvard starts with the same basic question: who is responsible for this dataset? For a dataset, uses the organization or data creator responsible for the record. The date element puts the release year after the author. The title element italicizes the dataset title and can add a data set label. The source element names the repository or publisher. Finally, the locator element uses DOI where possible, otherwise Available at plus URL and Accessed date. Work through those fields in order and the punctuation becomes much easier to control.

Harvard data citations should make the exact release clear. If your data portal updates live, include the date you accessed or downloaded the data. In text, use (NASA Goddard Institute for Space Studies, 2024). If you quote directly, add the page, paragraph, timestamp, or legal pin cite required by the style. If your source is online, prefer a stable URL or DOI over a search-result link, and remove tracking parameters before you submit the reference.

Studies, NASA Goddard Institute for Space (2024) Global temperature anomalies, 1880-2024 [Data set]. [Online] Zenodo. Available at: https://doi.org/10.5281/zenodo.1234567 (accessed January 15, 2025).

Common mistakes for this source type

Most errors come from forcing a dataset into the wrong template. Before submitting, check these details against the source itself, not against a database preview or a copied citation.

  • Citing the data portal homepage instead of the dataset record.
  • Omitting the version or release date.
  • Using a URL when a DOI is available.
  • Citing a report about the data instead of the dataset you analyzed.
  • Failing to record the access date for live or changing data.

Related guides