Data format

This data schema is in beta and can change with little notice.

OA.Report collects data on scholarly works such as papers, preprints, datasets, and code. Some of that data is generated by OA.Works and some are reused from other open sources.

For an example of a record with most of the possible fields, go to https://api.oa.works/report/works/10.1016/j.ijid.2020.05.122

Generated data

can_archive

Boolean: True if the work can be self-archived in a repository.

Source: ShareYourPaper Permissions Updated: Daily (premium), occasionally (free)

version

String: What version of the work can be self-archived in a repository?

can_archive: "acceptedVersion"

Values are based on the DRIVER Guidelines versioning scheme.

Source: ShareYourPaper Permissions Updated: Daily (premium), occasionally (free)

journal_oa_type

String: The journal's OA type.

Think of this as oa_statusfor a journal.

Values include:

  • gold: The journal's whole output is published Open Access.

  • hybrid: The journal allows some articles to be published Open Access.

  • transformative: The journal allows some articles to be published Open Access, and is listed by Coalition S as a transformative journal.

  • diamond : The journal whole output is published Open Access. with no APC

  • closed: The journal's output is entirely behind a paywall, or bronze.

  • not applicable: Used when the work is not in a journal (typically, a pre-print)

journal_oa_type: "diamond"

Source: OA.Works Updated: Daily (premium), occasionally (free)

pmc_has_data_availability_statement

Boolean: true if PMC reports the article as having a data availability statement.

Source: PMC Updated: Weekly (premium), occasionally (free)

has_data_availability_statement

Boolean: true if the article has a data availability (or, resource availability) statement.

Source: PMC Updated: Weekly (premium), occasionally (free)

email

String: The corresponding author's email address

email: "example@place.edu"

Most emails are encrypted if you're not logged in and viewing emails associated with your organization.

Source: OA.Works Updated: Weekly (premium)

author_email_name

String: The corresponding author's name for use in emails

email: "Dr.Who"

Source: OA.Works Updated: Weekly (premium)

crossref_is_oa

Boolean: true if crossref data suggests the article is free to read

Source: Crossref Updated: Weekly (premium), occasionally (free)

updated

String: timestamp of when the record was last updated

updated: "1675693406601"

Source: OA.Works Updated: Weekly (premium), occasionally (free)

Supplements

Each of these keys is found in the supplements object.

While the name suggests these are secondary, they're in fact, critical to OA.Report. They were given this name as they "supplement" the open data, and have nothing to do with supplemental information you might find in a research article.

publisher_license_best

String: The license applied to this work by the publisher as best we can determine.

publisher_license_best: "cc-by"

Source: Unpaywall, CrossRef, and manual collection can be used to support this designation. Updated: Weekly (premium), occasionally (free)

repository_license_best

String: The license applied to this work by the repository as best we can determine.

publisher_license_best: "cc-by"

Source: Data from Unpaywall and Europe PMC can be used to support this designation. Updated: Weekly (premium), occasionally (free)

is_preprint

Boolean: true if the article is on a preprint server

Source: OA.Works Updated: Weekly (premium)

has_preprint_copy

Boolean: true if the article has a version on a preprint server

Source: OA.Works Updated: Weekly (premium)

preprint_doi

String: The doi of the article's preprint

preprint_doi: "10.21203/rs.3.rs-805463/v1"

Source: OA.Works Updated: Weekly (premium)

has data availability statement

Boolean: true if the work has a data or resource availability statement

Source: OA.Works Updated: Weekly (premium)

has_made_data

Boolean: true if the article uses data the authors made in the process of research

Source: Dataseer Updated: As requested (premium)

has_shared_data

Boolean: true if the article shared the data in some location (e.g in the supplements, the article itself, a data repository, their website)

Source: Dataseer Updated: As requested (premium)

has_open_data

Boolean: true if the authors shared their data and licensed it cc-by or cc-0.

Source: OA.Works. Updated: As requested (premium)

has_reused_data

Boolean: true if the work relies on data not created by the authors

Source: Dataseer Updated: As requested (premium)

has_made_code

Boolean: true if the article uses code the authors made in the process of research

Source: Dataseer Updated: As requested (premium)

has_shared_code

Boolean: true if the article shared the code in some location (e.g in the supplements, the article itself, a data repository, their website)

Source: Dataseer Updated: As requested (premium)

has_open_code

Boolean: true if the authors shared their data and licensed it under a permissive open source licence (e.g MIT)

Source: OA.Works. Updated: As requested (premium)

resource_doi

String: DOI(s) found associated with the work (could be for a dataset, codebase, or something else)

Source: OA.Works. Updated: As requested (premium)

resource_licence

String: licence found associated to supporting resources (could be for a dataset, codebase, or something else)

Source: OA.Works. Updated: As requested (premium)

resource_location_name

String: location(s) of supporting resource(s)

Source: OA.Works. Updated: As requested (premium)

resource_location_url

String: URL(s) found associated with the work (could be for a dataset, codebase, or something else)

Source: OA.Works. Updated: As requested (premium)

apc_cost

String: the APC cost in USD

Source: OA.Works Updated: Weekly (premium)

invoice_date

String: Date an APC invoice was issued

Source: OA.Works Updated: Weekly (premium)

invoice_year

String: Year an APC invoice was issued

Source: OA.Works Updated: Weekly (premium)

invoice_number

String: The invoice number provided on the invoice

Source: OA.Works Updated: Weekly (premium)

Organization specific supplements

These keys also start with supplements. However, they also end with an organization's name or acronym to provide organization-specific data. For instance: supplements.grantid__bmgf.

grantid*

String: The grant ID(s) associated with the work

Source: OA.Works, Crossref Updated: Weekly (premium)

is_compliant*

Boolean: true if the work is compliant with the organization's Open Access policy

Source: OA.Works Updated: Weekly (premium)

is_covered_by_policy

Boolean: true if the work is covered under the organization's Open Access policy

Source: OA.Works Updated: Weekly (premium)

is_new*

Boolean: true if the work has been added since the last time we sent the user a report

Source: OA.Works Updated: Weekly (premium)

program*

String: the grant program the work was supported by

Source: OA.Works Updated: Weekly (premium)

is_approved_repository*

Boolean: true if this work is deposited in an approved repository under the Open Access policy

Source: OA.Works Updated: Weekly (premium)

financial_disclosures*

Boolean: true if this work's funding statement is actually a financial disclosure

Source: OA.Works Updated: Weekly (premium)

remove*

Boolean: true if this work should be removed from an organization's results for any reason

Source: OA.Works Updated: Weekly (premium)

fundingstatement*

String: full-text of funding statement

Source: OA.Works Updated: Weekly (premium)

Reused data

Open sources do a fantastic job of providing a lot of the core metadata used by OA.Report.

See Crossref's documentation for the following keys:

  • funder

    • name

    • award

    • DOI

  • subject

Note: Crossref is not our only source of funding data. However, it is the best source of open, structured data.

See OpenAlex's documentation for the following keys:

  • doi

  • title

  • subtitle

  • journal

  • publisher

  • issn

  • volume

  • issue

  • PMCID

  • is_retracted

  • is_paratext

  • is_oa

  • published_date

  • published_year

  • type

  • authorships

    • author

      • id

      • display_name

      • orcid

      • author_position

    • institutions

      • id

      • display_name

      • ror

      • raw_affiliation_string

  • concepts

    • display_name

    • id

    • level

    • score

We use equivalent Crossref data where OpenAlex data isn't yet available to provide up-to-date results. In some cases, such as `PMCID` we use other sources to provide more complete coverage.

See Unpaywall's documentation for the following keys:

  • oadoi_is_oa (see `is_oa`)

  • oa_status

  • has_repository_copy

  • has_oa_locations_embargoed

In the below host_type is prepended to a key as a helpful simplification over Unpaywall's oa_locations data:

  • publisher_version

  • publisher_license

  • publisher_url_for_pdf

  • repository_version

  • repository_license

  • repository_url

  • repository_url_for_pdf

  • repository_url_in_pmc

  • best_oa_location_url

  • best_oa_location_url_for_pdf

Last updated