How to write a data access statement
All research publications produced by Imperial authors must include a statement on how the underlying data can be accessed (a "data access statement"). This is in line with RCUK and individual funder policies.
A data access statement should include the following key pieces of information:
- How the data can be accessed: where it can be downloaded from or who must be contacted to request access. This should always include either a web link (a DOI or other persistent identifier if possible) or a departmental/group email address (not a personal email address).
- What conditions use of the data is subject to: whether a general licence applies to all users, or whether a data sharing agreement must be entered into before access to the data is granted.
Under some (rare) circumstances, it may be appropriate to explain that access to the data is not available at all. In this case, you must give clear and justified reasons:
"The data underlying this article are not available by agreement with our partners to protect their commercial confidentiality."
More examples of acceptable types of data access statement, depending on the type of access, are given below.
If a particularly restrictive licence is applied or a bilateral data sharing agreement is required this should be briefly justified.
Data access statement
Access to the data
If data (or simply metadata describing it) is stored in a dedicated repository, simply include a link to the appropriate landing page on the repository website. If possible, use a persistent identifier, such as a DataCite DOI.
"Data underlying this article can be accessed on figshare at http://dx.doi.org/10.6084/doi.goes.here, and used under the Creative Commons Attribution licence."
Some journals allow additional information, including supporting data, to be attached to an article. This option is only suitable for small datasets, and you should be aware that some publishers unintentionally corrupt data submitted as supplementary information. Data made available through this route will usually be subject to licence conditions applied by the publisher, so check you're happy with these before making data available in this way.
"Supplementary data associated with this article can be found in the online version, at web link to online article."
Available on request
In some circumstances it may be necessary, or simply more convenient, to limit access to verified researchers by requiring people to make a specific request for access. For example, this might be required to properly control access to sensitive information in line with consent forms.
You must not require requesters to email a specific person, as the instructions must remain valid for at least 10 years and there is a risk that individuals will change jobs in that period, or simply be away when a request arrives. EPSRC in particular have explicitly stated this requirement.
Instead, set up a shared email address for your research group, or use an existing departmental address.
"Supporting data is available on request: please contact firstname.lastname@example.org."
If data is stored on a web page (for example a research group or personal website), include a link to that. However, note that a dedicated data repository is strongly preferred over this option.
"Data underlying this article can be accessed via the Smith Research Group website at web link."
Available from a third party
In some cases, you may not have the right to distribute the data yourself; you may have obtained the data under licence from a commercial provider, for example. You should then clearly state the source of the data, and the search parameters used to select it should be documented in the article.
"Data underlying this article is available from Moody's Default and Recovery Database (https://www.moodys.com/)."
No new data
If no new data was collected or generated during the course of the research, state this clearly. Any third party data analysed should still be described as above.
"No new data was collected in the course of this research."
Conditions of access
If an open licence (e.g. Creative Commons, Open Data Commons) is used for the data, state this clearly and include a link to the full text of the licence.
"This data is available under an Open Data Commons Attribution Licence (ODC-BY): http://opendefinition.org/licenses/odc-by)."
Data sharing agreement
If restrictions apply (such as patient confidentiality) that require a bilateral agreement to be signed between the authors and those requesting access, make this clear and include a link giving more details.
"The data used in this research was collected subject to the informed consent of the participants. Access to the data will only be granted in line with that consent, subject to approval by the project ethics board and under a formal Data Sharing Agreement. For more details on our data sharing restrictions, visit link to research group data sharing policy."
If the data is subject to an embargo, clearly state the end date of the embargo period. This will not always be readily apparent if the embargo period is given relative to the date of publication of the article, for example.
"This data is subject to an embargo, and will be released on 12 March 2016."