The Pawsey Supercomputing Centre


Data Guides and Help

Introduction to Research Data Management Guides

The significance of research data and developing methods to manage the ever increasing quantities is a common issue amongst researchers. Reproducibility or the ability for another researcher to replicate analyses to validate theories and produce complementary or entirely new science is essential. Science is also increasingly based on computation; making access to the data and code needed to reproduce results increasingly important. Hence researchers should aim to make their research data FAIR (Findable, Accessible, Interoperable / Intelligible, Reusable).

To assist researchers navigate the bewildering array of information on research data management and make their data ‘FAIR’, the Pawsey Supercomputing Centre has developed some guides on research data management. If you are a researcher from one of our partners, you should refer to the policies, procedures and resources provided by your local institution before starting to implement any data management. Below are links to guides from the four public WA universities (accurate as of 21st January 2015)

The Pawsey Supercomputing Centre research data management guides are for general advice only and researchers should consider the appropriateness in relation to their own research and institutional policies before implementing.  If you have any questions relating to the guides or our data services, please contact: 

We also acknowledge the support of ANDS (Australian National Data Service) in reviewing documentation and providing some of the content.

A checklist for each topic is provided for your reference.  These are separate to the actual guides which are available to download using the links below.

1. Data Ownership, Legal and Ethical

  • Is the owner / Principal Investigator of the data clearly identified?
  • Does your data contain confidential or sensitive information? If so, have you discussed data sharing with the respondents and gained consent from whom you collected the data?
  • Do you need to anonymise data? (e.g. to remove identifying information or personal data)
  • If appropriate, have you gained ethics approval from your relevant committee?
  • Are there any cultural and/or commercial sensitivities to your data?
  • Have you established who owns the copyright of your data? Is there joint copyright?
  • Have you created an appropriate license for the data?


2. Data Documentation

  • Are you using standardised and consistent procedures to collect, process, check, validate and verify data?
  • Are the variable names, codes and abbreviations used explained?
  • How will you label and organise data, records and files?
  • Have you used standardised keywords / controlled vocabulary in your metadata?
  • Has a collection-level metadata record been published to a relevant discovery portal?


3. Data Storage and Sharing

  • How much data are you likely to collect during your project?
  • Where will your data be stored during / after the project?
  • What is your backup strategy ?
  • Which data formats will you use? Do your formats and software enable sharing and long-term validity of data, such as non-proprietary software and software based on open standards?
  • Is there non-digital data? Where will this data be held?
  • Do you need to securely store personal or sensitive data?
  • If data are held in various places, how will you keep track of versions?
  • Who has access to which data during and after research? Are various access arrangements needed?


4. Data Publication and Re-use

  • Has a permanent identifier (e.g. DOI) been minted for the data?
  • Have you considered the costs associated with making the data accessible?
  • How long should the data be retained for?
  • Are there any restrictions on the re-use of data? Is this reflected in the license?
  • Who will retain custodianship of the data?


Sourced from: University of Essex, UK Data Archive (; eResearch SA Research Data Management checklist – Theresa McGinley ( 21st January 2015.