Manage your dataset

File and folder conventions.

This page will showcase how to store and save your dataset.

The key to managing your data is to make sure it is safe, findable and reusable. Best practice is to create a hierarchical file structure. This structure involves a main project folder that contains the data, metadata and project files. This process adds additional levels of organization to your files and structure the project.

For folder titles, use whatever makes most sense for your data, be as descriptive and clear as possible, and name the folders as if others were to use them.

Best practice dictates that you will also need to include descriptive materials about your project. This is more commonly known as metadata.

Create a folder for your project, then folders for your materials. Be sure to create a folder containing the metadata about your materials.

File Organisation

Like file naming, consistency is key. Organise files in a way that makes sense within the context of your project, but would also make sense to someone who was not intimately familiar with your project.

How files are nested in directories can be dependent on the number of files you are working with, and what aspect of those files is most important for analysing or re-using the information in them.

For instance, if you have hundreds of thousands of image files collected over many years from many different locations, you may want to organize first by year, then month, then location. You could also organize them entirely by date, and include the location in the filename. Alternatively, organize by location, and only include the date in the filename.

If you are working on a collaborative project, make sure all collaborators are using the same principles to organize and name files!

Folder Naming conventions.

Folder naming conventions are similar to those of file naming.  Folder names will broadly be more descriptive and will include the name of the project, the name of each element of the project and who is responsible for the content.

Do not use spaces in folder or file names. Use hyphens or underscores instead. Be sure to note the date of file creation and to show the dates of any changes, when using dates use the ISO standards of YYYYMMDD, you only need to be as detailed as required but, as a minimum use year and month. For projects with lots of similar files, more dates will be needed to keep track of your research.
If you are working with analysed data sets then noting if the data is in its raw or analysed state is very important.


Metadata

The following is the ARDC definition of metadata

What is metadata?

While generally ‘meta-data’ is summarised as ‘data about data’, what does that actually mean?

  • Metadata is information about an object or resource that describes characteristics of that object, such as content, quality, format, location and access rights
  • Metadata can be used to describe physical objects (pot shards, specimens etc.) as well as digital objects (documents, images, datasets, software etc.)
  • Metadata can take many different forms, from free text (e.g. a read-me file) to standardized, structured, machine-readable, extensible content ands.org.au 3
  • Metadata is analogous to any other form of data, in terms of how it is created, managed, linked and stored
  • Metadata is associated with the data it describes. It can be embedded within the data file or recorded a separate text/spreadsheet file that is linked to the collection of data files it describes or contained in a catalogue record that points to the research data collection.

(ARDC 2016 Metadata  https://www.ands.org.au/__data/assets/pdf_file/0004/728041/Metadata-Workinglevel.pdf  Pg 3)

Why do we need metadata?  Metadata increases reuse and accessibility of data sets.

Why do you need metadata? You need project metadata to tell users what is in each folder, how you found the materials, when you found the materials and why these materials are useful.

What are some common metadata standards?

There are a variety of metadata schemas available to assist you in creating the metadata for your project.

Most commonly used is Dublin core https://www.dublincore.org/
There is also the DDIAlliance specifically for social science metadata https://ddialliance.org/

Storage and saving of Metadata?
The metadata for your project needs only to be saved in your project files. It can be as simple as a text document outlining and containing the necessary materials to describe your project.