Dates of Naturalisation – Bar graph


Dates of Naturalisation bar graph

A bar graph is useful for visualising occurrences of a certain term or event, and how they change over time. It can help quickly pinpoint standout occurrences or identify changing trends over a long period.

This recipe takes the declaration of Australian citizenship from the Australian Government Gazettes turns that into a bar graph that shows how many people became naturalised in each month of the year. Particularly in the late 1960s, the data produces noticeable spikes around Australia Day – a national holiday when many citizenship ceremonies are held.

Pre-Reading:

Jupyter Notebooks

Ingredients:

Datasets
Australian Government Gazettes [1832-1968]

Tools
OpenRefine
Notepad++
PHP programming language
R programming language
Jupyter Notebook
Microsoft Excel

Techniques
Creating Jupyter Notebooks
Working with Regular Expressions

Steps

  1. Extract a subset of gazette articles, limited to those from the Commonwealth Government Gazette, that have the title “Certificates of Naturalisation”. You’ll now have a JSON file of about 668 articles. [Uses Australian Government Gazettes (1832-1968), PHP]
  2. Open the JSON file in OpenRefine and convert the JSON to csv. Save it to your local environment. You’ll now have a CSV file with about 509 lines and 11 columns. [Uses OpenRefine]
  3. Open the CSV file in Notepad++ and run regular expressions to strip all the article text except the name, date and suburb of the people who have been naturalised. You’ll have a CSV file with 7,272 lines and 3 columns. [Uses working with regular expressions, Notepad++]
  4. Open the CSV file in Microsoft Excel. Format the data as a table, give the columns headings and then create a pivot table. Choose to display that table as a bar graph. You’ll see a spike in January, September and November with a low in May each year. [Uses Microsoft Excel]

Other Research Questions

This recipe could be re-used for a number of other research questions that start with the same dataset including:

  • Origins – taking the family names generated by step 3 of this recipe and linking them to the country of origin.
  • Places where naturalised citizens settle – taking the suburbs generated by step 3 of this recipe and exporting the data to a platform like AURIN, where the geochooser tool will allow you to produce a map visualisation.
  • How naturalised citizens can effect political results – export the data to AURIN, combine with the historic political polling data held by the ADA