Georeferencing – Geocoding street addresses using AURIN
Learning curve: Intermediate
Introduction: This recipe describes the process to create a geographic representation from structured text using AURIN’s PSMA geocoder tool. In this example, the addresses of three post offices in the inner north of Melbourne will be referenced to geometric point features, which may then be mapped or analysed.
Working with Regular Expressions
Sample Australian street addresses
AURIN Portal (https://portal.aurin.org.au/)
1. Open the ‘Sample Australian street addresses’ dataset in Excel and view the file’s contents, noting the data is structured using two columns ‘ID’ and ‘address’
2. Review each of the addresses (rows 2-4) and check their structure. Note, Australian street addresses are typically structured hierarchically, starting from street level to state, with postcode last. If you wish to add new addresses, ensure the text is structured in this way, otherwise the geocoding tool may fail or yield erroneous results*
- Save the dataset as a CSV file. From the ‘File’ menu choose ‘Save As’, select ‘CSV (MS-DOS)’ as the file format
- Login to the AURIN Portal. Open a browser window and navigate to https://portal.aurin.org.au/. At the prompt enter your personal educational institution’s account credentials.
- Click on ‘Import’ in the Data menu on the right hand side, then ‘Browse’ to your CSV file location
- Enter a descriptive Title (e.g., Post Offices) and Abstract (e.g., geocoded locations) for the dataset, choosing ‘Non-Aggregated’ for ‘Aggregation Level’ and ‘ID’ for ‘Key’
- Click ‘Add & Display’
- Once your dataset has been imported, view the resulting table shown. It is important to check if ‘Record Number’ matches the input and that all other details are correct. If the process failed, please return to Step 2.
- Navigate to the Geocoder tool (Tools → Spatial Data Manipulation → Geocoder) then choose your Address Dataset: ‘Post Offices’ with ‘Address Field’: ‘address’
- Click ‘Add and Run’
- Once your tool has run, find the output dataset (named ‘Output: geocoder-workflow XXXX’) in the Data menu and click on to table icon (located immediately to the left of the dataset’s name). Check if ‘Record Number’ matches the input.
- View the result – Right click on the spanner icon located to the right of the output dataset and click ‘Display on Map’. Select the colours you would like and click ‘Display’
- Zoom and pan the map window, locate the point features and mouse over each. You should see that each address has been referenced to a point feature
- Download the geometry by right clicking on the spanner icon located to the right of the output dataset and click ‘Download CSV’ to obtain the data in text form for analysis, or ‘Download SHP’ to obtain provide the referenced geometry for use in mapping software like QGIS or ArcGIS.
*It is recommended that postal address identifiers be removed from street addresses. As such the following should be replaced with a corresponding street number: PO Box, Locked Bag, Mailbox, Duplex, GPO, PMB, Private Bag. Additionally, use of the following identifiers may require fine tuning to a physical street address: LOT, SE, SHOP, Building.
Review & Refine
- All results need to be reviewed for completeness as reiteration may be required. Missing data may be a result of:
(1) the input data be incorrectly structured → Return to Step 1 of this recipe
(2) the address doesn’t exist within the geocoder’s reference data → Choose an alternative street address geocoder (note, each geocoder may have its own methodology). Visit the georeferencing recipe book and geocoder chooser for more information.
- Results may be analysed using analytical software like R, MATLAB and Excel.
Visit the spatial analysis recipe book for more information.
- Results may be mapped and presented using software like QGIS, ArcGIS and R.
Visit the cartography recipe book for more information.
This recipe has been derived from AURIN’s Geocoder tutorial available at: