Handwritting to Visualisations – Digivol


Recipe – Diary to Map.

From transcription of handwritten text, to being able to visually represent data on a map using the Diaries of an 19th century Victorian prison governor John Buckley Castieau (1855 – 1884).  This work package will expand on the following drafted Recipe answering the questions listed below in a way such that future users of the systems can follow a similar pathway.

In this example we will be using Digivol to transcribe the images, push the resulting files through a Python Jupyter Notebook using the Spacy Module to extract named entities and the Python Geocoder module to convert the named entities into Latitude and Longitude which can then be visualised.

Pre-Reading:

Jupyter Notebooks

Ingredients:

  1. Images of handwritten artifacts.
  2. Transcription Package – Digivol
  3. Tinker workbench + Python Jupyter Notebook
  4. Data visualiser

Steps

1. Transcription of images.

  1. Create a Digivol project by emailing the ALA team
    1. Support team will be able to provide user guide information for administrators.
  2. Transcribe / get Community to transcribe
  3. This recipe follows the ZIP file extraction export from Digivol.
  4. Extract files and use the Task.csv file as the input file for the NER process.

 2. Named Entity Recognition

  1. Login to the Tinker Workbench
  2. Clone the “test git repo” for this recipe ?
  3. Launch a Python Notebook
  4. Step through the “recipe” for NER.
    1. Install required Python modules
      1. spacy – for NER
      2. geocoder – for translating entities into latitude / longitude
      3. pandas – data analysis tools
      4. numpy – scientific computing tools
      5. matplotlib – graphing / plotting tools
    2. Read in CSV file selecting the “occurrenceRemarks” column.
    3. Extract place names and context to assist with manual confirmations.
    4. Manually confirm placenames

 3. Geo-Coding

In this recipe we are using the Python Geocoder module to map the extracted Named Entities to latitude / longitude.(continue to run through the code from step 2)

  1.  Loop through manually confirmed placenames with the geocoder module and assign Latitude / Longitude

 4. Visualisation

Next Steps

  1. Where does the data live?
  2. Is it published anywhere?
  3. Repository?
  4. Update / Create a new Recipe?