Jupyter Notebooks

Jupyter Notebooks allow you to write Python (or R) code interspersed with text and graphics.  Notebooks are a great way to document and communicate an analysis on some data and we have used them extensively in Tinker to illustrate some kinds of analysis that require Python coding rather than an interactive tool.

Viewing Notebooks

Notebooks written by others can be viewed on the web, for example using Github which will render the contents of a notebook stored in a repository.  For example, see this notebook by Tim Sherrat exploring some data on Chinese naturalisations in NZ or this one by Steve Cassidy containing some lecture notes and illustrations of speech acoustics.  As you will see from these examples, Notebooks can contain text and images like any other web page but importantly they contain code that will generate output as tables or graphs.   The text in the notebook allows us to explain and discuss the results of running the code. Hence notebooks are a great way to tell a story with code. They are also useful in teaching some aspects of coding and providing examples for others to work with.

Note that sometimes notebooks stored on the web (eg. in Github) will contain a copy of the output

Running Notebooks

The Github viewer just displays the results of running the code somewhere else. A big advantage of a notebook is that you can use it to run the code yourself and observe the effect of changing parts of it to slightly alter the analysis.  This might range from changing the colours in a graph to swapping in your own data to analyse in the same way.  

To run a notebook you need a Jupyter environment.  You can install this yourself on your own computer or use one of many environments available on the web.  I’ll describe two of these here: Binder and Tinker Studio.

Binder provides a service that lets you run any collection of notebooks stored on Github.  In some repositories you will see this launch binder button such as the following example from the HASS DEVL Github page:

Clicking on the button will take you to the Binder site which will build an execution environment for the notebooks and allow you to run them. It will take some time to get started (and you may see a big 404: Not Found message while it is loading which can be confusing – don’t worry!), when it does you will see the main Notebook page which looks like a file explorer.

Each of the files ending in .ipynb are Jupyter notebook files, click on one of them to open it (eg. Digivol NER.ipynb which shows how to read data exported from DigiVol and run a Named Entity Recognition process on it).  This will open a new tab showing the notebook itself:

A notebook is made up of cells that contain either text or code.   When you start the notebook the first cell will be selected (outlined in blue above).  You can usually work through the notebook by clicking the Run button in the toolbar – this will run the code in the cell and show you the output (if it’s a text cell it will just skip over it).  Sometimes running code takes a while and if that is the case you’ll see a [*] next to the cell on the left. This will turn into eg. [1] when it is done and you can hit Run again to run the next cell.  

Some cells will just load modules or perform calculations in preparation for an analysis and produce no output. Other code cells will produce output in the form of text, a table or a plot.  This will appear below the cell as in this example:

As you step through the notebook with the Run button you will run each cell and (if everything works ok) you’ll see the results of the authors analysis.  You can also click on the text in the cell itself and modify it – this is a great way to learn about how you might do your own analysis.

On Binder, there is no permanent record of your work – so any changes you made to the notebook would be lost once the Binder environment shuts down.  Binder is great for exploring someone else’s notebook.  As an alternative Tinker Studio provides a way to run notebooks and keep a permanent copy of your work.

See this page for details of how to start a Jupyter Notebook on Tinker Studio.  Once you have started your notebook server you’ll see a slightly different main page with file manager on the left side and a Launcher panel on the right:

By default you are looking at an empty directory and you need to either import some notebooks or start a new one.  There’s a separate page that describes working with some sample notebooks.  Here we’ll just show how to start an empty one and write your first line of code. Click on the icon under Notebook (that’s the Python icon FYI).   You should see a new pane replacing the Launcher with the title Untitled.ipynb, the same name should appear in the file list on the left – this is your first empty notebook.  In the first cell you can enter some code, then hit run to see it executed.  Here’s a simple example:

For more details on running notebooks see the Jupyter documentation.