Technical Walkthrough – Voyant Server

Voyant Server is a web based text analysis and reading environment. A review of Voyant Tools can be found at https://blogs.reed.edu/ed-tech/2017/03/text-analysis-using-voyant-tools/.

Who or what’s it for

HASS DEVL project contribution. eResearch South Australia originally requested from The University of Melbourne for South Australian research.

Arrangements in place

Available for broad research access for the HASS DEVL project and SCIP. Post project (after 2018) availability for national use intended.

Availability and data recovery

Best effort 24/7/365 availability of voyant.https://tinker.edu.au, with support during University of Melbourne Business hours, between 9AM and 5PM, Monday to Friday.

Daily backups of stored data, with at least 30 day recovery.

Key dates 

2017 – Early prototype provided by The University of Melbourne Social & Cultural Informatics Platform, deployed version by University of Melbourne eScholarship Research Centre.
May 2018 – discoverable on https://tinker.edu.au
2019 – sustainability at UoM, but TBC.

Location

HASS Cloud nectar allocation though The University of Melbourne.
Puppet configuration: https://github.com/PeterTonoli/puppet-toys/blob/master/voyant/voyant-be.hasscloud.net.pp

Support contacts

Primary: scip-enquiries@unimelb.edu.au
Sysadmin: Peter Tonolipeterct@unimelb.edu.au 03-83448181
Software contact: As at May 2018 Stéfan Sinclair is actively working on Voyant Server – Issues may be logged on GitHub.

Voyant Tools is also on Twitter.

Dependencies

Hasscloud.net.au Puppet server for configuration. Hasscloud.net.au for monitoring and backup infrastructure.

Note that users may “Export a URL, embeddable tool, or citable reference”, in essence publishing a visualisation, so external HTML links to the hostname might increase this way. Approach for addressing this TBD.

Software license

Voyant Server is open source licensed under the GNU General Public License v3.0.

Reference materials

Documentation for Voyant may be found at http://docs.voyant-tools.org/resources/run-your-own/voyant-server/ – additional technical information, such as installing Voyant on a server, can be found at https://github.com/sgsinclair/VoyantServer/wiki.

Deployment Guide

The beta instance of Voyant used for HASSDEVL is under Ubuntu 16.04, using Apache Tomcat 8.5, and OpenJDK 8.

First things, first. Create a virtual machine, under NECTAR, or your favourite platform – Ubuntu 16.04 works well, however 18.04 should suffice, if need be.

The impression that we have been given, however, have not yet had a corpus of text large enough, is that Voyant is voracious with resource usage. HASSDEVL has chosen to install Voyant on a m1.large instance, with 16GB of RAM & 4 VCPU’s.

Once your instance is installed and running

1. Install OpenJDK

apt-get install openjdk-8-jre-headless

2. Download and install Apache Tomcat.

Browse to tomcat.apache.org

Download, and copy to your virtual machine, the latest version of Apache Tomcat 8.0.

Unarchive Apache Tomcat, into /opt/tomcat

mkdir /opt/tomcat
tar -xvzf apache-tomcat-8.5.31 --directory /opt/tomcat

3. Download and install Voyant Server –

The latest release can be downloaded from https://github.com/sgsinclair/VoyantServer/releases/latest.

UnZip the voyant archive into /opt/tomcat/webapps

apt-get install unzip

unzip VoyantServer2_4-M5.zip -d "/opt/tomcat/webapps"

Delete the Tomcat ROOT application

rm -rf /opt/tomcat/webapps/ROOT

Create a symlink between Voyant’s application directory, and Tomcat

ln -s /opt/tomcat/webapps/VoyantServer2_4-M5/_app ROOT

4. In order to use the virtual machine’s full complement of memory, the Voyant confirmation file, /opt/tomcat/webapps/Voyant*/server-settings.txt , should be modified. The line beginning with “memory” should be modified to reflect the system’s memory, less 1 or 2 gigabytes reserved for the Tomcat and the system. HASSDEVL has chosen 12G for a 16G system, with the entry “memory = 12000“. By default, your Voyant instance will be listening on port on port 8888; change the entry “port = 8888” to your elected port.

5. For persistence, data should be stored in a persistent, non ephemeral data store. The line beginning with “data_directory = ” should be modified to point to the persistent data store.

Adding reference datasets to Voyant

Voyant Tools comes with two datasets (Shakespeare’s Plays and Austin’s Novels) attached to the “Open Menu” on the Voyant Tools home page. For all other datasets, researchers first have to upload their datasets to Voyant to analyse them. However, it would be very handy for the researchers of HASS to analyse data, if data of interest is embedded in the analysis tool beforehand instead of downloading the data from a source/data repository and then uploading it to Voyant.

Upload the reference dataset that you want to attach to Voyant, either by

a) Copying and pasting the text in text box

b) Writing the URL in text box

c) Uploading the document using the Upload button

  1. Once processed, Corpus Id of the uploaded document/documents will appear in the URL. Copy the id (Please note that corpus Id can be noted from your data directory (as specified above)\trombone5_2\corpora)
  2. Open server-settings.txt in the same directly where Voyant Server resides
  3. Edit open_menu field by adding corpus Id and label using the syntax corpus_Id:Label
  4. Multiple corpus can be added, separated by semicolon. For instance: Corpus_Id1:corpus_label;Corpus_Id2:Corpus_label
  5. Save the changes and shutdown and restart Voyant Server
  6. Once the server executes in the browser, you can see the attached corpora by clicking the Open button on the main page.

Opening Voyant with a default dataset

Upload the dataset that you want to open Voyant with, either by

a) Copying and pasting the text in text box

b) Writing the URL in text box

c) Uploading the document using “Upload” button

  1. Once processed, Corpus_Id of the uploaded document/documents will appear in the URL. Copy the id (Please note that corpus_Id can be noted from your data directory (as specified above)\trombone5_2\corpora)
  2. Open server-settings.txt in the same directly where Voyant Server resides
  3. Edit uri_path field by adding the corpus_Id or URL in the format uri_path = /?corpus=corpus_Id uri_path = /?input=http://cbc.ca/news/
  4. Save the changes and shutdown and restart Voyant Server
  5. Server executes by loading the results for the default dataset on homepage