Alveo and Voyant


Alveo is a Virtual Laboratory for Human Communication data and provides a repository for language data (speech, video, text) as part of its services.   Data hosted by Alveo can be kept private, shared with other researchers or published under an appropriate licence. Published datasets can be associated with a DOI (Digital Object Identifier) allowing them to be cited in publications.  Alveo provides and API to allow access to data stored in the repository and there are a number of tools that can be used to work directly on this data.  For example, Tinker has some recipes that take data directly from the Alveo API and perform Named Entity Recognition on the text and you can explore textual collections directly in Voyant via the Alveo platform. 

Voyant is a web based text analysis platform.  Data uploaded to a Voyant Tools server can be analysed and visualised in a number of different ways to provide insight into the texts. While it is possible to upload data to a public Voyant Tools instance for analysis, the data then becomes public by default (although there are ways to protect it) and you are creating yet another copy of your dataset that needs to be managed.

To facilitate the analysis of text data stored on Alveo, an instance of Voyant Tools is available as part of the Alveo platform and any text collection stored in Alveo can be made available for analysis on Voyant.   The advantage of this approach is that users don’t need to create new copies of data to upload to Voyant and, importantly, collections are protected by the same user permissions as the rest of the Alveo platform.  Users who have not agreed to the licence terms for a collection or been granted access to it cannot access the collection via Voyant Tools.