Transcription tools used in digital humanities projects are almost as varied as the projects themselves. Most are more complex than a simple free text field for transcribers to type into. Some are excellent for fielded data, such as Digivol. Some facilitate volunteers with formatting tools to mark up their transcribed text. Transcribe Bentham uses TEI XML tags; WikiSource uses a special syntax called Wikitext or Wiki markup.

In order to use a computer to analyse documents, the text in those sources must first be converted to computer readable text. For some recent printed documents this conversion can be done via optical character recognition (OCR). For historical manuscripts or handwritten documents, however, OCR does not adequately detect the text on the page. For this reason, older documents and most handwritten manuscripts have to undergo a process of manual transcription.


Some transcription tools facilitate annotations, allowing transcribers to make use of wiki-style syntax to tag keywords within the text, such as names of people, places or species. Examples of these include the Public Records Office Victoria (PROV) semantic wiki, WikiSource and FromThePage.

Many organisation around the world are using crowd-sourced tools to successfully transcribe handwritten material online. Some examples of transcription projects using online volunteers include:

The Tinkers team has conducted a useful evaluation of 3 different text transcription tools for use within the Tinker environment.

