Bigamy Dataset Description

How can this dataset be described e.g. content types and dates?       

The Prosecution Project is investigating the history of the criminal trial in Australia. As part of the project a web-based transcription tool for historical court records was created and documents are being transcribed by project team members and trained volunteers.

Case records in this data set have been entered by researchers or volunteer transcribers from scanned images of court registers in the custody of various state archives. Users should be aware that the records may contain transcription errors. Missing data (empty cell) usually indicates that information was missing in the original data source. In a small number of cases data in records have been accessed from additional sources (for example digitised newspapers or police gazettes).

The data sets available in this repository date from 1850 to 1922. The data refers to matters that were prosecuted within the respective jurisdictions of the six states as they currently exist (2018), which may be different from the jurisdiction formally in place for a particular offence in the early part of this period, e.g. prosecutions for Queensland (1850-1859) and Victoria (1850-July 1851) were formally under the jurisdiction of New South Wales during those years, but have been entered for Queensland and Victoria respectively.

The dataset  includes data drawn from original court registers, court calendars, trial briefs and police gazettes.  The Bigamy dataset is a combination of records drawn from the Victorian Supreme Court only at this stage.  Work is underway to expand this dataset to the other States.

JurisdictionSource of InformationIdentified by
Victorian Supreme CourtPublic Records Office Victoria (PROV – VPRS 3524)VICSC

Where is information about this dataset currently published online?

Information about this dataset is published on the Prosecution Project website: Where possible, individual records have also been harvested via the OAI-PMH API by the originating archive / source and published on their website. eg: TAHO website

The bigamy dataset has been made published to the Australian Data Archive and is available in the Australian Historical Criminal Justice Data Collection. A sample notebook has also been created using this in the Tinker Studio github examples page located here.

What data structures are there e.g. tabular or document?

The data is in CSV format.

What metadata schemas are used to access the data e.g. local, DDI, ALTO or DC?


How much time (and whose) is needed to compile the dataset ready for transfer?

In order to create this original dataset, the source documents need first to be located and digitised if not already done (by either a representative of the Prosecution Project or the originating source holder). These digitised items are then loaded onto the Prosecution Project’s server (by the Prosecution Project admin staff) to be transcribed by project team members and trained volunteers.

Once the records have been transcribed from originating source materials, a project team member uses the Prosecution Project tool to audit, correct, search, collate and extract the data to CSV file for all trials relating to Bigamy.

Once exported a manual combination of CSV files was required as different jurisdictions captured different metadata about a court proceeding.

In all it is estimated that 120  hours was required from a Prosecution Project administrator to curate the original dataset.

Subsequent work was done on the Prosecution Project application to enable researchers to combine data from different jurisdictions within the tool, thus reducing the amount of hours required to produce these datasets.

What is the scale of the dataset e.g. file numbers and size?

The dataset has 599 number of rows in the CSV file (as at September 2018) and is 135 Kb in size. It is expected that this will increase on a yearly basis until all data is publicly available.

What metadata exists that describes the dataset?

AttributetypeDescriptionformatexampleNull value
trial_idnumericThis is the number of the record as entered in the Prosecution Project database. It is system generated and permanent.number851234All records have this attribute
courtfreetextThis refers to the level of court in which a defendant was prosecuted – may include supreme, circuit, district, quarter sessions, also abbreviations of these (eg SC)textSupreme CourtEmpty cell
Def_firnamefreetextThe given name of the defendant as recorded in a register or other sourcetextEllenEmpty cell
Def_othernamefreetextThe second and any other other names including alias recorded for a defendanttext MaryEmpty cell
Def_sexfreetextThe sex of the defendant as recorded in a register, or imputed by nametextFemaleEmpty cell
Def_surnamefreetextThe surname of the defendant as recorded in register or other source. If the defendant has only one name it is used as a surname in this databasetextKerryAll records have this attribute
judgefreetextThe surname of the judge as recorded in a register or other sourcetextHopeEmpty cell
off_committed_datedateThe date on which a defendant was committed for trial, as recorded in a register.dd-mmm-yyyy10 May 1878Empty cell
off_committed_placefreetextThe place (committal place, typically a suburban or town petty sessions court) at which a defendant was committed for trial, as recorded in a register. This data was generally only available in registers for Victoria and Queensland.textBenallaEmpty cell
WGS84freetextThe geolocation of the committal place (as derived from the dataset  maintained by This has been completed only for Victoria as committal place is missing or incomplete for other states.numeric, structured-38.38552, 142.23671Empty cell
offencefreetextThis is the main offence for which a defendant was prosecuted (committed and indicted), as recorded in register. Some defendants faced more than charge.textBigamyEmpty cell
pleafreetextThe plea entered by defendant in response to the offence charged. A plea of guilty resulted in an immediate conviction of guilty and subsequent sentence without a need for trial.textNot guiltyEmpty cell
register_numbernumericThis is a case number recorded in the original register, for some jurisdictions (mainly relevant to Victoria and Western Australia records)numeric1Empty cell
sen_duration_daysnumericThe number of days of imprisonment recorded in the sentence (usually recorded only for sentences less than one month, eg 7 days)numeric7Empty cell
sen_duration_monthsnumericThe number of months of imprisonment (usually recorded only for sentences of less than two years, eg 18 months)numeric6Empty cell
sen_duration_yearsnumericThe number of years of imprisonment (may be recorded as a decimal fraction for sentences including years and months, eg three years and six months recorded as 3.5)numeric3Empty cell
sentencefreetextThe sentence as recorded in the register (may include death sentences, sentence of imprisonment, conditional sentences, suspended etc)text6 months hard labourEmpty cell
SourcefreetextA system record of the original source of the data, linked to an image of an archival source where this was available (mainly Victoria, New South Wales, Western Australia, Queensland).text, structuredVICSC_1828_3520202_016Empty cell
trial_datedateThe date of trial as recorded in the registerdd-mmm-yyyy12 May18781 Jan 1970
trial_placefreetextThe place of trial is recorded in the registertextBeechworthEmpty cell
troveurlA link to an article relating to the trial of a defendant in this case record – accesses the article available through the National Library of Australia Trove database of digitised newspapers.text, structured cell
verdictfreetextThe verdict outcome of prosecution proceedings involving the defendant.textGuiltyEmpty cell