Crowdsourcing the Past

Louise Seaward, University College London

Image of Jeremy Bentham
Bentham Papers, UCL Special Collections, Box 173, fol. 35. Image courtesy of UCL Special Collections.

The English philosopher Jeremy Bentham (1748 – 1832) was both inspired by and contributed to the intellectual currents of the Enlightenment, which included an emphasis on reason, progress and fraternity.  I think he would be fascinated to see how the Bentham Project at University College London is combining public engagement with cutting-edge technology to further the study of his life and philosophy.

In 2016, I became the coordinator of Transcribe Bentham: an online scholarly crowdsourcing initiative that was launched by the Bentham Project in 2010.  Volunteers from around the world visit our virtual Transcription Desk to explore and transcribe papers relating to Bentham’s huge range of interests which encompass ethics, politics, economics, crime and punishment, religion and education.

The task of transcribing Bentham is not for the faint-hearted – volunteers are required to decipher Bentham’s tricky handwriting and also encode their transcripts in Text Encoding Initiative (TEI) compliant XML to ensure that they are machine-readable.  The transcripts produced by volunteers feed directly into the work of the Bentham Project, where researchers are tasked with producing the definitive scholarly edition of The Collected Works of Jeremy Bentham

Our volunteers are making a contribution to academic research, as well as helping to preserve and promote access to these important documents. At the latest count, they have transcribed more than 20,000 pages of Bentham’s writings at a high level of accuracy.  This is a huge contribution to the future of Bentham scholarship.

Transcribing Bentham can also be rewarding in other ways – our volunteers tell us that they enjoy learning about Bentham’s philosophy, improving their palaeography skills and spending time on a somewhat unique hobby.  The Bentham Project has benefited hugely from its decision to collaborate with the public on the transcription of Bentham’s manuscripts and it is a privilege to work with such an amazing group of people.

Manuscript text
Bentham Papers, UCL Special Collections, Box 27, fol. 36a. Image courtesy of UCL Special Collections.

In the latest phase of our initiative, we are exploring the most recent advances in machine learning.  The Bentham Project is one of the partners in READ, an EU-funded project that is focused on transforming access to archival material through the development and dissemination of Handwritten Text Recognition technology.

This technology is freely available in the Transkribus platform, where users can train algorithms to recognize and search large historical collections written in single or multiple hands.  We have had significant success in training models to recognise the writing of Bentham’s secretaries and we are now focused on improving the automated recognition of the most difficult writing in the Bentham collection – those pages written by Bentham himself in his later years when his sight was fading (see more in a recent blog post).

We are not trying to replace our volunteers with computers.  Instead, we plan to integrate Handwritten Text Recognition into our crowdsourcing platform.  In a new version of Transcribe Bentham, volunteers will be able to check and correct computer-generated transcripts or request suggested readings of words that are difficult to decipher.  This should hopefully make transcription less daunting for new volunteers, as well offering help to experienced transcribers should they need it.

After nearly eight years at the forefront of scholarly crowdsourcing, Transcribe Bentham is still going strong.  We recently celebrated the complete digitisation of the central collections of Bentham’s papers, held at UCL and The British Library.  This comprises some 95,000 digital images – so there is still much more material for our volunteers to transcribe!

I would encourage anyone and everyone to take a look at our Transcription Desk and have a go at transcribing Bentham – you might enjoy it!  I also welcome all enquiries from potential volunteers, as well as other institutions and projects interested in crowdsourcing or Handwritten Text Recognition – it would be great to hear from you!

If you want to get involved, you can contact Transcribe Benthan via their website. And, if you’ve enjoyed reading this, why not check out Louise’s recent blog on ‘Bentham vs the Computer’.

Leave a Reply

Your email address will not be published. Required fields are marked *