Text Mining in the Old Bailey

Exploring the proceedings of the Old Bailey

The Old Bailey Online provides the full proceedings from the Old Bailey between 1674 and 1913, providing access on information from nearly 200,000 trials for free use for non-commercial purposes. This provides a wealth of text that can be mined and used for the purposes of text analysis and will allow us to discover more about the history of crime and justice in London.

As part of the module this week, we were asked to explore the proceedings of the Old Bailey using the Old Bailey Online search and API, and analyse and visualise text extracted from the Old Bailey Online using Voyant Tools. At first I wasn’t sure what I should search for, but then I recalled a recent podcast that I had listened to about the Lady Juliana – a ship that transported female convicts from Britain to Australia in the late 1700s – and thought it would be interesting to find out more about the these female convicts whose sentence included transportation to Australia. Recalling the stories of convicts being sentenced for relatively minor crimes of stealing handkerchiefs and loaves of bread, I wanted to find out more about the women and girls who were sentenced to transportation after being found guilty of shoplifting.

Continue reading

Changes to blog design

As I mentioned in one of my first posts, the initial design for this blog excluded some functions that I wanted to include later. Namely, I thought that it would not be all that helpful to include a lists of categories and tags to explore the posts when there were very few posts published, nor would it have been helpful to have a monthly archive when I had only posted over a short period of time.

Now that I’ve written a small handful of articles, I’ve added in widgets allowing readers to explore the blog posts by category and tags. I’ve also taken the time to clean up the categories and tags assigned to each article. Initially it wasn’t easy to know what categories and tags would be most useful, but as the blog has started to take shape, it’s become easier to categorise articles appropriately.

I still have not included a monthly Archive, but if I continue to write more articles, I may also include this to allow readers to view articles from a particular time.

Making sense of all those words!

Text analysis

This week in our DITA module, we were introduced to the topic of text analysis, the quantitative analysis of a text or group of aggregated texts. Text analysis can be considered a form of distant reading. Instead of a close reading of a passage or text to find meaning, distant reading looks for patterns across multiple texts to understand their meaning in context.

Text analysis can include searching for keywords or passages, identifying the number of times a word appears in the text, uncovering the context in which words appear and the concordance between different words.

Although it can be conducted manually, computer-assisted text analysis makes it possible to analyse large collections of text very quickly, undertake much more complex searches and also easily visualise the results to aid understanding.

Combining text analysis with the analysis of metadata can reveal patterns across time or geographical locations (some great examples are shown in this paper).

Exploring text analysis using Wordle

I created a data set to use to explore text analysis using a couple of different online text analysis tools. I used Altmetric to construct a list of 1,094 articles taken from the top 20 ranking journals in Library and Information Science that had been mentioned online in news, blogs or social media in the past year.

One of the most straightforward forms of text analysis and visualisation are word clouds. A word cloud visualises the frequency with which words are mentioned in a string of text. Words mentioned more frequently appear larger, while words mentioned less frequently appear smaller.

Wordle is a simple online tool for creating word clouds. You simply paste the text you wish to analyse, press a button and you can create a word cloud that you can edit and reformat.

Altmetric word cloud

Continue reading