Multiple Indemnity

[posted by Gavin Robinson, 10:04 am, 20 July 2010]

As part of the research for my book (saying that still feels a bit weird, but I’m sure I’ll get used to it) I’m going through indemnity cases in class SP 24 in the UK National Archives (aka the PRO). The Indemnity Committee was set up by parliament in 1647 to protect soldiers and officials from prosecution for actions that they had carried out under the authority of parliament, such as requisitioning things for the army or arresting royalists. It also dealt with disputes over sequestered rents and debts, and helped to enforce parliament’s order that apprentices who joined the army should be allowed to count military service towards their term of apprenticeship. If someone was prosecuted in court for acts which were covered by the Indemnity Ordinance (and many were despite the Ordinance banning people from bringing cases of this kind) the defendant could send a petition to the Indemnity Committee asking for protection. In SP 24 there are 58 boxes of petitions and other papers relating to cases, such as depositions and lists of expenses. Unlike some classes these are quite well sorted: papers relating to each case are grouped together and sorted in roughly alphabetical order of the plaintiff’s name (although confusingly the plaintiff in an indemnity case is the defendant in the corresponding criminal prosecution). I’m particularly interested in cases relating to horse requisitioning. According to Ian Gentles, about 30% of the military cases involve horses, although from what I’ve seen so far military cases seem to be a minority as many cases are disputes between civilians over payment of rents and debts due to sequestered estates. It usually takes me less than an hour to skim through a box, look at the first petition in each case to see if it’s about horses, and photograph the relevant cases. Sometimes I get cases that look interesting for other reasons, but I try not to wander too far off topic too often. Since I’m photographing these papers for my research, and since the National Archives allow document images to be uploaded to Flickr, that’s just what I’m doing. I’m also putting transcripts or summaries of the documents, along with links to the images, on the Your Archives wiki. You can see what I’ve done so far, and follow my progress in future, via a Flickr collection and Your Archives category.

So far I’ve uploaded cases from the first 2 boxes. I have another 16 boxes ready to be uploaded, but I’m working on some Python scripts to automate the process. The trial run on the first two boxes proved that doing it all manually is quite labour intensive. First I copied the image files from my camera and sorted them into directories for each box. The directory structure is based on the archival reference, so there’s a directory called “SP 24” with sub-directories called “30”, “31” etc. Then I went into each of these directories and made sub-directories for each case, so it looks like this:

  • SP 24
    • 30
      • 1 Abeary vs Windebanke
      • 1 Adams vs Haughton
      • 2 Alford vs King
      • etc
    • 31

And the path to a particular case would be:

SP 24/30/2 Alford vs King

Which looks quite similar to the archival reference.

The numbers at the start of the case name are the part number (each box usually contains three folders called part 1, part 2 and part 3 but I decided not to make directories for these). Up to here it has to be done manually as arranging cases into directories involves looking at the documents to see where a new case begins and to check the names. But from here a lot of it can be automated.

Each directory containing one case needs to have its own photoset on Flickr. I used Postr to upload one case at a time and then used Desktop Flickr Organizer to create a set and add photos to it (I got both of these applications from the Ubuntu repository – if you’re on Windows then… stop using Windows!). Then I used the Organizr on the Flickr website to drag each set into the “SP 24 Indemnity Cases” collection. Once the Flickr photos and sets were in place I went to the web page for each set, manually created a Zotero item for the case, and attached a link to the page. Finally I created a Your Archives page for each case and attached a link to it in Zotero. This includes a template that I made for indemnity cases which gives some basic information in a standardized form and includes a link to the relevant Flickr set. Doing all this manually for each case is quite tedious and takes a long time, so I’m working on some Python scripts to automate the process. What I want the scripts to do is:

  1. Upload photos from multiple directories
  2. Create a separate photoset for each directory, with a name based on the directory name and path
  3. Get the ID of each set and write the IDs and names to a CSV file
  4. (At this point I’ll manually edit the CSV file to add data that will be needed for Your Archives and Zotero and which can only be got by looking at the document images, eg full names of plaintiffs and defendants, date of the petition, summary of the case, categories/tags)
  5. Use the data from the CSV file to construct a wiki page with the correct template and upload to Your Archives through the MediaWiki API
  6. Export an XML file which can be imported into Zotero

So far I’ve written a Flickr upload script which does the first three steps and more or less works. Rather than working directly with the Flickr API I’m using the Python Flickr API library, which makes things very easy. It provides a flickr class with methods to handle API calls and authentication. Before using it you have to go to the App Garden and request an API key, but that doesn’t take long to do. App pages can be kept private, which is what I’m doing in this case as I don’t really have the time or skills to make my scripts fit for public consumption. The next step is to add error handling as the script only works as long as nothing goes wrong. In the real world, there are lots of things that could go wrong. The library throws an exception if it gets an error response from the API. Until I add some exception handling this means that the script just stops on an error. The script will need to keep track of what has and hasn’t been done (photos uploaded, sets created, photos added to sets) so that I can run it again if anything was left undone, and so that it doesn’t try to do the same thing again if it’s already been done. One annoying thing about Flickr’s public API is that it provides no way to create a collection or add sets to a collection. I assumed I’d be able to automate that part of the process but it looks like I’ll still have to do it manually.

For step 5 I’ll be using the Pywikipediabot library. I’ve already done some simple tests on a local MediaWiki installation and it seems quite easy to create a page. Once I’ve finished the script and thoroughly tested it I can ask for a bot account on Your Archives. Step 6 will involve learning a bit more about Zotero RDF. The easiest way to find out how to generate the right code is to export some similar existing items and look at the results.

So just because I’m writing a monograph it doesn’t mean I’ve abandoned digital history. I’ll still be using lots of digital tricks in the background, but they won’t necessarily be obvious in the text of the book. New technology is certainly making my research quicker and cheaper than it used to be. The stuff that I’ve written about above isn’t exactly revolutionary: it saves labour but it doesn’t offer new insights that couldn’t have been found before. But later in the project I’m planning to do some text mining which I hope will show me things that I couldn’t otherwise have found. I’ll also be revisiting phonetic algorithms for place name identification. And if I can’t think of anything else to blog about, there are likely to be some interesting stories in the indemnity cases.

2 Comments

Comments are closed.