Digital Things

Another not quite proper post - just a round-up of some things I’ve been doing.

The most important thing is that I’ve more or less finished the switch to Zotero. I managed to fix the bug in the MODS translator, which allowed me to import the 1,000+ records (along with associated notes) from my old database without any trouble. That success encouraged me to have a go at writing an Adlib XML translator so I could scrape records from the RHS Bibliography. It’s actually not as difficult as I thought before. I managed to get a working demo but then I gave up because the XML that the RHS site outputs isn’t very good. First, Adlib XML isn’t as detailed as MODS XML, and second, the RHS people don’t seem to have applied the tags very consistently. That means that any records scraped into Zotero would still need quite a bit of manual adjustment. Today I tried getting some new records from the RHS without a scraper, using the links to COPAC and getCopy which appear on most records. Although this was slower than scraping the records directly off the page it worked reasonably well. Books are no problem because they can nearly always be found on COPAC with one click. Journal articles are more hit and miss. Sometimes getCopy leads to a page that Zotero can scrape, sometimes it doesn’t. Essays in collections are the worst as they have to be entered manually. Today’s test was just a simple keyword search for “animals” which only returned 250 results. Over the rest of the week I need to find everything I can about the causes and outbreak of the English Civil War. I already have several hundred ECW related works on file from my PhD, but there will still be a lot of stuff which wasn’t relevant to that which I need to track down now.

Meanwhile over at Early Modern Notes, Sharon noted the death of bookmarks. I still use bookmarks a lot more than some people, but it is true that I’m using them less than I used to. RSS has played a big part in this decline. I use WizzRSS to subscribe to the blogs that I read regularly. Zotero is also taking over from bookmarks as it’s a much more powerful way of keeping track of webpages - you can keep a snapshot of the page (or several snapshots taken at different times), tag it, add it to collections, attach notes, relate it to other items.

I also got excited about the release of CommentPress, a Wordpress theme which allows paragraph level comments. One thing I’d like to use this for is putting my PhD thesis online. I could just let people download a PDF, but apart from giving readers the chance to comment on it, I’d like to comment on it myself first! It might also be useful as a feedback mechanism for the digital edition of Sandall’s history of 5th Lincs that I’m working on. I really want some way for readers to be able to suggest corrections, and something like CommentPress would be easier than programming something myself. So I downloaded it to try it out on my local server setup, but I couldn’t get it to work! It might be something to do with Windows, so tomorrow I’ll try to run it on my web host, which is on Linux.

I’ve had more success with Python. Getting to grips with it has been on my to do list for a long time, but I finally got round to downloading and installing the Python interpreter. I haven’t done much with it yet, but it looks like a good language. I used to be prejudiced against it because it doesn’t have curly braces (which are the mark of a “proper” programming language!) but its syntax is actually more concise than that, and nothing like the horrors of Visual Basic. I should be having lots of Python based fun over the coming weeks.

Digital History — posted by Gavin Robinson, 8:55 pm, 31 July 2007

No Comments

Google Trench Maps

I’ve just been playing around with the new “My Maps” feature on Google Maps. There are lots of other things I should be doing, but when I saw this post at Mercurius Politicus I just had to try it for myself. So I got out a trench map and came up with this map showing where my great-grandfather was captured by the Germans in December 1916 (I wrote about that in more detail here and here). We’re lucky that the incident was recorded in enough detail to reconstruct it reasonably well. It’s impossible to say exactly where the fight took place, but from the battalion war diary we can narrow it down to a relatively small area (the stretch of road highlighted in green on the map).

My Maps is obviously a very exciting development. It means that anyone can create custom maps with a few clicks rather than having to learn the Google Maps API. It took me less than an hour to make the map. The interface is so intuitive I didn’t need any instructions, I just got on with it. Most of the time was spent trying to trace the trench lines more or less correctly. It was easy for the Germans because their front line is still visible on the satellite photo, and the Z redoubt is a nice distinctive feature. The British trenches were more difficult because they don’t seem to coincide with any visible features. The lines I’ve drawn are only approximate and don’t capture all the twists and turns of the trenches but they give a reasonably good impression of the position.

One improvement that I’d like to see is the ability to place a grid over the map, move it, and change the size of the squares. That would help with tracing lines which don’t follow present day features visible on the map. It’s possible to do this with the line drawing tool but it’s a bit tricky. An automatic grid would make life much easier. Also a tool for measuring distances would be very useful - I found myself holding a ruler up to the screen! - and more fine control over scaling so that it’s easier to get the scale to coincide with a paper map. What would be really good is if someone made a map which overlaid the entire trench map grid onto France and Flanders…

Digital History, History, Military, World War 1, World War I On Web 2.0 — posted by Gavin Robinson, 6:37 pm, 27 July 2007

5 Comments

Back to the World Wars

I’m trying to get some “proper” English Civil War related work done this week, but at the weekend I did some more First World War stuff. In April I posted about World War I on Flickr, when I uploaded my great-grandfather’s photos from Cottbus PoW camp. Now I’ve added his letters, and another photo which I got from ebay. Although he isn’t on it, it was taken in the theatre at Cottbus and one of the men has the same “Bing Bong Boys” navy outfit:

April2007-001

I’ve now put each letter/postcard in its own set to make the link between the front and back of the same document more explicit. The sets are then arranged into collections. Some people on the Great War Forum were able to help me locate Cottbus Camp No. I, so now most of the photos have been placed on the map.

I also discovered that another Wenham brother might have died in the Great War. I don’t know why I hadn’t ever looked for Wenhams on CWGC before, but I found a Charles Wenham who could well be one of William’s brothers. Some of the evidence is circumstantial and I need to do more digging to be sure, but the epistemic probabilities are quite high. So far it looks like he joined 10th Lincolnshire Regt (Grimsby Chums), served overseas, was wounded and sent back to England but died of his wounds. Unlike the soldiers who died overseas, his body was brought home and buried in Cleethorpes cemetery. Again the Great War Forum has been a great help, and you can see more details on this thread.

And with regard to the other World War, I played some more of Brothers In Arms: Earned In Blood. I was still a bit curious about the post-Hill 30 storyline, but so far it’s been quite boring, and I gave up when I got into a silly tank level that’s suspiciously similar to the silly tank level in Road To Hill 30 that I complained about before. But there are more trees this time…

Digital History, History, Military, World War 1, World War I On Web 2.0 — posted by Gavin Robinson, 3:53 pm, 23 July 2007

No Comments

We Are The MODS

I haven’t mentioned Zotero for a long time. I was really excited when I first heard about it, and tentatively started using it last year, but then I accidentally wiped my Firefox profile and lost all the stuff I’d put in. It wasn’t much - mainly books from EEBO and notes for my posts on cavalry charges - but after that I got out of the habit. Now I need to manage the bibliographies for some articles I’m working so I’ve decided to start using Zotero properly. That involves importing over 1,000 records from my old database (which I wrote about here). I decided to use MODS XML as an intermediate format as Zotero can import and export MODS, and it’s also used as an intermediate format by bibutils. So far I’ve written a PHP script to pull records out of the MySQL database and display them as MODS XML. This bit went smoothly but while I was testing it I found what I think is a bug in the Zotero MODS translator (read all about it on the Zotero forum). Until that’s sorted out I can’t do the import unless I want to spend a lot of extra time manually changing creator types from author to editor.

I also need to think about adding new records. I’m trying to get on top of the debates about the causes and outbreak of the English Civil War, something which I’ve previously tried to avoid. Some of the literature is already in my old database from my PhD research but I need to find more. The most obvious place to look is the RHS Bibliography of British and Irish History as this is a more or less complete database of academic works with good search facilities (including subject headings and dates covered). Another potential advantage is the option to select records from the search results and display them as XML. The problem here is that they’ve chosen Adlib XML, which doesn’t seem to be very well supported outside the proprietary Adlib software. There isn’t a Zotero translator for it yet and I’m not really capable of writing one myself - if I couldn’t fix the bug in the MODS then it’s unlikely that I’d be able to adapt it to handle Adlib instead. What I might be able to do is write some XSLT to transform Adlib XML into MODS XML, which I can then import into Zotero. I’m not sure if it’s worth doing this. In practice most records in the RHS database are only a couple of clicks away from a record which Zotero can scrape. All records have a link to COPAC, which is fine for scraping books. Journal articles have a link to GetCopy, which usually leads to a record that can be scraped. Essay collections are a potential problem because the RHS has a separate record for each essay but there are no links to any other pages with these details as COPAC and other sites only list the volumes as a whole. So it’s a choice between entering these manually or getting to grips with XSLT (without the benefit of oXygen).

However I do it, I should have time to write about something interesting once I get it out of the way…

Digital History — posted by Gavin Robinson, 1:59 pm, 20 July 2007

2 Comments

New Blogs

Dan Todman linked to Plugstreet, a new blog about a First World War battlefield archaeology project.

Not to be outdone, I’ve just discovered a new early-modern history blog. Mercurius Politicus, named after a 17th century newsbook, focuses on England in the Commonwealth period. The anonymous (as far as I can tell) blogger is starting a Masters degree on the period in October, so the blog should be worth following, especially if you have an interest in the civil wars and interregnum.

Blogging, English Civil War, History, Military, World War 1 — posted by Gavin Robinson, 8:59 am, 18 July 2007

No Comments

Unexpected Progress

It’s been a long time since I wrote anything about my First World War digitization projects, but I now have some progress to report: today I published an interim version of Sandall’s History of 5th Lincolnshire Regiment. It’s still a work in progress, and there’s a lot more to be done, but you can see it here. It’s just a plain HTML version (and not strictly valid HTML), and the whole text is on one page (at least it makes it easy to search the whole text with your browser’s Find feature!), there’s no name linkage yet, no page images online, and no mechanism for submitting corrections. However, even in this form it should be useful to people who are researching the battalion and can’t get hold of the original book. More details on what I’ve done and how I’ve done it below.

(more…)

Digital History, History, Military, Sandall 5th Lincs, World War 1, World War I On Web 2.0 — posted by Gavin Robinson, 5:35 pm, 11 July 2007

10 Comments

Fourth Military History Carnival posted

The fourth edition of the Military History Carnival is now up at Battlefield Biker, remembering lots of July anniversaries. Thanks to TJ for doing a great job of hosting this edition.

The next edition is at American Presidents Blog on 16th August. E-mail submissions to £coppertop67£@£hotmail.com£ (without the GBP£ signs) or use the submission form.

And I really need a host for September, so if anyone is interested please get in touch. You don’t have to be a military history blogger (however you define that) and you don’t need previous experience of hosting a carnival or any special technical skills. All you  need is enthusiasm and some spare time.

Blogging, History, Military — posted by Gavin Robinson, 12:31 pm, 8 July 2007

1 Comment

Information vs Meaning: A False Dichotomy?

In a few previous posts I’ve stressed the difference between information and meaning (which I picked up from Claude Shannon, the father of information theory) and some of its implications. For example, in this post I pointed out that Shannon’s separation of meaning and information is compatible with structuralist and post-structuralist theories which maintain that there is no inherent meaning in the text. (I’ve also had to deal with it in the course of digitizing a book - see here). Work on Artificial Intelligence has tended to reinforce this distinction: computers are very good at processing information but not very good at understanding meaning.

But last week Bill Turkel wrote a post which turned my understanding of the meaning/information dichotomy on its head. This isn’t such a new development as it’s following on from a post he wrote in March 2006, and that was inspired by an article by Rudi Cilibrasi and Paul Vitányi published in 2005. There’s a lot of mathematical stuff about compression algorithms which I can’t claim to understand, but the schwerpunkt is that without understanding anything about meaning, computers can compare similarities in the information content of texts and cluster them accordingly. The result is patterns that make sense to humans who can understand the meaning of the text. Bill’s example used entries from the Canadian Dictionary of National Biography, finding geographical and chronological clusters of entries.

Despite the attention grabbing title of my post, the distinction between information and meaning isn’t a false one. However, these experiments show that in practice the relationship between information and meaning within the context of a particular linguistic/cultural system is not as arbitrary and unpredictable as theorizing might suggest. Does this mean that structuralism could make a comeback against post-structuralism? Or do we need to move beyond both of those things and find a new way to think about text? Whatever the implications for theory, this is an exciting development which promises to be very useful in practice.

Digital History, History — posted by Gavin Robinson, 2:33 pm, 2 July 2007

No Comments

Call For Submissions: Military History Carnival

This month’s Military History Carnival is early, taking place on Sunday 8th July at Battlefield Biker. Send submissions to $tj$@$battlefieldbiker$.$com$ (but remove the dollar signs!) or use the submission form.

Also I need a host for September. Leave a comment or e-mail me if you’re interested. You don’t have to be a military history blogger. Anyone who’s interested can have a go.

Blogging, History, Military — posted by Gavin Robinson, 10:58 am, 2 July 2007

No Comments