14th Military History Carnival

This is the 14th Military History Carnival, with a special theme of Contested Boundaries. Today is also the day that Bloggers Unite encourages bloggers to write about human rights (hat tip: Mark Stoneman). I might post something on that theme later today if I have time (and I probably won’t have time), but this carnival edition gives plenty of attention to human rights issues.

(more…)

Blogging, Early Modern, English Civil War, History, Military, World War I On Web 2.0 — posted by Gavin Robinson, 1:02 pm, 15 May 2008

12 Comments

Great War Archive update

Yesterday I tried uploading some material to the Great War Archive (which I previously posted about here). I’m pleased to say that it was very easy to do and that the site works very well. It took me less than one hour to upload about 27 items, so about 2 minutes per item, but that would vary depending on how many pages each item has. These were all letters and postcards with only two images per item. Most of the time was spent waiting for the files to upload, which depends on the speed of your connection (my ADSL is 8Mb downstream but only 500Kb upstream). Although there are several pages to click through during the submission process they all load very quickly, and there is an option to remember your personal details so you only have to enter them once.

There’s surprisingly little opportunity to enter structured metadata, but I think the idea is to make the submission process as easy as possible for people with no technical skills. This is likely to be a big advantage - I’ve previously mentioned that the UK National Archives wiki Your Archives requires an unusual combination of skills and experience which probably limits the number of people who can contribute. The important thing with the Great War Archive is to get hold of previously unseen material and make it accessible to the public (access to the archive will definitely be free for everyone). This means not making too many demands on the people who hold this material. It’s important to recognise that even uploading photos can be difficult for some people - many new users on the Great War Forum have problems with this, although that’s partly down to the 100K file size limit. The GWA allows each file to be up to 25MB, which should mean that contributors don’t have to worry about resizing or compressing images.

The submission form asks for as much information as possible in a human readable form. It will then be down to the project staff to convert this into structured metadata. It looks like they have the time, budget and expertise to do this - project director Stuart Lee said in a comment on my previous post that 60% of the timetable is devoted to cataloguing, and that the Centre for First World War Studies is involved in the project. The result should be very different from Ancestry’s sloppy indexing of service records. Now we’ll just have to wait until November to see how it turns out.

Digital History, World War I On Web 2.0 — posted by Gavin Robinson, 11:23 am, 9 March 2008

2 Comments

Great War Digital Archive

Today the Great War Archive opened for submissions. This is a very big and very innovative project started by Oxford University to collect digital facsimiles of documents, photographs, recordings, and artefacts relating to the First World War (it seems to be primarily about the UK but they haven’t explicitly mentioned any geographical limits) from private individuals. This means that lots of family collections which were previously unknown and inaccessible will be made available to the public (and since the terms for contributors state that material will only be used for educational non-commercial purposes I’d hope that access is going to be free). Anyone can contribute material by uploading it through the project’s website, and there will also be special events where people who don’t have the IT skills or equipment can bring items along to have them digitized.

This is a really exciting project, and I hope it all goes well. We’ll be contributing the Wenham letters that I’ve been working on (although I’m still planning to put TEI transcripts on my own site eventually, along with all the same kind of record linkage that I’ve done with Sandall’s history), so I’ll soon be able to report on how easy it is to upload stuff and what kind of metadata they collect.

If everything goes to plan the archive will be open to viewers from 11th November 2008.

Digital History, World War I On Web 2.0 — posted by Gavin Robinson, 5:39 pm, 3 March 2008

11 Comments

Digital Express

Having decided to leave my 5th Lincolnshire First World War project for a while, I got an offer I couldn’t refuse: someone from the Great War Forum sent me a transcript of the battalion’s medal citations from the regimental archive so that I could publish them on my site and link them in to the index of people that I’d created for the book. The document contains information that can’t be found elsewhere, as although awards of the Military Medal were listed in the London Gazette, full citations were not normally published. There are also three awards not mentioned in Sandall’s list, and citations for 10 people who were recommended for awards but turned down.

I received the list as a Word file with no semantic markup on Wednesday morning, started working on it on Thursday morning, and published it on the web this afternoon. It looks very basic but it’s not bad for two days, and it’s all linked in to the index of people for Sandall’s book. First of all I copied the text into jEdit and used Find and Replace to insert some basic TEI XML markup. Then I pasted it into a new TEI document in oXygen. With the automatic validation it was easy to track down and correct errors in the markup, so by lunch time I had a completely valid TEI file. In the afternoon I spent about 3 or 4 hours on linking records by inserting key attributes into <persName> tags. In most cases I already had the keys that I used for linking names in Sandall, but sometimes I had to change them in the light of new evidence from the citations, such as full names of people who I previously only knew by their initials. This also allowed me to clear up some ambiguities . This morning I finished the linkage by creating new keys for the 13 people not mentioned by Sandall, then got started on writing some XSLT. That was easy as I could copy or adapt a lot of the code from the style sheet for Sandall. As well as generating the HTML version of the citations, this XSLT generates an extra JSON file which is imported into the Sandall index of people to allow linking the citations. Again this only required some minor adjustments to the Exhibit page. After some testing and corrections I had a live site up this afternoon.

This demonstrates the potential value of the techniques I’ve been using for marking up texts, but it also raises some problems for digital history. I decided to trust a transcript from a random person off the internet. I have no way of knowing how accurate the transcript is, or even if the source document really exists! It could be Hugh Trevor Roper and the “Hitler Diaries” all over again. Therefore I’m going to think more carefully before putting myself in this situation again. There’s also a possibility that I’ve miscalculated the copyright situation. Based on internal evidence and comparison with other documents my best guess is that the list was created by the army and is therefore under Crown Copyright (and being unpublished and available for inspection in a public record repository should come under waiver of Crown Copyright), but without seeing the original it’s hard to be sure. I might be wrong, and even if I’m right the holders of the manuscript might not agree. So technology makes some things easier, but there are other problems that it can’t solve.

Digital History, History, Military, Sandall 5th Lincs, World War 1, World War I On Web 2.0 — posted by Gavin Robinson, 8:00 pm, 8 February 2008

No Comments

Sandall: The End of the Beginning

Having made good progress with my project to digitize Sandall’s History of 5th Lincolnshire Regiment in the last month I’m going to leave it for a while. This month I haven’t read any books or articles, haven’t written anything other than blog posts and computer code, and have only occasionally thought about historiography and theory. I kind of like it like that but I have other things to get on with now.

I’ve made some small changes since the last post. Dates now have tool tips, so if you hover over them you can see the full date. The place name index is a bit more user-friendly. I’ve replaced the hash values with query strings in the incoming links so that the Exhibit page filters the list down to the place passed in the query instead of displaying a box with the details. This means that you just have to click on “Map” to go straight to map view with only that place displayed. Once you’re there you can easily take the filter off again to see all the other places. The map view is also zoomed out further by default so that you can see Britain and Egypt. That means that you have to zoom in a long way to get to France and Flanders but I think it’s less confusing than not being able to see Grimsby or Alexandria unless you zoom out.

So the site is now in a satisfactory condition with lots of cool features, and now that I’ve worked out how to do everything I could probably get another book to the same stage within a few weeks. But there are still lots of features that could, and probably should, be added. See below for more details. (more…)

Digital History, Sandall 5th Lincs, World War I On Web 2.0 — posted by Gavin Robinson, 4:14 pm, 1 February 2008

No Comments

Places

Following on from adding an interactive index of people to my digital edition of Sandall’s history of 5th Lincs, I’ve now added a similar feature for place names. It works in exactly the same way as the person index, but it also has a map view. Again this uses the Exhibit API, which makes it very easy to mash up data with Google Maps without even having to know anything about the Google Maps API. The map view is a bit slower than the normal view, especially if the list isn’t filtered, but that’s an inherent limitation of using maps.

One of the many cool things about the map is that it strikingly illustrates the allied advances in the last months of the First World War. If you go into the map view and click “The Beginning of the Great Advance” on the list of chapters, you’ll see the battalion holding the line in Flanders, then moving behind the lines for rest near Amiens, then moving up to the front line at Saint-Quentin. Then click on each of the following chapters in turn and watch the markers surge forward as 46th Division breaks through the Hindenburg Line and pushes towards Belgium.

Adding the place index was mostly similar to adding the person index: I added a unique id to each<placeName> tag using a Python script, pulled out the place names into an SQLite database, identified/disambiguated them and added a regularized name, then used another Python script to pull the regularized names out of the database and put them into the key attributes in the XML file. Identifying the places was easier than identifying people, and took a couple of days, although there are a few that I couldn’t find. As with people I added some code the the XSLT to generate a JSON file of all the places. Then following the map view tutorial I used the Exhibit API to pull latitude and longitude co-ordinates from Google Maps and put them into another JSON file. This turned out to be a bit unreliable as about 10 per cent of the places had their co-ordinates missing. It seems to be random, as running the script again with the same set of data produced a similar error rate but with different places. I had to take the missing places from the output file, put them into another input file and run the script over them again, which produced a similar 10 per cent error rate, but the remaining few co-ordinates could be put in manually. Once I had a JSON file with all the correct geocodes it was easy to copy code from the tutorial to add a map view to the Exhibit page. In a few cases it turned out that Google had given me the wrong co-ordinates. Mostly this was because there are two or more places with the same name and it had picked the wrong one. I thought I’d put in enough information from my manual searches to disambiguate them but it seems that the results of a Google Map search can be a bit unpredictable, and don’t necessarily give you the full address of a place.

I’ve now done most of what I planned to do in this phase. There are still some features that could be added, especially a feedback mechanism, but I’ll be giving this project a rest soon so I can do some English Civil War work.

Digital History, History, Military, Sandall 5th Lincs, World War 1, World War I On Web 2.0 — posted by Gavin Robinson, 12:01 pm, 28 January 2008

No Comments

Marking Up Names: Part 3

My digital edition of Sandall’s History of 5th Lincolnshire Regiment now has a new improved index of people. This uses the Exhibit API to make an interactive list which can be filtered, sorted, and searched. Exhibit provides features that would normally need a database driven back-end but it’s all done on the client side using Javascript. The two disadvantages of this are that it doesn’t scale up very far, and that it isn’t very Google friendly. In this case there’s no problem because there are only ever going to be 350 records in the list, and there is no unique content on this page - it’s just an index to point users to other pages, which are Google friendly.

I’ve also made every occurrence of a name in the text into a link which points to the index. My worries about illegal characters in id attributes turned out to be unfounded. With Exhibit I can use the standardized names from the TEI @key attribute as hashes to make permalinks to individual records. Clicking on the link takes you to the index and displays a dialog box with all of that persons details, including links back to every mention in the text. The dialog box is also displayed by clicking on a person’s name on the index page. I just need to work out a way to display it without having to reload the page.

Exhibit is really easy to use and makes it possible to add some fairly advanced features with surprisingly little effort. It took some searching, copying examples, trial and error, and asking on the mailing list before I worked out how to do everything, but as the project is documented by a wiki I’ve been able to update it whenever I find out how to do something that isn’t already explained there. The JSON data file for my index page is generated automatically by XSLT which loops through every <persName> and <rs> tag in the TEI document, and pulls out extra details (date of death, links to medal cards and CWGC) from another XML file.

Now that person names are more or less fully implemented, it’s time to move on to place names. These should be easier to disambiguate, and with Exhibit I can do some even cooler things with them, such as generating a Google map.

Digital History, Sandall 5th Lincs, World War I On Web 2.0 — posted by Gavin Robinson, 12:52 pm, 23 January 2008

3 Comments

Marking Up Names: Part 2

My digital edition of Sandall’s History of 1/5th Lincolnshire Regiment now has a new index of people. In my last post I described how names were marked up in the text. This post is about how I linked them together.

(more…)

Digital History, History, Military, Sandall 5th Lincs, World War 1, World War I On Web 2.0 — posted by Gavin Robinson, 3:01 pm, 19 January 2008

No Comments

Marking Up Names: Part 1

On to the next stage of digitizing Sandall’s History of 5th Lincolnshire Regiment. Having marked up the structure of the text and written XSLT to split the book into several HTML pages with working internal links, I could move on to Phase 2: marking up name, dates, and abbreviations.

(more…)

Digital History, Sandall 5th Lincs, World War I On Web 2.0 — posted by Gavin Robinson, 3:55 pm, 15 January 2008

No Comments

Sandall Update

I’ve now uploaded a new version of Sandall’s history of 5th Lincs with each chapter on a separate page. I thought splitting the pages and getting the internal links to work would be difficult but it turned out to be easier than I thought, although it involves some quite complicated XPath expressions. I’ve also uploaded the new XSLT to show how I did it. This could probably be better in some ways but for now I’m just pleased that it works. While doing this I decided to change the n attribute of the chapter divs from a number to a slug that could be used to make a Google friendly permalink.

Now I’m waiting for Google to re-index the site so that the custom search actually works. Meanwhile I’ve started tagging people, places, dates and abbreviations. More on that when I’ve finished. I’m also increasingly confident that the photos are in the public domain (see these guidelines, which make things a bit clearer, if they’re right), so they’ll probably be added soon.

Digital History, History, Military, Sandall 5th Lincs, World War 1, World War I On Web 2.0 — posted by Gavin Robinson, 7:32 pm, 9 January 2008

2 Comments

Older posts