You’re del.icio.us

[posted by Gavin Robinson, 5:00 pm, 29 January 2008]

I’ve just finished clearing out my bookmarks folder and putting my links onto del.icio.us. I don’t know why I haven’t done this before as it’s much better than keeping over 300 links in my Firefox bookmarks. From now on bookmarks are only for the few sites that I access most regularly. Zotero is for pages that need to be dealt with in more detail, with snapshots, notes, and annotations, or which need to be kept together with bibliographies for projects that I’m working on. And everything else goes on del.icio.us. While I was rearranging everything I took the opportunity to add some of the best sites to History Nexus.

Some other cool things:

Operator is a Firefox plugin which detects and displays Microformats. Microformats are a simple way of embedding metadata in web pages using only HTML.

Firebug is another Firefox plugin, a bit like the Web Developer toolbar but much more powerful. It lets you inspect the code of a webpage with expanding and collapsing tags, highlights the current element on the page, displays all CSS styles which apply to an element, debugs Javascript, and even lets you rewrite the code on the fly! I’ve found it very useful for developing Exhibit pages, and it would also make it a lot easier to design or modify WordPress themes.

Yahoo Pipes is a set of tools for data mining and mashups. It’s kind of what I was wishing Google would do in a previous post. It looks complicated, but still easier than programming from scratch, and very powerful. I’ll be trying it out whenever I get time.

Places

[posted by Gavin Robinson, 12:01 pm, 28 January 2008]

Following on from adding an interactive index of people to my digital edition of Sandall’s history of 5th Lincs, I’ve now added a similar feature for place names. It works in exactly the same way as the person index, but it also has a map view. Again this uses the Exhibit API, which makes it very easy to mash up data with Google Maps without even having to know anything about the Google Maps API. The map view is a bit slower than the normal view, especially if the list isn’t filtered, but that’s an inherent limitation of using maps.

One of the many cool things about the map is that it strikingly illustrates the allied advances in the last months of the First World War. If you go into the map view and click “The Beginning of the Great Advance” on the list of chapters, you’ll see the battalion holding the line in Flanders, then moving behind the lines for rest near Amiens, then moving up to the front line at Saint-Quentin. Then click on each of the following chapters in turn and watch the markers surge forward as 46th Division breaks through the Hindenburg Line and pushes towards Belgium.

Adding the place index was mostly similar to adding the person index: I added a unique id to each<placeName> tag using a Python script, pulled out the place names into an SQLite database, identified/disambiguated them and added a regularized name, then used another Python script to pull the regularized names out of the database and put them into the key attributes in the XML file. Identifying the places was easier than identifying people, and took a couple of days, although there are a few that I couldn’t find. As with people I added some code the the XSLT to generate a JSON file of all the places. Then following the map view tutorial I used the Exhibit API to pull latitude and longitude co-ordinates from Google Maps and put them into another JSON file. This turned out to be a bit unreliable as about 10 per cent of the places had their co-ordinates missing. It seems to be random, as running the script again with the same set of data produced a similar error rate but with different places. I had to take the missing places from the output file, put them into another input file and run the script over them again, which produced a similar 10 per cent error rate, but the remaining few co-ordinates could be put in manually. Once I had a JSON file with all the correct geocodes it was easy to copy code from the tutorial to add a map view to the Exhibit page. In a few cases it turned out that Google had given me the wrong co-ordinates. Mostly this was because there are two or more places with the same name and it had picked the wrong one. I thought I’d put in enough information from my manual searches to disambiguate them but it seems that the results of a Google Map search can be a bit unpredictable, and don’t necessarily give you the full address of a place.

I’ve now done most of what I planned to do in this phase. There are still some features that could be added, especially a feedback mechanism, but I’ll be giving this project a rest soon so I can do some English Civil War work.

Marking Up Names: Part 3

[posted by Gavin Robinson, 12:52 pm, 23 January 2008]

My digital edition of Sandall’s History of 5th Lincolnshire Regiment now has a new improved index of people. This uses the Exhibit API to make an interactive list which can be filtered, sorted, and searched. Exhibit provides features that would normally need a database driven back-end but it’s all done on the client side using Javascript. The two disadvantages of this are that it doesn’t scale up very far, and that it isn’t very Google friendly. In this case there’s no problem because there are only ever going to be 350 records in the list, and there is no unique content on this page – it’s just an index to point users to other pages, which are Google friendly.

I’ve also made every occurrence of a name in the text into a link which points to the index. My worries about illegal characters in id attributes turned out to be unfounded. With Exhibit I can use the standardized names from the TEI @key attribute as hashes to make permalinks to individual records. Clicking on the link takes you to the index and displays a dialog box with all of that persons details, including links back to every mention in the text. The dialog box is also displayed by clicking on a person’s name on the index page. I just need to work out a way to display it without having to reload the page.

Exhibit is really easy to use and makes it possible to add some fairly advanced features with surprisingly little effort. It took some searching, copying examples, trial and error, and asking on the mailing list before I worked out how to do everything, but as the project is documented by a wiki I’ve been able to update it whenever I find out how to do something that isn’t already explained there. The JSON data file for my index page is generated automatically by XSLT which loops through every <persName> and <rs> tag in the TEI document, and pulls out extra details (date of death, links to medal cards and CWGC) from another XML file.

Now that person names are more or less fully implemented, it’s time to move on to place names. These should be easier to disambiguate, and with Exhibit I can do some even cooler things with them, such as generating a Google map.

More hosts needed for MHC

[posted by Gavin Robinson, 1:18 pm, 22 January 2008]

We need hosts for the Military History Carnival from March 2008 onwards. The carnival usually takes place around the middle of the month but the exact date is up to the host. All you need to be a host is a blog, some spare time, and some enthusiasm. Your blog doesn’t have to be primarily about military history.

The carnival aims to be as inclusive as possible. Weapons, tactics, strategy, uniforms, insignia, equipment etc are all interesting and important, and so are relationships between war and society, culture, race, gender, sexuality, disability, and the non-human. The only limit is that history is defined as anything up to the end of the 20th century. For more information see the Notes for Hosts.

If you’re interested please leave a comment here or on the Military History Carnival page or e-mail me using my mailform.

New spam filter

[posted by Gavin Robinson, 7:55 pm, 20 January 2008]

I’ve just installed Joe Tan’s Simple Spam Filter plugin. This is an extra line of defence which works alongside Akismet. The best thing is that it uses reCAPTCHA, but only for comments that are flagged as spam by the simple spam filter itself or by Akismet. That means that most legitimate comments will get through without having to jump through any hoops, but if the filters wrongly flag your comment as spam you get the chance to enter a CAPTCHA code (and reCAPTCHA offers an audio alternative if you can’t read the text). If you get it right the comment appears straight away. If you get it wrong the comment is automatically deleted. This should mean that I won’t have to waste any more time looking for false positives, and most commenters won’t have to waste any time solving CAPTCHAs.

[Edit: to make it work properly you might need to download and install a new version of Akismet. You need at least version 2.1.2 (the latest is 2.1.3) but as I haven't upgraded to WordPress 2.3 I still had an old version which I needed to replace manually]

Marking Up Names: Part 2

[posted by Gavin Robinson, 3:01 pm, 19 January 2008]

My digital edition of Sandall’s History of 1/5th Lincolnshire Regiment now has a new index of people. In my last post I described how names were marked up in the text. This post is about how I linked them together.

(more…)

Marking Up Names: Part 1

[posted by Gavin Robinson, 3:55 pm, 15 January 2008]

On to the next stage of digitizing Sandall’s History of 5th Lincolnshire Regiment. Having marked up the structure of the text and written XSLT to split the book into several HTML pages with working internal links, I could move on to Phase 2: marking up name, dates, and abbreviations.

(more…)

Sandall Update

[posted by Gavin Robinson, 7:32 pm, 9 January 2008]

I’ve now uploaded a new version of Sandall’s history of 5th Lincs with each chapter on a separate page. I thought splitting the pages and getting the internal links to work would be difficult but it turned out to be easier than I thought, although it involves some quite complicated XPath expressions. I’ve also uploaded the new XSLT to show how I did it. This could probably be better in some ways but for now I’m just pleased that it works. While doing this I decided to change the n attribute of the chapter divs from a number to a slug that could be used to make a Google friendly permalink.

Now I’m waiting for Google to re-index the site so that the custom search actually works. Meanwhile I’ve started tagging people, places, dates and abbreviations. More on that when I’ve finished. I’m also increasingly confident that the photos are in the public domain (see these guidelines, which make things a bit clearer, if they’re right), so they’ll probably be added soon.

Military History Carnival posted

[posted by Gavin Robinson, 10:52 am, 7 January 2008]

The 10th Military History Carnival is now up at Walking the Berkshires. It’s a special Flashman themed edition. Thanks to Tim for doing such a great job.

tj$at$battlefieldbiker$dot$com

The next edition will be at Battlefield Biker on 17th February. E-mail submissions to tj$at$battlefieldbiker$dot$com or use the carnival submission form.

We need hosts for March 2008 onwards so if you’d like to give it a go please get in touch.

More progress with Sandall

[posted by Gavin Robinson, 3:26 pm, 5 January 2008]

My project to digitize T. E. Sandall’s history of the 1/5th Lincolnshire regiment in the First World War has made very good progress this week. I’ve now uploaded a new HTML version. This features links to page images and a working index: if you click on a page number in the index it takes you to the corresponding part of the text. The whole book is still on one page as I haven’t worked out how to split it yet but it’s an improvement over the previous interim version. Below are more details of what I’ve done and how I’ve done it.

(more…)

Older posts