XML Tagging: Phase 1

Having proofread and corrected the digital text captured from Sandall’s history of 1/5th Lincolnshire (corrected to an adequate standard anyway — I can’t claim that it’s perfect), I was ready to start inserting XML tags. The first phase of markup involves the use of TEI XML tags to describe the basic structure of the text. There was nothing too difficult here, and a lot of it could be done automatically rather than reading through the text and manually inserting tags at every feature. Before I started I had to decide which tags to use and where to use them, then make sure I applied them consistently. This post gives more details of the tags I used, what I used them for, and how I got them into the text with minimal effort.

(more…)

Digital History, History, Sandall 5th Lincs, World War 1, World War I On Web 2.0 — posted by Gavin Robinson, 1:04 pm, 26 February 2007

No Comments

Proofreading

In my last project update I described how I used FineReader to OCR the text of Sandall’s History of 5th Lincolnshire Regiment. Since then I’ve manually proofread the text and inserted some basic XML markup. Proofing and basic tagging have given me a more detailed understanding of the text and the features in it, and I’ve been noting potential issues as I go. I’ll post more about how I’m using XML later, but this post is a more detailed description of the process of proofreading.

(more…)

Digital History, History, Sandall 5th Lincs, World War 1, World War I On Web 2.0 — posted by Gavin Robinson, 12:01 pm, 23 February 2007

No Comments

To go off and things

I’m planning to change web hosts soon, so this site might be down for a while until DNS records get updated. Once that’s out of the way I’ll be making an announcement about the military history carnival thing (but first I have to make some arbitrary decisions).

This weekend there will be an early-modern edition of Carnivalesque at The Long Eighteenth. Submit posts on anything to do with the early-modern period to carrieshanafelt[at]gmail[dot]com or use the submission form. This reminds me that it’s far too long since I posted anything early-modern. From reading my recent posts it would be hard to guess that I’m an English Civil War specialist! That’s partly because the civil war stuff I’m writing at the moment is secret — I don’t like to give away too much about projects that I’m working on for academic publications or conferences.

And Brett at Airminded has tagged me for a meme…

(more…)

Blogging — posted by Gavin Robinson, 7:51 pm, 21 February 2007

1 Comment

Digital History Project: Update

Another project update. Things have been slightly delayed because I have an article to rewrite (which means I’m slightly closer to getting published) but I’ve still been making some progress. This weekend I’ll be proofreading Sandall’s book. When that’s done I’ll be able to export the text and start tagging it with XML. But first I’ve been looking through the TEI guidelines, picking out the tags I think I’ll need, and working out how I think I’m going to use them. This is crucial because there are often different ways to mark up the same text and it’s important to be consistent. It’s also important to only apply tags which will actually be useful to users, because there’s an awful lot of potential to waste time marking up text in microscopic detail that no-one has any use for. As I do the proofreading I’ll also be looking at the structure of the text and the features in it that will need marking up, and revising the provisional tagging guidelines if necessary. Once I’m happy with the tag set and the guidelines for using them I’ll post it all (but be warned: it won’t be very interesting!). Even then I’m expecting to find some unexpected situations once I start trying to insert the tags.

Digital History, History, Sandall 5th Lincs, World War I On Web 2.0 — posted by Gavin Robinson, 7:24 pm, 15 February 2007

7 Comments

Military History Carnival?

I’m thinking about organising a military history themed blog carnival, because as far as I can tell there isn’t one. If there is one and I’ve missed it, please someone tell me! There are lots of bloggers writing about war, armed forces, and related topics, so it would be good to bring them all together and showcase the best posts. I can’t do it on my own though. Every carnival needs hosts, contributors, and readers. Leave a comment if you’re interested, if you have any suggestions, or even if you think it’s a bad idea.

[Obviously I'm only doing this now because I've just got feedback on an article and have to rewrite it. You're not alone Esther!]

If it goes ahead, I’ll host the first one myself but after that I’ll need a constant supply of hosts. Host blogs won’t have to be exclusively or mainly about military history. Anyone who’s interested can have a go.

Once a month is probably the optimum frequency. I’m thinking probably around the first weekend of each month but that could change.

I want to avoid polemic about current affairs, especially Iraq, so I’d like to arbitrarily define history as anything that happened more than 10 years ago. This might be contentious, so any counter arguments would be welcome.

Military will be defined as broadly as possible. It includes all levels of armed conflict — there will be no rigid definition of what is and isn’t a war. At the risk of offending latin purists, military will include navies and air forces as well as armies.

Within these limits anything goes. I don’t want any artificial division between academic and non-academic, amateur and professional, or traditional and new. Weapons, tactics, strategy, uniforms, insignia, equipment etc are all interesting and important, and so are relationships between war and society, culture, race, gender, sexuality, disability, and the non-human. Preparations for and aftermaths of wars are as significant as the wars themselves. Representations of war in literature, films, TV, games etc are just as valid objects of study as empirical evidence of reality.

The object is neither to glorify nor condemn war, but to see it as in integral part of history which needs to be better understood.

So, any comments, questions, suggestions, criticisms?

Blogging, History, Military — posted by Gavin Robinson, 11:37 am, 10 February 2007

27 Comments

Digital History Projects: OCR

Now that I’ve got all the theoretical agonising out of the way, I can actually do something about digitizing the text. This week I’m carrying out OCR and proofreading on the text of Sandall’s History of 5th Battalion the Lincolnshire Regiment. As soon as I got to work I encountered issues that I hadn’t thought of, and found that subjective decisions had to be made even earlier than I’d anticipated. This just shows that the only way to learn how to do something is to do it.

(more…)

Digital History, History, Sandall 5th Lincs, World War I On Web 2.0 — posted by Gavin Robinson, 8:11 pm, 7 February 2007

No Comments

Text Theories: Meaning

In my previous post about theories of digital text, I used Shannon’s communication theory to divide text into information and meaning, and then talked exclusively about text as information: a sequence of characters selected from a finite set. That allowed me to concentrate on one part of the problem, while excluding the more difficult problems associated with meaning. In this post, I’ll be trying to tackle some of the problems of meaning, while still trying to avoid as many as I can. I will also continue to avoid offering concrete definitions of “text” and “a text”, mainly because I haven’t found any satisfactory definitions yet, but I won’t be able to avoid using the word “text”.

(more…)

Digital History, History, Sandall 5th Lincs, Theory, World War I On Web 2.0 — posted by Gavin Robinson, 4:49 pm, 5 February 2007

No Comments

The Exhibitionist

I was going to write some more about theories of text today, but I started playing with Exhibit instead (I found out about it from Public Historian). It’s a simple web API which gives you some of the features of a database driven site without having to use a database or any PHP/MySQL code (or the Microsoft equivalent). All you need is one HTML file for the page, and one Javascript file to store the data. It sounds too good to be true doesn’t it?

(more…)

Digital History, English Civil War, History — posted by Gavin Robinson, 4:50 pm, 3 February 2007

1 Comment

Text Theories: Information

As the next stage of my Digital History Projects I’ve been doing background reading and thinking about the theory of text. This week I’ve read Schreibman, Siemens, and Unsworth A Companion To Digital Humanities (2004); Burnard, O’Brien, O’Keeffe, and Unsworth Electronic Textual Editing (2006); Susan Hockey Electronic Texts in the Humanities (2000); and C. E. Shannon ‘A Mathematical Theory of Communication’ (1948). I can’t say that I understood everything (especially Shannon’s equations and Jerome McGann’s pretentious jargon) but it’s given me a lot to think about, and things are nowhere near as simple as I first assumed.

(more…)

Digital History, History, Sandall 5th Lincs, Theory, World War I On Web 2.0 — posted by Gavin Robinson, 5:07 pm, 2 February 2007

1 Comment