XML Tagging: Phase 1

[posted by Gavin Robinson, 1:04 pm, 26 February 2007]

Having proofread and corrected the digital text captured from Sandall’s history of 1/5th Lincolnshire (corrected to an adequate standard anyway — I can’t claim that it’s perfect), I was ready to start inserting XML tags. The first phase of markup involves the use of TEI XML tags to describe the basic structure of the text. There was nothing too difficult here, and a lot of it could be done automatically rather than reading through the text and manually inserting tags at every feature. Before I started I had to decide which tags to use and where to use them, then make sure I applied them consistently. This post gives more details of the tags I used, what I used them for, and how I got them into the text with minimal effort.

(more…)

Tags: , , , , ,

Comments Off

Text Theories: Meaning

[posted by Gavin Robinson, 4:49 pm, 5 February 2007]

In my previous post about theories of digital text, I used Shannon’s communication theory to divide text into information and meaning, and then talked exclusively about text as information: a sequence of characters selected from a finite set. That allowed me to concentrate on one part of the problem, while excluding the more difficult problems associated with meaning. In this post, I’ll be trying to tackle some of the problems of meaning, while still trying to avoid as many as I can. I will also continue to avoid offering concrete definitions of “text” and “a text”, mainly because I haven’t found any satisfactory definitions yet, but I won’t be able to avoid using the word “text”.

(more…)

Text Theories: Information

[posted by Gavin Robinson, 5:07 pm, 2 February 2007]

As the next stage of my Digital History Projects I’ve been doing background reading and thinking about the theory of text. This week I’ve read Schreibman, Siemens, and Unsworth A Companion To Digital Humanities (2004); Burnard, O’Brien, O’Keeffe, and Unsworth Electronic Textual Editing (2006); Susan Hockey Electronic Texts in the Humanities (2000); and C. E. Shannon ‘A Mathematical Theory of Communication’ (1948). I can’t say that I understood everything (especially Shannon’s equations and Jerome McGann’s pretentious jargon) but it’s given me a lot to think about, and things are nowhere near as simple as I first assumed.

(more…)

Digital History Projects: Planning

[posted by Gavin Robinson, 7:30 pm, 10 January 2007]

In my New Year post, I mentioned that I’m thinking about carrying out a couple of digital history projects in connection with my First World War research. These projects are very small and should be relatively easy to carry out on my own, but there will almost certainly be challenges. Overcoming these will give me more experience of carrying out a digital history project (this is starting to sound like a job application again!), and produce useful resources. After that, I can move on to consider some more advanced issues, such as collaborating with other people, and dealing with seventeenth-century manuscripts. To make the experience even more useful, I’m trying to blog it as I go. This post is an outline of my plans so far. Now that I’ve published my plans I’ll have to carry them out!

(more…)

Newer posts