[posted by Gavin Robinson, 2:33 pm, 2 July 2007]
In a few previous posts I’ve stressed the difference between information and meaning (which I picked up from Claude Shannon, the father of information theory) and some of its implications. For example, in this post I pointed out that Shannon’s separation of meaning and information is compatible with structuralist and post-structuralist theories which maintain that there is no inherent meaning in the text. (I’ve also had to deal with it in the course of digitizing a book - see here). Work on Artificial Intelligence has tended to reinforce this distinction: computers are very good at processing information but not very good at understanding meaning.
But last week Bill Turkel wrote a post which turned my understanding of the meaning/information dichotomy on its head. This isn’t such a new development as it’s following on from a post he wrote in March 2006, and that was inspired by an article by Rudi Cilibrasi and Paul Vitányi published in 2005. There’s a lot of mathematical stuff about compression algorithms which I can’t claim to understand, but the schwerpunkt is that without understanding anything about meaning, computers can compare similarities in the information content of texts and cluster them accordingly. The result is patterns that make sense to humans who can understand the meaning of the text. Bill’s example used entries from the Canadian Dictionary of National Biography, finding geographical and chronological clusters of entries.
Despite the attention grabbing title of my post, the distinction between information and meaning isn’t a false one. However, these experiments show that in practice the relationship between information and meaning within the context of a particular linguistic/cultural system is not as arbitrary and unpredictable as theorizing might suggest. Does this mean that structuralism could make a comeback against post-structuralism? Or do we need to move beyond both of those things and find a new way to think about text? Whatever the implications for theory, this is an exciting development which promises to be very useful in practice.
[posted by Gavin Robinson, 4:49 pm, 5 February 2007]
In my previous post about theories of digital text, I used Shannon’s communication theory to divide text into information and meaning, and then talked exclusively about text as information: a sequence of characters selected from a finite set. That allowed me to concentrate on one part of the problem, while excluding the more difficult problems associated with meaning. In this post, I’ll be trying to tackle some of the problems of meaning, while still trying to avoid as many as I can. I will also continue to avoid offering concrete definitions of “text” and “a text”, mainly because I haven’t found any satisfactory definitions yet, but I won’t be able to avoid using the word “text”.
(more…)
[posted by Gavin Robinson, 5:07 pm, 2 February 2007]
As the next stage of my Digital History Projects I’ve been doing background reading and thinking about the theory of text. This week I’ve read Schreibman, Siemens, and Unsworth A Companion To Digital Humanities (2004); Burnard, O’Brien, O’Keeffe, and Unsworth Electronic Textual Editing (2006); Susan Hockey Electronic Texts in the Humanities (2000); and C. E. Shannon ‘A Mathematical Theory of Communication’ (1948). I can’t say that I understood everything (especially Shannon’s equations and Jerome McGann’s pretentious jargon) but it’s given me a lot to think about, and things are nowhere near as simple as I first assumed.
(more…)
[posted by Gavin Robinson, 8:14 pm, 23 January 2007]
Oh no! Bill Turkel has tagged me for a meme! Is this the end of civilisation as we know it? When I started this weblog I was determined to stick to substantial original content. There would be no room for memes or other self-indulgent timewasting — I already have a LiveJournal for that. However, Bill managed to turn this particular meme into some interesting analysis of memetics and the blogosphere. That’s inspired me to move even further away from the original meme and post some random thoughts about memes. I won’t be tagging anyone at the end, because I hope to demonstrate that history bloggers don’t need to tag each other.
(more…)
[posted by Gavin Robinson, 8:50 pm, 21 November 2006]
Over the last 10 years or so, technology has brought huge changes to historical research and opened up new possibilities. Computers have solved some old problems, but also created some new ones. Meanwhile there has been an increasing focus on the problems of epistemology: what can we know about the past and how can we know it? The debate has mostly been about the relationship between textual sources and the reality of the past. Even if you reject theory and take a purely empirical view of what the sources can tell us, there are some potential problems with the transmission of the information that they contain.
(more…)