<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Investigations of a Dog &#187; meaning</title>
	<atom:link href="http://www.investigations.4-lom.com/tag/meaning/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.investigations.4-lom.com</link>
	<description>Failing better at understanding the past</description>
	<lastBuildDate>Sun, 05 Feb 2012 09:18:46 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Information vs Meaning: A False Dichotomy?</title>
		<link>http://www.investigations.4-lom.com/2007/07/02/information-vs-meaning/</link>
		<comments>http://www.investigations.4-lom.com/2007/07/02/information-vs-meaning/#comments</comments>
		<pubDate>Mon, 02 Jul 2007 14:33:23 +0000</pubDate>
		<dc:creator>Gavin Robinson</dc:creator>
				<category><![CDATA[History]]></category>
		<category><![CDATA[compression]]></category>
		<category><![CDATA[digital history]]></category>
		<category><![CDATA[information theory]]></category>
		<category><![CDATA[meaning]]></category>

		<guid isPermaLink="false">http://www.investigations.4-lom.com/2007/07/02/information-vs-meaning/</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Information+vs+Meaning%3A+A+False+Dichotomy%3F&amp;rft.aulast=Robinson&amp;rft.aufirst=Gavin&amp;rft.subject=History&amp;rft.source=Investigations+of+a+Dog&amp;rft.date=2007-07-02&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://www.investigations.4-lom.com/2007/07/02/information-vs-meaning/&amp;rft.language=English"></span>
In a few previous posts I&#8217;ve stressed the difference between information and meaning (which I picked up from Claude Shannon, the father of information theory) and some of its implications. For example, in this post I pointed out that Shannon&#8217;s separation of meaning and information is compatible with structuralist and post-structuralist theories which maintain that [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Information+vs+Meaning%3A+A+False+Dichotomy%3F&amp;rft.aulast=Robinson&amp;rft.aufirst=Gavin&amp;rft.subject=History&amp;rft.source=Investigations+of+a+Dog&amp;rft.date=2007-07-02&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://www.investigations.4-lom.com/2007/07/02/information-vs-meaning/&amp;rft.language=English"></span>
<p>In a few previous posts I&#8217;ve stressed the difference between information and meaning (which I picked up from Claude Shannon, the father of information theory) and some of its implications. For example, in <a href="http://www.investigations.4-lom.com/2007/06/08/science-friction/">this post</a> I pointed out that Shannon&#8217;s separation of meaning and information is compatible with structuralist and post-structuralist theories which maintain that there is no inherent meaning in the text. (I&#8217;ve also had to deal with it in the course of digitizing a book &#8211; see <a href="http://www.investigations.4-lom.com/2007/02/02/text-theories-information/">here</a>). Work on Artificial Intelligence has tended to reinforce this distinction: computers are very good at processing information but not very good at understanding meaning.</p>
<p>But last week <a href="http://digitalhistoryhacks.blogspot.com/2007/06/clustering-with-compression.html">Bill Turkel</a> wrote a post which turned my understanding of the meaning/information dichotomy on its head. This isn&#8217;t such a new development as it&#8217;s following on from a post he wrote in March 2006, and that was inspired by an article by Rudi Cilibrasi and Paul Vitányi published in 2005. There&#8217;s a lot of mathematical stuff about compression algorithms which I can&#8217;t claim to understand, but the schwerpunkt is that without understanding anything about meaning, computers can compare similarities in the information content of texts and cluster them accordingly. The result is patterns that make sense to humans who <em>can</em> understand the meaning of the text. Bill&#8217;s example used entries from the <em>Canadian Dictionary of National Biography</em>, finding geographical and chronological clusters of entries.</p>
<p>Despite the attention grabbing title of my post, the distinction between information and meaning isn&#8217;t a false one. However, these experiments show that in practice the relationship between information and meaning within the context of a particular linguistic/cultural system is not as arbitrary and unpredictable as theorizing might suggest. Does this mean that structuralism could make a comeback against post-structuralism? Or do we need to move beyond both of those things and find a new way to think about text? Whatever the implications for theory, this is an exciting development which promises to  be very useful in practice.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.investigations.4-lom.com/2007/07/02/information-vs-meaning/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>That would be an ecumenical matter</title>
		<link>http://www.investigations.4-lom.com/2007/06/12/ecumenical-matter/</link>
		<comments>http://www.investigations.4-lom.com/2007/06/12/ecumenical-matter/#comments</comments>
		<pubDate>Tue, 12 Jun 2007 13:48:32 +0000</pubDate>
		<dc:creator>Gavin Robinson</dc:creator>
				<category><![CDATA[History]]></category>
		<category><![CDATA[epistemology]]></category>
		<category><![CDATA[historiography]]></category>
		<category><![CDATA[meaning]]></category>
		<category><![CDATA[science]]></category>
		<category><![CDATA[theory]]></category>

		<guid isPermaLink="false">http://www.investigations.4-lom.com/2007/06/12/ecumenical-matter/</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=That+would+be+an+ecumenical+matter&amp;rft.aulast=Robinson&amp;rft.aufirst=Gavin&amp;rft.subject=History&amp;rft.source=Investigations+of+a+Dog&amp;rft.date=2007-06-12&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://www.investigations.4-lom.com/2007/06/12/ecumenical-matter/&amp;rft.language=English"></span>
Last week I posted some thoughts in response to the discussions at A Historian&#8217;s Craft and Civil War Memory about history and philosophy. In that post I took some of the philosophical problems that affect history and tried to restate them in scientific terms. As Brett pointed out, this really amounted to stating the obvious [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=That+would+be+an+ecumenical+matter&amp;rft.aulast=Robinson&amp;rft.aufirst=Gavin&amp;rft.subject=History&amp;rft.source=Investigations+of+a+Dog&amp;rft.date=2007-06-12&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://www.investigations.4-lom.com/2007/06/12/ecumenical-matter/&amp;rft.language=English"></span>
<p>Last week I posted some thoughts in response to the discussions at <a href="http://idlethink.wordpress.com/2007/05/31/excuse-me-your-linguistic-bias-is-showing/">A Historian&#8217;s Craft</a> and <a href="http://civilwarmemory.typepad.com/civil_war_memory/2007/06/do_historians_n_2.html">Civil War Memory</a> about history and philosophy. In <a href="http://www.investigations.4-lom.com/2007/06/08/science-friction/">that post</a> I took some of the philosophical problems that affect history and tried to restate them in scientific terms. As <a href="http://www.investigations.4-lom.com/2007/06/08/science-friction/#comment-4371">Brett pointed out</a>, this really amounted to stating the obvious in fairly uncontroversial terms, but I think that was worth doing in order to bypass the unproductive hostility between both extremes in the postmodernism wars (although the extent to which those extremes even exist is debatable). Whether the major problems we face as historians are philosophical, scientific, or a bit of both, the question remains: how much time should we spend thinking about these problems? In this post I&#8217;ll be discussing that question, but I have to warn you in advance that I can&#8217;t answer it. So there might not be much point reading any further…</p>
<p><span id="more-91"></span>First of all I have to say that I agree completely with Kevin Levin that as historians we can&#8217;t solve these problems ourselves. If cognitive scientists can&#8217;t work out what meaning is and where it comes from, then we have no chance. But while it would be naive and arrogant to attempt to find solutions to these problems, it would also be naive and arrogant to ignore them completely. I think we need to know something about the nature and extent of problems such as meaning so that we know how they affect our work. But how much is enough? Getting good at history is hard enough without also having to know about philosophy, linguistics, cognitive science or whatever. Maybe blogs can help here, because they provide an easy way to find out about unfamiliar areas. For example, I&#8217;ve relied quite heavily on <a href="http://scienceblogs.com/mixingmemory/">Mixing Memory</a> and <a href="http://ebbolles.typepad.com/babels_dawn/">Babel&#8217;s Dawn</a> for science, and <a href="http://www.thevalve.org/">The Valve</a> for philosophy and literary criticism. But how do we know if we can trust them if we don&#8217;t already know what they&#8217;re telling us? Could this also be a problem with peer reviewed publications which are outside our field?</p>
<p>Is it really worth trying to find out about problems which haven&#8217;t been solved yet and which we can&#8217;t possibly solve ourselves? The problem of meaning has major implications for history, but the jury is still out. It might stay out forever, or maybe just a lifetime. Right now we don&#8217;t even know where the answer is going to come from let alone what it will be. (If I had to make a wild speculation I&#8217;d guess that sooner or later the cognitive sciences will crack the meaning problem and that the answer will be equally uncomfortable for both empirical historians and post-structuralist theorists, but anyway&#8230;) Under these circumstances, we can&#8217;t confidently take any position, whether empirical or theoretical. We might all be wrong. Post-structuralist thought is valuable in that it reminds us that meaning is not straightforward, but that is hardly the last word. I found Elizabeth Clark&#8217;s <em>History, Theory, Text</em> quite disappointing because she promised that post-structuralism offered exciting new opportunities for medieval historians, but failed to deliver. Most of the book is a teleological triumphal progress towards post-structuralism in which she sneers at various historians and philosophers for not being post-structuralist enough. There&#8217;s far too little discussion of what post-structuralism can actually do for the historian. The way I see it, post-structuralism is a problem not a solution. I don&#8217;t want to ignore that problem, but I don&#8217;t want to admit defeat and stop writing on the grounds that people could just as easily find interesting meaning in words randomly generated by a computer.</p>
<p>(Somewhere in that paragraph I changed from first person plural to first person singular. Even I&#8217;m not sure what the significance of that is!)</p>
<p>So I believe that there are major problems confronting history, I realise that I can&#8217;t solve those problems, but I don&#8217;t want to ignore them either. Can I work around them in order to minimize their impact on my work? If the central problem is meaning then probably not. I used to think that digitization offered a way out of this dilemma because you could concentrate on transcribing documents without having to make any assumptions about what they mean: concentrate on information and exclude meaning. Now that I&#8217;ve tried digitizing text for myself I can see that it&#8217;s not as simple as that (see my theoretical agonizing <a href="http://www.investigations.4-lom.com/2007/02/02/text-theories-information/">here</a> and <a href="http://www.investigations.4-lom.com/2007/02/05/text-theories-meaning/">here</a>). Although meaning intrudes into every stage of the digitization process the problem is perhaps more manageable than it would be in literary criticism. Identifying the character string &#8220;Lt. R. E. W. Sandall&#8221; as a name and rank seems less problematic than interpreting the meaning of a poem (unless it&#8217;s by Jessie Pope maybe…). Once I&#8217;d established that my editing decisions were arbitrary, I had no problem getting on with it. I have a lot of sympathy for Kevin&#8217;s point that theory doesn&#8217;t seem to matter so much when you&#8217;re actually doing history. But am I deluding myself there? Or am I wasting time on unproductive thinking when I could be doing?</p>
<p>If you&#8217;re not convinced that what you&#8217;re doing is right, how do you motivate yourself to do the work? Historical research involves a lot of difficult and tedious work. You need a strong commitment to get through it. The possibility that my work could be proved completely worthless by new developments in a different discipline isn&#8217;t stopping me from doing history. There isn&#8217;t really anything new here. It&#8217;s always been accepted by most historians that future research could prove their own work wrong. New sources or new interpretations could easily overturn your conclusions. Historians have usually been able to carry on doing what they do rather than giving up in despair at the thought that they might be wrong.</p>
<p>The hypothetical extreme empiricists would be offended at the hypothetical extreme postmodernist&#8217;s suggestion that empiricism is just an arbitrary culturally constructed paradigm. I don’t have any problem with that suggestion. My &#8220;proper&#8221; work (ie my Phd thesis, my forthcoming article, and other projects that I&#8217;m working on) is mostly within the empirical paradigm. The rules and values of that paradigm are arbitrary, but that doesn&#8217;t automatically make it worthless. As Brett pointed out in response to my post last week, science is an arbitrary system constructed by human language and culture, but it&#8217;s a useful one which can predict or change the future. Empirical history can&#8217;t do that, but I still like it. Maybe that&#8217;s a lame justification, but it&#8217;s honest. I can&#8217;t make a strong case for any kind of history being really important, but I know that history is what I want to do. If I like it, and if there&#8217;s a paradigm that values my work, is that all I need? I also think that empirical research teaches valuable skills. Some of these are transferable to other careers (eg using databases, analytical thinking, project management) while others are more specialised (eg palaeography, latin). All of them are more valuable than ever in the age of digital history. We need people with these skills and familiarity with historical documents to work on digitization projects. You can only get good at these things through years of practical experience, not by reading Derrida. However, digitization projects also require familiarity with theories of text &#8211; structuralism, post-structuralism, and information theory are all highly relevant here.</p>
<p>Although I like working in the empirical paradigm, I also like to look outside it. Right now there still seems to be a big gap between my empirical and theoretical interests. Will I ever be able to bring them together, or are they incommensurable? If they are incommensurable, is being able to think about them both at the same time a strength or a weakness? I don&#8217;t want to get too attached to one way of doing things. I want to be as versatile as possible, but will that just make me a jack of all trades and master of none? There&#8217;s a serious danger of creating the appearance of being theoretically aware by lazily dropping the right buzzwords but not really understanding the ideas behind them. The phrase &#8220;truth effect&#8221; can create its own truth effect. In a comment to my previous post I mentioned <a href="http://scienceblogs.com/mixingmemory/2007/06/the_brain_makes_it_better.php">this experiment</a> which provides empirical proof of the truth effect: non-experts are more likely to accept bad explanations of psychological phenomena if they include irrelevant neuroscience terminology. Now I&#8217;m wondering how terminology affects perceptions of historical writing. I&#8217;d like to see more experiments here, but do psychologists consider history interesting and important enough to be the object of their study?</p>
<p>Thinking too much is bad, but so is not thinking enough. I don&#8217;t think there&#8217;s a single point in the middle that&#8217;s exactly right. Different amounts of thinking suit different people. Ultimately everyone needs to make their own decisions. And so I&#8217;ve written nearly 1,500 words without really saying anything. Does that make me a philosopher? No, just a blogger.</p>
<ol>
<li>Elizabeth A. Clark, <span style="font-style: italic">History, Theory, Text</span> (Harvard UP: Cambridge, MA, 2004). <span class="Z3988" title="url_ver=Z39.88-2004&amp;ctx_ver=Z39.88-2004&amp;rft_id=urn%3Aisbn%3A0674015843&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=History%2C%20Theory%2C%20Text%3A%20Historians%20and%20the%20Linguistic%20Turn&amp;rft.place=Cambridge%2C%20MA&amp;rft.publisher=Harvard%20UP&amp;rft.aufirst=Elizabeth%20A.&amp;rft.aulast=Clark&amp;rft.au=Elizabeth%20A.%20Clark&amp;rft.date=2004&amp;rft.isbn=0674015843"></span></li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://www.investigations.4-lom.com/2007/06/12/ecumenical-matter/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Text Theories: Meaning</title>
		<link>http://www.investigations.4-lom.com/2007/02/05/text-theories-meaning/</link>
		<comments>http://www.investigations.4-lom.com/2007/02/05/text-theories-meaning/#comments</comments>
		<pubDate>Mon, 05 Feb 2007 16:49:51 +0000</pubDate>
		<dc:creator>Gavin Robinson</dc:creator>
				<category><![CDATA[History]]></category>
		<category><![CDATA[digital history]]></category>
		<category><![CDATA[information theory]]></category>
		<category><![CDATA[meaning]]></category>
		<category><![CDATA[tei]]></category>
		<category><![CDATA[theory]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://www.investigations.4-lom.com/2007/02/05/text-theories-meaning/</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Text+Theories%3A+Meaning&amp;rft.aulast=Robinson&amp;rft.aufirst=Gavin&amp;rft.subject=History&amp;rft.source=Investigations+of+a+Dog&amp;rft.date=2007-02-05&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://www.investigations.4-lom.com/2007/02/05/text-theories-meaning/&amp;rft.language=English"></span>
In my previous post about theories of digital text, I used Shannon&#8217;s communication theory to divide text into information and meaning, and then talked exclusively about text as information: a sequence of characters selected from a finite set. That allowed me to concentrate on one part of the problem, while excluding the more difficult problems [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Text+Theories%3A+Meaning&amp;rft.aulast=Robinson&amp;rft.aufirst=Gavin&amp;rft.subject=History&amp;rft.source=Investigations+of+a+Dog&amp;rft.date=2007-02-05&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://www.investigations.4-lom.com/2007/02/05/text-theories-meaning/&amp;rft.language=English"></span>
<p>In my previous post about <a href="http://www.investigations.4-lom.com/2007/02/02/text-theories-information/" title="Investigations of a Dog: Text Theories: Information">theories of digital text</a>, I used Shannon&#8217;s communication theory to divide text into information and meaning, and then talked exclusively about text as information: a sequence of characters selected from a finite set. That allowed me to concentrate on one part of the problem, while excluding the more difficult problems associated with meaning. In this post, I&#8217;ll be trying to tackle some of the problems of meaning, while still trying to avoid as many as I can. I will also continue to avoid offering concrete definitions of &#8220;text&#8221; and &#8220;a text&#8221;, mainly because I haven&#8217;t found any satisfactory definitions yet, but I won&#8217;t be able to avoid using the word &#8220;text&#8221;.</p>
<p><span id="more-54"></span></p>
<p>When scanning, OCR, and proofreading are complete (meaning you&#8217;ve gone as far as you can/will go with it &amp;mdash; last time I suggested that proofing can never be truly complete) you are left with one or more plain text files which contain reasonably accurate information. There is likely to be some noise in the form of wrongly transcribed characters, but it should be expected that you have selected methods which result in a level of accuracy that is acceptable to your project, and that you have assurance checks in place to be able to determine that the work does meet your minimum requirements. The information you have in the text file is a sequence of characters which more or less matches the sequence of characters in the book. What, if anything, should you do next?</p>
<p>You could just put the text file on the internet as it is and let users worry about what it means. This is the approach taken by <a href="http://www.gutenberg.org/wiki/Main_Page" title="Project Gutenburg">Project Gutenburg</a>. Their texts are all made available as plain text files, with some also available as HTML. This approach is based on the assumption that digitized texts should conform to the lowest common denominator, and that any additional markup might reduce cross compatibility and make the files inaccessible in the future. I don&#8217;t entirely agree with this view. TEI XML is a widespread standard and looks like it will remain so for a long time. XML files are not a proprietary format. In terms of file systems they are no different from plain text files: both can be read and edited by any text editing software on any platform. The XML Document Object Model should make it easy to update tags in the future, and if XML ever turns out to be totally useless, then you can at least use find and replace to strip it out automatically, leaving you with the original text and no markup. This is not meant to be criticism of Project Gutenburg. They are doing valuable work in making public domain works more widely accessible, and in developing tools and procedures for collaborative work. Digitizing and proofreading text is necessary before any markup can be added. Project Gutenburg stops before the markup stage, but there&#8217;s nothing to stop other people from taking PG text files and adding advanced markup.</p>
<p>Adding markup necessarily involves meaning to a certain extent. Even Project Gutenburg, which aims only at producing plain text editions, isn&#8217;t just transmitting the sequence of characters from printed book to ASCII codes. Some characters, such as page numbers and running heads, are omitted. This is a subjective decision about which information to include and exclude in the digital edition, based on what is most likely to be useful to readers. Therefore, there has to be some kind of judgement about what the information <em>means</em>.</p>
<p>While digital text offers more flexibility than printed text, the role of the editor is just as crucial as ever. As <a href="http://www.tei-c.org.uk/Activities/ETE/Preview/vanhoutte.xml" title="Electronic Textual Editing">Edward Vanhoutte</a> says: &#8220;The editor is always present in the organization of the material and the transcription of source documents&#8221;. Marking up the basic structure of a document according to established standards like TEI might seem unproblematic, but Susan Hockey points out that even this is an act of interpretation (p. 48). For example, replacing line break characters with paragraph tags makes an assumption about the meaning of line breaks. Hockey cites Huitfeldt&#8217;s observation that there are no objective facts about a text (p. 47). Adding TEI XML tags to a text file is imposing an arbitrary taxonomy. Nevertheless, all language is an arbitrary taxonomy. As long as we recognise that nothing actually <em>is</em> what it&#8217;s called, those taxonomies can be useful. The taxonomy you choose has to be relevant to your objectives, and therefore you have to know why you are digitizing a text, who the target audience is, and what they are likely to want from it. This is crucial, because there is no perfect way of digitizing a text. It also helps if your taxonomy ties into a system which is widely used and understood, and which does not vary unpredictably. TEI is a good starting point because it&#8217;s widely used, and while flexible enough to accommodate many different purposes is fixed enough to prevent too much random slippage.</p>
<p>In the interests of preventing slippage and maintaining cross-compatibility, it is vital to apply tags consistently. <a href="http://www.tei-c.org.uk/Activities/ETE/Preview/flanders.xml" title="Electronic Textual Editing">Julia Flanders</a> points out that this is easier said than done because of the complexity and flexibility of TEI: the same feature could be marked up several different ways. Again this is a problem of meaning: how do you interpret the meaning of a sequence of characters, and how do you fit that interpretation into your arbitrary taxonomy? Flanders emphasises the importance of documenting procedures and modifying documentation in the light of management decisions on difficult interpretations or previously unknown features. The <a href="http://crimpleb.group.shef.ac.uk/" title="Central Criminal Court/Plebeian Lives projects">Old Bailey project</a> has taken an innovative approach to this, using a wiki to co-ordinate XML tagging. In a collaborative project, regular assurance checks are necessary to make sure that all team members are following the documentation consistently, and that the documentation is adequate.</p>
<p>While marking up the basic structure of a text (paragraphs, chapters, headings) must be recognised as an act of interpretation and arbitrary classification, it should be relatively unproblematic in practice. This is particularly true of the book I&#8217;ll be working on first: a very conventional regimental history published in Britain in the 20th century. Colonel Sandall is hardly Ezra Pound or Jack Kerouac! The structure of a normal book can be seen in structuralist terms: although it&#8217;s an arbitrary system which doesn&#8217;t necessarily have a fixed relationship with reality, it&#8217;s fixed in relation to itself. Most people understand this system, which is much less complex than the whole of a language system, and publishers help to enforce conformity in printed works. Manuscripts are more problematic as they don&#8217;t necessarily have such rigid conventions. Interpreting the structure of William Wenham&#8217;s letters will be more difficult than interpreting the structure of Sandall&#8217;s book, but at least there are some established conventions of letter writing (again we&#8217;re not dealing with a modernist stream-of-consciousness here), and field postcards have their own basic structure.</p>
<p>The next stage of markup involves picking out dates, and names of people, places, and organizations, and therefore more subjective interpretation of meanings. At this stage no claims will be made about who or what the names signify. It will only be necessary to decide whether or not a sequence of characters represents a proper noun. Fortunately there is an established convention in English printed books that proper nouns are distinguished by a capital letter. This might even allow a certain amount of automation of picking out names, although the potential for confusion with capital letters at the beginnings of sentences will probably make a lot of human intervention necessary. <a href="http://www.tei-c.org.uk/Activities/ETE/Preview/lavagnino.xml" title="Electronic Textual Editing">John Lavagnino</a> points out that names are not always easy to define and delimit. In Sandall&#8217;s book, names will often be accompanied by ranks, which makes them easier to spot.</p>
<p>The third stage of markup is potentially the most contentious because record linkage involves making epistemological claims about the identities of the people referred to by the names. The first question is whether the same name refers to the same person when it occurs in different places within the text. In doubtful cases the book&#8217;s index might help to disambiguate two people with the same name. A different rank doesn&#8217;t necessarily indicate a different person, because ranks can change. At this stage I will be attempting to reconstruct the author&#8217;s understanding of who is who, which means confronting the major problem of author&#8217;s intentions. This doesn&#8217;t mean that I can remain neutral or that my assumptions won&#8217;t influence the record linkage. Linkage at this level will be determined by my own subjective interpretation of what I think the author meant. I will have to assume that there is some consistent logic to what he wrote, but that can&#8217;t necessarily be proved from within the text.</p>
<p>What about outside the text? Linking the text to other records would add value for users. If identifications can be corroborated from other sources, then my judgements might be more secure. However, this also involves making more ambitious claims about meaning. How do I know that the same sequence of characters in two different texts means the same thing? Ultimately I don&#8217;t. Record linkage is an empirical technique which can&#8217;t necessarily be justified to post-structuralists, but I don&#8217;t necessarily have to justify it to post-structuralists.</p>
<p>Once again the important thing is the purpose of the project and the needs and expectations of its target audience. The main value of Sandall&#8217;s book is to amateur researchers who want to know more about specific individual soldiers or officers, or about what the battalion was doing at a particular time. These people are unlikely to be impressed by agonising about meaning, intentions, and epistemology. Their methodology will most likely be traditional empiricism. This is not to say that they will be naive &amp;mdash; they can recognise that some sources are more reliable than others and that different people have different interpretations of what happened &amp;mdash; but ultimately what they really care about will probably be &#8220;the facts&#8221; of what really happened in the past. I don&#8217;t intend to challenge those beliefs, but conversely my project doesn&#8217;t depend on them either. While record linkage requires claims about the meaning of information and the relationship between different texts, it does not necessarily involve any claims about the relationship between text and reality.</p>
<p>To a certain extent I hope that I can let users make up their own minds about the meaning of the text, and that if they disagree with an editorial decision they can either ignore it or save their own personal copy which they can edit to their own specifications. TEI XML adds a layer of meaning to the text, but doesn&#8217;t change the underlying information, unlike a database where the information has to be cut up and rearranged to fit into an arbitrary taxonomy. <a href="http://nora.lis.uiuc.edu/xtf/view?docId=blackwell/9781405103213/9781405103213.xml&amp;chunk.id=ss1-3-5&amp;toc.depth=1&amp;toc.id=ss1-3-5&amp;brand=default" title="Companion to Digital Humanities">Allen Renear</a>: &#8220;One might say that the TEI is an agreement about how to express disagreement&#8221;. <a href="http://www.tei-c.org.uk/Activities/ETE/Preview/flanders.xml" title="Electronic Textual Editing">Julia Flanders</a> reminds us that editorial responsibility should not be offloaded onto the reader. The problems of digital text make the editor more, not less, important.</p>
<p>I hope I&#8217;ve demonstrated that nothing about editing digital texts is simple. Over the last two weeks I&#8217;ve become aware of more problems than I imagined when I set out, but it&#8217;s been very useful to think about these issues more clearly. Even if I can&#8217;t solve every problem, I can at least avoid some of them, and minimise the impact of others. Above all, these projects are intended to be educational, and I&#8217;m certainly learning a lot from them. Now I&#8217;m nearly ready to start creating the digital texts themselves.</p>
<h3>Bibliography</h3>
<ol>
<li>Lou Burnard, John Unsworth, and Katherine O&#8217;Brien O&#8217;Keeffe, <span style="font-style:italic;">Electronic Textual Editing with CDROM</span> (Modern Language Association of America, September 2006). <span class="Z3988" title="url_ver=Z39.88-2004&amp;ctx_ver=Z39.88-2004&amp;rft_id=urn%3Aisbn%3A0873529715&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Electronic%20Textual%20Editing%20with%20CDROM&amp;rft.publisher=Modern%20Language%20Association%20of%20America&amp;rft.edition=Pap%2FCdr&amp;rft.aufirst=Lou&amp;rft.aulast=Burnard&amp;rft.au=Lou%20Burnard&amp;rft.au=John%20Unsworth&amp;rft.au=Katherine%20O'Brien%20O'Keeffe&amp;rft.date=2006-09-30&amp;rft.pages=419&amp;rft.isbn=0873529715"></span></li>
<li>Susan M. Hockey, <span style="font-style:italic;">Electronic Texts in the Humanities</span> (Oxford University Press, November 2000). <span class="Z3988" title="url_ver=Z39.88-2004&amp;ctx_ver=Z39.88-2004&amp;rft_id=urn%3Aisbn%3A0198711948&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Electronic%20Texts%20in%20the%20Humanities%3A%20Principles%20and%20Practice&amp;rft.publisher=Oxford%20University%20Press&amp;rft.aufirst=Susan%20M.&amp;rft.aulast=Hockey&amp;rft.au=Susan%20M.%20Hockey&amp;rft.date=2000-11-23&amp;rft.pages=228&amp;rft.isbn=0198711948"></span></li>
<li>Ray Siemens, John Unsworth, and Susan Schreibman, <span style="font-style:italic;">Companion to Digital Humanities (Blackwell Companions to Literature and Culture)</span> (Blackwell Publishing Professional: Oxford, December 2004). <span class="Z3988" title="url_ver=Z39.88-2004&amp;ctx_ver=Z39.88-2004&amp;rft_id=urn%3Aisbn%3A1405103213&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Companion%20to%20Digital%20Humanities%20(Blackwell%20Companions%20to%20Literature%20and%20Culture)&amp;rft.place=Oxford&amp;rft.publisher=Blackwell%20Publishing%20Professional&amp;rft.edition=Hardcover&amp;rft.series=Blackwell%20Companions%20to%20Literature%20and%20Culture&amp;rft.aufirst=Ray&amp;rft.aulast=Siemens&amp;rft.au=Ray%20Siemens&amp;rft.au=John%20Unsworth&amp;rft.au=Susan%20Schreibman&amp;rft.date=2004-12-12&amp;rft.isbn=1405103213"></span></li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://www.investigations.4-lom.com/2007/02/05/text-theories-meaning/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Text Theories: Information</title>
		<link>http://www.investigations.4-lom.com/2007/02/02/text-theories-information/</link>
		<comments>http://www.investigations.4-lom.com/2007/02/02/text-theories-information/#comments</comments>
		<pubDate>Fri, 02 Feb 2007 17:07:38 +0000</pubDate>
		<dc:creator>Gavin Robinson</dc:creator>
				<category><![CDATA[History]]></category>
		<category><![CDATA[digital history]]></category>
		<category><![CDATA[information theory]]></category>
		<category><![CDATA[meaning]]></category>
		<category><![CDATA[tei]]></category>
		<category><![CDATA[theory]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://www.investigations.4-lom.com/2007/02/02/text-theories-information/</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Text+Theories%3A+Information&amp;rft.aulast=Robinson&amp;rft.aufirst=Gavin&amp;rft.subject=History&amp;rft.source=Investigations+of+a+Dog&amp;rft.date=2007-02-02&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://www.investigations.4-lom.com/2007/02/02/text-theories-information/&amp;rft.language=English"></span>
As the next stage of my Digital History Projects I&#8217;ve been doing background reading and thinking about the theory of text. This week I&#8217;ve read Schreibman, Siemens, and Unsworth A Companion To Digital Humanities (2004); Burnard, O&#8217;Brien, O&#8217;Keeffe, and Unsworth Electronic Textual Editing (2006); Susan Hockey Electronic Texts in the Humanities (2000); and C. E. [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Text+Theories%3A+Information&amp;rft.aulast=Robinson&amp;rft.aufirst=Gavin&amp;rft.subject=History&amp;rft.source=Investigations+of+a+Dog&amp;rft.date=2007-02-02&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://www.investigations.4-lom.com/2007/02/02/text-theories-information/&amp;rft.language=English"></span>
<p>As the next stage of my Digital History Projects I&#8217;ve been doing background reading and thinking about the theory of text. This week I&#8217;ve read Schreibman, Siemens, and Unsworth <a href="http://www.digitalhumanities.org/companion/" title="A Companion To Digital Humanities">A Companion To Digital Humanities</a> (2004); Burnard, O&#8217;Brien, O&#8217;Keeffe, and Unsworth <a href="http://www.tei-c.org.uk/Activities/ETE/Preview/index.xml" title="Electronic Textual Editing">Electronic Textual Editing</a> (2006); Susan Hockey <em>Electronic Texts in the Humanities</em> (2000); and C. E. Shannon &#8216;A Mathematical Theory of Communication&#8217; (1948). I can&#8217;t say that I understood everything (especially Shannon&#8217;s equations and Jerome McGann&#8217;s pretentious jargon) but it&#8217;s given me a lot to think about, and things are nowhere near as simple as I first assumed.</p>
<p><span id="more-52"></span></p>
<p>What is text? What is <em>a</em> text? It turns out that there are no easy answers to these questions. While I was right to think that digitization avoids some of the epistemological problems of history, allowing readers to make their own decisions about the relationship between text and reality, digital text presents plenty of new problems which could be equally intractable. A text is not necessarily the same thing as a book or an article or a play. Things get really complicated when there are differing versions of a text, as is often the case with medieval manuscripts. Should we classify them as the same text with some differences, or different texts with some similarities? The separation of information and meaning is an important concept which can allow us to think more clearly about what we&#8217;re doing, but in practice, the separation is not necessarily easy to make. This is how Shannon introduced the idea:</p>
<blockquote><p>The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning; that is they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem. The significant aspect is that the actual message is one selected from a set of possible messages.</p></blockquote>
<p>By Shannon&#8217;s definition, the sequence of characters contained in a book can be considered to be information. We can select this message and attempt to reproduce it exactly without having to worry about meaning. At the very least, we can fall back on structuralism, since an alphabet is a fixed arbitrary system in which the characters are identified by the differences between them. There is no fixed relationship between a character and the sound it represents (for example, characters in the latin alphabet can be pronounced differently in English, French and German). The same character might be represented in different ways. In modern print there are different typefaces which can be used to represent the same characters. In early-modern handwriting letter forms are often very different from modern forms, and the same character might have different forms in the same word (especially <em>s</em>). This fits in with Saussure&#8217;s distinction between <em>langue</em> and <em>parole</em>: the size, weight, font, and even form of a character might vary, but it can still be identified as the same character in relation to the system it comes from. [EDIT: the proper words for this are substantives and accidentals] This is not to say that <em>parole</em> is unimportant. Typography can have a significant effect on how text is perceived and understood, just like regional accents can signify group identities and influence how speech is understood. However, it is useful to be aware of distinctions. Susan Hockey points out that computers force us to concentrate on what we are doing and why (p. 3). They also force us to analyse everything more systematically rather than assume that anything &#8220;just is&#8221;. I&#8217;ve already had to break text down into information and meaning, with meaning further broken down into equivalents of <em>langue</em> and <em>parole</em>, and I&#8217;m still a long way from having any idea of what text &#8220;is&#8221;.</p>
<p>In theory it should be easy to transmit a sequence of characters, even a long sequence such as a book. In practice, getting an accurate electronic transcript of printed text is one of the biggest problems for digital humanities projects. Whether using OCR or human double-keying, getting acceptable accuracy is difficult and expensive, and perfection seems unattainable. This is surprising when you consider that printed characters and ASCII codes seem to meet Shannon&#8217;s definition of a discrete channel: &#8220;Generally, a discrete channel will mean a system whereby a sequence of choices from a finite set of elementary symbols can be transmitted from one point to another.&#8221; So what&#8217;s the problem?</p>
<p>Shannon&#8217;s model gives us five parts of the communication system:</p>
<ol>
<li>Information source</li>
<li>Transmitter</li>
<li>Channel</li>
<li>Receiver</li>
<li>Destination</li>
</ol>
<p>The transmitter converts the message from the information source into a signal, and the receiver converts it back into a message which can be understood by the destination (usually a person). Shannon&#8217;s theory is mainly concerned with maintaining the integrity of a signal in the channel by calculating how much redundancy is required for a given level of noise. In terms of digitization projects, this is all about the electronic working of the computer and its peripherals. Thanks to the application of Shannon&#8217;s theory, we can usually be sure that when we press the &#8220;a&#8221; key on the keyboard, the &#8220;a&#8221; character will appear on the screen (the keyboard can be seen as the transmitter, and the screen as the receiver).</p>
<p>With double keying, the real problem is what happens between the source and the transmitter. Shannon wasn&#8217;t too worried about this, implicitly assuming that the person at the source selected the message they wanted to select. Even if they didn&#8217;t, he points out that the redundancy of the English language is about 50%, meaning that even if half of the characters are wrong the message will probably still be intelligible to the recipient. Academic projects demand much more than 50% accuracy, and also need to preserve mistakes from the original text, which makes things more complicated.</p>
<p>We could perhaps see the keyer as another communication system, which introduces its own noise by misreading or mistyping the characters. According to Shannon, any system can transmit a message perfectly provided that it&#8217;s transmitted slowly enough and with sufficient redundancy. This applies to typists as well as it applies to telegraph wires. Typing very slowly and carefully will reduce the number of mistakes you make. Having more people rekeying the same text will reduce the overall number of errors. In practice there will always be a probability, however small, that a mistake can be missed. It&#8217;s also likely that people will make similar mistakes in reading and typing, rather than introducing completely random errors (I don&#8217;t know if any cognitive psychologists have done any experiments on this, but it would be interesting to see if there&#8217;s any empirical proof to back up this suspicion).</p>
<p>If time and money are unlimited it should be possible to make transcription errors negligible by employing large numbers of typists and making them type very slowly and carefully. However, we all know that major digital humanities projects don&#8217;t have unlimited time and money. Getting the right balance is important, as is having realistic expectations. Are the demands of digitization projects too high for the available techniques? Are time and budget considerations pushing text keying beyond its limits and making errors inevitable?</p>
<p>OCR is attractive because it offers the possibility of automating text capture, bypassing the expense and unreliability of humans. However, <a href="http://chnm.gmu.edu/digitalhistory/digitizing/4.php" title="Digital History">Cohen and Rosenzweig</a> cite studies which show that when the time and cost of proofreading and correcting OCR text are taken into account, double keying works out more cost effective as well as more accurate. This is because computers are much worse at recognizing characters than humans are. You can scan a document at 300dpi, and those dots will appear in the same sequence on the screen. Perfect transmission, or near enough. But when the computer tries to select a message from those dots as a &#8220;sequence of choices from a finite set of elementary symbols&#8221; things often go wrong. This is immensely frustrating, because to a human it seems like such a simple task. We can hope that advances in Artificial Intelligence will eventually lead to reliable OCR, but it&#8217;s not going to be an easy problem to solve. (The ultimate proof of the unreliability of OCR is that the online version of the <a href="http://www.digitalhumanities.org/companion/" title="A Companion To Digital Humanities">Companion To Digital Humanities</a> is full of scannos!)</p>
<p>As it is, OCR text needs to be proofread by at least one human. <a href="http://www.pgdp.net/c/" title="Distributed Proofreaders">Distributed Proofreaders</a> now use three rounds of proofing (followed by two rounds of formatting). Because of the &#8220;open source&#8221; nature of the project, which is run by unpaid volunteers, time and cost don&#8217;t need to be considered at all. A text is ready when it&#8217;s ready, and nobody has to pay for it. This makes triple proofing more feasible than in a funded project. However, it might also be the case that more proofing is required because the proofreaders themselves are an unknown quantity. As I haven&#8217;t qualified for round two yet, I don&#8217;t yet know how much time round two proofers spend correcting errors introduced by less experienced proofers in round one. Radical trust is feasible provided you get a critical mass of responsible users (which DP appears to have attained) and offers some interesting possibilities. Large numbers of unpaid volunteers doing small amounts of work very carefully might overcome some of the problems of big digitization projects, although it might also bring problems of its own.</p>
<p>Allowing users to supply corrections after publication could also help to increase the accuracy of transcriptions. This is even more radical than the DP model, and gives some traditional minded people the fear. <a href="http://www.tei-c.org.uk/Activities/ETE/Preview/eggert.xml" title="Electronic Textual Editing">Berrie, Eggert, Tiffin, and Barwell</a> take a very traditional view of authentication which is based on the assumption that editors can make a text perfect and that it will deteriorate if not controlled (even more depressing is their emphasis on defending copyright). In the light of everything I&#8217;ve discussed so far, I suggest that the opposite might be true: it is impossible for any individual editor or team of editors to produce a perfect text, and that the more people who are involved in correcting errors, the more accurate the transcription is likely to be. Wikipedia shows that with a critical mass of committed and responsible users, even deliberate vandalism can be overcome. This is not to say that every electronic text has to be, or can be, as open as Wikipedia. The most obvious problem is getting that critical mass of users, and this will be more difficult for more esoteric projects in which fewer people are likely to take an interest. At the very least, there should be some mechanism for users to suggest corrections, even if these have to be reviewed before being implemented. For example, the Old Bailey Proceedings has a form for submitting errors.</p>
<p>My current projects are small enough that I can take a lot of time and care over them, but I also want to develop techniques that will scale up, otherwise the experience will be of limited value. The relatively small amount of text to be dealt with means that in absolute terms there are likely to be few errors. It&#8217;s when you scale things up to millions of words that a small probability of errors can lead to a huge number of errors.</p>
<p>So far I&#8217;ve only considered characters as information, and haven&#8217;t got any closer to defining what text &#8220;is&#8221;. For the purposes of digitizing a book, I can avoid that question by setting out my aim as transcribing all of the characters in a particular book. Even though the definition of a book is at least slightly less problematic than the definition of a text, there&#8217;s more to a book than a sequence of characters. I&#8217;m choosing to represent one aspect of the book while discarding others, such as ink, paper, and binding. This is an arbitrary choice. Partly it&#8217;s because of the impossibility of representing the book as a complete physical object within a digital computer. Perfect information of that kind would need to go down to the level of atoms, and we would need some mechanism for reconstructing objects from the information contained in the computer. This is getting into the realms of alchemy, and clearly isn&#8217;t possible with any current technology.</p>
<p>But above all, any digitization project needs to look to the requirements of its intended users. From this point of view, it&#8217;s the information contained in a book, the sequence of characters making up the message, which potentially has the most value for readers. &#8220;Frequently the messages have meaning&#8221;. In the next part I&#8217;ll be going beyond information and looking at the even greater problems associated with meaning.</p>
<h3>Bibliography</h3>
<ol>
<li>Lou Burnard, John Unsworth, and Katherine O&#8217;Brien O&#8217;Keeffe, <span style="font-style:italic;">Electronic Textual Editing with CDROM</span> (Modern Language Association of America, September 2006). <span class="Z3988" title="url_ver=Z39.88-2004&amp;ctx_ver=Z39.88-2004&amp;rft_id=urn%3Aisbn%3A0873529715&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Electronic%20Textual%20Editing%20with%20CDROM&amp;rft.publisher=Modern%20Language%20Association%20of%20America&amp;rft.edition=Pap%2FCdr&amp;rft.aufirst=Lou&amp;rft.aulast=Burnard&amp;rft.au=Lou%20Burnard&amp;rft.au=John%20Unsworth&amp;rft.au=Katherine%20O'Brien%20O'Keeffe&amp;rft.date=2006-09-30&amp;rft.pages=419&amp;rft.isbn=0873529715"></span></li>
<li>Susan M. Hockey, <span style="font-style:italic;">Electronic Texts in the Humanities</span> (Oxford University Press, November 2000). <span class="Z3988" title="url_ver=Z39.88-2004&amp;ctx_ver=Z39.88-2004&amp;rft_id=urn%3Aisbn%3A0198711948&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Electronic%20Texts%20in%20the%20Humanities%3A%20Principles%20and%20Practice&amp;rft.publisher=Oxford%20University%20Press&amp;rft.aufirst=Susan%20M.&amp;rft.aulast=Hockey&amp;rft.au=Susan%20M.%20Hockey&amp;rft.date=2000-11-23&amp;rft.pages=228&amp;rft.isbn=0198711948"></span></li>
<li>C E Shannon, &#8216;A mathematical theory of communication&#8217;, <span style="font-style:italic;">Bell System Technical Journal</span>, 27 (1948), pp. 379-423, 623-656. <span class="Z3988" title="url_ver=Z39.88-2004&amp;ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=A%20mathematical%20theory%20of%20communication&amp;rft.jtitle=Bell%20System%20Technical%20Journal&amp;rft.volume=27&amp;rft.aufirst=C%20E&amp;rft.aulast=Shannon&amp;rft.au=C%20E%20Shannon&amp;rft.date=1948&amp;rft.pages=379-423%2C%20623-656"></span></li>
<li>Ray Siemens, John Unsworth, and Susan Schreibman, <span style="font-style:italic;">Companion to Digital Humanities (Blackwell Companions to Literature and Culture)</span> (Blackwell Publishing Professional: Oxford, December 2004). <span class="Z3988" title="url_ver=Z39.88-2004&amp;ctx_ver=Z39.88-2004&amp;rft_id=urn%3Aisbn%3A1405103213&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=book&amp;rft.btitle=Companion%20to%20Digital%20Humanities%20(Blackwell%20Companions%20to%20Literature%20and%20Culture)&amp;rft.place=Oxford&amp;rft.publisher=Blackwell%20Publishing%20Professional&amp;rft.edition=Hardcover&amp;rft.series=Blackwell%20Companions%20to%20Literature%20and%20Culture&amp;rft.aufirst=Ray&amp;rft.aulast=Siemens&amp;rft.au=Ray%20Siemens&amp;rft.au=John%20Unsworth&amp;rft.au=Susan%20Schreibman&amp;rft.date=2004-12-12&amp;rft.isbn=1405103213"></span></li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://www.investigations.4-lom.com/2007/02/02/text-theories-information/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

