CIRCA:Community Portal

From CIRCA

Jump to: navigation, search

Renear, H. Allen. “Text Encoding.” A Companion to Digital Humanities, ed. Susan Schrelbman, Ray Siemens, John Unsworth. Oxford: Blackwell, 2004. Reviewed By Joseph Dung

Renear in his article “Text Encoding” (2004) provided valuable insights about the historical and theoretical context needed to understand “both contemporary text encoding practices and the various ongoing debates that surround those practices.” He expounds on the theoretical frameworks that guided the development of markup related techniques and systems and the ongoing debates that surrounded them. He pointed out that traditional humanities computing concerned more with literature and language analysis but text encoding encompasses a wider sense to include new cultural products like “new media”. In presenting a brief history of markup languages he proceeds to delineate the advantages of the dominant type of markup model that has been developed: the Descriptive markup. However, even as Renear discusses the Descriptive model, he exposes how that model naturally fits with our view of text as an “Ordered Hierarchy of Content Objects” (OHCO); an intuitive view of text structure he challenges in his other article, “Refining our Notion of What Text Really Is: The Problem of Overlapping Hierarchies"-- with varying axiomatic arguments. Since markup languages grew as a modern system of annotating text for additional processing by machines, they were initially procedural since formatting commands and instructions were embedded inside the text. Descriptive Markups conflated with the nascence of SGML and had the advantage of labelling parts of the document, in a way authors could understand. Underlying this concept is the idea that markups should be focused on the structural aspects of the document and leave the visual presentation of that structure to the interpreter. The way Descriptive models work (identifying words, lines, passages, paragraphs, headings) naturally fit the way humans understand text. Descriptive Markup labels also function like assembler mnemonics (or macros), where abbreviations could be used to represent and decode a longer strings, enabling the possibility of creating global variables that can affect some aspects of the document, without entirely affecting the whole document. He compared and contrasted the descriptive and procedural models giving credence and weight to the former.

While providing background detail of the creation of SGML, XML and the TEI , he also took time to clarify the confusing terminology in the field, and the ambiguities around certain nomenclatures: SGML is not markup language in the traditional sense, but is actually is meta-language. It is a language that provides all the basic elements for authors to build their own markup languages. SGML provided a means by which other specific “grammars” could be built for any range of documents. Renear posits that SGML finds use beyond the formatting of documents--it can also be useful for data interchange.

At this point Renear acknowledges how SGML’s adoption gradually suffered, first in the publishing world when WISYWIG Word programs appeared which provided an even richer visual metaphor for text processing beyond macros and then second with the creation of HTML. HTML, even though it lacked an initial Document Type Definition, and had, in the very words of Renear, “an impoverished element set” was a simpler and forgiving markup language. Curiously, HTML also included both descriptive and procedural models in its syntax—a mix of different text encoding approaches. It is quite possible that the very structure of WYSISWIG text processing with its visual metaphors and the non-linear nature of hypertext in HTML must have strong influences for Renea to then assert that the descriptive OHCO model of text structure was deficient; and that a newer encompassing theory of text structure was needed to both understand and create new techniques for text encoding.

Personal tools