skip to content
 

Every writing technology has advantages and drawbacks. The papyrus codex, for example, was (for its time) user-friendly and easy to search, because its architecture was rather similar to that of the world wide web — composed of a large number of short pages, rather than a continuous scroll.

It did, on the other hand, have drawbacks for copying, which was very labour-intensive, and also for permanent archiving, due to its fragility, which may be seen from this illustration of Charles Hedrick working on the Nag Hammadi codices.

palaeographer

Similar factors still apply in the modern world. Information flow is optimised when the message doesn't depend entirely on the medium, but can be translated across a variety of vehicles. For this reason, we are composing the Cambridge Greek Lexicon using XML technology (the letters stand for 'extensible markup language'). This means that our pages are not just formatted for appearance, as with word-processing software, but are also typeset for publication, and configured for online display and searching in electronic editions.

Here is a comparison of the two systems. First, a page of the lexicon composed using a word-processing program looks something like this:

p472word

This electronic typing standardises and preserves formatting quite well. However, extensive proof-reading has to be undertaken, and, in order to be translated into other media, the structure underlying the formatting also needs to be recorded. For example, the plain-text passages sometimes express the definition of the headword, but when bracketted, they may express an introductory or following explanatory remark, or encyclopaedic information, and these need to be identified if the lexicon is to be searched.

Such information can be preserved using XML 'tagging' (a development of the HTML which is used to format WWW pages). The basis of the system is an extended use of tags (labels inside pointed brackets) which in HTML are used to mark format: instead of typing a bold section of text by changing the font style, the passage is simply enclosed within "<bold>" tags.

In XML, the tags can define structure as well as format, and we can configure our own tags, so we can mark the headword or lemma by enclosing it in a specific tag. We can stipulate that this tag always marks the headword (a structural function), and that the text inside it is always in a bold Greek font (a formatting requirement). Similarly, we can tag the inflection, dialect forms, principal parts, definitions, and contextual information.

We have found that 100 different tags suffice to cover every type of entry in the lexicon. Here is the start of the page shown above, now marked up in XML: 

p472xml

At first glance, this may look rather forbidding, but it soon becomes as natural as setting styles in a 'Word' document. We select the tags as we write. For example, the first entry, for libazomai, is enclosed in tags marked 'VE', because it is a 'verb entry'. Within that 'wrapper', there is a 'verbal head group' (vHG), which contains the lemma (HL), the part of speech label (PS), and the etymology (Ety). These elements may contain others within them, in a hierachical structure. And some of the elements are primarily there to facilitate searching: for example, inside the etymology tag, the related word libas is enclosed in 'Ref' tags, which indicate that it refers to another headword in the lexicon.

The definitions and translations appear inside 'S1' tags, and there are also 'S2' tags (not shown here) for subsections illustrating nuances of meaning. Within these 'S' elements are many others marking authors and contextual information, such as the subjects and objects which a verb takes, examples of nouns qualified by an adjective, or verbs modified by an adverb.

This level of precision means that we can immediately translate the page into print-quality format, producing a PDF page.

This gives us an accurate picture of the finished product, enabling us to identify typing errors and unwanted variations of style and content while we are writing. There are other advantages too: because we have organised the tags within a specific structure, we are encouraged to be consistent in the way we write each entry, and so we can maintain a 'house style'.

The final step is to combine these  individual pages into a single paginated PDF document to produce the final typeset copy.

We may sum up the advantages of XML authoring under five headings:

1: An integrated, flexible writing and publishing environment

We can cope with any technical problems which might arise as we proceed, and produce precise formatting for the typesetters, so the task of proof-reading will be greatly helped.

2: A consistent writing style

Inconsistencies are almost unavoidable in typed copy, and especially when articles are written by more than one person. For example, when citing Euripides Antiope, LSJ refers to "Antiop.iv B, [line number] A" and also "Antiop.iv B line ... Arn." and sometimes "Antiop.iv B line ... Arnim", or else "Antiop.p.21 A" or "Antiop.B 58 p.21 A". All these citations refer to the same fragment (fr.10 in Page's Select Papyri). Consistency could have been maintained if the authors of LSJ had been able to compare all their citations easily.

XML allows us to apply maximal constraints to entries, and so enables all the members of the editorial team to maintain consistent style and format.

3: A structure which reflects our methodology

Our aim has been to create structures which impose constraints on the writing, yet remain flexible enough to contain the range of information which we may wish to enter. We achieve this most importantly through the innovation of using dedicated structures for each part of speech. This enables us to maintain a balance between extended definitions, translation glosses, and contextual and encyclopaedic information, so that we are helped to write the last entries in the same style as we wrote the first.

4: A product which is translatable across publishing media

There will be an electronic edition on the Perseus site. The system means that it can be easily and accurately searched. Dictionaries which are tagged after they were written necessarily contain fewer tagged elements than ours (as they were not composed with such a precise structure), so fewer types of search are possible. A reader of our lexicon can, for example, see how vocabulary changes across the range of Ancient and Koiné Greek, because we mark usage in a corpus of 70 authors, from Homer to Plutarch, and so we can compare word frequency in different writers. And our system will also be linked to other Perseus databases, to images as well as to texts.

5: Better-organised material

XML releases us from the constraints on space of the printed book. Most usefully, our 'annotation' element allows us to incorporate editorial notes in each entry, for reference during the writing, and as a permanent archive of our research.

And cross-reference elements enable us to perform electronic searches during the writing and proof-reading stages. That has a number of advantages:

(a) We shall be able to group related words together, so we can easily compare all words sharing the same stem, and write the entry for a simple form before dealing with its derivatives. It is useful to compare the entries for all the compounds from the verb bainw (go), which can take the preverbs ana-, anti-, apo-, dia-, eis-, ek-, epi-, kata-, exana-, meta-, para-, peri-, poti-, pro-, pros-, sum-, huper-, and hupo-, rather than only treating them in alphabetical order.

(b) We can investigate the range of meanings of the prefixes themselves, across the different primary forms (as in the derivatives of bainw listed above), and compare this with their uses as independent prepositions and adverbs.

(c) We can incorporate cultural information. For example, colour terms constitute a group which is currently the subject of considerable semantic interest, and XML tagging enables us to study them not only in their primary forms, but also in compounds, where they may appear as stems, as in akro-kelainiown, with black surface; dia-melainw, become quite dark; hupo-glaukos, somewhat grey, huperuthros, rather red. They also appear as prefixes combined with a noun like aspis ('shield'): we find leuk-aspis ('white-shielded'), phoinik-aspis ('red-shielded'), chalk-aspis ('bronze-shielded'), and chrus-aspis ('gold-shielded').

Attention to these details of word formation enables the writers to compose more precise definitions, which in turn can help the student gain a deeper understanding of Greek word meaning. Electronic searching during the writing process will help us produce a more consistent, coherent, and consequently more useful lexicon.

The XML environment is a little more challenging for the writers, because we have to become accustomed to manipulating the tags. However, it does save effort, too, as the text formatting is largely automated, we don't need to select bold or italic fonts, or to insert brackets, or even section numbers: all that is done automatically.

And the advantages are that mistakes and inconsistencies can be avoided, the writing and publication processes are integrated, and the usefulness of the lexicon can be maximised and extended in the future, as new ways of integrating verbal and visual information are discovered. We believe that this will help students to explore the richness of the Ancient Greek vocabulary in the most effective way possible.

 

Next Page: Research Partnerships

Latest news

Classical Equalities Lecture 25 April 2024 at 17.00 in G19

4 March 2024

Jane Draycott will be giving this year’s Classical Equalities lecture, on ‘ Prostheses in Classical Antiquity: Everything You Never Knew You Wanted To Know’. Jane Draycott is Lecturer in Ancient History at the University of Glasgow. Her research investigates science, technology, and medicine in the ancient world. She has...

Soundmarks Project

12 February 2024

Soundmarks, an art/archaeology collaboration between Rose Ferraby, Cambridge Archaeologist, and Rob St John using sound and visual art launches at DIG in York. In 2019 the pair created work exploring and animating the sub-surface landscape of Aldborough Roman Town in North Yorkshire, UK. Soundmarks Aldborough was re-shown...

Vacancy: Assistant Professor in Latin literature

8 February 2024

The Faculty of Classics is seeking to appoint an Assistant Professor in Classics (Latin literature) from 01 September 2024. The role is open to those, at any stage in their career, with a primary research interest in Latin literature. The successful candidates will have, or be expected to develop, a record of world-class...

Publication: The New Documents in Mycenaean Greek

24 January 2024

The Faculty of Classics is proud to announce the publication of The New Documents in Mycenaean Greek , edited by John Killen FBA, the Emeritus Professor of Mycenaean Greek. More than a dozen leading Mycenologists have contributed chapters and sections to this seminal work in two volumes, comprised of more than 1100 pages...