In extending the work of the Digital Tolkien Project to The History of Middle-earth, we encounter additional challenges because we are not dealing with a cohesive published text nor even (at least outside volumes 6–9) drafts of a single canonical text. One might think the solution is merely to model multiple versions of texts but this is not straightforward either. Firstly, the manuscripts and typescripts are often heavily emended and likely not in a single pass in many cases, and so the combinatorics of different readings makes the notion of a clean version of a text often impossible. Secondly, and perhaps even more relevant, we do not have access to the actual manuscripts and typescripts. Rather, the texts we have are mediated through Christopher Tolkien. On the one hand this is a blessing. Decades of painstaking scholarship has already been done for us. But on the other hand, we must reconcile ourselves to the fact that, for now at least, the principal objects of our study are artifacts produced by Christopher Tolkien.
And so we are developing a data model that enables us to describe and annotate the unpublished texts of Tolkien’s legendarium in a way that gives primacy to Christopher’s presentation of them. It will attempt to provide a logical structure for the texts beyond the physical constraints of the volumes of The History of Middle-earth but in a way that builds on the remarkable scholarship already done in those books. Core to this is the notion of a P-text.
Text Types
P-text is the text given within The History of Middle-earth in a larger typeface. One can think of P as standing for ”primary” (or “privileged” or “prima facie” or even “Professor”). It is Christopher’s diplomatic transcription of a manuscript or typescript incorporating his decisions of what emendations to include and what spelling or punctuation to correct or normalize. By taking P-text as a foundation, we avoid second-guessing Christopher’s interpretation. And by designating sections of The History of Middle-earth as P-text, we can move forward with citation systems, search, text analysis, etc within those parts while deferring until later the complexities of other parts of the books.
That said, we have a tentative taxonomy of other text types as outlined below:
| Not Tolkien’s words | Notes and Commentary by Christopher | N |
|||
| Other person’s words | O |
||||
| Tolkien’s words | Legendarium | Primary texts in larger font | P |
||
| Small font | Quotes of P-Text by Christopher | Q |
|||
| References to other texts, e.g. Published Silmarillion | R |
||||
| New / variant material | Secondary texts | S |
|||
| Text variants and fragments | T |
||||
| Unconnected writings, e.g. Letters | U |
||||
The words of Christopher (other than external quotations) are designated as N-text. The words of all other people are designated as O-text.
The primary distinction between Q-text and R-text is whether Christopher is quoting P-text (in the same or another HoMe volume) or not. Note that both are normally in quotation marks or presented as an indented block quote. If, in a note or commentary, Christopher mentions a name from the P-text without using quotation marks, that is just considered part of the N-text. If he mentions a word such a name from a non-P-text that is otherwise not used in any other texts, it might be designated as a T-text so we have some indication that the word does appear in Tolkien’s writings (which would not otherwise be clear if just considered N-text).
The distinction between S-text and T-text is probably the most nuanced and there are a number of tests we have so far found useful in determining which text type to use. If the text is separate and standalone without clear connexion to a P-text, it is S-text. This includes things like poems or notes on separate sheets of paper. If the text is a substitution for part of a P-text (whether written before or after) and/or is clearly anchored to part of a P-text, it is T-text. This includes things like emendations and earlier drafts presented in direct relation to a P-text rather than as a P-text in its own right. A T-text in many cases may be a single word or phrase.
See Text Type Examples but note that we are still developing this system and welcome feedback as well as edge-cases that might need further discussion.
Structure Levels
All of this is a way of modelling the contents of the History of Middle-earth volumes. It is what could be referred to as the surface structure.
Separate from this is the modelling of the underlying documents: the manuscripts and typescripts. Where on the page is a particular bit of text found? Are the manuscripts written in pencil or ink? Are they notebooks or loose sheets? Is the writing hasty, is it a fair copy, or is it calligraphic? What typewriter was used for the typescripts? What ribbon? What type of paper was used? When was the text written? How was it emended? As discussed above, we do not have direct access to the documents but we can still capture Christopher’s descriptions. All this could be referred to as the documentary structure.
Then we have the relationship between texts (e.g. that is a later version of this). We are using the term work to refer to the thing that distinct texts are different versions of. For example, we consider the Music of the Ainur / Ainulindalë to be a single work with various versions in various documents, expressed in the History of Middle-earth as various texts of type P, S, and T. This could be referred to as the work structure.
Finally, there is the notion of a story above the level of individual works. The ‘Túrinsaga’, for example is told over multiple works (tales, poems, annals, etc). This could be referred to as the story structure.
And so we have:
| Surface Structure |
| Documentary Structure |
| Work Structure |
| Story Structure |