Subject and object: national biography online

Philip Carter, Oxford Dictionary of National Biography

This paper briefly considers some of the challenges, but also the resulting opportunities and potential, of online resources for the writing and imminent reading of the country's largest collaborative research project in the humanities: the Oxford Dictionary of National Biography (hereafter Oxford DNB). A twelve-year research project based at Oxford University, the Oxford DNB will be published in full by its principal funder, Oxford University Press, in print and online in September 2004. The project's purpose since 1992 has been to rewrite the existing Dictionary of National Biography - an A-Z of British-born historical figures, or figures influential in British history (all deceased) - first published between 1885 and 1901 and subsequently extended in the twentieth century in decennial and quinquennial supplements to 1990.(1)

The process of rewriting since 1992 has encompassed three key aims. First, the revision of existing articles or the provision of new entries for all subjects included in volumes published between 1885 and 1990 (some 36,000 lives, of which approximately two-thirds are fully rewritten memoirs).

Second, the addition of approximately 14,000 new subjects who have come to light through recent scholarship; who are considered worthy of inclusion as befits modern research interests; or who have died between the publication of the final 'old' DNB volume in 1990 and 31 December 2000, the cut-off date for inclusion in the 2004 edition. The full complement for theOxford DNBis therefore 50,000 main or full subject biographies, detailing the lives of 55,200 individuals (5,200 subjects being co- or subsidiary subjects not meriting an entry in their own right). All 55,200 subjects carry tagged records of personal information, detailing name and standard factual components as well as, where available, records relating to a subject-s education, religious affiliation and residences, which will be searchable electronically. In addition to a biographical text, subject entries will include listings of primary and secondary material used for compiling the article (501,500 citations in total) and, when applicable, records of paper, sound and film archives and of likenesses relating to the subject; where known, information is also provided on an individual's wealth at death as calculated by probate or other sources. One in five articles will also carry a portrait illustration of the subject.

Finally, rewriting means the publication of the complete dictionary in print (approximately sixty-two million words across 60,000 pages in sixty volumes) as well as online to individual or institutional subscribers. It is this latter format on which the following comments focus, with particular attention paid to the challenge of putting the dictionary online and to a realistic assessment of the resulting potential when such challenges are dealt with satisfactorily.

I

In the context of a conference on digitization, the Oxford DNB online can be considered both as subject and object of the themes under consideration. By subject I mean that, as a ten-year research project, comprising 500 academic advisers and editors and 10,000 contributors worldwide, the dictionary has itself made extensive use of digitized resources (CD-ROM and Internet) during the planning, research and writing phases. In this role editors and contributors have been presented with challenges which will be common to other individual or collaborative scholars, relating principally to issues of integrity, accuracy and known scope of digitized resources or guides. However, several points specific to the project are also worthy of note. The decade between the new dictionary's origin in 1992 and the completion of the research phase in 2002 has seen a dramatic growth in the availability of, and scholars' willingness and capacity to use, dedicated digitized resources, as well as in communication and information facilities provided by the Internet. As a result, the Oxford DNB has grown up with an enormous expansion in resource opportunities which, as for academia generally, have become integral to its research process. On a practical level, it is inconceivable that an international project of this scale could have been completed on time without such communication modes (an observation which makes more considerable the publishing achievement of the late Victorian dictionary).

The accessibility of such resources has also influenced, to a degree, the subject content of the final edition - especially in the past five years - as editors, external advisers and academic authors gained better access to and familiarity with online resources. The facility to gather information on relatively obscure subjects, or to locate authors unknown to even the best-informed adviser, has undoubtedly made it easier to broaden the dictionary in terms of its coverage of less prominent, but often locally significant or culturally representative figures, as well as in the geographic and professional range of its contributors. Equally, it is highly probable (if unquantifiable) that the opportunities for electronic communication have influenced the content and tone of a number of entries before delivery. This said, if the low number of postings on academic 'message boards' are a reliable guide, it is certainly true that the potential for such exchange, and for academic collaboration in general, is far from being realized - at least by a generation of scholars more familiar with the educative function of the common rather than the chat room.

The second approach to the online Oxford DNB is as an object of this conference, that is, as a forthcoming dataset on the British past (principally of newly written secondary source material) open to the same scrutiny as other resources discussed in other papers. As a national and international project (as well as a commercial concern for OUP), it is a basic requirement that the Oxford DNB, and especially its online version, be accessible to the widest possible readership comprising academics, private researchers, pre-university students, school teachers and children, archivists, curators and journalists. Yet breadth brings challenges if the supposed accessibility - a central rationale for online providers - is to be realized. A broad range of online users guarantees not just different requirements of the data but also differing levels or forms of information with which readers begin searches for a particular subject. In accordance with the project's principle of reader inclusivity, the online resource is therefore required to treat all starting points as equivalent and deserving of an equally helpful response in retrieving targeted subjects by name.

Lexicographical rules demand that, on the printed page, each subject requires a single core location within the 50,000-strong sequence. The wife of Edward VIII, for example, is entered - according to the dictionary's general practice for such subjects - under her final married name as 'Windsor, (Bessie) Wallis'. But a wide-ranging readership ensures that such subjects will also be best, or perhaps only, known by alternatives variant to the imposed dictionary name: where dictionaries require and create logical structures and rules, history and its practitioners work by a more fluid, but equally valid, set of personally familiar practices. Thus individual readers may know of Edward VIII's wife only as 'Wallis Simpson', 'Simpson', 'Mrs Simpson', 'the duchess of Windsor', 'Lady Windsor' or by her maiden name, Warfield, or first married name, Spencer - a practice which applies equally to the many subjects whose names can be variously spelt or who are known, or first encountered in other primary or secondary sources, by alternative or colloquial names: among others, 'Dr Johnson', 'Capability Brown', 'the Black Prince', 'Bonnie Prince Charlie' and 'Scott of the Antarctic'. Variations between a core/dictionary name and reader-generated names 'as known' are compounded by specific issues such as the complex naming policy of the British peerage in which title name (where different) supersedes surname after ennoblement. It is reasonable to assume that among modern British, and certainly among non-UK students, knowledge of this system and the relationship of subjects- surnames to peerage names is in decline. To most of us this poses few difficulties in contemporary life. But for many students of the British past, in which the peerage system played an integral part in political and social culture, it is a more significant gap in knowledge.

Solutions to these issues of searching and retrieving are not, of course, specific to online resources. In print the connection between, for example, the politician and author 'Philip Dormer Stanhope' (1694-1773) and the 'fourth earl of Chesterfield' (his title from 1726) is certainly possible via a comprehensive indexing system of cross-reference entries. But there remain practical implications in linking 'C' to 'S' in a sixty-volume dictionary, implications that are increasingly unwelcome to readers who have elsewhere come to expect immediate and virtual connections whereby background research or prior knowledge is nullified by technical process.

Online the gap between (dictionary) core and (reader) names 'as known' can be bridged far more efficiently. Appreciative of multiple access points (whether born of familiarity or changing patterns of knowledge), the Oxford DNB has created individual 'profile sheets' for its 55,200 full and co-subjects. In addition to further standard factual information, these profiles store all elements of a subject-s personal and/or title name (variously forename, surname, alternative name, replacement name, toponym, pseudonym, performing name, name in religion and/or foretitle, title rank and peerage name). The sheets, independent of the article text and unseen by the online reader, serve as an index for all people-based searches in the dictionary, permitting considerable variations in nomenclature to be recorded without the excessive intrusion in the article text required if such searches were wholly text-based.

In practice the link between a subject's (multiple) profile names and his or her dictionary name produces the same result regardless of whether students of eighteenth-century politics begin a search with the term 'Stanhope', 'Dormer Stanhope', 'Chesterfield', 'earl of Chesterfield', 'fourth earl of Chesterfield' or 'Lord Chesterfield' (since 'Lord' is parsed with the six hereditary peerage ranks). Having linked search with result, the remaining challenge is to ensure a suitably close match between (personal) starting point and (dictionary) result since, for readers, success is wholly determined by the transparency or relevance of the resulting article or list of articles with which he or she is presented. Without such a match online users quickly conclude that they are too inexperienced to use the technology, are working with defective searching apparatus, or come to see themselves more as victims than beneficiaries of a restrictive technical process. In each instance, online readers are prone to lose confidence and trust - a central issue for the success of digitized records that will be touched on again in conclusion.

II

At this simple level, digitization serves a now standard function of accessing information - here biographies of designated subjects - which might still be found, in time, via scrutiny of the print version. But digitized records are also valuable in a second form of searching in which the user's starting point is not a known individual but a connecting theme which an, as yet, unknown number of subjects share. The result is the production of groups or sets of subjects impossible for even the most diligent book reader to create, with the potential to connect individuals in original ways, and so to suggest new perspectives on specific research questions.

Such connections are best viewed as an increasingly detailed hierarchy of searches in which individual dictionary subjects are selected and arranged according to their relationship to particular conditions. Of these the most obvious are time-centred enquiries. Thus a quick search of the online dictionary reveals that, of 55,200 subjects, seventy-seven were born on 7 July (the date of the IHR conference), a set which brings together Margaret, countess of Cumberland (b. 1560) and James Drummond, Jacobite duke of Perth (b. 1648) with the Irish carpet manufacturer, James Templeton (b. 1802), the historian Steven Runciman (b. 1903) and the actor Jon Pertwee (b. 1919). Regrettably 7 July is, among our dataset at least, a more common date to die (111 subjects); individuals in this category include Edward I (1307), the politician and dramatist Richard Sheridan (1816), cocoa manufacturer Joseph Fry (1919) and the performer Jimmy Edwards (1988). And although we may be forgiven for thinking otherwise, George Orwell's is not the only anniversary to mark in 2003. A further search groups fellow centenarians who include well known figures such as Amy Johnson and Malcolm Muggeridge alongside less obvious, but perhaps now more familiar subjects, including the bookmaker William Hill, the clothes retailer Hugh Fraser, or the Italian restaurateur Frank Berni, founder of the eponymous Inn.

On their own, such sets are perhaps of little academic interest beyond something akin to a 'strange bed-fellows' approach to history. However, they become more valuable when time-based searches identify dates of collective activity (the sinking of the Titanic (14/15 April 1912) or the first day of the battle of the Somme (1 July 1916), for example), or are combined with additional spatial considerations. A search for residents of Bloomsbury's Gower Street in 1900, for instance, reveals the relative geographic proximity of those, like Millicent Garret Fawcett (1847-1929), commemorated by blue plaques to others, including the photographer Alice Mary Hughes (1857-1939), the organic chemist John Norman Collie (1859-1942) and the women's activist Mary Kathleen Lyttleton (1856-1907). A similar survey of students educated at University College London in 1853 brings together the future chief rabbi, Hermann Adler (1839-1911), the theatre impresario Richard D'Oyly Carte (1844-1901) and the potter and novelist William de Morgan (1839-1917). The result is a snapshot of the personnel of a particular place, institution or organization, with potential research opportunities for those interested both in personal relationships and in the history of the unifying structure. Differently configured (that is, by word/text searching of articles rather than of subject profiles), this slicing of the dataset also allows for the clustering of subjects associated with shared events, organizations or people - the Great Exhibition, the Black Watch regiment, the Suez crisis or Winston Churchill, for example - and the creation of a picture composed of multiple narratives in which the shared place, person or experience features only as part of another's life story.

Consideration of shared issues of time and place can be refined further via searches which incorporate additional shared characteristics. From these it may be possible to identify patterns of human movement or development over time: whether physical - for example, the number of seventeenth-century traders born in Lowland Scotland who died in London; cultural - the number of subjects who, baptized Anglican, later converted to Roman Catholicism in the nineteen-hundreds; or economic - people dying with more than 5,000 as organized by profession between 1500 and 1700. Furthermore, by extending searches to include (where records exist) subjects' parents, spouses or siblings, there is considerable future potential for developing this latter type of 'social scientific' examination into an extensive (if never nationally 'representative') dataset providing information on, for example, changing patterns in causes of death, life expectancy, average family size or factors conditioning the naming of children.

In a recent review, coincidentally of new memoirs on Orwell, the critic Terry Eagleton reiterated the common suspicion of biography as of questionable value to history, and to the humanities more generally, referring to its 'usual...defect of sacrificing the wood for the trees'.(2) Viewed as a collection of 50,000 discrete lives the Oxford DNB certainly has the potential to play to this tendency. Moreover, it is likely that the majority of users, academic and other, will regularly use the dictionary for material relating to a specific individual, or to place an aspect of life in wider biographical context. In this capacity the potential of online functionality is, as suggested, quicker and more efficient access to a target subject, and from there to relevant life information.

But online, the dictionary also becomes a resource capable of at least suggesting connections or collective endeavours which place individual subjects in once-lived relationships whether structured around geographical, institutional or other cultural factors. In so doing, cross-subject searching goes some way to remind us that, in the humanities at least, woods are predominantly made up of people and are consequently defined precisely by the intimate physical or cultural connections that can exist between such individuals.

The searches sketched here relate only to the resources contained within the Oxford DNB's own dataset. However, just as the dictionary as 'subject' has made use of online resources, so as 'object' it will also point readers to external web links dedicated to specific elements of an individual's biography. Initially these connections will be to additional sources of information relating to the subject: for example, the National Portrait Gallery-s collection of portrait images; archive holdings via The National Archives; or to alternative memoirs of the same, or related, subjects provided in other online national biographies, presently the New Zealand (NZDNB) and American (ANB) dictionaries.

In practice this means that readers of the article on the eighteenth-century prime minister Sir Robert Walpole will also be able to view caricature images referred to in the text's discussion of opposition politics, or to compare US and UK authors- handling of the life of the patriot or traitor, Benedict Arnold (1741-1801). Future links are planned, providing further information on individuals who, although discussed in articles, are beyond the dictionary's remit as subjects (for example, non-British sculptors covered by the Grove Dictionary of Art), or on contextual events and places (such as the 'Civil War in Oxford' available from the Victoria County History via the IHR's 'British History Online').

Further links are possible to more explicitly interactive sites offering biographical information in alternative media. These, by complementing (or questioning) the dictionary's text-based memoir, have the potential to 'bring a subject to life' through sound recordings and documentary film archives provided by the British Library's National Sound Archive and the British Universities Film and Video Council, or by linking to other authoritative sites, of which one example would be the Freud Museum, London, that offer their own biographical information in virtual and material forms.

Increased digitization of manuscript holdings by record offices also holds open a further opportunity for readers to click through from material quoted in an Oxford DNB article to an archive's website containing the original source in full. And what applies for dictionary subjects will, in time, also be possible for their biographers, with links from an article to library catalogues of a scholar-s publications or to digitized publications themselves (via resources such as JSTOR) for further information or research. How far such creative developments 'transform' scholarship remains to be seen. What they appear to promise are new opportunities to witness the individual historian's craft through insight into the use (or manipulation) of primary sources; to gain access to related web resources; and, by this mixing of text', sound- and image-based media, to make biography (and therefore history) a little more personable and, perhaps, meaningful to a range of audiences.

III

These are some of the opportunities, realized, planned or predicted, which digitization may afford readers of the Oxford DNB online. But with digitization seeming to lead towards increased interconnection of web-based resources, it is worth concluding with a note on several broader challenges faced by all who publish, and use, humanities reference material online.

While we are all more web literate, it is reasonable to suggest that most of us are still relatively more comfortable, or at least less sceptical, when presented with printed books or journals. Notwithstanding recent technologies, paper remains our accepted currency of academic excellence, bolstered as it is by familiar and intelligible indices of quality in the shape of publisher, production values, or personal endorsements and reviews which (with the exception of H-Net and the IHR's Reviews in History) themselves remain a predominantly paper-based form. A first challenge, therefore, is how best to develop the perceived integrity and value of wholly online publications which, especially in the case of e-monographs, lack the accompanying material support of - as for the Oxford DNB, History of Parliament or VCH - history and reputation, an established name, a respected publisher and/or university affiliation and, when needed, the physical presence of volumes on shelves.

A second challenge also relates to our relationship to books in which we expect, and usually find, established publishing conventions - contents, footnotes and index - which underpin and assist the personal search for information. This emphasis on the personal is important but potentially at risk in those online resources in which too much agency is removed and too much context lost either by interposing excessive stages between search and result, or by appearing to undermine readers whose personal experience is seemingly conditioned less by individual actions than by the technology with which he or she is now forced to work.

It is no surprize that resources which are deceptively simple to use, are readily explicable and employ common formats, prove easier to trust, with trust a key element in realizing the opportunities under discussion at this conference. Thus, in the current undoubtedly exciting climate, in which numerous bodies compete to reveal selective areas of the web, it seems important that online providers remember that it is better to do a few things well than it is to offer an increasingly unwieldy, and unrealistic, series of unrealizable opportunities. Academic research is a lengthy and careful process in which material is steadily accumulated and evaluated. Quality is invariably superior to quantity, although currently digitization is undoubtedly better at scale than at determining and regulating value. Future humanities research should surely promote a combination of online resources with the deep-seated 'cultural practice' of established library or archive-based work in which the former prompts not replaces the latter.(3)

Likewise, it would be wrong for online publishers to see the considerable benefits of new media as always superior to traditional research methodologies based on the relatively more liberated and entertaining activity of, say, book browsing - a successful and proven feature of one medium that can be easily and productively incorporated online alongside more disciplined forms of online access. A second challenge, therefore, is to proceed cautiously in a medium often given to hyperbole (or 'cyberbole' to adopt Woolgar's neologism). This, of course, is a caution which applies just as much to the prompts and possibilities offered by the Oxford DNB online as it does to other sources. Thus, to generate a set of people who lived in Gower Street in 1900, studied at UCL in the eighteen-fifties, or based their careers around a particular person, is to suggest but not to confirm new connections which additional research may show never to have existed, or to be of no scholarly significance should it be possible to establish a human relationship.

To suggest caution and greater context poses a final challenge for humanities reference works online, although this is one faced by readers and critics as well as by publishers and academic editors. Traditionally works like the Oxford DNB, VCH or History of Parliament have been judged as authorities and have come accordingly to serve as the 'last word' on a particular subject. However, online environments inevitably question such absolute values, either by allowing for the quicker detection of error, the bringing of virtual repositories into closer proximity with equivalents or alternatives, or through the conspicuous act of updating and correcting digital resources. To maximize the potential for such projects is therefore to request a more pluralistic and, ultimately, realistic view of what these resources should be: simultaneously name and date authority, source of hard fact, interpretative monograph, selective bibliography or archive repository, identifier of possible human networks, facilitator of serendipity and web conduit to complementary or 'rival' sites. In short, they should be seen as efficient modes of access to new writing and, from this, as pointers to further study within a combination of established and innovative research strategies.

July 2003

Notes

  1. For further information on the publishing history and aims of the DNB project 1885-1990 and 1992-2004, see H. C. G. Matthew, Leslie Stephen and the New Dictionary of National Biography (1997); B. Harrison and R. Faber, 'The Dictionary of National Biography: a publishing history', in Lives in Print: Biography and the Book Trade from the Middle Ages to the 21st Century, ed. G. Mandelbrote and others (2002); also www.oup.com/oxforddnb.Back to (1)
  2. 'Reach-me-down romantic', London Review of Books (19 June 2003), p. 9.Back to (2)
  3. C. Crook and P. Light, 'Virtual society and the cultural practice of study', in Virtual Society? Technology, Cyberbole, Reality, ed. S. Woolgar (Oxford, 2002). Similar caution runs through Roy Rozenzweig's recent essay, 'Scarcity or abundance? Preserving the past in a digital era', American Historical Review, cviii (2003), 735-62; and Digital Academe: the New Media and Institutions of Higher Education and Learning, ed. W. Dutton and B. Loader (2002): 'a knowledge society could be misleading if policy makers conclude that ICTs create knowledge, rather than reconfigure access to knowledge and expertise' (p. xix).Back to (3)
  4. Examining the impact... | Digitisation | back to the top