Any thoughts on working with ‘original’ vs. digitised historical documents?

by mo11y_mi11ion5

I am a digitisation volunteer at a Local Studies archive (in the UK, hence the ‘s’ instead of ‘z’) and also an anthropology student researching the material culture and sensory experience of books/documents/texts and digital surrogates. Regarding the user experience of ‘original’ vs. digitised historical documents: Can a digital copy stand in for an original? If your answer is no, a digital surrogate can’t always do as an original, what is missing from the experience of working with a document that has been digitised? Anything else you may like to offer on the topic? Cheers

caffarelli

Well what a nice discussion question! I’m a digital archivist (in the coloniez with the zees) so I put digitized and born-digital documents online, and I think about this a lot! Since you’re not coming from an archival background and I don’t think you would know it, the archival science term for this idea that the original format has a value is called the items “intrinsic value,” reading up on that might help with your paper.

(Note that all that follows is about digital presentation and not digital preservation, which is a different bag of rats, and on the back end of our servers and about checksums and stuff and pretty boring. The files you the user see on a “digital archives” are not our real preservation files.)

My work ranking when putting things online is accessibility, then security, then authenticity. Accessibility absolutely comes first, because my prime consideration is making the items useful to researchers, or else I’m wasting my time. Another problem is that, while we get to interact with on-site researchers extensively, with online the website is likely to be the ONLY interaction we will have with the user of the digital documents, so it has to be highly functional and self-evident. (For instance, one of our digital collections got highlighted on a prominent blog a while back, we got about 500 hits that week on the collection, but a total of 2 emails asking for more information.) So the pressure’s on to “do it right” when you do it digital.

Some of the big questions when I’m trying to put a collection online are:

Do I have copyright permissions to put this online, or is it in the public domain? (the big unfun one)
How can I best convey how these records were created and used without “framing” the document too much (i.e. interpreting it, which is the job of historians and not archivists)?
How do I make it clear you can and should contact the archives directly for more information?
What’s the best way to present the metadata?
How do I make it clear how to cite these correctly in a school paper?
What file formats are easiest for researchers to use?

Security is the second one, and it’s not fun, but basically we need to ensure that our materials won’t show up in a publication without us knowing about it. That means when I prepare a public copy of a scanned image I have to strike a delicate balance between accessible to researchers and too low of quality to print. This means I hand pick the resolution for each image to make sure it’s clear enough to examine but too low to print, and apply a watermark of our name in an area that will not be annoying to a researcher but make it unsuitable for publication. (Now, some archives go overboard and put watermarks in rude places like on top of faces, and some places are more laissez faire and put unmarked high res stuff online for free. Different strokes!)

Third, we get to authenticity. (Poor authenticity!) I frankly, rank this one dead last because I just don’t think digitized materials should be in that business. I think the gift of digitization is getting these materials to people who previously couldn’t use them because of distance, disability (usually blindness), or because the materials are too fragile. An archives is an archives, not a museum, and our primary purpose is to facilitate historical research and not do public history. The primary value of a record is the information it contains, not it as an artifact. But I do put some mind to it -- I don’t edit photographs or documents typically, although I occasionally do a little increasing the contrast to make a faded document readable or something mild like that, and if I do that I’ll always put the original scan next to the altered one so the researcher can see what I’ve done.

So, at the end of the day, does a scan replace an original? No it doesn’t. What’s missing from the experience? Hard to put it in words! I’d call it “that Antiques Roadshow appeal.” Everyone likes old things! There is nothing like seeing a college freshman get all jazzed up (but try to act cool and hide it) when they get to put on the white gloves to handle a pile of 100 year old photographs, it’s really special. A scan just can’t get you that authentic interaction with physical history. But hey, does it really need to? Digitized and analog materials can co-exist, the fact that the the researcher across the world can use the materials now doesn’t cheapen the experience for the on-site researcher.

You can see a good example of this authenticity vs. accessability debate going on between archive.org and gutenberg.org actually; Gutenberg is the original deal and their founder Michael Hart was heavy on accessability, and he wanted ALL books in plain text that you could download quickly on a bad connection, no pictures, no fuss, no muss. Gutenberg’s ebooks are usually much better proofread and you can put them on your ereader quickly and without errors, or a screenreader can use them. Archive.org is going for the authentic feel, and as a result has a beautiful book-flipping thing but the accessible side of it is pretty much garbage.

Wow that’s a lot of words. Oh well. Let me know if this is useful to your paper!

Little_Noodles

The value of archival material is in the information it contains. Whether or not a digital copy can stand in for the original depends both on how well the digital copy is presented and what kind of information the researcher is looking for.

If I'm just looking for names and dates in a 20th century document originally printed on standard office paper, a competently done digital archive will probably do just fine. In fact, if it's searchable, it might even be a better resource than the original.

If I want to access archival material for information that digitization does a poor job of capturing, however, (textures, glossiness, multi-dimensional material), then the digital surrogate is a poor substitute.

However, I wonder if this archival perspective that you're getting is maybe not the best (or not the full) range of what you're looking for. Your post says that your asking about user experience, not archival understanding.

A great many users seem to place an awful lot of emotional value on original materials and the tactile experience of touching old stuff. From an archival perspective, it seems to me like a fetishization of historical documents that doesn't really enhance their understanding of the material. But that doesn't change the fact that they feel like they're genuinely engaged with the document in a way that they can't do with a surrogate copy.

If you haven't cross posted in one of reddit's popular history subs, you should consider it. Posting to this forum, which has a lot of working historians and historian/archivists, is going to get you answers from people that have handled much more archival material than the typical user. I have a feeling that you'll get very different answers from different sets of users.

EDIT: If you feel like further complicating your research (and who doesn't love doing that?), you may want to also consider the use of reference copies of original material. In some cases, a disinclination to use digital surrogates may have less to do with a desire for the original than it does with an overall preference for non-digital materials. A user may not want to read a digital copy of a lengthy 18th century text, especially if the digital interface is not well developed. But they might be perfectly well satisfied with a recently created reference copy of the material.

rosemary85

As Oliebonk says, digitised versions are almost never a complete copy of the paper one. The lack of completeness manifests itself in different ways depending on field, though.

Here's an example from mine. When hellenists go to look up a text these days, as often as not they'll consult the TLG digitised archive instead of a hardcopy critical edition. The key word in that sentence is "critical": an electronic copy is no critical edition.

To illustrate the difference, here's what I get if I go to the TLG and look up Iliad book 5, line 1:

Ἔνθ’ αὖ Τυδεΐδῃ Διομήδεϊ Παλλὰς Ἀθήνη

Here's what I find if I go to the standard hardcopy critical edition and look up the same line:

ἕνθ’ αὖ Τυδείδηι Διομήδεϊ Παλλὰς Ἀθήνη

E 1 [Ammon.] Diff. 170; 1a ApS 47.10, 69.8; 1b Epm. ad A 30b; 1 (Τυδ.-)-2a Max. Tyr. 8.5; ...

E 1 Διομήδεϊ sic (-εϊ) 1 16 572 982 Epm. disertim Ω

You almost certainly can't read either of these; don't worry, I can. You don't need to be able to read them to get the point of the contrast I'm drawing. Briefly, in the critical edition the second line is a list of references to and quotations of the text in later ancient authors; the third line is a list of notes on orthography and textual variations between the various surviving ancient manuscripts.)

This difference isn't a failing of the electronic medium per se, of course. It's a human failure. Just like the ordering and contextual problems that Oliebonk describes. There's nothing to stop people from producing critical editions in a digital format; there's no widely-agreed system for transferring the information you'd get in a critical edition to a digital format, but that's a matter of time. But the point is that no one's done it yet. As long as this failure in functionality, imagination, and precision is with us, it'd be daft for a researcher to try to switch over completely, because the hardcopies are where the research tools are at. And this failure is likely to hang around for a long time yet: the TLG has been going since the 1970s, but the service provider has never shown the slightest interest in trying to rectify the problems.

LordSariel

Call me old fashioned, but I enjoy original documents. I am more productive in a work space with my colleagues, in a cozy library/archive. Ideally one with copious natural light. Feeling the original paper adds a tangible affirmation of my work. And looking through the whole file, and reaching the end, helps me assure that I missed nothing.

When viewing digital files on a computer or tablet, I tend to be lazier, less enthused, and easily distracted. I also get confused when documents end suddenly or don't paint a full picture - I automatically think something happened in the digitization process and my file is not complete.

In short, a highly personal decision for me in favor of original, hard-copy sources. It is an inconvenience in terms of travel and time, but the experience is more enjoyable to me professionally. Another person mentioned, the original order of documents (and boxes/collections) further helps frame my research/conclusions.

bettinafairchild

First I suggest you check out Nicholson Baker's essay, "Discards", which is a reflection upon the digitization of card catalogues, way back in 1994. He covers a lot of the issues that have become so evident as more documents have been digitized.

I work with original documents in private collections, so I get to spend a lot of time with them. I think digitizing a document is an example of the ways people prioritize vision over all of the other senses. Digitization only covers one sense, and only one dimension. That actually does convey most of the information usually want, such as what the image or the words are. But other important information is lost that might be important at some point. Such as the depth of the letters, the quality of the paper, the feel of the paper, and such. This is particularly important if something is hard to read. Looking at different angles can help to elucidate some of the text. If there is a palimpsest, then that information is lost. If one is wondering when the text was written, the paper can provide clues. Harder to detect a forgery via digitized document. For most situations, though, digitized documents will do.

Oliebonk

A few thoughts: As a researcher I noticed that the order of documents in the original file is important to understand how the information is formed and what the administrative process was. Even if the order is changed, you will have a better chance to reconstruct it when the original source is available. The order of the original documents gives you essential clues on what the sources once reflected. You lose that context when only using a digitised source. I was part of a project to digitise an archive and usually, if there were a few versions of a document, they only digitised the most recent one. So if you research digitised sources you must be aware of the context of the original sources and related sources and documents. A digitised source is almost never a complete mirror image of the paper one. So researching only a digitised source is not advisable...

farquier

I'd like to add comments on working with 'digitised' artworks, either manuscripts or paintings. In both cases, there are things that you really can't get across in a digital reproduction. With paintings, there's a gap both in terms of color and scale and also to some extent paint texture-certain artworks tend to be glossier or rougher and that in turn shapes their qualities. Take for example Google Art Project's image of Bellini's painting of St. Francis. http://www.google.com/culturalinstitute/asset-viewer/st-francis-in-the-desert/egGQB5gOZujX4g?projectId=art-project

Now I should say this is a very, very good reproduction of the painting, probably the best possible. But having had the good fortune to look at this painting in the original, there are still things that you can't really get across. The depth of the color, for lack of a better word, does not really come through in the reproduction, and nor does the slight darkness in the tones. Nor for that matter does the glossiness of the painting; in the flesh, so to speak, it has a very slight sheen and a very even surface to it. These are all kind of nitpicky things to pick up on, but they do matter to our understanding of Bellini's methods and how they compare to other artists of his time, and the kinds of effects he was aiming at.

With manuscripts, the big difference is really that you're looking at two-dimensional photos of an object that is made to be held and that matters a lot. Some publications or digital reproductions will only show the paintings and decorations in the manuscript, which makes it much harder to asses where they fit in with the page layout. Even when they don't, though, it's very hard just from looking at photographs to get a feel for the scale of the book in your hands or how the images fit together as part of a book(even with things like "turn the pages" it doesn't give a sense of how the images are spaced out physically and you can't really do that until you've picked up the book and gone through it). And there are aspects of the book's life history you really can't see from photographs-the condition of the book and the paintings is often a good deal less obvious(looking at one book that I'd previously only seen photos of allowed me to notice some smudges in the painting, some of which I think might even have been deliberate damage), and you can't really tell things about how it's put together or even if it was altered(rebound, cut down, bound with other books, repaired, etc) at some point and you can't really pick up on details about changes in manuscript hand unless your publication shows all pages, which not all do. Some of this will be described if you have a good catalog of the book, but not all will and if you don't have access to a good descriptive catalog of the book it will not be obvious in the slightest if you are looking at a photo on a museum website.

HallenbeckJoe

I'm exclusively working with digitized books, magazines and newspapers from the 19th century. I do like actually seeing/touching/smelling original documents, but I'd never voluntarily give up the conveniences digitized materials can provide. Reading them on my phone, searching through them on my laptop, checking a quote, looking up certain words or clusters of words (OCR has made progress) is invaluable to me and my style of researching. I guess books and newspapers are a special case though. For other documents, the archive can provide much more context and help to the historian. Nevertheless, I love google, archive.org, ProQuest and all other newspaper databases that make my research possible. I pity all the historians who did not have these tools availabe just 10-20 years ago.

OleWorm64

For historians analyzing texts as objects, just having the digital copy may not be enough. One thing I might point out is the non-writing parts of the documents may not show through in the digitization process. Watermarks can indicate the origins of that paper and can inform economic analysis and information/idea flow of a time period and place. Stains of various kinds may not be easily distinguished in a digital copy. Those things give a real-ness to the document and yield specifics about the circumstances under which the document was written through chemical analysis.