How great of a blow to historians is the migration of letters and journals to the digital realm, where most of them are inaccessible to historians (locked email accounts, hard drives, etc)?

by RusticBohemian

I just finished "Lincoln in the Bardo," of which, probably about 1/3rd is historical written accounts from contemporaries of Lincoln.

Although many of these accounts disagree on particulars, I was blown away by the quality and quantity of the testimony. Many of these men and women wrote as well as modern journalists/novelists, giving rich descriptions of what things looked and felt like, what people said, etc.

I wonder to what extent many of these personal accounts have vanished from the modern era, and to what extent they've been replaced by accessible material that's equally or more valuable.

People rarely write letters today, and journaling isn't as common as it once was. Emails are certainly a thing, and people write things and leave them on hard drives. But email accounts have passwords that heirs might not know, and hard drives get scrapped when computers are thrown out. Someone is unlikely to find their great grandfather's emails like they once found his letters in the attic.

On the other hand, the proliferation of photos and videos must be giving historians a lot to work with, and that people might leave accounts in public places on the internet.

So what's the consensus on how grievous a blow the digitization of material has been to historians?

dhowlett1692

This is a question that a lot of digital historians grappled with in the past twenty years. Our profession requires primary sources, and until recently, most sources were originally on paper or had some other tangible aspect like physical artifacts (oral traditions being an exception but those also require a person to tell it). We've entered an age where most sources about people's lives are born digital- created digital, live digitally, and generally stay digital. Your text messages, emails, Word Documents, Tweets, Facebook status, etc exists, but with limited exceptions, these sources exist on a device. Whether its a cell phone, laptop, Ipad, the Cloud, or USB drive, some physical device contains the data produce these born digital sources.

Historical sources depend a lot on luck. In order for a 17th century diary to survive to the 21st century, 1. You needed to be able to write it, 2. Family members had to be able to save it, 3. Archives needed to care about saving it, 4. Nothing could go wrong- fires, floods, toddlers with crayons, etc. Other materials like a ship manifest depended on weather conditions. If a ship sank, guess what probably didn't survive. Its amazing how much survived, but we know the survival of sources leave a lot of biases intact. Enslaved people rarely had the opportunity to preserve their own thoughts and experiences compared to enslavers with the power and means to construct their written legacy.

If we think about today's digital sources, access to digital formats allows a lot more people to create primary sources- texting, posting, uploading photographs, etc. There are still limitations based on the cost of devices and access to the Internet, but there are billions of people with digital footprints. If you want to see what you emailed someone last week or last year or five years ago, you can open Gmail and find it. Our lives are heavily documented.

However, does this mean historians are entering an age of practically unlimited sources? Maybe? Roy Rosenzweig's article "Scarcity or Abundance?", Ian Milligan's book History in the Age of Abundance?, and Trevor Owens' book The Theory and Craft of Digital Preservation dive into these challenges. Historians will either have an unimaginable record to dig through with the abundance of digital sources or we will have practically nothing.

Consider what it means to preserve a digital thing. The WayBack Machine preserves websites. Hard drives save digital files. Email accounts let you archive messages. But what does that mean? Website maintenance ceases when no one updates the website. Hard drives require a device that reads it. Email accounts need passwords. However we access digital materials will require upkeep. If Facebook or Twitter shut down, how much content of the websites will really be saved?

Adobe Flash is a great example of obsolescence. The digital files that made up Flash games may still exist, but Flash software does not. If Microsoft shut down, eventually Word and PowerPoint software would be outdated and unable to run on later devices. My parents saved a bunch of their early 2000s tax returns on floppy discs, so I hope the IRS doesn't need to see them. My laptop doesn't have the option to run a CD, so my saved Age of Empire games are lost. Technology advances and the method of storage of information disappears.

It is possible to save hardware to keep access. But if I keep a VCR so I can rewatch a VHS tape, how long will that VCR work before it needs a replacement part? Will computers always be able to open an MP3 file?

As more and more sources become mostly digital or exclusively digital, what happens in 10 years? 20 years? 100 years? How long will The New York Times offer a print version? What about publications that are exclusively digital? Punchbowl emails a digital newsletter rather than sells a paper in a news stand. As more potential historical sources become digital, historians may enjoy the wealth of perspectives through online blogs and YouTube videos. But any digital format can easily disappear. And if the digital world become lost, historians will have such scarcity of sources.

Consider your digital footprint in a day- for example: 8 emails, 30 text messages, 2 phone calls, 1 memo on your phone, 10 Tweets/retweets, 15 selfies and cat pictures, and an online shopping receipt. All that information could tell a fascinating and vivid picture of a day in your life. Now tell the same story without any digital sources- maybe you wrote down a grocery list on a post-it? There might not be much else to go off of. Historians don't know which way we'll end up.

historianLA

Some thoughts that build on areas not covered by the excellent answer by /u/dhowlett1692

quotidian epistolary records (everyday letters) have been historically quite rare. For the period I study (1500s Latin America we have almost none, maybe a couple dozen to a hundred). The bulk of what we have globally certainly clusters in the 1800s-1900s. But that is historically an outlier. So in some sense the loss of such records due to the digital turn is more a return to normal than anything else. Also, the preservation of extant physical letters are at risk because most people who wrote letters wrote them on acid wash paper which will eventually deteriorate into nothing.
This problem is also more focused on the loss of information about comparatively affluent people rather than those who are less affluent. So for historians of the disadvantaged (because of race, class, religion, whatever) there has always been less material available and so it does not change the status quo much. Although the digital turn has had some democratizing effect of access to technology and information the poorest folks are producing fewer digital documents and more importantly storing fewer digital documents than more affluent people.

As a historian I don't want to see any sources lost, but I am not quite as concerned about the loss as others in part because I am used to studying a period where we muddle through with fewer documents.

ComplexComedian

I will speak from personal experience. I am in the second year of my PhD. My Masters' project looked at rhetoric around so-called vaccine alternatives. In that project, I used a variety of internet sources and blogs (albeit without using the tools to clean and process my data so it was time consuming). My PhD dissertation explores the Canadian anti-vaccine movement from the 1980s on. I have found the internet invaluable to the project. Getting access to the voluminous information generated by the anti-vaccine activists of Canada could only happen with digital tools. It also allows for some exploration of transnational trends and the place of Canadian anti-vaccination within a global context that would previously be difficult to trace.

I think that Digital History opens up a lot of interesting questions and answers about the internet itself and how the internet has changed communication. And in the case of my PhD project, I hope to understand the creation of the Canadian anti-vaccine movement and its evolution as the internet opened up new avenues of communication. In fact, while my research is in early stages I find that the biases of digital communication (it being adopted and accessible first to middle and upper-class Canadians (Milligan, Age of Abundance) has a lot to do with the white middle-class face of the movement.

The questions I can ask (and indeed the questions I need to answer to get a satisfying picture of the movement, require a lot of digital sources. I am learning coding as am deeply indebted to the existing repositories that exist to interact with the archived web (which is stored in ARC files (1996-2008) and WARC files (2008-). I will be conducting network analysis and topic modelling as part of the use of digital tools, but tools are only as good as the sources you bring to bear and the methods you use to deal with gaps.

The biggest hurdle in DH of the early internet and manipulating the data at scale is creating collections from archived material that fit your needs. The internet archive has a subscription-based archival tool Archive-It but the collection starts when you (or some other person) creates it. There isn't a collection for anti-vaccine material from 1996-. With Covid, many collections have been started but these will leave out valuable information of what preceded 2020. Creating a collection by pulling old ARCs and WARCs is coding and resource intensive. The storage of the data is hard to do. As a PhD student, these problems are hard to solve. I am extremely lucky that my advisors have partnered with Archived Unleashed Cohorts or I would have been likely at a loss for how to do what I wanted to do and knew would be necessary for a discussion that probes internet sources.

It is my hope that my work with the Internet Archive to create a "post-hoc" web collection will result in the development of tools that make it easier for all scholars to go through the same process. Digital history is going to make grants all the more important to generating adequate scholarly analyses, I hope to lower that barrier if only by a small amount.There will be problems created by what sources are left behind (but what era hasn't had these issues) but also new questions and new answers to be probed. Certain vlogs and blogs are still being created and archive by organizations. There will be questions to ask about why this blog, and not others. How does the ad model shape the writing of these documents? There will also be questions about whether it is ethical to read some of these personal moments. Despite your worries about personal moments being lost. Many are being captured and without consent, and may be used while a person is alive, how do we engage with these ethical issues is extensively discussed by Ian Milligan's work on Geocities (an early social media). Similar questions may be asked of archived Twitter posts. While there will be issues to work out, I think that the earlier tools of historians can be successfully brought to bear to interrogate born-digital sources.