How do historians of relatively modern eras sift through the ever-increasing glut of available primary sources? How has (or will) the Internet and its massive data archives changed the academic study of history?

by chocolate_on_toast

Historians of ancient and early eras often have to pore over the few surviving texts or artifacts from that era, examining and re-examining the source to squeeze every bit of information or context from it, and building on other historians' analyses of the same source.

But as the era in question gets closer to modern day, there's an obvious and natural increase in the amount of surviving primary sources. r/askhistorians accepts questions that are on topics at least 20 years old, so the study of 'history' comes right up well into living memory, and into the age of the Internet. Now historians could access the forum posts and blogs of everyday people discussing events as they happened on a potentially massive scale.

So how does this sudden huge increase in the availability of sources affect the academic techniques of historians? Are the recommended methods of selecting how to refine which sources to use and how to assess their validity (in terms of whether the author is a respected authority on the subject, an 'everyman', or sowing deliberate misinformation, etc) different when there is excess of information compared with the study of eras when information is scarce?

Are there any historians studying Internet-Age topics who could give their experiences of sifting through millions of sources and how they might expect the academic techniques of history to change in response to this?

Thank you!

Bernardito

There is no doubt that digital history is and will continue to impact the academic study of history.

You appear to be more interested in born-digital material (primary sources created, 'born', online) than what historians usually think of when they think about digital history (digitized primary sources), which is why I will focus on the former, but I feel like it is important to emphasize just how vital the increasing digitization of source material has been to historians all around the world. Where you once had to travel and spend quite a bit of money to access archival material, you can now access a great deal from the comfort of your own home without any time limits and without having to fumble with cameras or your phone. For younger generations of historians, this means being able to research subjects that could easily have been out of reach for an undergraduate thesis, such as non-European history. The Internet has made it possible for historians to also reach out to the general public and talk about their research in a more direct way, whether on here (through our wonderful flairs and AMA sessions, as well as our recent conference) or on social media like Twitter and Facebook.

Returning to the question of born-digital source material, it is indeed a challenge. I am one of those historians whose current source base is entirely made up of born-digital source material: YouTube comments, forum posts, Reddit threads. I must sifted through hundreds of forum posts and thousands of YouTube comments in order to discover the breath of online racism and how it relates to the historical memory of the First World War.

So how did I come to decide on the specific forums, videos, and subreddits that I am sifting through? As with any historical research subject, a historian has to decide what question they are trying to answer. The historical question will in turn help to determine what source material and methodology that the historian will use. Let's use the question I am researching right now: "Why do people question the presence of people of color in historical representations?"

It is evident that this is a very broad question. Let's focus it a little by anchoring it in a digital source: A video game. In 2016, there was a widespread backlash against the inclusion of non-white soldiers in the First World War video game Battlefield 1. Why did this happen? With a focused subject, we can now move on to the born-digital sources. The choice of a video game means that we are looking at a specific community. Naturally, the discussions held on Reddit (first on /r/Gaming and later /r/Battlefield_One) immediately become valuable alongside the official Battlefield 1 forum on Battlefield.com. A fair mix between invested players and casual players, I felt like I still had to broaden the field. In order to gauge the initial reactions, I turned to YouTube and the comments of the incredibly popular official release trailer to the game.

The methods of analysis was based on a qualitative approach, the same I have used for non-digital research. The assessment of the validity of the sources was in similarity based on my training as a historian, but it was also very much determined by the question I had asked myself on the outset. I am explicitly looking for opinions by strangers online who are first reacting and later playing the game. Since the discussions are referring to a visual and interactive medium, it's made very clear in most comments that the users have played the game (since they are referring to specifics) or have seen promotional material (in the case of the early reactions).

In my case, I would not say that my methodology or overall approach to my digital research has been different from what I have previously carried out in archives. Many of the same tools, methodologies and theories remain the same. Yet at the same time, I am aware that this will be different from one historian to another. Digital history is a considerably broader field than I have described here and includes not only digital resources but also software, not to mention big data. Since I specifically do not have much experience with that part of it, I feel like it is not my place to talk in-depth about it. However, I hope that my personal experience of digital history has been helpful in some way. I now need to return to writing my book!

Sources:

Anna Nilsson Hammar, "Digital History," Scandia 81:1 (2015), 99-110.

Om digital historia by Kenneth Nyberg (2013).

Historia i en digital värld by Jessica Parland-von Essen and Kenneth Nyberg (ed.) (2014).

.... and of course, my upcoming book, White Mythic Space: Racism, the First World War, and Battlefield 1 (forthcoming).

evil_deed_blues

Plenty of questions surrounding the value and use of internet sources surround (even as they're not reducible entirely to!) broader methodological questions for different historical subfields, even for born-digital materials. /u/Bernardito raises some good points surrounding source use and public history - archives are digitized, it's easier to communicate and articulate findings (even at the turn of the century academic/personal blogs are wonderful for this).

I'll speak very briefly on intellectual history, which I'm a bit more familiar with. A history of something like neoliberalism is not a history of neoliberal thinkers tout court, but demands analysis of the ever-evolving ideas, institutions, and cultures rooted in expanding, remodelling in the image of, and sustaining markets and competition. But what can we use to examine this? There was a pretty heated (not just by historian-blog standards) discussion surrounding the proper subject of intellectual history a while back, spawned by a post on an intellectual history of 'irrational thought' - intellectual historians largely have not seriously, substantively engage with the work of authors like L. Ron Hubbard, given their internal inconsistencies, bad-faith (at times) and messiness. Their influence is undeniable - but historians like Nils Gilman asserted that a different field like cultural or social history was better suited to the task. In Gilman's words:

I started asking, rhetorically, whether it would be appropriate to write an intellectual history of, say, Glenn Beck’s half-baked rantings. When I got some affirmatives to that, I went yet further, and asked, How about an intellectual history of Honey Boo-Boo‘s utterances? Yes? Then how about an intellectual history of moans by porn stars? All verily texts, no?

Not all of the historians in this exchange agreed with Gilman's criteria that objects for intellectual study should meet the criteria of:

(1): "“carefully wrought” by its author(s); that is, it should be something that was self-consciously produced to be read with care." (i.e administrative records, telephone books would not count)

(2) "the text needs to be one that situates itself within a tradition of similar texts. Most such texts will usually signal this intertextuality in relatively overt ways, that is, by referencing or commenting on other, similar text."

Or even if they did, they disagreed that examples cited by Gilman as patently ridiculous were unworthy of being historical, particularly since political texts necessarily situate themselves amongst wider discourses and claims. (There's another response here from Edward Blum - the comments sections of all 3 linked articles are worth a look.) You can see where I'm getting at with internet sources as a result - brief internet comments, snippets might be something intellectual historians have difficulty (if any interest in the first place) incorporating into their work, even if it's undeniable they reflect, produce and articulate particular social and cultural milieus. I'm not ruling out the possibility that a book someday will one day be subtitled "an intellectual history of Donald Trump" (if my search-skills are accurate, this comment should be only the third time on the internet this string of words appears) but perhaps it's an intellectual history of his tweets that might be harder to cohere. But a social/cultural history? Certainly!