What Can Computer Algorithms Tell Us About Literature?

The fourth floor of the English department contains a peculiar research lab. It is a space where “tokenize” and “Tolkien” can occur in the same sentence, and where “word vectors” and “Wordsworth” can appear in the same presentation.

This is the Stanford Literary Lab. Housed within one of the humanities havens on campus, the Literary Lab sits, quite literally, at the intersection of English Literature and Computer Science. Directed by Professor Mark Algee-Hewitt, the Literary Lab describes itself as a “research collective that applies computational criticism, in all its forms, to the study of literature.”¹ Here, volumes of fiction are fed into machine learning algorithms, novels are processed by topic-modeling systems, and the interwoven words of poets are spliced from their carefully-formed phrases and mapped into vector space.

While the number-crunching language models in Computer Science classrooms draw no puzzled looks, the literature-ingesting algorithms in English departments often do. But before looking at “computational criticism in all its forms,” let us back up for a moment.² How does computational literary criticism differ from existing literary analysis?

Traditionally, scholars approach literary analysis through close reading. By studying the form and construction of literary texts, literary scholars reveal themes and motifs that are often invisible to the casual reader. Computational literary criticism (also known as computational literary studies) provides a modern day alternative to traditional forms of literary analysis. It proposes to supplement the human reader by offloading certain text-processing tasks to computers. As Professor Nan Z. Da describes, computational criticism “usually entails feeding bodies of text into computer programs to yield quantitative results, which are then used to make arguments about literary form, style, content, or history.”³

What might seem blasphemous to the literary-purist and plain bewildering to the literary-agnostic, computational literary criticism raises the critical question: Why? What can computer algorithms reveal about Romantic poetry and the Victorian novel that a human reader cannot? What can we learn about literature when we transform qualitative accounts of human experience into quantitative charts, graphs, and models?

The metaphors computational literary critics often employ are the telescope and the microscope.^{4 5} Like a telescope, computers can examine a galaxy of texts while the human reader is necessarily limited to a smaller selection of “star” canonical works. Like a microscope, computers can trace stylistic minutia across the vast span of literary production. Bypassing problems of scale, computational critics can zero-in “on microlevel linguistic features […] that map directly over macrolevel phenomena.”⁶ While a human reader can offer interpretations of select novels, computers can explore larger literary archives and reveal broad aesthetic and cultural patterns throughout literary history.

In “Loudness in the Novel,” for instance, Holst Katsma traces the percent frequencies of speaking verbs (ie. “said,” “replied”) across 19th century British fiction and weights them by their loudness.⁷(The speaking verb “shouted,” for example, is louder than “whispered.”) Observing a “century long diminuendo” in the British novels,⁸ Katsma parallels this literary phenomenon with the cultural changes of a modernizing world: a world which turned from “rage and wonder” toward “boredom, depression, nostalgia, and anxiety” (Fisher,⁹ qtd in Katsma); a world which, once perilous and miraculous, was becoming increasingly “quotidian.”¹⁰The 19th century and its microlevel literary features, as Katsma argues, map onto larger cultural trends.

Computational power allows Katsma to sift through a century-long corpus for speaking verbs. The telescopic breadth of his corpus and the microscopic scale of his syntactical interests lend themselves well to computational analysis. In his research, Katsma turns text into digits, and the novel into material that can be parsed as well as read. As he states in his paper’s conclusion: “the main revelation is the discovery that loudness is perceivable and measurable within the novel. […] Written language codifies loudness; the word becomes its own type of gramophone-record; and the text preserves variations in loudness over time.”¹¹ Codified prose and computer code: in today’s technological era, written language can both be a measurable object and the program that does this measuring.

With recent leaps in Computer Science, computational literary criticism has taken on new forms and moved beyond the world of word frequency analysis. For instance, Martin Paul Eve uses information retrieval techniques to identify anachronistic language in Cloud Atlas, a novel in which chapters are set in different time periods.¹² By writing a Python script to query and cross-check word origins, Eve is able to explore the relationship between historical fiction and historical accuracy in Mitchell’s novel.¹³¹⁴ At the Literary Lab, the Microgenres project uses machine learning to “identify points at which authors incorporate the language and style of other contemporary disciplines into their narratives.”¹⁵

While technologists may delight in transforming words to numbers with their aptly-named word2vec algorithm, literary scholars are divided on the value of computational techniques in their field. Skeptics have pointed to the poor statistical rigour and the lackluster results that certain projects have produced.¹⁶

Like certain engineers who offer up Rube Goldberg solutions for simple problems, computational literary critics have taken some puzzling approaches to literary research. New York Times writer Katheryn Schulz recalls a computational finding which “defines ‘protagonist’ as ‘the character that minimized the sum of the distances to all other vertices.’”¹⁷ Her reaction: “Huh? O.K., he means the protagonist is the character with the smallest average degree of separation from the others, ‘the center of the network.’ So guess who’s the protagonist of Hamlet? Right: Hamlet. Duh.”¹⁸

However, while these “duh”-inspiring results may be easy to dismiss, the Literary Lab’s director Mark Algee-Hewitt pushes back against this reaction: “As a field, we’ve been practicing [literary criticism] for a century or so, we do have good theories on the history of literature, and some of those turn out to be absolutely right. But even in those cases, it’s good to confirm things by a completely different methodology.”¹⁹ While the literary field may be used to prizing original and maverick ideas, the scientific community values replicable claims. As an interdisciplinary field, computational literary criticism can be valuable even when it reiterates past literary findings.

In other cases, Algee-Hewitt has found that computational research findings are only obvious to literary scholars in retrospect. In one project, the Literary Lab team asked a room of poetry scholars to guess the most frequent poetic meters across different points in time. They then presented the results produced by computational techniques. The reaction? “Oh yeah, of course! It’s iambic pentameter followed by iambic tetrameter.” But, as Algee-Hewitt recounts with a chuckle, these seemingly-obvious results were not what the scholars had originally guessed.²⁰

Computational literary criticism has also demonstrated its ability to surprise the literary field. For instance, literary theorists and philosophers have made long mused over the distinctions of fiction and non-fiction. Are there truly semantic and linguistic differences between these two genres? While some theorists have leaned towards the negative, recent computational analyses suggest otherwise. The Microgenres project at the Literary Lab trained an algorithmic model to classify texts by genres based on grammatical features (eg. percentage of verbs, length of sentences).²¹ It found that there are not only different writing styles between novels and other prose, but also recognizable distinctions between writings from various fields, such as natural history and anthropology.²² While theoreticians may broadly claim that there perhaps is “no feature of style or syntax that marks off a work as fictional rather than non-fictional,” a computer is able to investigate this “perhaps” through practical research.²³

In recent years, English departments have been adapting to a computational future. On the faculty side, nearly a quarter of Stanford’s English professors are associated with the Literary Lab.²⁴ Their projects range from training algorithms to recognize poetic meters to mapping the “typicality” of a novel.²⁵ In 2015, Stanford’s English department launched a Digital Humanities minor, which offers English courses that cover sentiment analysis, word embeddings, and the basics of programming in R.²⁶

Computational literary criticism extends beyond Silicon Valley universities. Harvard’s English PhD requires reading fluency in two foreign languages from Latin, Ancient Greek, Old English, French, German, Spanish, and Italian. Today, it also accepts “computer languages, if deem[ed] … relevant and appropriate to a student’s program of study.”²⁷ In Victoria, Canada, Digital Humanities Summer Institute trains current and budding scholars in technical skills, such as “Introduction to Javascript and Data Visualization” and “Stylometry with R: Computer-Assisted Analysis of Literary Texts.”²⁸

While computers may not yet have the ability to interpret literature, they are nonetheless becoming more and more relevant in a field which at first glance seems the antithesis of all things computational. For centuries, we, with the human eye, have gained insights — and pleasure! — from key pieces of poetry and prose. Now, paired with computational spectacles, we might wonder: what can and can’t we see when we peer into the smallest and largest patterns of literature? When we examine the particulate-levels of the novel? When we connect the statistical dots and trace trends that map across our literary skies like constellations?