Scale and Method: A Reply to Jeremy Rosen

It's always a pleasure to have a thorough response to one's work, particularly when there is much in it with which to agree. Jeremy Rosen's reply to my "Contemporary Fiction by the Numbers" is such a response, helpfully amplifying several important points from the original piece. To these I will add only a handful of clarifications and differences of emphasis.

First, though, a restatement of what I was doing in my essay. The piece had two aims, namely to advocate for the addition of computational methods to our critical repertoire and to give a sample of recent computational work of the sort I find useful. I mention these goals up front because I think some of Rosen's criticisms follow from the failure (mine, to be sure) to specify exactly what my essay was and was not doing and arguing. So to be clear: it was an argument for methodological expansion, especially for those of us working with contemporary sources, and a high-level synopsis of the results of that expansion.

Rosen raises three main points, considered here in roughly the order in which his essay presents them.

1. It is possible to misinterpret data.

On this point I agree entirely. It is not only possible but inevitable that we can and will make mistakes as we produce and interpret quantitative results. If what's wanted is a method that's proof against error, we've never yet found one and I'm sure we never will. To this I would add only that the same is true of the methods we've long employed, which are certainly riddled with errors of fact, logic, and interpretation. Neither I nor anyone else argues that newer methods should be held to a lower standard. But neither should new techniques be required to meet standards that we quite rightly do not demand of our existing work.

But here I am perhaps letting myself off too easily. Rosen's complaint isn't only that quantitative results can be misinterpreted but that I did, in fact, misinterpret some of the data I presented. Specifically, he claims that the unexpectedly "diversely outward looking" landscape of American fiction circa 1850 that I detected by way of geolocation extraction collapses on closer inspection into both logical error and simple confirmation bias. But the issue is largely explained by a difference of understanding of the phrase "diversely outward looking," which Rosen reads as a synonym for "broadminded" but which I used without the implied moral valence to mean, well, outward looking in diverse ways.

In support of his claim, Rosen, made of sterner stuff than I, reads Joseph Cobb's Mississippi Scenes, in which he finds references to slaves and slavery in connection with Africa, concluding that because these scenes describe American issues, their references to Africa are either misleading or irrelevant. On this point I'm afraid we simply disagree, and for two reasons. First, the ways we use place to frame issues of origin or belonging and to construct related metaphors matters deeply. If slaves are described as sons of Africa rather than as sons of the South, for example, that's an important part of how the institution of slavery, domestic race relations, and the United States' place in the world were and are constructed. It doesn't mean, of course, that every mention of Africa (or anywhere else) indicates a deep investment in that location, but patterns of usage and attention do matter. If forty percent of the named locations used in U.S. fiction of the mid-nineteenth century fall outside the United States (as is indeed the case), that fact strikes me as tremendously significant, not because each and every occurrence is especially important, but because such pervasive use of foreign places tells us something new and unexpected about the imaginative geography of the period.

This in turn raises my second point, namely that what's interesting about corpus-level work is what it reveals about exactly these broad patterns and their evolution over time. If we really want or need to understand Cobb in detail, we should read Cobb. Doing so will tell us far more about Mississippi Scenes than any computational work likely ever will. If we want to understand place in the literary production of the mid-nineteenth century, on the other hand, we have a choice: we can do it relatively quickly through a combination of text mining, selective close reading, and historical and theoretical contextualization (not all of which was possible in my few hundred words of synopsis), or we can spend most of our careers repeating our approach to Cobb across the remaining texts written in 1851 and hope that someone else will take up the project for 1852, 1853, 1854, and so on. Each approach will be better on some issues and weaker on others. The point isn't that one or the other of these choices is the right one, but that there is indeed a choice (or, if you prefer, a matter of emphasis and extent) involved, one with what should be obvious benefits and costs on both sides.

2. Datasets are not neutral.

This is a more interesting point and one with which I again agree entirely. The data we produce and conclusions we draw through computationally assisted criticism are certainly shaped by our selection of source texts, which are in turn inevitably constrained by material, legal, professional, intellectual, and historical circumstances. We should and do bear these issues in mind as we pursue our research, and we should understand that we are indeed often blind to our blind spots. But of course the same thing is true of more conventional literary research, which suffers not only from essentially the same issues of access, collection, and historical accident, but also from the profound limitations of scope imposed by the slow pace of reading. This is another way of saying that problems of canon formation continue to plague traditional literary scholarship in ways that we have not always fully appreciated. As I observed in my essay, our difficulty is especially acute in the contemporary case, where production outstrips consumption at such a rate that we necessarily know almost nothing about what's being written around us. 1 Large-scale computational work is motivated in part by the desire to have more information about a larger range of literature than would otherwise be possible.

Concerning the Wright bibliography used in my own work, Rosen says little beyond observing that it was assembled some time ago. Fortunately, Wright's task of cataloging American fiction was relatively straightforward. I'm sure he missed volumes, especially those not held by academic libraries, which in turn likely underrepresent marginal writers. I'm also sure that there are items in his bibliography that we might want to classify as other than fiction. But his work has the merit of a more or less unified, consistently applied standard, and there's no reason to believe it skews in any particularly problematic direction for the task at hand. Rosen's reservation applies more directly to a project like Franco Moretti's historical mapping of genres, which relies on periodizations drawn mostly from single sources in the professional literature. But still, isn't the published, professional, peer-reviewed literature supposed to be exactly the sort of thing you can more or less trust, especially when you're working outside your own core area (as you often will be if you're doing broad historicist work)? I'm not saying we shouldn't ask questions about our sources, but if you consistently can't trust the literature, isn't that a pretty damning assessment not of new methods but of our old ones?

This is probably as good a place as any to say a word about the "methodological absolutism" of which Rosen accuses me. I confess that I find this charge baffling and certainly unsupported by anything I wrote. As I have suggested in both my original essay and in my remarks here - and as Rosen himself observes - it's clear that computational and conventional literary critical methods complement one another in very fruitful ways. That said, I've also tried to be honest about the fact that trade-offs are certainly involved in emphasizing one or the other set of methods. I know of no one on the digital side of things who has ever suggested otherwise, though I often hear charges to the contrary leveled by conventional critics who would prefer to ignore computational methods entirely (see Katie Trumpener's "Paratext and Genre System: A Response to Franco Moretti" in Critical Inquiry, for example, or her follow-up comments in the Chronicle of Higher Education).

3. The cultural turn is a mistake.

Rosen's third major point - that we should restrict our attention to a handful of texts rather than to large groups of them - is the only one with which I find myself unable to agree. "Why," he asks, "should we think determining the character of 'contemporary cultural production as a whole' is desirable?" As far as I can tell, Rosen is in earnest when he poses this question. I'm not sure, given the course of criticism over the last couple of generations, that it requires much of an answer, but I'll make two observations about what does and does not follow from the project of large-scale cultural criticism.

First, Rosen suggests that either my proposed methods or cultural criticism more generally (or both) demand that we treat all cultural objects equally. This is a charge of epistemic relativism long familiar from the culture wars of the eighties and nineties; it's been refuted often and well and certainly needn't be rehashed here. Let me only observe that there's nothing in the methods I've discussed that requires the texts in question to be given equal weight. It would be easy to, for example, weigh word or location frequencies by the number of copies of each book sold, or by the number of times each book is cited in the MLA bibliography, or by some more subjective assessment of the "importance" of each. In some cases and for some research questions we'll want to do this. For others, we won't. The distinction turns largely on whether we're treating the texts in question primarily as symptoms of their cultural situation of production or as interventions in the course of literary history. We should reiterate, however, particularly for a contemporarist audience, that it's often difficult to identify new work that is or will be particularly important in even the slightly longer cultural run, and so we'll probably want to err on the side of modesty concerning our ability to discern any strong and enduring version of importance.

Second and more broadly, Rosen concludes that because it is as a practical matter difficult to work with large swaths of cultural products and because it's hard to say exactly what does and does not count as a legitimate object of cultural study, we ought not even to try. I disagree. Rosen claims that to ask large-scale questions is to pursue a Casaubonian key to all mythologies. Again, I disagree. The issue isn't one of producing a theory of everything, it's a matter of the qualitative kind of claims we're after. Think of this problem by analogy to map making. I'm not opposed to 7.5-minute topos, but I'd like to have some sense of where those maps fit on a larger, necessarily coarser representation of (cultural) space. At the moment, our large-scale maps are primarily matters of guesswork; I'd like to see this change.

Moreover, while it's certainly correct that computational work calls direct attention to a set of issues concerning the outer edges of cultural production that we've long been able (mostly) to ignore, it's not as though we haven't in fact had these problems. We've just haven't needed to address them because other, more immediate difficulties have gotten in the way. Our inability to deal with the mass of unpublished or marginally published work, for instance, has only seemed trivial because we were so far from addressing even any meaningful portion of conventionally published books. If we didn't have much to say about the broad relationship between different aesthetic modes, it was because we had our hands entirely full with a small subset of books or plays or films or songs or paintings or . . . . We've made a virtue of our necessarily narrow focus and we've pushed it a long way, but we shouldn't forget that it was indeed a necessity to begin with.

With all of this said, it's clear that Rosen and I agree in more ways and in more areas than we disagree, and I thank him for the opportunity to reflect further on an important set of issues. The field is changing quickly these days; let us continue both to explore the best ways in which to ask and answer the questions that have long motivated us and to pursue new problems that are only now becoming legible.

#1 For more on the problem of canons in connection with digital methods, see my "Canons, Close Reading, and the Evolution of Method" in Debates in the Digital Humanities, ed. Matthew Gold, University of Minnesota Press, 2012: 249-58.[⤒]

Scale and Method: A Reply to Jeremy Rosen

Related reading

P45