Return to the MLA Commons
Concepts, Models, and Experiments

Text Analysis

Natalie M. Houston

1 Leave a comment on paragraph 1 0 Department of English | University of Massachusetts Lowell

2 Leave a comment on paragraph 2 0 Please visit the final version of Digital Pedagogy in the Humanities, where you can read the revised keywords and create your own collections of artifacts.

3 Leave a comment on paragraph 3 0 The official reviewing period for this project has ended, and commenting is closed.


4 Leave a comment on paragraph 4 1 Text analysis is fundamental to humanities scholarship and teaching because readers are always already analyzing text, whether unconsciously or with intention. Readers analyze and understand aspects of a text’s bibliographic and visual signification through paratextual, somatic, material, and institutional encounters with the text, long before reading a word. Readers analyze a text’s linguistic codes of syntax and semantics through a variety of cultural, disciplinary, and subjective frameworks and filters. Regardless of the time period, language, or form of the text, or the questions that motivate our approach, humanists frequently:

  • 5 Leave a comment on paragraph 5 1
  • select or collect texts in order to explore an hypothesis;
  • look for patterns (of words, ideas, symbols, rhetorical or formal structures, etc) within an individual text and/or within sets of texts;
  • discover relationships (of development, dependence, seriality, association, intention, allusion, intertextuality, etc) between parts of texts, whole texts, or sets of texts;
  • interpret the significance of these patterns, relationships, and texts;
  • develop arguments for the larger significance of these interpretations.

6 Leave a comment on paragraph 6 1 In humanities research, these steps are often iterative and recursive and are rarely labeled as hypothesis, data collection, experimentation, analysis, and argument. Instead, all of these things are called reading. This conflation of very different activities under one word has heightened recent debates between data driven approaches to large scale analysis, what Franco Moretti has termed distant reading, and the traditional formalist and hermeneutic approach called literary close reading (Moretti, Trumpener, Goodwin and Holbo). If reading is often hailed as a specific kind of pleasurable, human activity, the term text analysis may seem in contrast to emphasize statistical approaches to quantifiable aspects of language (Hoover; Jockers 25). The specific disciplinary and institutional histories of computer-assisted text analysis, humanities computing, and computational linguistics variously intersect and diverge from those of literary studies more generally (Rockwell, Jockers, Ramsay 2011, Bonelli).

7 Leave a comment on paragraph 7 0 But other scholars have argued that computational analysis merely makes explicit the codes and rules already embedded in the nature of textuality itself. Michael Witmore explains:

8 Leave a comment on paragraph 8 0 I would argue that a text is a text because it is massively addressable at different levels of scale. Addressable here means that one can query a position within the text at a certain level of abstraction.

9 Leave a comment on paragraph 9 1 Such abstractions include words, characters, themes, or phrases within a particular text, but also the broader categories of form, genre, book or work. Witmore emphasizes that this is true of all texts, not merely those that have been recently digitized: “addressability as such: this is a condition rather than a technology, action, or event.” Texts are, and have always been, open to multiple methods of analysis. Digitization and computational tools only make it easier to explore different levels of address, from the usage of specific words to features of genre or references between different works:

  • 10 Leave a comment on paragraph 10 0
  • large scale digitization changes our access both to specific texts and to new quantities of texts;
  • relational databases and full text search expand the kinds of research queries that can be pursued;
  • new media forms and new interfaces transform how we understand and perform acts of reading;
  • the widespread availability of computational power and storage offer new ways of curating, displaying, and using collections of texts for human or machine analysis;
  • tools for data visualization and multimodal composition offer new ways of exploring texts and building arguments.

11 Leave a comment on paragraph 11 2 Not only might the objects of humanist study be seen as always addressable, but also its methods of analysis can be understood as already aligned with computation. Stephen Ramsay points out that “critical reading practices already contain elements of the algorithmic” because critical interpretation “relies on a heuristic of radical transformation. The critic . . . puts forth not the text, but a new text in which the data has been paraphrased, elaborated, selected, truncated, and transduced” (Ramsay 2011, 16). Digital technologies can be used to expand the scale of traditional methods (and thereby transform them) or to open entirely new modes and possibilities for text analysis.

12 Leave a comment on paragraph 12 1 The artifacts presented here represent a broad spectrum of approaches to teaching text analysis, which I have organized into four categories: rethinking the digital, text analysis tools and methods, textual editing as text analysis, and communicating text analysis digitally. I have focused on assignments aimed at the undergraduate classroom, but many of these could be adapted for other levels. More specialized resources and syllabi for text analysis in courses involving programming languages can be found under Further Resources.

Rethinking the Digital

Digital Pedagogy Unplugged



15 Leave a comment on paragraph 15 0 Fyfe provocatively asks, “Can there be a digital pedagogy without computers?” and offers several examples of assignments that treat “the ‘digital’ in the non-electronic senses of that word: something to get your hands on, to deal with in dynamic units, to manipulate creatively.” Rethinking digital pedagogy in this way not only allows students and instructors with varied access to electronic technologies to explore new kinds of assignments, but it also creates useful linkages between thinking about the materiality of print artifacts and that of digital texts. For example, Fyfe imagines a curatorial assignment where students gather, remix, and analyze physical artifacts rather than images on a screen. Textual annotation performed manually with colored pens activates the pattern matching skills of the human brain in ways analogous to the discovery of statistical patterns through data visualization. Such assignments could be scaffolded with digital assignments that use computational tools to emphasize shared methodological and theoretical principles.

Indexing In Memoriam Assignment



18 Leave a comment on paragraph 18 0 The origin of humanities computing is usually dated to 1949, when Father Roberto Busa began working with IBM computers to produce a concordance to the works of St Thomas Aquinas (Hockey). Of course, concordances and indexes long predate electronic computers, and, as Geoffrey Rockwell suggests, are premised upon hermeneutical assumptions of coherence and generative rule-bound procedures (Rockwell 211). The index is thus another example of “digital” or “hands-on” technology that expands beyond the electronic. Buurma’s assignment asks students to create an index to Tennyson’s In Memoriam or to use an existing index to create a new edition of the poem, foregrounding how informational technologies like the index create, constrain, or complicate the interpretation of literary works.

Text Analysis Tools and Methods

Distant Reading Duffy



21 Leave a comment on paragraph 21 0 This assignment in using the [Voyant] (http://voyant-tools.org/) online suite of tools for text analysis foregrounds the difference between traditional close reading approaches to a small number of texts and more distant reading of a larger set of texts. In presenting the assignment, Croxall emphasizes the ludic tradition in text analysis, exemplified by the work of Jerome McGann, Geoffrey Rockwell, and Stephen Ramsay, reminding students that:

22 Leave a comment on paragraph 22 1 we might not learn anything earth shattering—or even anything—by taking this approach. That’s okay. We are, to a certain extent, just screwing around. We’re operating here under the principle of experimentation that has guided our class. (cf. McGann and Samuels, Rockwell, and Ramsay 2014).

23 Leave a comment on paragraph 23 0 Allowing and encouraging an experimental attitude is important in introducing students to tools that help them see textual patterns in new ways. This assignment also asks students to contribute to the work (now conducted over several years by different iterations of Croxall’s course) of transcribing the texts for digital analysis. Making the labor of text preparation and cleaning evident to students demystifies the processes of text analysis and opens up conversations about textual transmission more generally.

Team Project Description for English 203 (Hamlet in the Humanities Lab)



26 Leave a comment on paragraph 26 2 This assignment sets up a two-phase group project in which students first learn one of five text analysis tools (WordHoard by applying it to one scene of Hamlet. In the second phase, the teams are re-formed to include students with expertise in each of the five tools, and each team is assigned an act of the play to analyze. By transforming students into experts who contribute specific knowledge to the team’s project, Ullyot’s assignment helps them develop their skills by teaching each other. In a related paper, Ullyot highlights “curiosity, resourcefulness, provisionality” as hallmarks of DH scholarship, and suggests that:

27 Leave a comment on paragraph 27 0 Openness about our own learning through algorithmic processes models this openness for our students. When I blog about my research, and raise it in class, I pose more questions than I answer. I openly tell my students that I rely on their experience with these 5 tools to decide how best to combine them for my own work.

28 Leave a comment on paragraph 28 1 This willingness to share the roles of student and teacher is a hallmark of playful digital pedagogy.

Topic Modeling Assignment



31 Leave a comment on paragraph 31 0 As Elijah Meeks and Scott Weingart suggest in their introduction to a special issue of Journal of Digital Humanities, “Topic modeling could stand in as a synecdoche of digital humanities” because of its algorithmic complexity and potential obscurity:

32 Leave a comment on paragraph 32 0 It is distant reading in the most pure sense: focused on corpora and not individual texts, treating the works themselves as unceremonious “buckets of words,” and providing seductive but obscure results in the forms of easily interpreted (and manipulated) “topics.”

33 Leave a comment on paragraph 33 0 Swafford’s two-part topic modeling assignment for undergraduates gives a clear explanation of the several steps required to prepare textual data, import it into the graphical user interface tool for MALLET, and explore the results. This assignment requires students to try the topic modeling process with different parameters and to assess the results of their experiments. This assignment helps students learn a specific method of text analysis as well as skills in visualizing and interpreting its results in relation to the historical and literary topics of the course.

Textual Editing as Text Analysis

Digital Close Reading: TEI for Teaching Poetic Vocabularies



36 Leave a comment on paragraph 36 0 Singer’s article examines the utility of the TEI (Text Encoding Initiative) XML markup protocols as a method for analyzing and describing poetic texts, focusing on her experience teaching TEI encoding to an undergraduate senior seminar. Singer presents text encoding not merely as a means to producing an end result, such as a digital edition, but as “a dynamic, hands-on method for self-conscious, unhurried reading.” Singer’s approach to using TEI in the classroom empowered her students to critically debate the subjectivity of critical interpretation. Her essay includes discussion not only of the pedagogical approach and assignments she used, but also of student papers written after completing the encoding unit. As Singer suggests, to teach methods like TEI encoding can serve two purposes, equipping students with practical project based skills as well as exposing the interpretive choices that are at the heart of textual editing and text encoding.

Digital Annotation Project



  • 38 Leave a comment on paragraph 38 0
  • Artifact Type: assignment
  • Source URL: NA
  • Artifact Permissions:
  • Copy of the Artifact:
  • Creator and Affiliation: Katherine Malone (South Dakota State University)

39 Leave a comment on paragraph 39 1 This sequence of interrelated assignments guides students to work individually and in groups to create a critical edition of a text for use by future classes of their peers. Students define key terms to be annotated, research topics in digitized eighteenth- and nineteenth-century periodicals, and add research-based annotations to the text using [A.nnotate.com] (http://a.nnotate.com/). Exposing students to primary research with digitized materials deepens the context for their understanding of the text. Asking students to participate in the process of annotating a text in a collaborative digital environment reveals the research and editorial decisions that lie behind any classroom text. This clearly structured assignment could be adapted for a wide variety of literary or historical texts.

Juxta Commons Revision/Collation Assignment



42 Leave a comment on paragraph 42 0 Text analysis tools of all sorts can be useful for the process of composition as well. In this assignment, and in a related MLA talk published on his blog, Walsh repurposes the scholarly method of collation, the comparing of multiple copies or witnesses of a text, for the teaching of writing. Walsh’s assignment teaches students to use Juxta Commons, an online collation tool, to compare multiple drafts of one paragraph from their own essay. By analyzing the graphical display indicating the multiple changes between versions, student writers can arrive at a better understanding of what kinds of specific changes alter the writing’s focus or tone. On his blog, Walsh also describes an exercise in which a group of students each write out a new version of a draft sentence during a writing workshop in a shared Google Document. These exercises make visible the many different choices available to a writer, and Walsh suggests it “trains students to internalize the practice of collation and reflect on the interpretive possibilities offered by such differences.”

Communicating Text Analysis Digitally

Digital Poster on Hard Times



45 Leave a comment on paragraph 45 0 The digital environment offers new ways for students to communicate their analytic arguments about texts. This assignment neatly combines literary and cultural analysis of characters and physical space with rhetorical analysis both of the novel and of the student’s own work. By requiring students to create a digital poster to present their arguments about Dickens’s rhetorical strategies in the novel, Hunter asks them to reflect on the different affordances of the digital medium as compared with a print poster. Sequencing the poster assignment with drafts and peer review sessions means that students take it as seriously as a form for argument as they do essays in traditional formats. Because the results of computational text analysis are often best presented in graphs, the digital poster assignment could be usefully combined with other text analysis assignments included in this section.

Image and Sound Interpretation: Wilde, “The Harlot’s House”



48 Leave a comment on paragraph 48 0 Dierkes-Thrun describes this assignment as “an exercise that calls for a creative visceral and sensual, rather than rational and verbal, interpretation.” This project invokes image, video, and sound as primary modes of interpretation, rather than as supplements to a written text. Having students and website visitors outside the class contribute multimedia responses to a particular poem transforms the course blog into a collaborative intertextual display which can then itself become the object of further investigation and analysis.

49 Leave a comment on paragraph 49 0 Fyfe, Paul. “How to Not Read a Victorian Novel.” Journal of Victorian Culture 16.1 (Spring 2011): 84-88.

50 Leave a comment on paragraph 50 0 Jockers, Matthew. Text Analysis with R for Students of Literature. Cham, Switzerland: Springer, 2014.

51 Leave a comment on paragraph 51 0 Selected Syllabi for Courses Including Computational Text Analysis. files/text-analysis-syllabi.md

52 Leave a comment on paragraph 52 0 Sinclair, Stéfan and Geoffrey Rockwell. “Teaching Computer-Assisted Text Analysis.” _Digital Humanities Pedagogy: Practices, Principles and Politics. Ed. Brett D. Hirsch. Open Book Publishers, 2012. Kindle file.

53 Leave a comment on paragraph 53 0 Weingart, Scott. “Topic Modeling for Humanists: A Guided Tour.” The Scottbot Irregular. 25 July 2012. Web. Accessed 30 Mar. 2015.


54 Leave a comment on paragraph 54 0 Bonelli, Elena Tognini. “Theoretical overview of the evolution of corpus linguistics.” The Routledge Handbook of Corpus Linguistics. London: Routledge, 2010. Print.

55 Leave a comment on paragraph 55 0 Buurma, Rachel Sagner. “Indexing In Memoriam Assignment.” 13 Nov. 2014. Web. Accessed 30 Mar. 2015.

56 Leave a comment on paragraph 56 0 Fyfe, Paul. “Digital Pedagogy Unplugged.” Digital Humanities Quarterly (DHQ) 5.3 (2011). Web. Accessed 30 Mar. 2015.

57 Leave a comment on paragraph 57 0 Fyfe, Paul. “How to Not Read a Victorian Novel.” Journal of Victorian Culture 16.1 (Spring 2011): 84-88. Print.

58 Leave a comment on paragraph 58 0 Goodwin, Jonathan and John Holbo, eds. Reading Graphs, Maps, and Trees: Responses to Franco Moretti. Parlor Press, 2011. Web. Accessed 30 March 2015.

59 Leave a comment on paragraph 59 0 Hockey, Susan. “The History of Humanities Computing.” A Companion to Digital Humanities. Ed. Susan Schreibman, Ray Siemens, and John Unsworth. Oxford: Blackwell, 2004. Online edition. Web. Accessed 30 March 2015.

60 Leave a comment on paragraph 60 0 Hoover, David L. “Textual Analysis.” Literary Studies in the Digital Age: An Evolving Anthology. Modern Language Association. Web. Accessed 30 March 2015.

61 Leave a comment on paragraph 61 0 Jockers, Matthew. Macroanalysis: Digital Methods & Literary History. Urbana: U of Illinois P, 2013. Print.

62 Leave a comment on paragraph 62 0 McGann, Jerome and Lisa Samuels. “Deformance and Interpretation.” New Literary History 30.1 (Winter 1999): 25-56. Print.

63 Leave a comment on paragraph 63 0 Meeks, Elijah and Scott B. Weingart. “The Digital Humanities Contribution to Topic Modeling.” Journal of Digital Humanities 2.1 (Winter 2012). Web. Accessed 30 Mar. 2015.

64 Leave a comment on paragraph 64 0 Moretti, Franco. Distant Reading. London: Verso, 2013. Print.

65 Leave a comment on paragraph 65 0 Ramsay, Stephen. “The Hermeneutics of Screwing Around; or What You Do with a Million Books.” Pastplay: Teaching and Learning History with Technology. Ann Arbor, MI: U of Michigan P, 2014. 111-120. Web. Accessed 30 Mar. 2015.

66 Leave a comment on paragraph 66 0 Ramsay, Stephen. Reading Machines: Toward an Algorithmic Criticism. Urbana: U of Illinois P, 2011. Print.

67 Leave a comment on paragraph 67 0 Rockwell, Geoffrey. “What is Text Analysis, Really?” Literary and Linguistic Computing 18.2 (2003): 209-219. Print.

68 Leave a comment on paragraph 68 0 Sinclair, Stéfan and Geoffrey Rockwell. “Teaching Computer-Assisted Text Analysis.” _Digital Humanities Pedagogy: Practices, Principles and Politics. Ed. Brett D. Hirsch. Open Book Publishers, 2012. Kindle file.

69 Leave a comment on paragraph 69 0 Trumpener, Katie. “Paratext and Genre System: A Repsonse to Franco Moretti.” Critical Inquiry 36 (Autumn 2009): 159-171. Print.

70 Leave a comment on paragraph 70 0 Walsh, Brandon. “Collation and Writing Pedagogy.” Brandon Walsh. 17 Jan. 2015. Web. Accessed 30 Mar. 2015.

71 Leave a comment on paragraph 71 0 Weingart, Scott. “Topic Modeling for Humanists: A Guided Tour.” The Scottbot Irregular. 25 July 2012. Web. Accessed 30 Mar. 2015.

72 Leave a comment on paragraph 72 0 Witmore, Michael. “Text: A Massively Addressable Object.” Debates in the Digital Humanities. Ed. Matthew K. Gold. Open access edition. Web. Accessed 30 Mar. 2015.

Page 62

Source: https://~^(?[\\w-]+\\.)?(?[\\w-]+)\\.hcommons\\.org$/keywords/text-analysis/