Blog: on “distant reading”

It is believed that Franco Moretti, a philologist, and literary scholar, came up with the term “distant reading” sometime around 2000 when he published a piece in one of the academic journals. “Distant reading” is an antonym to “close reading.” If you need to provide a very brief definition of both phrases and underline their difference, it is probably safe to say that “close reading” means one deal directly with a book, they attentively (closely) read the book to figure out the meanings of a literary text, define its metaphors, unearth hidden linguistic riddles, decipher the key idea—anything while working on, or with, a text. “Distant reading,” on the other hand, can be defined as a process of working with a text without reading the text. You don’t need to read the actual text, all kinds of digital toolkits and software will do that for you.

After Moretti’s published his initial article, he kept developing his ideas, first and foremost, in Graphs, Maps, Trees and, after a few years later, in Distant Reading—these works have been regarded by some as path-breaking as well as widely used and discussed in the scholarly fields other than DH. In addition to Moretti, another significant volume on “distant reading,” titled Macroanalysis by Matthew Jockers. On the other hand, there are scholars who have a different take on the history and timeline of “distant reading” and question its inception just some two decades ago advocating that earlier models of “distant reading” were created in the past, but of course not called that way yet.

In his piece, Ted Underwood discusses the history and trajectory of the term “distant reading” and—while relying on some previously published scholarship—poses questions about when the studies of “distant reading” really began. The scholar also asks what parallel, related fields were—and mentions the concepts spearheaded by others: “textual interpretation (reading)”; “sociology of literature”; or “cultural analytics.” Underwood points out that Moretti’s works are important, “not because they invented the idea of macroscopic literary inquiry, but because they galvanized an existing project by infusing it with a new sense of possibility and a new polemical rationale.” Indeed, Moretti’s concepts and approaches to a history of British novel and its classification and division into subgenres, in one of the chapters in Graphs, Maps, Trees, seemed novel. The question some scholars keep asking is related to Moretti’s dataset—all details about datasets (most likely, the most important part of any DH project)—their origin, fullness, etc.—were not shared or revealed. In general, it seems that the question of researching, aggregating, composing, editing, and sharing datasets is yet another fundamental point as we discuss “distant reading”. What also seems especially appealing is that this whole concept of “distant reading” is being discussed from various standpoints, it’s interesting to observe that its pre-history may be dated before the year it was actually coined, it’s being discussed as well—which only means that the field keeps breathing and is far from being fully understood.

Text Analysis Praxis Overview

As you know, your second Praxis Project is due next week at class time. All students have chosen this option, so we’ll have a nice cluster of projects to talk about. First, here’s the procedure with a few tips:

  • read this overview of text-mining.
  • choose a text or set of texts (you might start with a pre-prepped corpus like the CCC corpus we looked at earlier or the EEBO corpus Witmore co-created and discusses), and explore with Voyant, Google N-Gram, J-Stor Text Analyzer, Bookworm, MALLET, or another text-mining tool.
  • Third, explore! Even more than with the mapping project, this can be an exercise in playing around with a tool or tools and reflecting on “what happens” rather than the production of some kind of finished “project”: if you don’t believe me, look at the blog posts from prior students below
  • Fourth, blog about your experiences. Here are some examples to guide you from prior students in 700:

Post Workshop Blog: Asynchronous Interaction in Course and Platform Design

Yesterday I attended the Asynchronous Interaction in Course Design workshop, facilitated by Seth Graves, a Carnegie Educational Technology Fellow at the GC. The workshop explored integrations of asynchronous engagement into online and offline course design. While the workshop was geared towards educators in undergraduate and graduate teaching and students who have engaged in online learning, I attended this workshop with hopes of learning about strategies that I could share with colleagues from my professional role. I do not teach, however I work for a non profit organization that develops and offers music theory curriculum and courses focussed on Conjunto music to our community. These music theory courses were scheduled to launch when the pandemic first hit, so the courses were re-formatted to be executed entirely online. With this unexpected teaching format, we have encountered some frustrations, so I plan to share some of the ideas and methods I learned in this workshop with my team as we continue to develop our music education programs online and, eventually, in-person.

Much of the focus throughout the workshop surrounded methods of increasing thoughtful engagement in an online classroom setting, however, many if the ideas can be applied to offline, in-person classroom settings as well. Some of the most helpful takeaways I noted are below:

Low-Stakes Communication:

  • Generating asynchronous check-in spaces and deadlines to foster an environment wherein students can share ideas, feedback, and questions about each other’s ideas and work freely, that is, in a mostly informal setting. For example, a forum, shared document, or discussion board. Some useful tools are Slack, Google Docs, and Discord.
  • Introduce flexible modalities through which students can engage. Some plugins like Talk and Comment and tools such as Voicethread allow students to share video and audio responses with each other.

Readiness:

  • Ensure that students have all of the information and tools needed to participate in the class. Consider providing tutorials and helpful documentation if there is a learning curve with any of the integrations used in the classroom.
  • Provide appropriate contact information if students should reach out to specific personnel for technical assistance.

Break Up The Content:

  • Outline the material in sections to avoid presenting an overwhelming amount of information to students at one time. Depending on the course platform, the instructor may be able to break up the content into smaller sections. For example, a different WordPress for every week or section of content. The workshop facilitator, Seth, shared his own example that I found very helpful — as a student, it certainly feels easily digestable and not overwhleming to see the content broken up into different pages on a course website. I also want to call attention to the clarification and outlining of all tools needed for the course — an example of ensuring students have all of the information needed to participate in an asynchronous class.
https://blogs.baruch.cuny.edu/graves2150summer2020/?page_id=233

Interrelating Course Material:

  • Blog post assignments about the material (i.e. our posts in this course!).
  • Brief comment-based prompts surrounding the class content.

Additionally, we discussed some common concerns of educators in online teaching environments. Most of the attendees shared frustrations surrounding assessment in asynchronous work. Below are some helpful tips provided by the workshop facilitator, Seth:

  • Organize how you will assess your students by the stakes of the assignment.
  • Avoid over-grading. Was it more important that the student practice?
  • Avoid over-commenting. Can you organize your feedback into key ideas or revision tasks?
  • Try rubrics. How can a rubric help you provide the right balance of feedback while helping with your time management?
  • Try peer review. How can you incentivize quality feedback? (try this: give them guided peer review instructions)
  • Seconding this: Offer short comment-based prompts about the content they just read.

Lastly, I wanted to share some of the other course platforms we discussed. Below are some of platforms and tools most commonly used amongst the workshop attendees, including myself (we use Google Classroom in my professional work):

Course Platforms:

  • Hosting course site: WordPress (ex: CUNY Academic Commons), Blackboard, Google Classroom
  • Hosting longform student writing: WordPress, Manifold

Understanding the “Back-End”

One of the hallmarks of Johanna Druckers’ scholarship is an interest in how knowledge is produced. In SpecLab, her experiments at the titular laboratory centered on “reconceptualization of premises and parameters, not a reassessment of means and outcomes.” This sentiment gives a generous acknowledgment of the benefits computer-assisted quantitive methods made in humanities research, while also recognizing the “invisible” constraints orthodox methodologies inherit from the disciplines and industries of origin (Drucker, 27). Recent publications build on some of her earlier experiments to demonstrate that what is presented to us as self-explanatory artifacts actually require more thought, context and interpretation than we’re trained to deploy in our every day encounters with, say, data visualizations. Along similar lines, Drucker draws our attention to the hidden costs of the transition away from physical media to digital media for academic publication this week. One small nuance present in Pixel Dust caused me to reflect on why I’m prone to accepting Drucker’s lines of augment.

Drucker demonstrates an understanding of relevant digital technologies outside of an academic context which views the “back-end” as cruical to the frontend. The thoroughness of comprehension merges in the specificity of one example used when description some misunderstandings of the transition to digital media for academic publication. The company named in “the misperception…that everything digital is available on Google” strikes a chord with me given the origin PageRank, the key algorithm in that companies competitive advantage in the 1990s IT tech ecosystem. I would argue that Drucker knowingly references Google because of the influence of citation analysis on this algorithm, borne out of finding a quantitative method of evaluating the importance of published scientific articles. As Google put describes the logic:

PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. The underlying assumption is that more important websites are likely to receive more links from other websites (citation courtesy of the Wayback Machine)

Much like her example of the codex, Drucker emphases the continuities in the backend logic between older forms of media and their descendants. In this case, leaving discovery up to Google search results likely reinvigorates the same forces that constricted traditional academic publishing, both financial and institutional, through backlinking other SEO strategies that favor

Drucker’s implication of the backend as critical subject matter to understand the transition to digital formats reflects the true costs, rather that the price. One misconception has to do with the distribution of costs over the lifetime of an academic product. Whereas physical books and journals had large upfront costs, the cost of digital publication goes beyond licensing fees, or even the “author-pays” model mentioned in Fitzpatrick’s chapter in Generous Thinking. Drucker also counts the cost of maintaining the digital medium itself, often not appended to price tags. To draw the analogy using a physical book, it’s as if there were occasions where the pigments in ink stopped rendering, and someone needed to jostle them to recompose as letters and text on a page. Or the pages would occasionally fall out of the binding, become disordered on the ground.

I think there are reasonable responses to these problems offered in the motivations behind Gil’s Ed. And in general, the reading from this week and last about infrastructure shows a gradual movement towards systems thinking. As someone who’s being propelled rapidly into a more “backend position” in a professionally, I find a lot of sympathy with Drucker’s instinct to scrutinize the complex tradeoffs in the “invisible work” required to create and maintain digital platforms.

R. W. Emerson and 19thC technoutopianism

I’ve been thinking about our conversation last night about the arguably willful blindness to “invisible” infrastructure and logistical processes that characterizes our period. The brief conversation about history made me think of the “commodity” section of Emerson’s famous essay on Nature. There, Emerson celebrates the rise of new technologies that mirror and are consonant with natural forces–the steam engines that improve upon windpower and so on–and whose benefits are widely distributed for the general welfare.

Needless to say, now that it has dawned on us that we’re in the Anthropocene and that the benefits of tech don’t generally trickle evenly through the population, we find ourselves in very different place…

Commodity

Whoever considers the final cause of the world, will discern a multitude of uses that result. They all admit of being thrown into one of the following classes; Commodity; Beauty; Language; and Discipline.

Laptops and the “right to repair”

In a bit of kismet, the NYTs Wirecutter department featured an article on the Framework laptop, an attempt to build an eminently repairable laptop designed to last 10+ years. Dig in: it’s fascinating in an era dominated by Apple’s utterly opposite approach.

The Framework Laptop Could Revolutionize Repairability. We Hope It Does.

Framework is promising the kind of upgradable laptop that plenty of people have demanded for years, and so far things look great. Mostly.

Mapping

Project description:

Image 1: All places are shown on this map.

For my project, I wanted to create a map depicting a story of a particular copy of the book that was published in the late 1840s in Imperial Russia. That copy is a unique one in the sense that it has stamps and inscriptions of previous owners and institutions this copy was part of. Thanks to these stamps and inscriptions, it was possible to recreate, in part, the way the copy travel through the times, the countries, and the people and institutions. It should be also pointed out that this story has blank spots–it’s unclear who owned the copy at a certain point, and this is something that still could be researched and analyzed.

Platform:

I decided to choose Palladio as a tool to illustrate the book’s itinerary. First of all, I jotted down all the places that were known to me–two cities in what now Russian Federation is (back then, it was Imperial Russia), one city in what is now Ukraine (back then, it was the Austro-Hungarian Empire), and two cities in the United States (both in New York state). The first city in Russia is the city the book was printed in; the other city in Russia is a place where one of the owners of the copy lived; the third city on the map is a city where a library received this copy as a donation; the fourth town in the US in a place where this book was; and the fifth, the city it ended up in. 

I found the exact geographical locations of these places and included latitude and longitude the way it was recommended. For that, I used an Excel spreadsheet. I followed the guidelines the tool has about datasets and it took me lots of time to finally come up with a dataset that the tool would accept and turn it into a map. Although I prepared the dataset initially in a way it was recommended in their guidelines, it did not work and that was really confusing.

Image 2: European part of the map.

Outcome:

The map I came up with could have been more clear: for example, it could have mentioned the approximate dates (when known) the copy “spent,” so to say, in a certain place. It could also show the trajectory of its itinerary: how did it move from what place and when. To make it even more interactive, it could have included the images of the institutions which had or have the copy in their possession. To better present this information, a different tool could have been used. 

A different avenue:

I would probably want to have a map or a set of maps that could clearly illustrate the itinerary of that copy, on the one hand. At the same time, on the other hand, I would want these visualizations to work in a way that could be treated as illustrations to a write-up or essay about this story. This means, the data set may need more data to be found, analyzed, and included, so it could be reproduced well enough on the visuals.

tomorrow’s plan

I’d like to try something a bit different tomorrow and wanted to tip my hand a little bit to get you thinking.

First, as mentioned previously, think of something to bring, or just a story to tell, about something you’ve fixed or hacked. It can be as simple as a mended shoelace or as complex as a broken website or bit of software.

Second, rather than march through the readings and sites as usual, I’d like to work more synthetically. So I’d like to break us into small groups for the first part of class and work on three separate questions, each of which might be addressed with several of the articles and sites on the syllabus. Might be a total fail, but might be really fun and inspiring. Here are the questions, which are also on this Dropbox Paper doc. We’ll use the latter to jot notes in our discussion tomorrow. You don’t need to do a thing before class, but I thought it might jog your thought process while reading to have these questions in your minds:

  1. One of the foundational concepts of the traditional humanities is the “transvaluation of values” (Nietzsche), that is, a skeptical stance towards received ideas of the good, the moral, the beautiful, et al. and a boldness in imagining new and unprecedented scales of evaluation. What are some entrenched values that this weeks authors seek to “transvaluate” in Nietzsche’s sense? By all means suggest your own, but you might think about some of the following: innovation, disruption, novelty, progress. What terms recede in importance and what terms emerge to replace them in the authors’ analyses?
  2. How do digital technologies and/or the digital humanities look different from the perspective of the Global South? How might close observation of “makers” and “inventors” in these “underdeveloped” spaces teach those of us in the “developed” world new approaches? How might humanistic study, and DH in particular, benefit from attention to seemingly marginal people and spaces who, in fact, comprise the global democratic majority?
  3. What are some objects and processes discussed in this round of readings that are hard to see, hard to grasp, hard to comprehend? How do the readings/sites help us to think bigger or smaller or quicker or simply different? What research might emerge out of paying attention to ordinarily invisible aspects of our built landscape that ground the “clouds” we use in our everyday work and play?

Design/Infrastructure Week Blog Post – Maccioni

This has been my favorite week of readings thus far, lending a new way of engaging with my mapping project investigating food supply chains. In addition, as an interdisciplinary learner, the concept of studying the relationships between networks (visible and invisible) is something very intriguing to me. Novinskie’s work on creating a praxis of “care” particularly struck me as it afforded an extra layer of meaning and truth-seeking to focusing on even the smallest of sites/topics within larger networks to expose their invisibility.

That said, I was also moved by Jackson’s piece “Rethinking Repair” and how to flip my project in, say, focusing on “breakdowns” and “errors” in food supply chains as the impetus for study. This might be in the form of tracking stale food or evaluating waste. For me, the concept of studying sites of repair, and in a sense, error, is also something I’ve come across before in thinking of all of the various types of foods and common ingredients today that have come as a result of an “accident.” Could I map those interactions, sometimes a cultural misunderstanding, in working through their “breakdowns”?

On the other hand, I could take Gil’s approach in illustrating a “technology of disobedience,” and showcasing how/where we make use of foods and ingredients for other purposes – which reminded my of all the things my Italian grandmother does in the kitchen.

I could also think further about Posner’s article in thinking just how disruptive a trace of an otherwise invisible food supply chain is for a capitalist economy based on scale. Reading Posner’s work, my trouble finding sources for many of the foods in my home felt right on target, and it got me thinking about the various food businesses who might already be employing a sort of blockchain technology – including Nestle and Walmart:

Of course, there are quite a few benefits if companies were to do this: fresher food, waste reduction, guaranteed “uniqueness” of a product, improved food safety and more. That said, the probability of adoption feels low given the raise in prices it would spark for both consumer and producer, and the fact that while some companies (Bumble Bee tuna) purport to be doing this, the feature did not work when I gave it a try for my map. Perhaps I could investigate this technology further in my next project and weigh the outcomes.

All that said, here is where my cynical side comes in – why would companies endeavor on this path other than the attempts to be “good people” and transparent entities? Of course we hope that’s enough, but where do these things fit into an economy based on such large scale…if at all. Jackson touched on this a bit in “Rethinking Repair,” but I’m curious to discuss more on how making these systems transparent can also make them sustainable, in both an ecological and trust-worthy sense moving forward.