For this assignment, I decided to focus on the study of several translations. The subtitle of my blog post is actually a reference to a subtitle to one of the translations. Can we understand the translation or more broadly a text if we don’t read the original — that is, can we understand a text if we use distant reading, not close reading?
Voyant Tools helped me understand if it’s possible to see if (and how) a target language—i.e., a language into which work is translated—changes through the centuries. How does the language actually change? And what kind of influence does it make on the translated text? Do the vocabulary of those translated texts differ and if yes, then how? What are the actual differences? Is it even possible to ask these questions without closely reading the texts or this distant reading is the way to start thinking about these problems?
To investigate a set of these research questions, I concentrated on English-language translations of Homer’s Odyssey. This should be a good example to try to answer those questions because we have many translations of this classical work into English and, as we know from readings, the richer our dataset is, the more interesting our outcomes might be.
Thanks to Gutenberg Project, I was able to locate as many as seven translations of Odyssey—and these constitute the core of my project. There are many more translations of this work—a Wikipedia page about Odyssey, for example, lists published translations of the work and its number is around one hundred. Thus, it could be material for a large research project.
First, I located the five translations through Gutenberg Project—copied and pasted the texts into different Word files. I did this because the files on the website contain some information produced by Gutenberg Project, translators’ notes on translations, notes to their translations, notes to the text, various additional materials—i.e., those parts which are not in the original literary work. And I wanted to focus entirely on the literary text, those had to be excluded. This way the additional materials won’t interfere and make any influence on my research.
After I uploaded five translations, here are some of my findings. Thanks to Cirrus, it was possible to see a word cloud that visualizes the top frequency of a corpus—in this particular, case of all five translations combined. The top 55 word frequencies are depicted in this word cloud.

I wanted to explore and get more information about words in the whole dataset. A function called summary came up with the total number of words in all files (611,788) and the number of unique word forms (28,282). In addition to that, one of the features of the summary is that it could provide distinctive words in all five words. This demonstrates the changes in five translations and underlines to need to study the question of why these texts show so many different distinctive words. The summary provides some interesting information about the whole corpus which consists of five translations of the same text. We can study the longest texts (number included), and the shortest texts. It’s also possible to observe vocabulary densities and distribution of density across the whole corpus. Finally, over interesting information: average words per sentence, frequent words, and distinctive words.

The next question I delved into was how particular words trended and how these could be depicted through a line graph. We already have the five top words in the whole dataset and could be the ways these five words appeared and were used in the respective translations. Curiously, the use of the word “spake” in the translations increased tremendously when you compare the translation produced in 1614-16 and 1726 with those in 1879. Also, quite unsure of what happened with the use of the word “Ulysses” in one of the translations in 1879. Interestingly, both “son” and “shall” were used more or less on the same level.

The most widely used word in all these translations is “Ulysses,” and therefore I became interested in how it is used within certain terms. Microsearch visualizes the frequency and distribution of the word in all five texts—you can view the “map” of this word in all five texts and think about if its frequency changes. If it does change, then what might be the reasons for that?

Conclusions. Without knowing a source language, it’s always tricky to work with translated works. However, if one intends to compare just the translated texts, these results might be of certain interest and might help to pose further research questions which one can solve with the help of traditional close reading. Distant reading can certainly diversify the study of literature.



