Exclusively Frederick Douglass:
As would be expected after conducting this experiment, some of the most frequent words that appeared in ALL texts included articles “the, a, an”, conjunctions “and, but”, and pronouns “he, his, her, she, they, their, it.”
I decided to choose the top three largest bubbles from both of my graphs and calculated the proportionalize their approximate percentages of frequency by dividing their occurrences by the number of books (For all authors I divided by 47 and for Frederick Douglass I divided by 5). The results are as follows…
The——-204,094 divided by 47 ======> 4,342.4
47 divided by 204,094 =.00023029
And——125,168 divided by 47 ======> 2,663.1
47 divided by 125,168=.0003755
Of——–118,406 divided by 47 ======> 2,519.3
47 divided by 118,406 =.00039694
The——11,791 divided by 5======> 2,358.2
And—–6,854 divided by 5 ======> 1,370.8
Of——7,552 divided by 5=======> 1,510.4
Interestingly enough, when I divided the number of books by the word frequency, Frederick Douglass’s results were almost twice as much as all the authors. However, when I divided the frequencies by the number of books, Frederick Douglass’s results were half of all the authors. I’m not sure what to attribute this conclusion to: perhaps Frederick Douglass’s total word count per book is much greater than that of the other authors so that the ratio of “the, and, of” is on a much grander scale and therefore yields a smaller ratio. I’m not entirely certain.
The results from both graphs were pretty much aligned with the expectations I had going into this project. Since all of the authors are from a similar time period with the same demographic and writing about essentially the same material it makes sense that the results would be rather consistent and reflective of each other.
Some notable observations:
The graph of all the authors has larger bubbles for the feminine pronouns and related feminine words. I attribute this to authors such as Jane Austen who writes about mostly female characters and Douglass’s lack of women in his books the cause of his rather insignificant feminine word bubble presence in his separate graph. Pronouns such as “me, my, his, I, he” and the like, take the cake for Douglass’s graph. The feminine words are almost nonexistent. The graph of all authors has the word “mrs” which Douglass’s graph lacks entirely. In addition, the words “mister” and “slave” appeared less frequently than I had originally presumed they would in these graphs due to the authors subject matters, however, their appearances were similarly frequent in both graphs, just on a smaller scale than I had anticipated.
Verb tenses were also very consistent amidst the two graphs. “Were” and “will” had similar frequencies as did “have” and “been.” A lot of the word frequencies matched up and the overlap was, as mentioned before, very consistent with my hypothesis.