For my analysis I decided to compare sixteen different gothic novels from the nineteenth century (the exceptions being Ann Radcliffe’s The Mysteries of Udolpho, published 1794, and Bram Stoker’s The White Worm, published 1911). The titles include works from Bram Stoker (2), Charles Dickens (4), Edgar Allan Poe (2), Ann Radcliffe (1), Charlotte Bronte (2), Emily Bronte (1), Jane Austen (1), and Mary Shelly (3). I thought it would be interesting to see if there was any notable difference or separation between male and female authors writing; I suspected that the male authors would be grouped together and the female authors would be grouped together (relatively) – at the very least, I expected works by the same authors to appear near if not directly next to each other.
My first run through with basic scrubbing yielded interesting results, with many of the authors being interspersed. Charlotte Bronte’s two novels are relatively similar, as well as Dickens’ Bleak House and Edwin Drood and Mary Shelly’s Frankenstein and Mathilda. The Bronte sisters’ novels, Great Expectations and Dracula appear to be closely related, which surprised me a little. What didn’t surprise me was seeing Poe’s work far to the left as an outsider.
My second run through with the NLTK list as Stop Words yielded results similar to what I expected in the first place, with Poe’s and Dickens’ works appearing in a row on the left and each respective author’s works appearing next to each other. What was interesting though was Stoker’s work interspersed among the female authors, with Dracula appearing between Shelly and the Bronte sisters. What is also interesting is that the closest related texts in this run through are less close than those of the previous list, even though they appear between works by the same author. Again Poe’s works appear as outsiders, yet in this list The Fall of the House of Usher is more related to the other branch than The Raven directly.
In my final run through I used the NLTK list as Keep Words, which yielded interesting results again. Charlotte Bronte’s works are still right next to each other, and near Emily Bronte’s, but now Dickens appears dispersed among the other texts, Shelly is split up, and Stoker (again) appears in different places. While Poe’s works are not as closely related to each other as they were in my original run through, they again appear as outliers to the rest of the works. In this run through, the final branch connecting his works to the others is dramatically higher than in previous run throughs.