Apr 20
10:25 AM

Assignment 7

To begin, the entire process from installation to execution was quite difficult. Besides the obvious difficulties regarding software, there was also the need for a large collection of texts. Due to collaboration, gathering approximately one hundred texts was not too difficult. However, working individually, it would be very hard to accumulate on enough texts to effectively use Paper Machines.

There was also some difficulty concerning the speed of the process. I went into paper machines preferences in Zotero to increase the memory allocation in an attempt to reduce the run time. I also reduced the topics from fifty to twenty because I felt that would give more meaningful results.

However, once we made it through the process the paper machines results were less informative than I expected. I felt the three words topics were not enough to accurately portray the authorship of the topics. As we discussed, paper machines is an attempt at a user-friendly software with similar functionality to mallet. However, the mallet output we looked at in class returned topics of eight to ten words, which made it a lot easier to discern the specific text and/or author related to the topic. With three words the topic results in paper machines were often too generic. For example, the topic “great, love, family” are very general themes that appear in many novels. Another very generic topic result I got was “people, character, earlier” which are not strong topics or themes. On the contrary, the topic “Catherine Heathcliff etc” from mallet was easy to identify as Wuthering Heights.

Overall, I think Mallet provided better topics and learning to use Mallet would have been comprably difficult to the entire installation and set up necessary for Paper Machines. I believe the idea behind Paper machines of a user-friendly software for topic modeling is a great idea, however, I feel it has been poorly executed by paper machines.


Leave a Reply

Your email address will not be published. Required fields are marked *