Scatterbrain

by Colby Duke

Sentiment Analysis of r/gatech

04/12/21

Today, for the 347th time, a student complained about the number of negative posts on r/gatech. As a user who frequents the subreddit (probably to an unhealthy degree), I have noticed these meta-type posts are appearing far too often. From my personal observations, I thought the sentiments expressed on r/gatech were certainly more negative than those expressed around campus, but I was not sure if this was normal for university subs or if r/gatech was a genuinely more depressing space. Therefore, I set out to attempt to find the answer via the tried and true computer science technique of Googling and clicking on the first result. Innovative, I know.

In order to determine if r/gatech was a typical college sub, I decided to utilize Sentiment Analysis via VADER. Why VADER? Well, its name sounded badass and it was invented by GTRI research scientists. Anyways, I found this lovely article on Medium which provided a method for applying VADER to Reddit comment sections scrapped by PRAW. This was close to what I needed, except I found two problems after implementing their methods in Python. First, PRAW is limited to fetching 1000 items per API call, and I wanted to analysis far more posts than that. Secondly, their code analyzed comment sections rather than post titles and (if applicable) content, which is what I wanted to focus on.

To fix these issues, I edited my code to employ Pushshift.io via PSAW. Now that I could successfully query and parse data from subreddits, I began my analysis. After a bit of deliberation, I settled on examining six other university subreddits which I considered similar to Georgia Tech: UCB, UCLA, UIUC, UT Austin, UMich, and UDub. All of these universities are top ranked computer science schools with very active subreddits. More importantly, all are public institutions. The reason I focused on only public institutions was because A: Georgia Tech, the school which I wished to compare the others to, is a public institution and B: private universities often have smaller and less active subreddits.

Anyways, here are the results of my analysis:

After parsing through 10,000 posts per subreddit and then analyzing the sentiments expressed in each, I classified each sentiment identified by VADER into one of three categories (as per the article above): strongly positive, strongly negative, or neither. I then calculated the percentage of positive and negative sentiments per subreddit in the table below.

University Positive % Negative %
Georgia Tech 25.31 8.07
UCB 27.00 12.32
UCLA 25.18 11.32
UIUC 26.50 10.52
UT Austin 28.92 8.12
UMich 28.07 9.66
UDub 27.47 9.44


I was surprised at the results, to say the least. While all of the subreddits seem to have relativity similar percentages of negative posts, r/gatech actually has the least by a slim margin. Maybe these university subreddits are still abnormally negative compared others, or maybe all universities tend to utilize their subreddits as rant spaces – I suppose a semi-anonymous platform would lend itself well to that purpose. I’m not sure and I won’t pretend to know enough about the subject to say definitively one way or another; draw your own conclusions. I’ll stick with mine: no matter how down you feel at Georgia Tech, there is always a student at UCB who is feeling worse.