It all started with a facebook message from a friend. And it all ended with me going “f- this” and calling it a day on the project.
The idea was simple. I go to three political subreddits on Reddit. SandersForPresident, HillaryClinton, and The_Donald, and I pull a mixture of random users from these subs. I then go and look at their comment history to see where else they are commenting. Hopefully, we could see some nice data on what types of people congregate in each.
What actually happened was that the Reddit API was intolerably slow, and had limits of 1 API call every 2 seconds. That severely limited my ability to pull in the data. To grab meaningful data would mean running it for a period of possibly days. This seemed OK at first, but (maybe wrongly of me) I just assumed that at some point during that time Reddit would go down and I would have to start from scratch.
Instead I ran the app across just 100 randomly picked users from each subreddit. Because of the small range of data I did manage to grab, it’s hard to draw any huge conclusions. So, instead I drew some pretty graphs and called it a day (Note, they aren’t that pretty, it’s all I could do with Excel, but you can download the data at the end of this post if you want to have a go yourself!)
Users Involved In Other Political Subs
So the first set of graphs is showing that given a random commentator in say “SandersForPresident” what are the chances that user also comments in say “The_Donald”. Either because they are just very politically minded, or because they want to cause a bit of mischief, let’s take a look.
Hopefully the legend is explanatory. The first letter is where we found the original user (S = SandersForPresident) for example, the next letter is checking whether that user also commented in another subreddit. What we can see is that both Donald Trump and Hillary commentators also comment heavily in the Bernie Sanders subreddit. Draw from that what you will.
Hillary Supporters Camp Out /r/PoliticalDiscussion
While much of the data is spread out across hundreds of subreddits rather evenly, one thing that sticks out like a sore thumb is the fact that a large percentage of HillaryClinton commentators also comment in /r/PoliticalDiscussion. See the graph below.
This is even more pronounced because if we look at something like /r/politics it’s much more evenly distributed.
Hillary Supporters Also Have An Anti Sanders Subreddit
Unsurprisingly, the subreddit /r/enoughsandersspam is inhabited exclusively by HillaryClinton supports (There were zero instances of either Sanders or Trump supporters commenting there). No graph for this one since there isn’t much to show. But the numbers are that out of 100 randomly picked HillaryClinton commentators, 20 had commented on /r/enoughsandersspam.
Really, I could pull data out of this for days around what I think the data tells me. In all honesty I walked into this with a pretty empty mind, I didn’t have any agenda whatsoever. But the more I stare at the data the more I see that HillaryClinton commentators have really weird patterns around what they are commenting on. I think the easiest way to describe this is to leave you with a graph of how each of the political subreddit commentators comment on some fairly innocuous subreddits. These are subreddits that are popular in their own right and are not political in nature (Usually).
Hopefully the graph is big enough to see what I’m talking about. HillaryClinton commentators have MUCH less overall engagement with the rest of reddit. What does that mean? I’m really not too sure. Hypothesis in the comments are more than welcome 🙂
Small note about how I obtained the data for anyone that cares 🙂 I went to each political subreddit, and took the top 25 posts from the past month. Inside these posts, I went in and took 100 commentators, ordered by newest, but they had to have a score more than 1. I should have ended up with a little less than 2500 users (Give or take since we remove duplicates). I then shuffled all of these users and grabbed a random 100. From there, I went and grabbed their comment history ordered by newest. From their comments I grabbed all the subreddits they are commenting on and uniqued them all (So if they commented twice in /r/politics, that was still only one “point” for /r/politics). I then wrote out the resulting data to a CSV file which you can get below.
You can download the complete CSV data here! Link Back/Comment below if you use it so I can see what you made!