Exploring ‘wallstreetbets’ trends with Sentiment Analysis and EDA.

Simon Stausholm Rasmussen
4 min readFeb 19, 2021

— A Python implementation with praw, nltk, and yfinance.

Photo by Tech Daily on Unsplash

Anticipating a Reddit short squeeze or catching the next rocket🚀 “to the moon” is likely the reason behind the recent attention and massive increase in members of investment subreddits such as wallstreetbets. This article presents a program for ranking, summarizing, and analyzing the hottest topics of the daily investment-related discussions on Reddit.

Find all the code on my GitHub and start exploring this fascinating world of Reddit-investing.

https://github.com/simo075j

The idea

The impact of Reddit investors joining forces against capital funds has recently drawn a lot of attention in the media and investment world. This attention from the general media has only accelerated the interest in investment-related subreddits. Some of the most popular finance/investment subreddits such as ‘wallstreetbets’, ‘stocks’, ‘investing’ and ‘stockmarket’ accumulate upwards of 15 million users. Understanding what’s going on in the minds of “redditors” is almost impossible, BUT with some computational assistance, anything is possible.

The approach

This program scrapes and analyses any investing/stock-related subreddits, to gain a basic understanding of what’s being discussed, the general sentiment, and basic financial metrics of mentioned stocks. Output examples below are extracted on 17–02–21 and include the topics of discussion for this specific date.

1. Scrape posts and comments from ‘wallstreetbets’, ‘stocks’, ‘investing’ and ‘stockmarket’ with praw. The parameters below can be tailored for specific needs.

# PROGRAM PARAMETERS:
subs = ['wallstreetbets', 'stocks', 'investing', 'stockmarket']
post_flairs = {'Daily Discussion', 'Weekend Discussion', 'Discussion'}
goodAuth = {'AutoModerator'}
uniqueCmt = True
ignoreAuthP = {'example'}
ignoreAuthC = {'example'}
upvoteRatio = 0.70 #upvote ratio for posts
ups = 20 #min upvotes for posts
limit = 10 #number of tickers to look for
upvotes = 2 #min upvotes for comments
picks = 10 #n_picks for consideration
picks_ayz = 5 #n_picks for sentiment

2. Tokenize, aggregate and visualize the word counts for top 10 tickers on the NYSE, NASDAQ and NYSEAMERICAN public traded companies.

Example output

3. Perform and visualize sentiment analysis with Vader SentimentIntensityAnalyzer and a custom lexicon related to common “Reddit-slang”.

Example output

Interpretation:

Positive sentiment: Compound score >= 0.05

Neutral sentiment: (Compund score > -0.05) AND (Compund score < 0.05)

Negative sentiment: Compound score <= -0.05

e.g. a generally positive sentiment towards ‘PLTR’ and general negative sentiment towards ‘RIOT’.

4. Extract the current price for the 10 most mentioned tickers at the time of running the script. Also plot last months price developments for these tickers (only 2 shown below).

Example output

5. Take a deep dive into most mentioned stock with the yfinance API and plot the last 3 months development.

Example output

6. Take a look at the most recent institutional recommendations and largest institutional shareholders.

Example output

The conclusion

With basis in the example above as of 17–02–2021, I myself didn’t know anything about Palantir Technologies Inc. I find it interesting that this specific stock has appreciated more than 235% over the last year and has a short ratio of 0,98 as of today. Maybe redditors are discussing the next short squeeze target?

It is further interesting to me that Goldman Sachs lifted their recommendation from hold to buy, indicating that their analysis point towards a future appreciation.

Another interesting insight from the application is the fact that GameStop (GME) is still being discussed actively and that the overall sentiment still seems to be relatively positive.

Hopefully this article and tool has sparked your interest in what’s going on in the world of social media investing. I highly encourage anyone to fork the program from my GitHub and play around with the configuration to fit your interests.

--

--