MVP

Sabrina Yang

After collecting $TSLA tweets from 07/12 - 07/19 (total 18019 tweets), I cut down to 10096 tweets after deleting duplicates rows/tweets.

1. Topic Modelings - NMF

Since tweets are considered as relatively short documents, I chose NMF as my method on analyzing the interpretable topic models. Here’s what I found:

topic_0 : advertisement - promoting trading alert chatroom
topic_1 : FUD (Fear ,Uncertainty, Doubt) # insolvency = bankrupty # crash, panic
topic_2 : advertisement - stock app promotion
topic_3 : advertisement - stock option trading investment advice
topic_4: how much Tesla made from bitcoin investment
topic_5: car, FSD Subscription (ha = have: car ownership)
topic_6: # fintwit # wallstreetbet # trends # top stocks: aapl, spce(Virgin Galactic), mrna
topic_7: advertisement - daily alert
topic_8: advertisement - sign up
topic_9: stock technical analysis and price charts
topic_10: advertisement - stock alert for discord
topic_11: meme stock - influencers account trade ideas - promoting certain stocks - amc, gme
topic_12: elon musk tweets activity, solarcity, cybertruck, trial, court (Tesla bought Solarcity recently)
topic_13: TSLA stock personal opinions and feelings
topic_14: TSLA stock personal trading ideas - Short > Long ("Short" shows up more frequent than "Long", means the market is more bearish at that moment)

What do the topics tell you about the overall structure of the data?

some of topics are noises ( daily alert, promotions, advertisements) but some are useful
- elon musk's tweets
- tech chart patterns
- retail traders about tesla news

2. Sentiment Analysis Score - Vader

The reason I chose NLTK Vader for this part is because Vader has better sense to pick up social media sentiments than other packages. It comes as 4 scores: negative, neutral, positive and compound. For this case, I will focus on compound score to make a scatterplot based on daily average sentiment score.(why did I take compound score, not other scores to analyze? Since neutral tweets do not reflect a positive or negative mood and serve therefore no purpose to this analysis)

a. Daily Sentiment Compound Score

b. Daily $TSLA price and Sentiment Compound Score
- Can't see any significant correlation pattern but the sentiment did fall down when the stock price dropped for the first 2 days(7/12-7/14).
- Sentiment climbed up and down repeatly from 7/15 to 7/19 even though the stock didn't recover during last week.
- In the end of this observation timeframe, the sentiment hyper somehow boost the confidence of stock holders and we saw the price recovery on 7/20.

Next step:

look into elon musk's tweets content
do more data cleaning on advertisement tweets to decrease the noises and redo the topic modeling to get clearer topics.
- update: cut down to 9712 tweets after this cleaning
do Kmeans clustering for topic-matrix df for topic purpose
Check other related tweets than my current keyword "$TSLA" search, for example, check "Tesla" tweets.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MVP.md

MVP.md

MVP

Sabrina Yang

1. Topic Modelings - NMF

2. Sentiment Analysis Score - Vader

Files

MVP.md

Latest commit

History

MVP.md

File metadata and controls

MVP

Sabrina Yang

1. Topic Modelings - NMF

2. Sentiment Analysis Score - Vader