This project looks at two (polarisingly chosen) subreddits and aims explore several questions related to the language used in these subreddits.
The chosen subreddits are /r/skeptic and /r/psychic. I drew data primarily from comments posted in these communities (obtained using the PRAW Python API) and using the NLTK tool to analyse the comments.
So far in this project:
- I have explored frequency distributions of words (e.g. What are the top 20 nouns, verbs and adjectives of each subreddit?). See https://codabunga.io/posts/2022/01/reddit-and-language-data-part-1-analysing-text/
- Made forays into language generation using n-grams (my first time doing this). See https://codabunga.io/posts/2022/10/reddit-and-language-data-part-2-generating-text/