Skip to content

In this project, I'll visualize text data using WordCloud, employ the LDA model for topic modeling, and compute coherence scores to assess the model's quality and find the optimal number of topics. I'll create an interactive visualization with pyLDAvis, saving it as an HTML link for exploration.

Notifications You must be signed in to change notification settings

Talia178/NLP_TopicModelling_LDA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 

Repository files navigation

About Dataset

"Friends" is an American television sitcom, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast starring Jennifer Aniston, Courteney Cox, Lisa Kudrow, Matt LeBlanc, Matthew Perry and David Schwimmer, the show revolves around six friends in their 20s and 30s who live in Manhattan, New York City. The series was produced by Bright/Kauffman/Crane Productions, in association with Warner Bros. Television. The original executive producers were Kevin S. Bright, Kauffman, and Crane.

Kaggle link: https://www.kaggle.com/datasets/sujaykapadnis/friends/data?select=friends.csv

friends.csv variables:

  • text: Dialogue as text
  • speaker: Name of the speaker
  • season: Season Number
  • episode: Episode Number
  • scene: Scene Number
  • utterance: Utterance Number

About the Topic Modelling project

I am a devoted fan of the 'Friends' sitcom, having rewatched the series numerous times. Among the characters, Chandler Bing stands out as my favorite male character. His witty humor never fails to bring a smile to my face. The recent loss of the actor, Matthew Perry, who portrayed him, deeply saddened fans around the worlds. In tribute to him and the entire cast of the series, I undertook a small project using this captivating dataset.

In this project, I'll visualize text data using WordCloud, employ the LDA model for topic modeling, and compute coherence scores to assess the model's quality and find the optimal number of topics. I also create an interactive visualization with pyLDAvis, saving it as an HTML link for exploration.

About

In this project, I'll visualize text data using WordCloud, employ the LDA model for topic modeling, and compute coherence scores to assess the model's quality and find the optimal number of topics. I'll create an interactive visualization with pyLDAvis, saving it as an HTML link for exploration.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published