We created a topic modeling pipeline to evaluate different topic modeling algorithms, including their performance on short and long text, preprocessed and not preprocessed datasets, and with different embedding models. Finally, we summarized the results and suggested how to choose algorithms based on the task.
visualization
natural-language-processing
topic-modeling
lda
unsupervised-learning
nmf
ctm
yahoo-answers
20newsgroup
top2vec
bertopic
lda-bert
-
Updated
Aug 26, 2022 - Jupyter Notebook