You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I am using ETM for topic modelling for a dataset of 50K documents. I am running the model multiple times (with random seed values) to find the appropriate value of K for my data. Sometimes, the model gives me the loss values as nan for the same K. This is a little random and I am not able to track why this happens.
INFO:root:Epoch 56 - Learning Rate: 0.005 - KL theta: nan - Rec loss: nan - NELBO: nan
INFO:root:Epoch 57 - Learning Rate: 0.005 - KL theta: nan - Rec loss: nan - NELBO: nan
Once this happens, for all the epochs in that run, the loss values are nan.
Reproduction example
Here is how I am using the model:
Hi @GareemaRanjan! Thanks for your report and sorry for the delay.
On your example, you are not passing the embeddings parameter. Is that intended? E.g. do you want to also learn word embeddings alongside topic embeddings? If that's your intention, you also need to pass train_embeddings=True, because this feature is disabled by default.
Also, if you can share a reproducible and/or more complete code example, I can reproduce it myself.
Describe the bug
I am using ETM for topic modelling for a dataset of 50K documents. I am running the model multiple times (with random seed values) to find the appropriate value of K for my data. Sometimes, the model gives me the loss values as nan for the same K. This is a little random and I am not able to track why this happens.
INFO:root:Epoch 56 - Learning Rate: 0.005 - KL theta: nan - Rec loss: nan - NELBO: nan
INFO:root:Epoch 57 - Learning Rate: 0.005 - KL theta: nan - Rec loss: nan - NELBO: nan
Once this happens, for all the epochs in that run, the loss values are nan.
Reproduction example
Here is how I am using the model:
etm_instance = ETM(
vocabulary,
num_topics=k,
epochs=100,
debug_mode=True,
seed=random_seed,
)
I am new to topic modelling (and machine learning). Is there something I am missing?
The text was updated successfully, but these errors were encountered: