-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High RAM usage when loading FastText Model on Google Colab #2502
Comments
Does it crash on the load, or shortly thereafter when you start using the vectors? Because: doing common operations like |
it crash on the loading process, stopped in the syntax above. I haven't used the model at all. |
@gojomo , then how can i get around using most_similar() on google colab ? |
If There are a bunch of major memory inefficiencies and unnecessary over-allocations in gensim's FastText support up through the current released version, 3.8.3. They'll be fixed in the eventual gensim-4.0.0 release, so those FB models might be more usable within 12GB, but those & other changes are still being tested & further improved, and there's not yet any certain date for a 4.0.0 release. An advanced user capable of running in-development code that's checked-out from Github and built locally could use that fixed code now & help test it, but I'm not sure Google Colab would support that. |
I'm getting the same error. The colab runtime crashes loading the model:
|
@italodamato please post your versions, like @rianrajagede did above (or open a new ticket). |
numpy==1.18.5 gcc 7.5.0 |
In that case see @gojomo's answer above. The 4.0 beta release is here: https://github.com/RaRe-Technologies/gensim/releases/tag/4.0.0beta |
Upgraded to 4.0 but it keeps crashing. |
It looks like you're using an even larger model ( As the gensim-4.0.0-beta has removed the major sources of unnecessary memory usage in Gensim's implementation, if you are still getting "crashed after using all available RAM" errors, your main ways forward are likely to be: (1) moving to a system with more RAM at Colab or elsewhere; (2) if it's possible your other uses of RAM are contributing the usage, reducing those usages. |
I'll try with his model. I'm not sure what the difference is between the two though. |
It's crashed again. I don't have any other things in memory when I do it. |
On a VM with 32G RAM, with Python 3.6 under Ubuntu 18.04 & It took almost 4 minutes of wall clock time (!) but completed without error. Gensim in Python likely has more overhead than Facebook's C++ (Note, though that other serious issues may remain in loading full native FastText models, such as #2969.) |
Problem description
I want to load FastText pre-trained model using Gensim. I run this script in Google Colab with ~12GB RAM but it always crashes, with Colab's message: "Your session crashed after using all available RAM."
Steps/code/corpus to reproduce
I didn't use both methods at the same time, I only use one of them. Then, I restart the runtime to clear the memory if I want to run another method. I use method 2 to avoid issue #2378. Both method crash Colab by using all available RAM. At first, I think the problem is the model but if I check the size each model, their size is far below 12GB:
and if I load using FastText Python module, it works:
Versions
Linux-4.14.79+-x86_64-with-Ubuntu-18.04-bionic
Python 3.6.7 (default, Oct 22 2018, 11:32:17)
[GCC 8.2.0]
NumPy 1.16.3
SciPy 1.3.0
gensim 3.7.3
FAST_VERSION 1
The text was updated successfully, but these errors were encountered: