-
Notifications
You must be signed in to change notification settings - Fork 646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aggregation operations not working with many groups #7164
Comments
Hi @abhi211199, thanks for the question! What version of Modin are you using? Spilling occurs when there is not enough memory to store all objects in distributed storage. Therefore, there may be two options: you simply do not have enough RAM to process your amount of data, or Modin in such a case uses much more memory than is required. Please provide us with this information so that we can better understand the problem. Please note that data clearing after spilling in the temporary folder does not occur automatically. This should be done manually, so there is a chance that the temporary folder was filled with data from previous runs, which prevented your last run from running successfully. |
Hi, I'm new to
modin.pandas
, I'm trying to perform several aggregation operations on groups which were splitted from a dataframe usingdf.groupby()
. The aggregation util works fine when I have less number of groups but fails to perform when it has more than 1000 groups.To give an idea of the code using an example
When executed, this goes on and on
but using simply
pandas
doesn't cause this issue.Please let me know if I'm missing something that needs to be used while handling large number of groups. Thanks
The text was updated successfully, but these errors were encountered: