Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training on GPU is not successful using XGBClassifier when training data is too large #10301

Open
madakkmi opened this issue May 20, 2024 · 6 comments

Comments

@madakkmi
Copy link

madakkmi commented May 20, 2024

I have X_train and y_train with shapes (483903, 2897) and (483903,) respectively. Training XGBoost is successful on GPU using the following code:

import xgboost as xgb
fit_kwargs = {'tree_method': 'hist', 'device': 'cuda'}
dtrain = xgb.DMatrix(X_train, label=y_train, feature_names=list(X_train.columns))
model = xgb.train(fit_kwargs, dtrain)

However, the following code does not run on GPU successfully:

from xgboost import XGBClassifier
model = XGBClassifier(**fit_kwargs)
model.fit(X_train, y_train)

It throws the error:

XGBoostError: [16:43:48] C:\buildkite-agent\builds\buildkite-windows-cpu-autoscaling-group-i-0b3782d1791676daf-1\xgboost\xgboost-ci-windows\src\tree\updater_gpu_hist.cu:781: Exception in gpu_hist: [16:43:48] C:\buildkite-agent\builds\buildkite-windows-cpu-autoscaling-group-i-0b3782d1791676daf-1\xgboost\xgboost-ci-windows\src\data\../common/device_helpers.cuh:431: Memory allocation error on worker 0: bad allocation: cudaErrorMemoryAllocation: out of memory
- Free memory: 1997537280
- Requested memory: 5558567356

It is expected that if xgb.train(fit_kwargs, dtrain) runs on GPU successfully, then fitting using XGBClassifier should also run on GPU successfully.

xgboost version = 2.0.3

@trivialfis
Copy link
Member

Hi, if you replace the DMatrix object with QuantileDMatrix in the native interface snippet, does it work? In addition, what's the type of X_train?

@madakkmi
Copy link
Author

@trivialfis, if DMatrix is replaced with QuantileDMatrix, then training in native interface fails with the following error.

XGBoostError: [09:56:51] C:\buildkite-agent\builds\buildkite-windows-cpu-autoscaling-group-i-0b3782d1791676daf-1\xgboost\xgboost-ci-windows\src\tree\updater_gpu_hist.cu:781: Exception in gpu_hist: [09:56:51] C:\buildkite-agent\builds\buildkite-windows-cpu-autoscaling-group-i-0b3782d1791676daf-1\xgboost\xgboost-ci-windows\src\data\../common/device_helpers.cuh:431: Memory allocation error on worker 0: bad allocation: cudaErrorMemoryAllocation: out of memory
- Free memory: 1974468608
- Requested memory: 5558567356

Data type of X_train is given below.

X_train.dtypes.value_counts()
Out[78]: 
float16    2683
float64     214
Name: count, dtype: int64

@trivialfis
Copy link
Member

trivialfis commented May 21, 2024

That makes sense; thank you for sharing. Could you please share the type of input, such as whether it's a pandas dataframe or a cudf dataframe?

@madakkmi
Copy link
Author

@trivialfis , thank you for your quick reply. Here is what you requested:

X_train.__class__
Out[79]: pandas.core.frame.DataFrame

y_train.__class__
Out[80]: pandas.core.series.Series

@trivialfis
Copy link
Member

trivialfis commented May 21, 2024

Hi, I noticed that using the native interface, you are training a regression model with the default objective rmse, while it's a classification model when sklearn is used. Could you please fix that?
Classification uses more memory since it needs to train one model for each class.

@madakkmi
Copy link
Author

madakkmi commented May 21, 2024

@trivialfis, thanks for noticing that. I've done modified the code (as below), and it runs successfully on GPU.

import xgboost as xgb
fit_kwargs_native = {'objective': 'binary:logistic', 'tree_method': 'hist', 'device': 'gpu'}
dtrain = xgb.DMatrix(X_train, label=y_train, feature_names=list(X_train.columns))
model = xgb.train(fit_kwargs_native, dtrain)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants