Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shap.prep not working when dealing with multiclass classification problems #35

Open
Slangevar opened this issue Oct 7, 2022 · 3 comments

Comments

@Slangevar
Copy link

Hi! I fit a multiclass classification XGBoost model and try to generate the plot of shapley values using shap.prep and then shap.plot.summary. However, I'm getting the following error when using shap.prep

Error in `colnames<-`(`*tmp*`, value = c(colnames(X_train), "BIAS")) : 
  attempt to set 'colnames' on an object with less than two dimensions

I was wondering whether there was a bug here. Thank you.

@marboe123
Copy link

Error in `colnames<-`(`*tmp*`, value = c(colnames(X_train), "BIAS")) : 
  attempt to set 'colnames' on an object with less than two dimensions

I receive the same error above after running this command for multiclass classification with a XGBoost model:

shap_values <- SHAPforxgboost::shap.values(xgb_model = model_n, X_train = trainval)

If I use the package fastshap with a similar line of code:

shap_values <- fastshap::explain(model_n, X = trainval, exact = TRUE)

I receive this error:

Error in if (ncol(res) == 1) { : argument is of length zero

While fastshap is running fine on binary classification.

I don't know what can be the cause.
Thank you.

@nipnipj
Copy link

nipnipj commented Feb 8, 2023

Same issue here. Multiclass model.

@matthewrw
Copy link

One way this can happen is if there are extra arguments passed into your call to fastshap::explain that don't match a named argument. On https://github.com/bgreenwell/fastshap/blob/90a9cedb26ab217c98f9b917eaa8d0ba28270493/R/explain.R#L372 any unmatched arguments get absorbed into the ... , which likely is length one and foreach only uses the shortest iterator, so any subsequent columns (covariates) get dropped and thus there is a mismatch in the column lengths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants