-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allowing the use of alternative loss functions #82
Comments
Hi Batuhan, The use of cross entropy loss for classifier and mean squared error loss for regressor rooted in the early API design of torchensemble (following the same design of Scikit-Learn Ensemble). I agree with you that there should be no limitation on the specification of objective functions.
Sure, your contributions are highly welcomed. Here are some ideas coming from my mind:
Gradient boosting requires additional considerations, we could skip this ensemble first. The ideas above are still very rough, feel free to comment below if you have better solutions. If you agree with this design, how about we start with |
That makes sense. Thanks for the suggestions. Starting with |
Great! |
Would it be simpler to just set the loss function as an instance variable with a model = VotingRegressor(
estimator=MLP,
n_estimators=10,
cuda=True,
)
criterion = nn.L1Loss()
model.set_criterion(criterion) I've tried this and it seems to work well for all of the ensemble modules except GradientBoosting. This would also circumvent the need to create additional custom classes for each ensemble module. Let me know what you think of this idea. The additional considerations of GradientBoosting could potentially be addressed by calculating pseudo-residuals using |
Sorry for the late response @by256.
Sure, this looks nice! Meanwhile, I will also take a look at how to automatically calculate the first-order gradients with |
Hi,
Thank you for this super useful library.
I've noticed that the ensemble modules are all restricted to either cross-entropy loss (in the case of classification) or mean squared error loss (in the case of regression). Is there a particular reason for this? It would be great if we could pass any objective function of our choosing to the ensemble modules, as this would provide much greater flexibility.
If there are no theoretical restrictions as to why we can't use alternative losses, I could potentially have a go at implementing this.
Batuhan
The text was updated successfully, but these errors were encountered: