You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since we're adapting pathology datasets for use in machine learning, often our datasets end up being imbalanced, e.g., the treatment non-responding set ends up having 2x the number of graphs as the responding to treatment set. This can lead to model overfitting since it's seeing 2x or worse of examples from one category than the other.
I'd like to go in and add an option to the train/validation/test set split such that, if one or more classes has more examples than another class, I shunt off excess examples into what is currently the "unlabeled" class but would be renamed to the "not used for training, validation, or testing" class. This function could also be propagated to spt-plugin so forked plugins also have access to it.
The text was updated successfully, but these errors were encountered:
Since we're adapting pathology datasets for use in machine learning, often our datasets end up being imbalanced, e.g., the treatment non-responding set ends up having 2x the number of graphs as the responding to treatment set. This can lead to model overfitting since it's seeing 2x or worse of examples from one category than the other.
I'd like to go in and add an option to the train/validation/test set split such that, if one or more classes has more examples than another class, I shunt off excess examples into what is currently the "unlabeled" class but would be renamed to the "not used for training, validation, or testing" class. This function could also be propagated to spt-plugin so forked plugins also have access to it.
The text was updated successfully, but these errors were encountered: