-
Notifications
You must be signed in to change notification settings - Fork 389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CODE IMPROVEMENT] if user specifies a “system” column but it doesnt exist, it should error out instead of continue running silently #600
Comments
Conversation_chain_handler.py L140 change from a simple log to a raise error? There is so much stuff being printed in the log that the average person would miss the warning |
How exactly is it possible to specify a column that does not exist? |
I guess the issue is referring to the case if the training Dataframe contains a system column, but validation does not.
To keep the pipeline flexible, one should not raise an issue here. One may use a common evaluation datasets across different experiments (mt-bench, company specific evaluation dataset, ...) that does not contain any system column. As a low-priority issue, one could think about adding Dataframe checks before running an experiment (alongside cfg checks). For now, logging a warning is sufficient IMO. |
No, it doesn’t have to do with train vs valid. Just use any csv file, and in your config.yaml for training, type system=“column_that_doesnt_exist”. The code will still run, it will log a small error saying that the System column was not found. I’m suggesting that instead of logging that, you should just raise an AssertionError |
Thanks for the clarification! I'd go into the direction of adding DataFrame checks to |
🔧 Proposed code refactoring
if system column not in train dataframe.coljmns or in valid columns, then error out
Motivation
Otherwise user might erroneously believe they are using a system column
The text was updated successfully, but these errors were encountered: