You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After Exception happens, for whatever reason, the ep_raw_mean and ep_len_mean are much higher than usual. Are we properly reseting the environment before restarting the training? Or is there a more important issue? Perhaps the opponents are reset to their poorest state, meaning that after restarting we are playing against easier opponents?
Note that after a while it goes down to a similar level that it was before the crash.
This is the prompt that i got around the Exception:
ep_rew_mean and ep_len_mean are calculated by averaging corresponding values over the last 100 episodes. These information is stored in a buffer (ep_info_buffer). When you restart the training it is possible that the buffer is emptied and thus they are not the true mean over 100 episodes.
Emptying the buffer seems like the correct thing to do, as we start a completely new training and have no real control of the interaction between these, unless we handle this ourselves.
A much better solution would be if the training just worked - that we could train forever without any errors.
There seem to be issues at multiple levels, and quite challenging to debug what is causing all these, but I would presume there is something wrong in sapai-gym. However, I added a method to catch if any errors happened inside the step method in sapai-gym: andreped/sapai-gym@7443f36
To my surprise I still got some errors, so I believe there might be something wrong in the sapai engine. Not sure really.
After Exception happens, for whatever reason, the
ep_raw_mean
andep_len_mean
are much higher than usual. Are we properly reseting the environment before restarting the training? Or is there a more important issue? Perhaps the opponents are reset to their poorest state, meaning that after restarting we are playing against easier opponents?Note that after a while it goes down to a similar level that it was before the crash.
This is the prompt that i got around the Exception:
The text was updated successfully, but these errors were encountered: