Pytorch Categorical GRU Policy #2196

ManavR123 · 2020-12-09T08:45:37Z

In this PR, I add a Pytorch version of the Categorical GRU Policy. I believe there is some in-progress work with adding RNN support #2172, however, this PR seems to be in draft for a while and I personally needed this policy so I figured I would try and contribute. One thing missing would be an example using this policy, but it isn't too different from other similar policies.

I would appreciate any feedback on this!

@krzentner @avnishn

avnishn

This looks pretty good to me.

One thing that the tests are missing from the tensor flow tests is the tests that build an input that concatenates the action and the observation.

Tbf I'm not sure why those exist, and what use case they're testing. I think that @krzentner can probably shed some light on this.

Otherwise this is pretty awesome and we'll wait on krs comments to address this. I think she and another student have been working on adding pytorch rnn support for the past few weeks and this may be helpful to them.

avnishn · 2020-12-09T18:14:27Z

Oh wait one more thing @ManavR123.

You'll want to include a test similar to:

tests/garage/tf/models/test_categorical_gru_model.py

The test titled test normalized outputs checks to ensure that your module outputs the correct values in a golden values test.

This will ensure correctness.

You can also obtain these golden values by using your module/policy in some benchmark to ensure that it works, then running the policy and module on a simplified environment like grid world env, capturing the weights of that policy/module, and then writing a test that compares the values of a trained policy to those golden values after a few optimization iterations.

Does that make sense? Please ask me anything if that wasn't clear!

Thanks
@avnishn

avnishn

Just reviewed the test you wrote. I think there's a small bug see my comments and lmk what you think.

tests/garage/torch/modules/test_categorical_gru_module.py

krzentner · 2020-12-11T00:57:20Z

Wow, this PR looks good. The main feedback I have is that we don't actually yet have any algorithms implemented in PyTorch that can train an RNN. Unfortunately, our VPG (and PPO and TRPO) shuffle all timesteps individually in their batching step, which prevents training the RNN correctly. I'm currently working on fixing that, but probably won't have anything usable available before sometime next week.

The more minor feedback I have is that I think we should actually streamline the garage/torch/modules directory to not include modules that output distributions whenever possible, since it's just an extra layer to reason about when working with policies (which themselves have a super-set of the API of the equivalent module).

ryanjulian · 2020-12-11T01:34:48Z

@krzentner I'm curious: what is your implementation plan for RNNs in torch/VPG? Is it to make the optimization trajectory-oriented (as in tf/VPG) or something else?

nikhilxb · 2021-09-09T14:11:31Z

Wow, this PR looks good. The main feedback I have is that we don't actually yet have any algorithms implemented in PyTorch that can train an RNN. Unfortunately, our VPG (and PPO and TRPO) shuffle all timesteps individually in their batching step, which prevents training the RNN correctly. I'm currently working on fixing that, but probably won't have anything usable available before sometime next week.

The more minor feedback I have is that I think we should actually streamline the garage/torch/modules directory to not include modules that output distributions whenever possible, since it's just an extra layer to reason about when working with policies (which themselves have a super-set of the API of the equivalent module).

@krzentner Did this end up getting done? I'm also trying to implement an RNN policy in PyTorch but can't seem to find any working examples.

added categorical gru policy

d1e36b8

ManavR123 requested a review from a team as a code owner December 9, 2020 08:45

ManavR123 requested review from ahtsan and removed request for a team December 9, 2020 08:45

mergify bot requested review from a team, gitanshu and nicolengsy and removed request for a team December 9, 2020 08:46

avnishn approved these changes Dec 9, 2020

View reviewed changes

mergify bot requested a review from a team December 9, 2020 18:09

added gru module tests

4ea2665

avnishn reviewed Dec 9, 2020

View reviewed changes

tests/garage/torch/modules/test_categorical_gru_module.py Outdated Show resolved Hide resolved

mergify bot requested a review from a team December 9, 2020 21:22

switched return to assert

7eaee83

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pytorch Categorical GRU Policy #2196

Pytorch Categorical GRU Policy #2196

ManavR123 commented Dec 9, 2020 •

edited

Loading

avnishn left a comment

avnishn commented Dec 9, 2020

avnishn left a comment

krzentner commented Dec 11, 2020

ryanjulian commented Dec 11, 2020

nikhilxb commented Sep 9, 2021

Pytorch Categorical GRU Policy #2196

Are you sure you want to change the base?

Pytorch Categorical GRU Policy #2196

Conversation

ManavR123 commented Dec 9, 2020 • edited Loading

avnishn left a comment

Choose a reason for hiding this comment

avnishn commented Dec 9, 2020

avnishn left a comment

Choose a reason for hiding this comment

krzentner commented Dec 11, 2020

ryanjulian commented Dec 11, 2020

nikhilxb commented Sep 9, 2021

ManavR123 commented Dec 9, 2020 •

edited

Loading