For a model of Markov Decision Process, Policy creation via two methods : Value Iteration and Linear Programming
Model world has 4*4 block grid, one positive terminal state, one negative terminal state. is total description of the world.
For a model of Markov Decision Process, Policy creation via two methods : Value Iteration and Linear Programming
Model world has 4*4 block grid, one positive terminal state, one negative terminal state. is total description of the world.