RDDL2TensorFlow compiler and trajectory simulator in Python3.
$ pip3 install tfrddlsim
tf-rddlsim can be used as a standalone script or programmatically.
$ tfrddlsim --help
usage: tfrddlsim [-h] [--policy {default,random}] [--viz {generic,navigation}]
[-hr HORIZON] [-b BATCH_SIZE] [-v]
rddl
RDDL2TensorFlow compiler and simulator
positional arguments:
rddl path to RDDL file or rddlgym problem id
optional arguments:
-h, --help show this help message and exit
--policy {default,random}
type of policy (default=random)
--viz {generic,navigation}
type of visualizer (default=generic)
-hr HORIZON, --horizon HORIZON
number of timesteps of each trajectory (default=40)
-b BATCH_SIZE, --batch_size BATCH_SIZE
number of trajectories in a batch (default=75)
-v, --verbose verbosity mode
$ tfrddlsim Navigation-v1 --policy random --viz navigation -hr 50 -b 32 -v
$ tfrddlsim Reservoir-8 --policy default --viz generic -hr 20 -b 128 -v
import rddlgym
from rddl2tf.compilers import DefaultCompiler as Compiler
from tfrddlsim.policy import RandomPolicy
from tfrddlsim.simulation.policy_simulator import PolicySimulator
from tfrddlsim.viz import GenericVisualizer
# parameters
horizon = 40
batch_size = 32
# parse and compile RDDL
rddl = rddlgym.make('Reservoir-8', mode=rddlgym.AST)
compiler = Compiler(rddl, batch_size)
compiler.init()
# run simulations
policy = RandomPolicy(compiler)
simulator = PolicySimulator(compiler, policy)
trajectories = simulator.run(horizon)
# visualize trajectories
viz = GenericVisualizer(compiler, verbose=True)
viz.render(trajectories)
The tfrddlsim.Simulator
implements a stochastic Recurrent Neural Net (RNN) in order to sample state-action trajectories. Each RNN cell encapsulates a tfrddlsim.Policy
module generating actions for current states and comprehends the transition (specified by the CPFs) and reward functions. Sampling is done through dynamic unrolling of the RNN model with the embedded tfrddlsim.Policy
.
Note that the tfrddlsim
package only provides a tfrddlsim.RandomPolicy
and a tfrddlsim.DefaultPolicy
(constant policy with all action fluents with default values).
Please refer to https://tf-rddlsim.readthedocs.io/ for the code documentation.
If you are having issues with tf-rddlsim
, please let me know at: [email protected].
Copyright (c) 2018-2019 Thiago Pereira Bueno All Rights Reserved.
tf-rddlsim is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
tf-rddlsim is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with tf-rddlsim. If not, see http://www.gnu.org/licenses/.