rddlgym package¶
Submodules¶
rddlgym.env module¶
-
class
rddlgym.env.RDDLEnv(rddl, config=None)¶ Bases:
gym.core.EnvGym wrapper for RDDL domains.
Parameters: rddl (str) – RDDL filename or rddlgym id. -
set_horizon(horizon)¶
-
horizon¶
-
timestep¶
-
_eval_non_fluents()¶
-
_create_observation_space()¶
-
_create_action_space()¶
-
_build_state_inputs()¶
-
_build_action_inputs()¶
-
_build_model_ops()¶
-
reset()¶ Resets the environment state and timestep.
-
step(action)¶ Execute action in the current state and timestep. Updates state and timestep and returns experience tuple (state, reward, done, info).
Parameters: action (Dict[str, np.array]) – Returns: next_state (Dict[str, np.array]), reward (np.float32), done (bool), info (Dict[str, np.array])
-
close()¶ Release resources by closing current tf.Session.
-
render(mode='human')¶ Renders the current state of the environment.
-
rddlgym.runner module¶
-
class
rddlgym.runner.Runner(env, planner, debug=False)¶ Bases:
objectRunner class implements the planner-environment loop.
Parameters: - env (rddlgym.RDDLEnv) – The RDDLEnv gym environment.
- planner (tfplan.planners.Planner) – The planner.
- debug (bool) – The debug flag.
-
build()¶ Builds the runner’s underlying components.
-
run(mode=None)¶ Runs the planner-environment loop until termination.
Parameters: mode (str) – The environment render mode. Returns: The total reward for the run. trajectory (List[Transition]): The state-action-reward trajectory. Return type: total_reward (float)
-
close()¶ Closes the environment.
rddlgym.trajectory module¶
-
class
rddlgym.trajectory.Transition(step, state, action, reward, next_state, info, done)¶ Bases:
tuple-
_asdict()¶ Return a new OrderedDict which maps field names to their values.
-
_field_defaults= {}¶
-
_fields= ('step', 'state', 'action', 'reward', 'next_state', 'info', 'done')¶
-
_fields_defaults= {}¶
-
classmethod
_make(iterable)¶ Make a new Transition object from a sequence or iterable
-
_replace(**kwds)¶ Return a new Transition object replacing specified fields with new values
-
action¶ Alias for field number 2
-
done¶ Alias for field number 6
-
info¶ Alias for field number 5
-
next_state¶ Alias for field number 4
-
reward¶ Alias for field number 3
-
state¶ Alias for field number 1
-
step¶ Alias for field number 0
-
-
class
rddlgym.trajectory.Trajectory(env)¶ Bases:
objectTrajectory class handles state-action-interm-reward sequences.
-
add_transition(step, state, action, reward, next_state, info, done)¶ Adds transition to the trajectory.
-
as_dataframe()¶ Returns the trajectory as a dataframe with columns as fluent variables.
-
save(filepath)¶ Saves the trajectory in the filepath as a CSV file.
-
states¶ Returns a dict mapping state fluent name to sequence of values.
-
actions¶ Returns a dict mapping action fluent name to sequence of values.
-
infos¶ Returns a dict mapping action fluent name to sequence of values.
-
rewards¶ Returns list of rewards.
-
initial_state¶ Returns the trajectory’s initial state.
-
final_state¶ Returns the trajectory’s final state.
-
total_reward¶ Returns the total sum of the trajectory’s rewards.
-
rddlgym.utils module¶
Collection of utility functions used in the rddlgym package.
-
class
rddlgym.utils.Mode¶ Bases:
enum.Enumrddlgym.Mode controls the type of return in rddlgym.make().
-
RAW= 1¶
-
AST= 2¶
-
SCG= 3¶
-
GYM= 4¶
-
-
rddlgym.utils.read_db()¶ Returns list of available RDDL domains as a JSON object.
-
rddlgym.utils.read_model(filename)¶ Returns RDDL string read from filename.
-
rddlgym.utils.parse_model(filename, verbose=False)¶ Returns RDDL abstract syntax tree (AST).
-
rddlgym.utils.create_env(filename, config=None)¶ Returns a RDDLEnv object for the given RDDL file.
-
rddlgym.utils.compile_model(filename)¶ Returns the rddl2tf compiler for the given RDDL file.
-
rddlgym.utils.load(filename, mode=<Mode.AST: 2>, config=None, verbose=False)¶ Loads filename with given mode.
-
rddlgym.utils.make(rddl, mode=<Mode.AST: 2>, config=None, verbose=False)¶ Returns rddl object for the given mode.
Module contents¶
-
class
rddlgym.Trajectory(env)¶ Bases:
objectTrajectory class handles state-action-interm-reward sequences.
-
add_transition(step, state, action, reward, next_state, info, done)¶ Adds transition to the trajectory.
-
as_dataframe()¶ Returns the trajectory as a dataframe with columns as fluent variables.
-
save(filepath)¶ Saves the trajectory in the filepath as a CSV file.
-
states¶ Returns a dict mapping state fluent name to sequence of values.
-
actions¶ Returns a dict mapping action fluent name to sequence of values.
-
infos¶ Returns a dict mapping action fluent name to sequence of values.
-
rewards¶ Returns list of rewards.
-
initial_state¶ Returns the trajectory’s initial state.
-
final_state¶ Returns the trajectory’s final state.
-
total_reward¶ Returns the total sum of the trajectory’s rewards.
-
-
class
rddlgym.Runner(env, planner, debug=False)¶ Bases:
objectRunner class implements the planner-environment loop.
Parameters: - env (rddlgym.RDDLEnv) – The RDDLEnv gym environment.
- planner (tfplan.planners.Planner) – The planner.
- debug (bool) – The debug flag.
-
build()¶ Builds the runner’s underlying components.
-
run(mode=None)¶ Runs the planner-environment loop until termination.
Parameters: mode (str) – The environment render mode. Returns: The total reward for the run. trajectory (List[Transition]): The state-action-reward trajectory. Return type: total_reward (float)
-
close()¶ Closes the environment.
-
rddlgym.make(rddl, mode=<Mode.AST: 2>, config=None, verbose=False)¶ Returns rddl object for the given mode.
-
rddlgym.load(filename, mode=<Mode.AST: 2>, config=None, verbose=False)¶ Loads filename with given mode.