Welcome to rddlgym’s documentation!¶
rddlgym package¶
Submodules¶
rddlgym.env module¶
-
class
rddlgym.env.
RDDLEnv
(rddl, config=None)¶ Bases:
gym.core.Env
Gym wrapper for RDDL domains.
Parameters: rddl (str) – RDDL filename or rddlgym id. -
set_horizon
(horizon)¶
-
horizon
¶
-
timestep
¶
-
_eval_non_fluents
()¶
-
_create_observation_space
()¶
-
_create_action_space
()¶
-
_build_state_inputs
()¶
-
_build_action_inputs
()¶
-
_build_model_ops
()¶
-
reset
()¶ Resets the environment state and timestep.
-
step
(action)¶ Execute action in the current state and timestep. Updates state and timestep and returns experience tuple (state, reward, done, info).
Parameters: action (Dict[str, np.array]) – Returns: next_state (Dict[str, np.array]), reward (np.float32), done (bool), info (Dict[str, np.array])
-
close
()¶ Release resources by closing current tf.Session.
-
render
(mode='human')¶ Renders the current state of the environment.
-
rddlgym.runner module¶
-
class
rddlgym.runner.
Runner
(env, planner, debug=False)¶ Bases:
object
Runner class implements the planner-environment loop.
Parameters: - env (rddlgym.RDDLEnv) – The RDDLEnv gym environment.
- planner (tfplan.planners.Planner) – The planner.
- debug (bool) – The debug flag.
-
build
()¶ Builds the runner’s underlying components.
-
run
(mode=None)¶ Runs the planner-environment loop until termination.
Parameters: mode (str) – The environment render mode. Returns: The total reward for the run. trajectory (List[Transition]): The state-action-reward trajectory. Return type: total_reward (float)
-
close
()¶ Closes the environment.
rddlgym.trajectory module¶
-
class
rddlgym.trajectory.
Transition
(step, state, action, reward, next_state, info, done)¶ Bases:
tuple
-
_asdict
()¶ Return a new OrderedDict which maps field names to their values.
-
_field_defaults
= {}¶
-
_fields
= ('step', 'state', 'action', 'reward', 'next_state', 'info', 'done')¶
-
_fields_defaults
= {}¶
-
classmethod
_make
(iterable)¶ Make a new Transition object from a sequence or iterable
-
_replace
(**kwds)¶ Return a new Transition object replacing specified fields with new values
-
action
¶ Alias for field number 2
-
done
¶ Alias for field number 6
-
info
¶ Alias for field number 5
-
next_state
¶ Alias for field number 4
-
reward
¶ Alias for field number 3
-
state
¶ Alias for field number 1
-
step
¶ Alias for field number 0
-
-
class
rddlgym.trajectory.
Trajectory
(env)¶ Bases:
object
Trajectory class handles state-action-interm-reward sequences.
-
add_transition
(step, state, action, reward, next_state, info, done)¶ Adds transition to the trajectory.
-
as_dataframe
()¶ Returns the trajectory as a dataframe with columns as fluent variables.
-
save
(filepath)¶ Saves the trajectory in the filepath as a CSV file.
-
states
¶ Returns a dict mapping state fluent name to sequence of values.
-
actions
¶ Returns a dict mapping action fluent name to sequence of values.
-
infos
¶ Returns a dict mapping action fluent name to sequence of values.
-
rewards
¶ Returns list of rewards.
-
initial_state
¶ Returns the trajectory’s initial state.
-
final_state
¶ Returns the trajectory’s final state.
-
total_reward
¶ Returns the total sum of the trajectory’s rewards.
-
rddlgym.utils module¶
Collection of utility functions used in the rddlgym package.
-
class
rddlgym.utils.
Mode
¶ Bases:
enum.Enum
rddlgym.Mode controls the type of return in rddlgym.make().
-
RAW
= 1¶
-
AST
= 2¶
-
SCG
= 3¶
-
GYM
= 4¶
-
-
rddlgym.utils.
read_db
()¶ Returns list of available RDDL domains as a JSON object.
-
rddlgym.utils.
read_model
(filename)¶ Returns RDDL string read from filename.
-
rddlgym.utils.
parse_model
(filename, verbose=False)¶ Returns RDDL abstract syntax tree (AST).
-
rddlgym.utils.
create_env
(filename, config=None)¶ Returns a RDDLEnv object for the given RDDL file.
-
rddlgym.utils.
compile_model
(filename)¶ Returns the rddl2tf compiler for the given RDDL file.
-
rddlgym.utils.
load
(filename, mode=<Mode.AST: 2>, config=None, verbose=False)¶ Loads filename with given mode.
-
rddlgym.utils.
make
(rddl, mode=<Mode.AST: 2>, config=None, verbose=False)¶ Returns rddl object for the given mode.
Module contents¶
-
class
rddlgym.
Trajectory
(env)¶ Bases:
object
Trajectory class handles state-action-interm-reward sequences.
-
add_transition
(step, state, action, reward, next_state, info, done)¶ Adds transition to the trajectory.
-
as_dataframe
()¶ Returns the trajectory as a dataframe with columns as fluent variables.
-
save
(filepath)¶ Saves the trajectory in the filepath as a CSV file.
-
states
¶ Returns a dict mapping state fluent name to sequence of values.
-
actions
¶ Returns a dict mapping action fluent name to sequence of values.
-
infos
¶ Returns a dict mapping action fluent name to sequence of values.
-
rewards
¶ Returns list of rewards.
-
initial_state
¶ Returns the trajectory’s initial state.
-
final_state
¶ Returns the trajectory’s final state.
-
total_reward
¶ Returns the total sum of the trajectory’s rewards.
-
-
class
rddlgym.
Runner
(env, planner, debug=False)¶ Bases:
object
Runner class implements the planner-environment loop.
Parameters: - env (rddlgym.RDDLEnv) – The RDDLEnv gym environment.
- planner (tfplan.planners.Planner) – The planner.
- debug (bool) – The debug flag.
-
build
()¶ Builds the runner’s underlying components.
-
run
(mode=None)¶ Runs the planner-environment loop until termination.
Parameters: mode (str) – The environment render mode. Returns: The total reward for the run. trajectory (List[Transition]): The state-action-reward trajectory. Return type: total_reward (float)
-
close
()¶ Closes the environment.
-
rddlgym.
make
(rddl, mode=<Mode.AST: 2>, config=None, verbose=False)¶ Returns rddl object for the given mode.
-
rddlgym.
load
(filename, mode=<Mode.AST: 2>, config=None, verbose=False)¶ Loads filename with given mode.