rddlgym package¶

Submodules¶

rddlgym.env module¶

class rddlgym.env.RDDLEnv(rddl, config=None)¶

Bases: gym.core.Env

Gym wrapper for RDDL domains.

Parameters:	rddl (str) – RDDL filename or rddlgym id.

set_horizon(horizon)¶

horizon¶

timestep¶

_eval_non_fluents()¶

_create_observation_space()¶

_create_action_space()¶

_build_state_inputs()¶

_build_action_inputs()¶

_build_model_ops()¶

reset()¶: Resets the environment state and timestep.

step(action)¶

Execute action in the current state and timestep. Updates state and timestep and returns experience tuple (state, reward, done, info).

Parameters:	action (Dict[str, np.array]) –
Returns:	next_state (Dict[str, np.array]), reward (np.float32), done (bool), info (Dict[str, np.array])

close()¶: Release resources by closing current tf.Session.

render(mode='human')¶: Renders the current state of the environment.

rddlgym.runner module¶

class rddlgym.runner.Runner(env, planner, debug=False)¶

Bases: object

Runner class implements the planner-environment loop.

Parameters:	env (rddlgym.RDDLEnv) – The RDDLEnv gym environment. planner (tfplan.planners.Planner) – The planner. debug (bool) – The debug flag.

build()¶: Builds the runner’s underlying components.

run(mode=None)¶

Runs the planner-environment loop until termination.

Parameters:	mode (str) – The environment render mode.
Returns:	The total reward for the run. trajectory (List[Transition]): The state-action-reward trajectory.
Return type:	total_reward (float)

close()¶: Closes the environment.

rddlgym.trajectory module¶

class rddlgym.trajectory.Transition(step, state, action, reward, next_state, info, done)¶

Bases: tuple

_asdict()¶: Return a new OrderedDict which maps field names to their values.

_field_defaults = {}¶

_fields = ('step', 'state', 'action', 'reward', 'next_state', 'info', 'done')¶

_fields_defaults = {}¶

classmethod _make(iterable)¶: Make a new Transition object from a sequence or iterable

_replace(**kwds)¶: Return a new Transition object replacing specified fields with new values

action¶: Alias for field number 2

done¶: Alias for field number 6

info¶: Alias for field number 5

next_state¶: Alias for field number 4

reward¶: Alias for field number 3

state¶: Alias for field number 1

step¶: Alias for field number 0

class rddlgym.trajectory.Trajectory(env)¶

Bases: object

Trajectory class handles state-action-interm-reward sequences.

add_transition(step, state, action, reward, next_state, info, done)¶: Adds transition to the trajectory.

as_dataframe()¶: Returns the trajectory as a dataframe with columns as fluent variables.

save(filepath)¶: Saves the trajectory in the filepath as a CSV file.

states¶: Returns a dict mapping state fluent name to sequence of values.

actions¶: Returns a dict mapping action fluent name to sequence of values.

infos¶: Returns a dict mapping action fluent name to sequence of values.

rewards¶: Returns list of rewards.

initial_state¶: Returns the trajectory’s initial state.

final_state¶: Returns the trajectory’s final state.

total_reward¶: Returns the total sum of the trajectory’s rewards.

rddlgym.utils module¶

Collection of utility functions used in the rddlgym package.

class rddlgym.utils.Mode¶

Bases: enum.Enum

rddlgym.Mode controls the type of return in rddlgym.make().

RAW = 1¶

AST = 2¶

SCG = 3¶

GYM = 4¶

rddlgym.utils.read_db()¶: Returns list of available RDDL domains as a JSON object.

rddlgym.utils.read_model(filename)¶: Returns RDDL string read from filename.

rddlgym.utils.parse_model(filename, verbose=False)¶: Returns RDDL abstract syntax tree (AST).

rddlgym.utils.create_env(filename, config=None)¶: Returns a RDDLEnv object for the given RDDL file.

rddlgym.utils.compile_model(filename)¶: Returns the rddl2tf compiler for the given RDDL file.

rddlgym.utils.load(filename, mode=<Mode.AST: 2>, config=None, verbose=False)¶: Loads filename with given mode.

rddlgym.utils.make(rddl, mode=<Mode.AST: 2>, config=None, verbose=False)¶: Returns rddl object for the given mode.

Module contents¶

class rddlgym.Trajectory(env)¶

Bases: object

Trajectory class handles state-action-interm-reward sequences.

add_transition(step, state, action, reward, next_state, info, done)¶: Adds transition to the trajectory.

as_dataframe()¶: Returns the trajectory as a dataframe with columns as fluent variables.

save(filepath)¶: Saves the trajectory in the filepath as a CSV file.

states¶: Returns a dict mapping state fluent name to sequence of values.

actions¶: Returns a dict mapping action fluent name to sequence of values.

infos¶: Returns a dict mapping action fluent name to sequence of values.

rewards¶: Returns list of rewards.

initial_state¶: Returns the trajectory’s initial state.

final_state¶: Returns the trajectory’s final state.

total_reward¶: Returns the total sum of the trajectory’s rewards.

class rddlgym.Runner(env, planner, debug=False)¶

Bases: object

Runner class implements the planner-environment loop.

Parameters:	env (rddlgym.RDDLEnv) – The RDDLEnv gym environment. planner (tfplan.planners.Planner) – The planner. debug (bool) – The debug flag.

build()¶: Builds the runner’s underlying components.

run(mode=None)¶

Runs the planner-environment loop until termination.

Parameters:	mode (str) – The environment render mode.
Returns:	The total reward for the run. trajectory (List[Transition]): The state-action-reward trajectory.
Return type:	total_reward (float)

close()¶: Closes the environment.

rddlgym.make(rddl, mode=<Mode.AST: 2>, config=None, verbose=False)¶: Returns rddl object for the given mode.

rddlgym.load(filename, mode=<Mode.AST: 2>, config=None, verbose=False)¶: Loads filename with given mode.