rddlgym package

Submodules

rddlgym.env module

class rddlgym.env.RDDLEnv(rddl, config=None)

Bases: gym.core.Env

Gym wrapper for RDDL domains.

Parameters:rddl (str) – RDDL filename or rddlgym id.
set_horizon(horizon)
horizon
timestep
_eval_non_fluents()
_create_observation_space()
_create_action_space()
_build_state_inputs()
_build_action_inputs()
_build_model_ops()
reset()

Resets the environment state and timestep.

step(action)

Execute action in the current state and timestep. Updates state and timestep and returns experience tuple (state, reward, done, info).

Parameters:action (Dict[str, np.array]) –
Returns:next_state (Dict[str, np.array]), reward (np.float32), done (bool), info (Dict[str, np.array])
close()

Release resources by closing current tf.Session.

render(mode='human')

Renders the current state of the environment.

rddlgym.runner module

class rddlgym.runner.Runner(env, planner, debug=False)

Bases: object

Runner class implements the planner-environment loop.

Parameters:
  • env (rddlgym.RDDLEnv) – The RDDLEnv gym environment.
  • planner (tfplan.planners.Planner) – The planner.
  • debug (bool) – The debug flag.
build()

Builds the runner’s underlying components.

run(mode=None)

Runs the planner-environment loop until termination.

Parameters:mode (str) – The environment render mode.
Returns:The total reward for the run. trajectory (List[Transition]): The state-action-reward trajectory.
Return type:total_reward (float)
close()

Closes the environment.

rddlgym.trajectory module

class rddlgym.trajectory.Transition(step, state, action, reward, next_state, info, done)

Bases: tuple

_asdict()

Return a new OrderedDict which maps field names to their values.

_field_defaults = {}
_fields = ('step', 'state', 'action', 'reward', 'next_state', 'info', 'done')
_fields_defaults = {}
classmethod _make(iterable)

Make a new Transition object from a sequence or iterable

_replace(**kwds)

Return a new Transition object replacing specified fields with new values

action

Alias for field number 2

done

Alias for field number 6

info

Alias for field number 5

next_state

Alias for field number 4

reward

Alias for field number 3

state

Alias for field number 1

step

Alias for field number 0

class rddlgym.trajectory.Trajectory(env)

Bases: object

Trajectory class handles state-action-interm-reward sequences.

add_transition(step, state, action, reward, next_state, info, done)

Adds transition to the trajectory.

as_dataframe()

Returns the trajectory as a dataframe with columns as fluent variables.

save(filepath)

Saves the trajectory in the filepath as a CSV file.

states

Returns a dict mapping state fluent name to sequence of values.

actions

Returns a dict mapping action fluent name to sequence of values.

infos

Returns a dict mapping action fluent name to sequence of values.

rewards

Returns list of rewards.

initial_state

Returns the trajectory’s initial state.

final_state

Returns the trajectory’s final state.

total_reward

Returns the total sum of the trajectory’s rewards.

rddlgym.utils module

Collection of utility functions used in the rddlgym package.

class rddlgym.utils.Mode

Bases: enum.Enum

rddlgym.Mode controls the type of return in rddlgym.make().

RAW = 1
AST = 2
SCG = 3
GYM = 4
rddlgym.utils.read_db()

Returns list of available RDDL domains as a JSON object.

rddlgym.utils.read_model(filename)

Returns RDDL string read from filename.

rddlgym.utils.parse_model(filename, verbose=False)

Returns RDDL abstract syntax tree (AST).

rddlgym.utils.create_env(filename, config=None)

Returns a RDDLEnv object for the given RDDL file.

rddlgym.utils.compile_model(filename)

Returns the rddl2tf compiler for the given RDDL file.

rddlgym.utils.load(filename, mode=<Mode.AST: 2>, config=None, verbose=False)

Loads filename with given mode.

rddlgym.utils.make(rddl, mode=<Mode.AST: 2>, config=None, verbose=False)

Returns rddl object for the given mode.

Module contents

class rddlgym.Trajectory(env)

Bases: object

Trajectory class handles state-action-interm-reward sequences.

add_transition(step, state, action, reward, next_state, info, done)

Adds transition to the trajectory.

as_dataframe()

Returns the trajectory as a dataframe with columns as fluent variables.

save(filepath)

Saves the trajectory in the filepath as a CSV file.

states

Returns a dict mapping state fluent name to sequence of values.

actions

Returns a dict mapping action fluent name to sequence of values.

infos

Returns a dict mapping action fluent name to sequence of values.

rewards

Returns list of rewards.

initial_state

Returns the trajectory’s initial state.

final_state

Returns the trajectory’s final state.

total_reward

Returns the total sum of the trajectory’s rewards.

class rddlgym.Runner(env, planner, debug=False)

Bases: object

Runner class implements the planner-environment loop.

Parameters:
  • env (rddlgym.RDDLEnv) – The RDDLEnv gym environment.
  • planner (tfplan.planners.Planner) – The planner.
  • debug (bool) – The debug flag.
build()

Builds the runner’s underlying components.

run(mode=None)

Runs the planner-environment loop until termination.

Parameters:mode (str) – The environment render mode.
Returns:The total reward for the run. trajectory (List[Transition]): The state-action-reward trajectory.
Return type:total_reward (float)
close()

Closes the environment.

rddlgym.make(rddl, mode=<Mode.AST: 2>, config=None, verbose=False)

Returns rddl object for the given mode.

rddlgym.load(filename, mode=<Mode.AST: 2>, config=None, verbose=False)

Loads filename with given mode.