BaseTask#

safety_gymnasium.bases.base_task#

class safety_gymnasium.bases.base_task.BaseTask(config: dict)#

Base task class for defining some common characteristic and mechanism.

Methods:

dist_goal(): Return the distance from the agent to the goal XY position.
calculate_cost(): Determine costs depending on the agent and obstacles, actually all cost calculation is done in different safety_gymnasium.bases.base_obstacle.BaseObject.cal_cost() which implemented in different types of object, We just combine all results of them here.
build_observation_space(): Build observation space, combine agent specific observation space and task specific observation space together.
_build_placements_dict(): Build placement dictionary for all types of object.
toggle_observation_space(): Toggle observation space.
_build_world_config(): Create a world_config from all separate configs of different types of object.
_build_static_geoms_config(): Build static geoms config from yaml files.
build_goal_position(): Build goal position, it will be called when the task is initialized or when the goal is achieved.
_placements_dict_from_object(): Build placement dictionary for a specific type of object.
obs(): Combine and return all separate observations of different types of object.
_obs_lidar(): Return lidar observation, unify natural lidar and pseudo lidar in API.
_obs_lidar_natural(): Return natural lidar observation.
_obs_lidar_pseudo(): Return pseudo lidar observation.
_obs_compass(): Return compass observation.
_obs_vision(): Return vision observation, that is RGB image captured by camera fixed in front of agent.
_ego_xy(): Return the egocentric XY vector to a position from the agent.
calculate_reward(): Calculate reward, it will be called in every timestep, and it is implemented in different task.
specific_reset(): Reset task specific parameters, it will be called in every reset.
specific_step(): Step task specific parameters, it will be called in every timestep.
update_world(): Update world, it will be called when env.reset() or goal_achieved() == True.

Attributes:

num_steps (int): Maximum number of environment steps in an episode.
lidar_conf (LidarConf): Lidar observation parameters.
reward_conf (RewardConf): Reward options.
cost_conf (CostConf): Cost options.
mechanism_conf (MechanismConf): Mechanism options.
action_space (gymnasium.spaces.Box): Action space.
observation_space (gymnasium.spaces.Dict): Observation space.
obs_info (ObservationInfo): Observation information generated in running.
_is_load_static_geoms (bool): Whether to load static geoms in current task which is mean some geoms that has no randomness.
goal_achieved (bool): Determine whether the goal is achieved, it will be called in every timestep and it is implemented in different task.

Initialize the task.

Parameters:: config (dict) – Configuration dictionary, used to pre-config some attributes according to tasks via safety_gymnasium.register().

DataClass#

class safety_gymnasium.bases.base_task.LidarConf(num_bins: int = 16, max_dist: float = 3, exp_gain: float = 1.0, type: str = 'pseudo', alias: bool = True)#

Lidar observation parameters.

Variables:

num_bins (int) – Bins (around a full circle) for lidar sensing.
max_dist (float) – Maximum distance for lidar sensitivity (if None, exponential distance).
exp_gain (float) – Scaling factor for distance in exponential distance lidar.
type (str) – ‘pseudo’, ‘natural’, see self._obs_lidar().
alias (bool) – Lidar bins alias into each other.

class safety_gymnasium.bases.base_task.RewardConf(reward_orientation: bool = False, reward_orientation_scale: float = 0.002, reward_orientation_body: str = 'agent', reward_exception: float = -10.0, reward_clip: float = 10)#

Reward options.

Variables:

reward_orientation (bool) – Reward for being upright.
reward_orientation_scale (float) – Scale for uprightness reward.
reward_orientation_body (str) – What body to get orientation from.
reward_exception (float) – Reward when encountering a mujoco exception.
reward_clip (float) – Clip reward, last resort against physics errors causing magnitude spikes.

class safety_gymnasium.bases.base_task.CostConf(constrain_indicator: bool = True)#

Cost options.

Variables:: constrain_indicator (bool) – If true, all costs are either 1 or 0 for a given step.

class safety_gymnasium.bases.base_task.MechanismConf(randomize_layout: bool = True, continue_goal: bool = True, terminate_resample_failure: bool = True)#

Mechanism options.

Starting position distribution.

Variables:

randomize_layout (bool) – If false, set the random seed before layout to constant.
continue_goal (bool) – If true, draw a new goal after achievement.
terminate_resample_failure (bool) – If true, end episode when resampling fails,
exception. (otherwise, raise a python) –

class safety_gymnasium.bases.base_task.ObservationInfo(obs_space_dict: Dict | None = None)#

Observation information generated in running.

Variables:: obs_space_dict (gymnasium.spaces.Dict) – Observation space dictionary.

Methods#

safety_gymnasium.bases.base_task.BaseTask.__init__(self, config: dict) → None#

Initialize the task.

Parameters:: config (dict) – Configuration dictionary, used to pre-config some attributes according to tasks via safety_gymnasium.register().

safety_gymnasium.bases.base_task.BaseTask.dist_goal(self) → float#: Return the distance from the agent to the goal XY position.

safety_gymnasium.bases.base_task.BaseTask.calculate_cost(self) → dict#: Determine costs depending on the agent and obstacles.

safety_gymnasium.bases.base_task.BaseTask.build_observation_space(self) → Dict#: Construct observation space. Happens only once during __init__ in Builder.

safety_gymnasium.bases.base_task.BaseTask._build_placements_dict(self) → None#

Build a dict of placements.

Happens only once.

safety_gymnasium.bases.base_task.BaseTask.toggle_observation_space(self) → None#: Toggle observation space.

safety_gymnasium.bases.base_task.BaseTask._build_world_config(self, layout: dict) → dict#: Create a world_config from our own config.

safety_gymnasium.bases.base_task.BaseTask._build_static_geoms_config(self, geoms_config: dict) → None#

Load static geoms from .yaml file.

Static geoms are geoms which won’t be considered when calculate reward and cost in general. And have no randomness. Some tasks may generate cost when contacting static geoms.

safety_gymnasium.bases.base_task.BaseTask.build_goal_position(self) → None#: Build a new goal position, maybe with resampling due to hazards.

safety_gymnasium.bases.base_task.BaseTask._placements_dict_from_object(self, object_name: dict) → dict#: Get the placements dict subset just for a given object name.

safety_gymnasium.bases.base_task.BaseTask.obs(self) → dict | np.ndarray#: Return the observation of our agent.

safety_gymnasium.bases.base_task.BaseTask._obs_lidar(self, positions: np.ndarray | list, group: int) → np.ndarray#

Calculate and return a lidar observation.

See sub methods for implementation.

safety_gymnasium.bases.base_task.BaseTask._obs_lidar_natural(self, group: int) → ndarray#

Natural lidar casts rays based on the ego-frame of the agent.

Rays are circularly projected from the agent body origin around the agent z axis.

safety_gymnasium.bases.base_task.BaseTask._obs_lidar_pseudo(self, positions: ndarray) → ndarray#

Return an agent-centric lidar observation of a list of positions.

Lidar is a set of bins around the agent (divided evenly in a circle). The detection directions are exclusive and exhaustive for a full 360 view. Each bin reads 0 if there are no objects in that direction. If there are multiple objects, the distance to the closest one is used. Otherwise the bin reads the fraction of the distance towards the agent.

E.g. if the object is 90% of lidar_max_dist away, the bin will read 0.1, and if the object is 10% of lidar_max_dist away, the bin will read 0.9. (The reading can be thought of as “closeness” or inverse distance)

This encoding has some desirable properties:

bins read 0 when empty
bins smoothly increase as objects get close
maximum reading is 1.0 (where the object overlaps the agent)
close objects occlude far objects
constant size observation with variable numbers of objects

safety_gymnasium.bases.base_task.BaseTask._obs_compass(self, pos: ndarray) → ndarray#

Return an agent-centric compass observation of a list of positions.

Compass is a normalized (unit-length) egocentric XY vector, from the agent to the object.

This is equivalent to observing the egocentric XY angle to the target, projected into the sin/cos space we use for joints. (See comment on joint observation for why we do this.)

safety_gymnasium.bases.base_task.BaseTask._obs_vision(self) → ndarray#: Return pixels from the agent camera.

Note

This is a 3D array of shape (rows, cols, channels). The channels are RGB, in that order. If you are on a headless machine, you may need to checkout this: URL: issue

safety_gymnasium.bases.base_task.BaseTask._ego_xy(self, pos: ndarray) → ndarray#: Return the egocentric XY vector to a position from the agent.

safety_gymnasium.bases.base_task.BaseTask.calculate_reward(self) → float#: Determine reward depending on the agent and tasks.

safety_gymnasium.bases.base_task.BaseTask.specific_reset(self) → None#: Set positions and orientations of agent and obstacles.

safety_gymnasium.bases.base_task.BaseTask.specific_step(self) → None#

Each task can define a specific step function.

It will be called when safety_gymnasium.builder.Builder.step() is called using env.step(). For example, you can do specific data modification.

safety_gymnasium.bases.base_task.BaseTask.update_world(self) → None#: Update one task specific goal.

Additional Methods#

abstract property BaseTask.goal_achieved: bool#: Check if task specific goal is achieved.