FreightFrankaPickAndPlace(Multi-Agent)#

../../_images/freight_franka_pick_and_place.gif

This task is designed to require the robot to safely pick up an object from a lower position and place it at a higher location. Specifically, the robot should refrain from moving directly between the two platforms and instead navigate around them.

Observations#

Agent0#

Index

Description

0 - 2

Joint DOF values

3 - 5

Joint DOF velocities

6 - 18

Object DOF

19 - 31

Relative pose between the Franka robot’s root and the hand rigid body tensor

32 - 43

Actions taken by the robot in the joint space

Agent1#

Index

Description

0 - 8

Joint DOF values

9 - 17

Joint DOF velocities

18 - 30

Object DOF

31 - 43

Relative pose between the Franka robot’s root and the hand rigid body tensor

44 - 55

Actions taken by the robot in the joint space

Actions#

Agent0#

Index

Description

0

x_joint of freight

1

y_joint of freight

2

z_rotation_joint of freight

Agent1#

Index

Description

0

panda_joint1

1

panda_joint2

2

panda_joint3

3

panda_joint4

4

panda_joint5

5

panda_joint6

6

panda_joint7

7

panda_finger_joint1

8

panda_finger_joint2

Rewards#

State Variable

Notation

Object’s Position

\(x_o\)

Hand Tip Position

\(x_h\)

Target Position

\(x_t\)

Object’s Elevation Above Ground

\(z_o\)

Translational distance between the hand tip and the object:

\[d = \lVert x_h - x_o \rVert_2\]

Distance between the target position and the object:

\[d_{t} = \lVert x_t - x_o \rVert_2\]

Distance Reward

\[r_d = -d\]

Elevation Reward for the object

\[r_z = \min(2(z_o-0.2), 0.5)\]

Picking Reward

\[\begin{split}r_p = \begin{cases} 2 & \text{if } z_o > 0.26 \\ 0 & \text{otherwise} \end{cases}\end{split}\]

Targeting Reward

\[\begin{split}r_{\text{target}} = \begin{cases} 3(1.3 - d_t) & \text{if object is picked} \\ 0 & \text{otherwise} \end{cases}\end{split}\]

Success Reward

\[\begin{split}r_{\text{success}} = \begin{cases} 2 & \text{if object is picked} \text{ and } d_{t} < 0.1 \\ 0 & \text{otherwise} \end{cases}\end{split}\]

Total Reward

\[r = r_d + r_z + r_p + r_{target} + r_{success}\]

Costs#

State Variable

Notation

Freight’s X-Y Position

\(f_p\)

Freight positioning cost is based on whether it lies within a defined rectangular zone. This zone is defined by:

Axis

Range

X-axis

\([-0.2, 0.3]\)

Y-axis

\([-0.6, 0.0]\)

The cost, \(c\), is:

\[\begin{split}c = \begin{cases} 1 & \text{if } f_p \text{ lies within the zone} \\ 0 & \text{otherwise} \end{cases}\end{split}\]