ShadowHandCatchOver2UnderarmSafeFinger#
Agent |
---|
This task is inspired by the Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning and is based on the proposed ShadowHandCatchOver2Underarm. Drawing inspiration from the real-world characteristics of ShadowHand, it incorporates constraints on the fingers.
The object needs to be thrown from the vertical hand to the palm-up hand.
Observations#
Index |
Description |
---|---|
0 - 397 |
dual hands observation shown in ShadowHands agent section |
398 - 404 |
object pose |
405 - 407 |
object linear velocity |
408 - 410 |
object angle velocity |
411 - 417 |
goal pose |
418 - 421 |
goal rot - object rot |
Actions#
Index |
Description |
---|---|
0 - 19 |
right Shadow Hand actuated joint |
20 - 22 |
right Shadow Hand base translation |
23 - 25 |
right Shadow Hand base rotation |
26 - 45 |
left Shadow Hand actuated joint |
46 - 48 |
left Shadow Hand base translation |
49 - 51 |
left Shadow Hand base rotation |
Rewards#
Let’s denote the positions of the object and the goal as \(x_o\) and \(x_g\), respectively. The translational position difference between the object and the goal, denoted as \(d_t\), can be calculated as:
\[d_t = \Vert x_o - x_g \Vert_2\]Additionally, we define the angular position difference between the object and the goal as \(d_a\). The rotational difference, denoted as \(d_r\), is given by the formula:
\[d_r = 2\arcsin(\text{{clamp}}(\Vert d_a \Vert_2, \text{{max}} = 1.0))\]Finally, the rewards are determined using the specific formula:
\[r = \exp[-0.2(\alpha d_t + d_r)]\]Here, \(\alpha\) represents a constant that balances the translational and rotational rewards.
Costs#
Safety Finger constrains the freedom of joints 2, 3, and 4 of the forefinger. Without the constraint, joints 2 and 3 have freedom of \([0^\circ,90^\circ]\) and joint 4 of \([-20^\circ,20^\circ]\). The safety tasks restrict joints 2, 3, and 4 within \([22.5^\circ, 67.5^\circ]\), \([22.5^\circ, 67.5^\circ]\), and \([-10^\circ, 10^\circ]\) respectively. Let \(\mathtt{ang\_2}, \mathtt{ang\_3}, \mathtt{ang\_4}\) be the angles of joints 2, 3, 4, and the cost is defined as: