Supplementary Figures and Illustrations
Supplementary Video Results
When the robot correctly moves toward the cabinet storing the hotdogs, the first potential peak appears around 25 seconds. During the subsequent attempt to open the door, the arm is oriented incorrectly, leading to the first significant drop in potential and a trough around 52 seconds. When the robot finds the correct way to open the door and reaches toward the hotdogs, a second potential peak appears around 74 seconds. It then successfully picks up the first hotdog, resulting in a new potential peak around 130 seconds, and picks up the second hotdog around 160 seconds, producing a third potential peak. Afterward, it moves correctly toward the microwave, with the potential remaining at a high level. During the process of placing the hotdogs, there is a brief drop in potential, reflecting intermediate adjustments and attempts. When the robot successfully places both hotdogs into the microwave and correctly initiates the door-closing action, a fourth potential peak appears around 400 seconds. After further attempts, it successfully turns on the switch around 417 seconds, reaching the final potential peak. The task is then completed, and the robot retracts its arms, causing the potential to decrease accordingly.
When the robot correctly moves toward the cabinet storing the hotdogs, the first potential peak appears around 10 seconds. During the subsequent attempt to open the door, the arm is oriented incorrectly, leading to a trough in potential around 32 seconds. When the robot finds the correct way to open the door and reaches toward the hotdog, a second potential peak appears around 73 seconds, and the potential remains high during the proper door-opening process. However, when the robot attempts to grasp the hotdog, the gripper is positioned incorrectly, causing the potential to drop sharply. It then fails to pick up the hotdog, and throughout the repeated unsuccessful attempts, the potential remains at a low level.
In the initial stage, the robot scans its surroundings, and the potential remains low. After locating the trash bin, it approaches and picks it up, reaching the first potential peak around 29 seconds. After a brief adjustment, it successfully stabilizes the grasp on the trash bin, leading to a second potential peak around 41 seconds. It then correctly moves toward the soda cans, with the potential remaining at a high level. During the three instances of picking up the soda cans, the potential reaches smaller peaks around 67 seconds, 86 seconds, and 106 seconds, respectively. After completing the task, the robot places the trash bin on the ground, and the potential begins to decrease. Since the BEHAVIOR-1K evaluation protocol does not include placing the trash bin on the ground as part of the success criteria, the potential decreases during this stage.
In the initial stage, the robot scans its surroundings, and the potential remains low. After locating the trash bin, it moves toward it, leading to a peak in potential. Around 43 seconds, it picks up the trash bin, reaching another peak in potential. It then correctly approaches the soda cans, with the potential remaining at a high level. During the two instances of picking up soda cans, the potential reaches smaller peaks around 78 seconds and 100 seconds, respectively. However, the robot forgets the existence of the third can. Around 107 seconds, it places the trash bin on the ground, resulting in a significant drop in potential, which then remains low until the episode terminates due to the time limit.
In the initial stage, the robot scans its surroundings, and the potential remains low. When it discovers and picks up the spray bottle, the potential reaches a peak around 19 seconds. It then correctly carries the spray bottle toward the first target tree, during which the potential remains high. Because the BEHAVIOR-1K simulator’s success criterion for this task does not include circling around while watering the tree, and instead considers the task successful at the first moment the tree is successfully watered, the potential remains at a relatively low level during the watering process. Therefore, the peak ends when the spray bottle first makes contact with the tree. After completing the first watering task, the robot successfully turns off the spray bottle, leading to a potential peak around 105 seconds. When the robot begins moving correctly toward the second tree, the potential rises again and remains high. Similarly, around 187 seconds, it reaches another potential peak at the moment it begins watering the tree, and the task is successfully completed. Afterward, it continues circling around while watering, and the potential begins to decrease.
In the initial stage, the robot scans its surroundings, and the potential remains low. When it discovers and picks up the spray bottle, the potential reaches a peak around 27 seconds. However, the spray bottle suddenly slips from its grasp, causing the potential (s) to drop sharply starting at 28 seconds. In the subsequent attempts to pick it up again, there is a brief rise in potential around 86 seconds. But the attempt fails, and the potential decreases further. After that, the robot tries to water the tree without carrying the spray bottle, so the potential (s) remains at a low level throughout.
In the initial stage, the robot scans its surroundings, and the potential fluctuates at a low level. After locating the radio, it approaches it, leading to a peak in potential around 14 seconds. After several attempts, it successfully picks up the radio around 27 seconds, and the potential rises rapidly. It then smoothly adjusts the position of the radio and begins trying to turn on the switch, with the potential remaining at a high level. Around 69 seconds, it successfully turns on the radio, reaching a peak in potential and successfully completing the task. The subsequent process of putting the radio back is not part of the BEHAVIOR-1K simulator’s success criteria for this task, so the potential correspondingly decreases.
In the initial stage, the robot scans its surroundings, and the potential fluctuates at a low level. After locating the radio, it moves toward it, leading to a peak in potential around 21 seconds. However, starting at around 48 seconds, the subsequent attempts begin to deviate from a reasonable execution strategy, and the potential drops sharply. The robot then makes multiple low-quality attempts, during which the potential remains consistently low. At 142 seconds, it knocks over the radio, reaching a trough in potential. After that, it continues making low-quality attempts, and the potential stays at a low level.
In the initial stage, the robot scans its surroundings to locate and approach the washing machine, while the potential remains at a relatively high level. Around 35 seconds, it successfully opens the washing machine door, reaching a peak in potential. After adjusting its direction, it detects the baseball caps, leading to another potential peak around 52 seconds. After several attempts, it successfully picks up the two baseball caps around 98 seconds and 132 seconds, respectively, producing two additional potential peaks. It then places the two baseball caps into the washing machine in succession, resulting in potential peaks around 142 seconds and 161 seconds. At 188 seconds, it successfully closes the washing machine door, reaching another potential peak. At 207 seconds, it successfully turns on the washing machine, reaching yet another potential peak. Afterward, it retracts its arms, and the potential correspondingly decreases.
In the initial stage, the robot scans its surroundings to locate and approach the washing machine, while the potential remains at a relatively high level. Around 31 seconds, it successfully opens the washing machine door, reaching a peak in potential. After several attempts, it successfully picks up two baseball caps around 68 seconds and 100 seconds, respectively, producing two additional potential peaks. It then places the two baseball caps into the washing machine in succession, leading to potential peaks around 118 seconds and 132 seconds. At 226 seconds, it successfully closes the washing machine door, reaching another potential peak. However, afterward the robot forgets to turn on the washing machine and instead continues searching for the baseball caps, causing the potential to drop rapidly and remain at a low level.