Level 01
Navigation / Escape (proof of concept)
Train the bot with one feedback action per AI turn and guide it from center to escape.
Turn cadence: 0.25s movement + 0.75s wait. Controls: D reward · A punish · R reset
Status: running
Time: 0.0s
Turn: 0
Phase: wait
Score: 230
Best: 0
Last action: idle
Feedback token: spent
Training controls
The bot policy is intentionally hidden. You see behavior and outcomes, not internal action weights.
death
escape