Abstract (EN):
Reinforcement Learning techniques allow learning complex behaviors to deal with a variety of situations in a matter of hours. This complexity is even more prominent in multi-agent continuous 3D environments. This paper compares how the actions taken by two agents independently trained via a self-play approach differ from the ones taken when they are controlled by the same policy. It also explores the emergence of competitive or collaborative behaviors in a natural game setting. By implementing a 3D simulated version of the Dance Dance Revolution, the acquisition of more specific abilities like equilibrium, balance, and dexterity was tested. The approach achieved very good results learning a predefined sequence of buttons (7 arrows correctly pressed in 20M timesteps), revealing a similar learning behavior to human beings (improving with training and performing better in this kind of sequence than in random ones). The self-play approach also produced some interesting effects by developing cooperative behaviors in theoretically competitive scenarios.
Language:
English
Type (Professor's evaluation):
Scientific
No. of pages:
6