Resumo (PT):
Abstract (EN):
When compared with their single-agent counterpart, multi-agent systems have an additional set of challenges for reinforcement learning algorithms, including increased complexity, non-stationary environments, credit assignment, partial observability, and achieving coordination. Deep reinforcement learning has been shown to achieve successful policies through implicit coordination, but does not handle partial-observability. This paper describes a deep reinforcement learning algorithm, based on multi-agent actor-critic, that simultaneously learns action policies for each agent, and communication protocols that compensate for partial-observability and help enforce coordination. We also research the effects of noisy communication, where messages can be late, lost, noisy, or jumbled, and how that affects the learned policies. We show how agents are able to learn both high-level policies and complex communication protocols for several different partially-observable environments. We also show how our proposal outperforms other state-of-the-art algorithms that don't take advantage of communication, even with noisy communication channels. © 2019 IEEE.
Language:
English
Type (Professor's evaluation):
Scientific
No. of pages:
8