Português help

Comuta visibilidade da coluna direita

Você está em: Start > Publications > View > Mixed-Policy Asynchronous Deep Q-Learning

Map of Premises

Publication

Publication Search

Publications

Mixed-Policy Asynchronous Deep Q-Learning

Title

Mixed-Policy Asynchronous Deep Q-LearningExport publication in the APA format Export publication in the EXCEL format Export publication in the RIS format

Type

Article in International Conference Proceedings Book

Date

2017

Title

Mixed-Policy Asynchronous Deep Q-Learning

Type

Article in International Conference Proceedings Book

Year

2017

Authors

Simões, D

(Author)

Other

The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID

lau, n

(Author)

FCUP

View Personal Page You do not have permissions to view the institutional email. Search for Participant Publications View Authenticus page Without ORCID

reis, lp

(Author)

REIT

View Personal Page Send message Search for Participant Publications View Authenticus page View ORCID page

Conference proceedings International

Title: Advances in Intelligent Systems and Computing Search for Conference Proceedings Publications

Pages: 129-140

3rd Iberian Robotics Conference, ROBOT 2017

22 November 2017 through 24 November 2017

Indexing

Scopus - 8 Citations

Other information

Authenticus ID: P-00N-N4Z

DOI: 10.1007/978-3-319-70836-2_11

Abstract (EN): There are many open issues and challenges in the reinforcement learning field, such as handling high-dimensional environments. Function approximators, such as deep neural networks, have been successfully used in both single- and multi-agent environments with high dimensional state-spaces. The multi-agent learning paradigm faces even more problems, due to the effect of several agents learning simultaneously in the environment. One of its main concerns is how to learn mixed policies that prevent opponents from exploring them in competitive environments, achieving a Nash equilibrium. We propose an extension of several algorithms able to achieve Nash equilibriums in single-state games to the deep-learning paradigm. We compare their deep-learning and table-based implementations, and demonstrate how WPL is able to achieve an equilibrium strategy in a complex environment, where agents must find each other in an infinite-state game and play a modified version of the Rock Paper Scissors game. © Springer International Publishing AG 2018.

Language: English

Type (Professor's evaluation): Scientific

Documents

We could not find any documents associated to the publication.

Related Publications

Of the same authors

Learning a Humanoid Kick with Controlled Distance (2016)
Article in International Conference Proceedings Book
Abdolmaleki, A; Simões, D; lau, n; reis, lp; Neumann, G

Recommend this page Top

Copyright 1996-2025 © Faculdade de Direito da Universidade do Porto I Terms and Conditions I Acessibility I Index A-Z
Page created on: 2025-08-08 at 06:21:41 | Privacy Policy | Personal Data Protection Policy | Whistleblowing