Abstract (EN):
Coverage Path Planning (CPP) is a common task in robotics that consists in computing collision-free paths that pass through all the specified points from an area of interest. This task is known to be NP-Hard, and increasingly complex when the agent relies exclusively on sensor information. Reinforcement Learning methods appear as an interesting solution to deal with the complexity of this problem and obtain efficient solutions. This paper presents an online CPP algorithm based on Tabular Temporal Difference Learning methods, for a generic robotic platform with a ranging sensor. The problem is formulated as a Partially Observed Markov Decision Process and an RL scheme that includes a modified policy with a heuristic method is proposed. The presented approach provides a way to mix the concepts of classical algorithms with RL, enabling the tabular algorithm to overcome the shortcomings of the inherent large state space of CPP, and accelerated the training process by optimizing and reducing the policy space. The proposed algorithm is tested and its performance is compared in simulation using different Temporal Difference Learning methods, showing that it can efficiently complete the task with no prior information, with different map sizes, starting positions, and a random number of obstacles.
Language:
English
Type (Professor's evaluation):
Scientific
No. of pages:
6