Apprenticeship Learning for Continuous State Spaces and Actions in a Swarm-Guidance Shepherding Task

Nguyen, H.T. and Garratt, M. and Bui, L.T. and Abbass, H. (2019) Apprenticeship Learning for Continuous State Spaces and Actions in a Swarm-Guidance Shepherding Task. In: 2019 IEEE Symposium Series on Computational Intelligence, SSCI 2019, 6 December 2019 through 9 December 2019.

Full text not available from this repository. (Upload)

Official URL: https://www.scopus.com/inward/record.uri?eid=2-s2....

Abstract

Apprenticeship learning (AL) is a learning scheme using demonstrations collected from human operators. Apprenticeship learning via inverse reinforcement learning (AL via IRL) has been used as one of the primary candidate approaches to obtain a near optimal policy that is as good as that of the human policy. The algorithm works by attempting to recover and approximate the human reward function from the demonstrations. This approach assists in overcoming limitations such as the sensitivity associated with the variance in the quality of human data and the short sighted decision time that does not consider future states. However, addressing the problem of continuous action and state spaces has still been challenging in the AL via IRL algorithm. In this paper, we propose a new AL via IRL approach that is able to work with continuous action and state spaces. Our approach is used to train an artificial intelligence (AI) agent acting as a shepherd of artificial sheep-inspired swarm agents in a complex and dynamic environment. The results show that the performance of our approach is as good as that of the human operator, and particularly, the agent's movements seem to be smoother and more effective. © 2019 IEEE.

Item Type:	Conference or Workshop Item (Paper)
Divisions:	Faculties > Faculty of Information Technology
Identification Number:	10.1109/SSCI44817.2019.9002756
Uncontrolled Keywords:	Apprentices; Inverse problems; Reinforcement learning; Apprenticeship learning; Continuous actions; Continuous State Space; Dynamic environments; Inverse reinforcement learning; Learning schemes; Near-optimal policies; shepherding; Swarm intelligence
Additional Information:	Conference code: 157933. Language of original document: English.
URI:	http://eprints.lqdtu.edu.vn/id/eprint/9211

Actions (login required)

: View Item