Lifelong reinforcement learning with temporal logic formulas and reward machines

Publication Name

Knowledge-Based Systems

Abstract

Continuously learning new tasks using high-level ideas or knowledge is a key capability of humans. In this paper, we propose lifelong reinforcement learning with sequential linear temporal logic formulas and reward machines (LSRM), which enables an agent to leverage previously learned knowledge to accelerate the learning of logically specified tasks. For a more flexible specification of tasks, we first introduce sequential linear temporal logic (SLTL), which is a supplement to the existing linear temporal logic (LTL) formal language. We then utilize reward machines (RMs) to exploit structural reward functions for tasks encoded with high-level events, and propose an automatic extension of RMs and efficient knowledge transfer over tasks for continuous lifelong learning. Experimental results show that LSRM outperforms methods that learn the target tasks from scratch by taking advantage of the task decomposition using SLTL and the knowledge transfer over RMs during the lifelong learning process.

Open Access Status

This publication may be available as open access

Volume

257

Article Number

109650

Funding Sponsor

National Natural Science Foundation of China

Share

COinS
 

Link to publisher version (DOI)

http://dx.doi.org/10.1016/j.knosys.2022.109650