Lifelong reinforcement learning with temporal logic formulas and reward machines
Continuously learning new tasks using high-level ideas or knowledge is a key capability of humans. In this paper, we propose lifelong reinforcement learning with sequential linear temporal logic formulas and reward machines (LSRM), which enables an agent to leverage previously learned knowledge to accelerate the learning of logically specified tasks. For a more flexible specification of tasks, we first introduce sequential linear temporal logic (SLTL), which is a supplement to the existing linear temporal logic (LTL) formal language. We then utilize reward machines (RMs) to exploit structural reward functions for tasks encoded with high-level events, and propose an automatic extension of RMs and efficient knowledge transfer over tasks for continuous lifelong learning. Experimental results show that LSRM outperforms methods that learn the target tasks from scratch by taking advantage of the task decomposition using SLTL and the knowledge transfer over RMs during the lifelong learning process.
Open Access Status
This publication may be available as open access
National Natural Science Foundation of China