Degree Name

Doctor of Philosophy


School of Computing and Information Technology


Multi-agent learning has been widely used to enable multiple agents to autonomously find solutions for complex tasks such as robotic swarm control, social order maintenance, and transportation management. To date, various multi-agent learning approaches have been developed with various capabilities such as having teaching skills and utilising collective intelligence. Despite the progress, there are still various research challenges to be addressed to advance the usage of multi-agent learning. Specifically, this thesis tries to address three challenges, which are:

1. to accelerate multi-agent learning in open and dynamic environments. Current approaches for accelerating learning often require various agent abilities such as communication and wide ranges of observation. These abilities, however, may not always be satisfied in open and dynamic environments where agents have limited abilities in communication, observation, etc. To address the challenge, a multiagent learning approach based on transferring action advice between a teacher and a student is proposed. Action transfer has been widely seen as requiring limited agent abilities, but the current action-transfer-based approaches have the limitation of requiring a teacher and a student to have the same learning goal of the same environment. To deal with the limitation, the proposed approach enables a teacher to decide whether some actions are right to a student’s possibly different goal, and enables the use of right actions as many times as possible based on the decisions of both a teacher and a student. Experiment results show the effectiveness of the proposed approach in using right action transfer to accelerate multi-agent learning.

2. to enable norm emergence in environments with conflict-intolerable interactions. Learning for norm emergence has been widely studied to improve agent coordination in interactions. However, the widely existing conflict-intolerable interactions have been overlooked. “Conflict-intolerable” means that agents’ conflicting actions can cause intolerable incidents like functional damage to agents. Many realworld interactions that need to be finished by, e.g., physical coordination, have the “conflict-intolerable” feature. To address the challenge, two cyclical multi-agent learning approaches are proposed. In these two approaches, agents relearn new knowledge when their old knowledge might lead to conflicting actions, and multiple relearning processes are named cyclical learning. The proposed two approaches are guaranteed for norm emergence before and during the period when agents perform conflict-intolerable interactions, respectively. Some theoretical analysis and experiments are also conducted to analyse the time of norm emergence.

3. to enable norm emergence in heterogeneous environments. A heterogeneous environment means that some agents use different approaches to choose actions during interactions. Most of the current approaches are evaluated in homogeneous environments where all agents use the same approach. In reality, however, agents may use different learning-based or non-learning-based approaches because of, e.g., belonging to different agent designers who do not know each other’s choice of approaches in advance. To address the challenge, a novel organisation of current approaches is proposed to show these approaches’ characteristics that might influence norm emergence in heterogeneous environments. Then, some representative approaches are identified and their influences in various heterogeneous environments are evaluated through experiments. The results indicate various pros and cons of these representative approaches, and suggestions about choosing proper approaches are identified for the references of agent designers. Some opportunities and associated challenges are also identified to encourage future studies in norm emergence.

In summary, this thesis studies three research challenges in multi-agent learning and proposes several solutions to address these challenges. Both theoretical analysis and experiment results show the effectiveness of the proposed solutions.

FoR codes (2008)


This thesis is unavailable until Tuesday, February 07, 2023



Unless otherwise indicated, the views expressed in this thesis are those of the author and do not necessarily represent the views of the University of Wollongong.