Doctor of Philosophy
School of Information Technology and Computer Science
Ward, Koren, Acquiring mobile robot behaviours by learning from experiences, Doctor of Philosophy thesis, School of Information Technology and Computer Science, University of Wollongong, 2002. http://ro.uow.edu.au/theses/2002
This thesis addresses the problem of providing mobile robots with autonomous adaptive control for enabling behaviours to be acquired automatically in unknown and unstructured environments. As the range of applications for autonomous robots widens and the need to explore hazardous and extraterrestrial environments becomes necessary the development of robots which can learn from experience and adapt to environmental conditions becomes increasingly important. Furthermore, the high cost of custom engineering controllers for specific robotic applications suggests robots will need to be able to automatically adjust to diverse circumstances before they can become commercially viable for most domestic and field service applications.
A control architecture based on learning trajectory velocities is presented. This architecture enables a mobile robot equipped with ultrasonic sensors to quickly acquire multiple behaviours such as obstacle avoidance, wall and corridor following, dead-end escape and goal seeking. These behaviours are acquired autonomously without the need for human supervision or specially setup environments. This robot learning approach is considerably faster than previous unassisted robot learning approaches such as reinforcement learning or classifier methods because it does not suffer from the credit assignment problem or the need to perform fitness evaluations. Also, there is no need to devise reinforcement signals or performance measures to guide the learning process. In addition to this, the acquired behaviours provide appropriate velocity control of the robot and are adjustable in that object clearance distances can be controlled. Experiments show that the robot is also able to quickly adapt to changed environments and recover from partially damaged sensors through continued learning.
Although previous research has shown that multiple behaviours can be learnt via reinforcement learning, these methods require each desired behaviour to be regularly engaged in order for them to be learnt on separate associative maps. This not only results in slow learning but requires different reinforcement signals and behaviour switching mechanisms to be carefully devised and implemented. Instead, by learning to perceive the world in terms of trajectory velocities, it is shown that the robot acquires multiple behaviours quickly and simultaneously without the need for it to perform all the desired behaviours. It is also shown that the control mapping used to learn trajectory velocities can be optimised to suit the robot's sensors and environmental conditions by evolving fuzzy associative maps used to map sensors to trajectory velocities with a genetic algorithm. Thus, by using the presented learning method the mobile robot can learn multiple adjustable behaviours quickly and simultaneously and can adapt to both changed sensors and environments.