Master of Engineering
School of Electrical, Computer and Telecommunications Engineering
Liu, Yiwei, Autonomous blimp control with reinforcement learning, Master of Engineering thesis, School of Electrical, Computer and Telecommunications Engineering, University of Wollongong, 2009. http://ro.uow.edu.au/theses/3116
Blimps are a special type of airship without rigid structure on the body. Most existing blimps are manually operated by a pilot directly or through radio control. One of the most famous examples is the Goodyear Blimp used for commercial advertising. With the fast development of microcontroller and electronic technologies, autonomous blimps have recently attracted great research interest as a platform to access dangerous or difficulty-to-access environment in applications such as disaster exploration and rescue, security surveillance in public events and climate monitoring, etc..
This thesis investigates the problem of learning an optimal control policy for blimp autonomous navigation in a rescue task, and presents a new approach for navigation control of an autonomous blimp using an intelligent reinforcement learning algorithm. Compared to the traditional model based control methods, this control strategy does not require a dynamic model of the blimp, which provides signifcant advantage in many practical situations where the blimp system model is either hard to acquire or too complicated to apply.
The blimp in this research is used as a prototype for the \UAV Outback Challenge" organized by Australian Research Centre for Aerospace Automation (ARCAA). The Challenge requires the UAV to y autonomously to a designated area and rescue the dummy, named Jack. The objective of this research is to develop a control system, which could autonomously adjust the blimp heading direction to the rescue target. As the blimp is required to obtain a range of pilot skills through the learning and reinforcement mechanism during actual navigation trials it can automatically account for the environmental changes during the navigation tasks.
The basic hardware structure and devices of the blimp control system were preliminarily developed. The developed controller does not require a dynamic model of the blimp, but however, is adaptive to the changes of the surrounding environment. The simulation data generated from a Webots Robotics Simulator (WRS) demonstrate satisfactory results for planar steering motion control. The Matlab was used to analyse the simulation data produced by WRS.
Within the simulation environment, the blimp used the Q-learning method was successfully tested in the single target and continuous target tasks subjected to various environmental disturbance. The different learning parameters and initial conditions are also tested to acquire better solutions of blimp autonomous steering motions. Reinforcement learning within blimp control in this research is shown to be a promising and effective solution for autonomous navigation tasks.
01Whole.pdf (4668 kB)