Optimization of building demand flexibility using reinforcement learning and rule-based expert systems
The increasing use of renewable energy in buildings requires optimization of building demand flexibility to reduce energy costs and carbon emissions. Nevertheless, the optimization process is generally challenging that needs to consider the on-site intermittent energy supply, dynamic building energy demand, and proper utilization of energy storage systems. Leveraging the growing availability of operational data in buildings, data-driven strategies such as reinforcement learning (RL) have emerged as effective approaches to optimizing building demand flexibility. However, training a reliable RL agent is practically data-demanding and time-consuming, limiting its practical applicability. This study proposes a new strategy that integrates a rule-based expert system (RBES) and RL agents into the decision-making process to jointly reduce building energy costs, minimize the peak-to-average ratio (PAR) of grid power, and maximize PV self-consumption. In this strategy, the RBES determines system operation directly in less complex decision-making scenarios, while, in more intricate decision-making environments, it provides a reference decision for RL to explore optimal solutions further. This integration empowers RL agents to avoid unnecessary exploration and significantly enhance learning efficiency. The proposed strategy was tested using PV generation data and energy consumption data of a low energy office building. The results demonstrated an 85.7% improvement in RL learning efficiency and this strategy can successfully avoid sub-optimal convergence during policy learning. Compared to relying solely on the RBES, the proposed strategy led to 5.4% and 19.2% reductions in the electricity costs and daily PAR of grid power at peak hours, respectively. The strategy also achieved a satisfying PV self-consumption ratio of 62.4%, which is merely 0.4% lower than the optimal value determined by the RBES strategy that prioritized maximizing PV self-consumption. Additionally, compared with a model predictive control method developed for cost reduction, the strategy achieved similar cost savings while significantly reducing the decision time.
Open Access Status
This publication may be available as open access