On Sampling Time Maximization in Wireless Powered Internet of Things
Sensing devices operating in the upcoming Internet of Things (IoT) are likely to rely on the radio frequency (RF) transmissions of a hybrid access point (HAP) for energy. The HAP is also responsible for setting the sampling or monitoring time of these devices according to their harvested energy. This task, however, is made challenging when the HAP has imprecise knowledge of the channel gains to each device. Consequently, the HAP does not know exactly the amount of energy harvested by each device. As a result, the HAP may program a sensing device with an incorrect sampling time. To address this problem, we employ stochastic programming, and use it to determine the time used for charging, and also the sampling time of each device. Its objective is to maximize the minimum sampling time of devices. The formulated stochastic program, however, requires a model or the probability distribution of channel gains. To this end, we propose a reinforcement learning (RL) approach to solve the same problem. In addition, as the state-space contains continuous quantities, we use linear function approximation and a set of novel features to represent the large state-space. Our experiment results show that the RL approach is able to achieve 93% of the minimum sampling time computed by the stochastic program.