2013 IEEE. We consider a Hybrid Access Point (HAP) that charges one or more energy harvesting devices via Radio Frequency (RF). These devices then transmit their data to the HAP. To date, prior works assume devices use Time Division Multiple Access (TDMA) for channel access, and these devices are able to transmit using any amounts of harvested energy. By contrast, we consider Dynamic Framed Slotted Aloha (DFSA) and devices can only transmit if they have sufficient energy. Moreover, nodes are not aware of each other's energy level, meaning the HAP and devices are unaware of the number of devices that is ready to transmit. In addition, we consider different non-linear energy conversion models. To this end, we propose a two-layer approach. At the first layer, the HAP adjusts its transmission power using a Sequential Monte Carlo (SMC) approach, and the frame size according to the Softmax function. At the second layer, devices use another Softmax function to learn the time slot that yields the highest reward for a given frame size. Our results show that throughput is affected by the minimum energy required for each transmission, the temperature of the Softmax function, transmission power used for charging devices, channel gain and network density. Our results indicate that our two-layer learning approach achieves at least 7%, 19%, 40% higher throughput than TDMA, $\epsilon $ -greedy and Aloha.