With the rapid development of next-generation Internet of Things (NG-IoT) networks, the increasing number of connected devices has led to a surge in power consumption. This rise in energy demand poses significant challenges to resource availability and raises sustainability concerns for large-scale IoT deployments. Efficient energy utilization in communication networks, particularly for power-constrained IoT devices, has thus become a critical area of research. In this paper, we deployed flying LoRa gateways (GWs) mounted on unmanned aerial vehicles (UAVs) to collect data from LoRa end devices (EDs) and transmit it to a central server. Our primary objective is to maximize the global system energy efficiency (EE) of wireless LoRa networks by joint optimization of transmission power (TP), spreading factor (SF), bandwidth (W), and ED association. To solve this challenging problem, we model the problem as a partially observable Markov decision process (POMDP), where each flying LoRa GW acts as a learning agent using a cooperative Multi-Agent Reinforcement Learning (MARL) approach under centralized training and decentralized execution (CTDE). Simulation results demonstrate that our proposed method, based on the multi-agent proximal policy optimization (MAPPO) algorithm, significantly improves the global system EE and surpasses the conventional MARL schemes.