Abstract:LoRaWAN is a low-power long-range protocol that enables reliable and robust communication. This paper addresses the challenge of predicting the spreading factor (SF) in LoRaWAN networks using machine learning (ML) techniques. Optimal SF allocation is crucial for optimizing data transmission in IoT-enabled mobile devices, yet it remains a challenging task due to the fluctuation in environment and network conditions. We evaluated ML model performance across a large publicly available dataset to explore the best feature across key LoRaWAN features such as RSSI, SNR, frequency, distance between end devices and gateways, and antenna height of the end device, further, we also experimented with 31 different combinations possible for 5 features. We trained and evaluated the model using k-nearest neighbors (k-NN), Decision Tree Classifier (DTC), Random Forest (RF), and Multinomial Logistic Regression (MLR) algorithms. The combination of RSSI and SNR was identified as the best feature set. The finding of this paper provides valuable information for reducing the overall cost of dataset collection for ML model training and extending the battery life of LoRaWAN devices. This work contributes to a more reliable LoRaWAN system by understanding the importance of specific feature sets for optimized SF allocation.