Abstract:Vision Large Language Models (VLMs) combine visual understanding with natural language processing, enabling tasks like image captioning, visual question answering, and video analysis. While VLMs show impressive capabilities across domains such as autonomous vehicles, smart surveillance, and healthcare, their deployment on resource-constrained edge devices remains challenging due to processing power, memory, and energy limitations. This survey explores recent advancements in optimizing VLMs for edge environments, focusing on model compression techniques, including pruning, quantization, knowledge distillation, and specialized hardware solutions that enhance efficiency. We provide a detailed discussion of efficient training and fine-tuning methods, edge deployment challenges, and privacy considerations. Additionally, we discuss the diverse applications of lightweight VLMs across healthcare, environmental monitoring, and autonomous systems, illustrating their growing impact. By highlighting key design strategies, current challenges, and offering recommendations for future directions, this survey aims to inspire further research into the practical deployment of VLMs, ultimately making advanced AI accessible in resource-limited settings.
Abstract:Disaster recovery and management present significant challenges, particularly in unstable environments and hard-to-reach terrains. These difficulties can be overcome by employing unmanned aerial vehicles (UAVs) equipped with onboard embedded platforms and camera sensors. In this work, we address the critical need for accurate and timely disaster detection by enabling onboard aerial imagery processing and avoiding connectivity, privacy, and latency issues despite the challenges posed by limited onboard hardware resources. We propose a UAV-assisted edge framework for real-time disaster management, leveraging our proposed model optimized for real-time aerial image classification. The optimization of the model employs post-training quantization techniques. For real-world disaster scenarios, we introduce a novel dataset, DisasterEye, featuring UAV-captured disaster scenes as well as ground-level images taken by individuals on-site. Experimental results demonstrate the effectiveness of our model, achieving high accuracy with reduced inference latency and memory usage on resource-constrained devices. The framework's scalability and adaptability make it a robust solution for real-time disaster detection on resource-limited UAV platforms.
Abstract:Sixth-Generation (6G)-based Internet of Everything applications (e.g. autonomous driving cars) have witnessed a remarkable interest. Autonomous driving cars using federated learning (FL) has the ability to enable different smart services. Although FL implements distributed machine learning model training without the requirement to move the data of devices to a centralized server, it its own implementation challenges such as robustness, centralized server security, communication resources constraints, and privacy leakage due to the capability of a malicious aggregation server to infer sensitive information of end-devices. To address the aforementioned limitations, a dispersed federated learning (DFL) framework for autonomous driving cars is proposed to offer robust, communication resource-efficient, and privacy-aware learning. A mixed-integer non-linear (MINLP) optimization problem is formulated to jointly minimize the loss in federated learning model accuracy due to packet errors and transmission latency. Due to the NP-hard and non-convex nature of the formulated MINLP problem, we propose the Block Successive Upper-bound Minimization (BSUM) based solution. Furthermore, the performance comparison of the proposed scheme with three baseline schemes has been carried out. Extensive numerical results are provided to show the validity of the proposed BSUM-based scheme.