Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mohsen Guizani

Real-Time Aerial Fire Detection on Resource-Constrained Devices Using Knowledge Distillation

Feb 28, 2025

Sabina Jangirova, Branislava Jankovic, Waseem Ullah, Latif U. Khan, Mohsen Guizani

Abstract:Wildfire catastrophes cause significant environmental degradation, human losses, and financial damage. To mitigate these severe impacts, early fire detection and warning systems are crucial. Current systems rely primarily on fixed CCTV cameras with a limited field of view, restricting their effectiveness in large outdoor environments. The fusion of intelligent fire detection with remote sensing improves coverage and mobility, enabling monitoring in remote and challenging areas. Existing approaches predominantly utilize convolutional neural networks and vision transformer models. While these architectures provide high accuracy in fire detection, their computational complexity limits real-time performance on edge devices such as UAVs. In our work, we present a lightweight fire detection model based on MobileViT-S, compressed through the distillation of knowledge from a stronger teacher model. The ablation study highlights the impact of a teacher model and the chosen distillation technique on the model's performance improvement. We generate activation map visualizations using Grad-CAM to confirm the model's ability to focus on relevant fire regions. The high accuracy and efficiency of the proposed model make it well-suited for deployment on satellites, UAVs, and IoT devices for effective fire detection. Experiments on common fire benchmarks demonstrate that our model suppresses the state-of-the-art model by 0.44%, 2.00% while maintaining a compact model size. Our model delivers the highest processing speed among existing works, achieving real-time performance on resource-constrained devices.

Via

Access Paper or Ask Questions

A Multi-Agent DRL-Based Framework for Optimal Resource Allocation and Twin Migration in the Multi-Tier Vehicular Metaverse

Feb 26, 2025

Nahom Abishu Hayla, A. Mohammed Seid, Aiman Erbad, Tilahun M. Getu, Ala Al-Fuqaha, Mohsen Guizani

Abstract:Although multi-tier vehicular Metaverse promises to transform vehicles into essential nodes -- within an interconnected digital ecosystem -- using efficient resource allocation and seamless vehicular twin (VT) migration, this can hardly be achieved by the existing techniques operating in a highly dynamic vehicular environment, since they can hardly balance multi-objective optimization problems such as latency reduction, resource utilization, and user experience (UX). To address these challenges, we introduce a novel multi-tier resource allocation and VT migration framework that integrates Graph Convolutional Networks (GCNs), a hierarchical Stackelberg game-based incentive mechanism, and Multi-Agent Deep Reinforcement Learning (MADRL). The GCN-based model captures both spatial and temporal dependencies within the vehicular network; the Stackelberg game-based incentive mechanism fosters cooperation between vehicles and infrastructure; and the MADRL algorithm jointly optimizes resource allocation and VT migration in real time. By modeling this dynamic and multi-tier vehicular Metaverse as a Markov Decision Process (MDP), we develop a MADRL-based algorithm dubbed the Multi-Objective Multi-Agent Deep Deterministic Policy Gradient (MO-MADDPG), which can effectively balances the various conflicting objectives. Extensive simulations validate the effectiveness of this algorithm that is demonstrated to enhance scalability, reliability, and efficiency while considerably improving latency, resource utilization, migration cost, and overall UX by 12.8%, 9.7%, 14.2%, and 16.1%, respectively.

* 15 pages, 16 figures

Via

Access Paper or Ask Questions

Rectifying Conformity Scores for Better Conditional Coverage

Feb 22, 2025

Vincent Plassier, Alexander Fishkov, Victor Dheur, Mohsen Guizani, Souhaib Ben Taieb, Maxim Panov, Eric Moulines

Abstract:We present a new method for generating confidence sets within the split conformal prediction framework. Our method performs a trainable transformation of any given conformity score to improve conditional coverage while ensuring exact marginal coverage. The transformation is based on an estimate of the conditional quantile of conformity scores. The resulting method is particularly beneficial for constructing adaptive confidence sets in multi-output problems where standard conformal quantile regression approaches have limited applicability. We develop a theoretical bound that captures the influence of the accuracy of the quantile estimate on the approximate conditional validity, unlike classical bounds for conformal prediction methods that only offer marginal coverage. We experimentally show that our method is highly adaptive to the local data structure and outperforms existing methods in terms of conditional coverage, improving the reliability of statistical inference in various applications.

Via

Access Paper or Ask Questions

Vision-Language Models for Edge Networks: A Comprehensive Survey

Feb 11, 2025

Ahmed Sharshar, Latif U. Khan, Waseem Ullah, Mohsen Guizani

Abstract:Vision Large Language Models (VLMs) combine visual understanding with natural language processing, enabling tasks like image captioning, visual question answering, and video analysis. While VLMs show impressive capabilities across domains such as autonomous vehicles, smart surveillance, and healthcare, their deployment on resource-constrained edge devices remains challenging due to processing power, memory, and energy limitations. This survey explores recent advancements in optimizing VLMs for edge environments, focusing on model compression techniques, including pruning, quantization, knowledge distillation, and specialized hardware solutions that enhance efficiency. We provide a detailed discussion of efficient training and fine-tuning methods, edge deployment challenges, and privacy considerations. Additionally, we discuss the diverse applications of lightweight VLMs across healthcare, environmental monitoring, and autonomous systems, illustrating their growing impact. By highlighting key design strategies, current challenges, and offering recommendations for future directions, this survey aims to inspire further research into the practical deployment of VLMs, ultimately making advanced AI accessible in resource-limited settings.

Via

Access Paper or Ask Questions

Safeguarding connected autonomous vehicle communication: Protocols, intra- and inter-vehicular attacks and defenses

Feb 06, 2025

Mohammed Aledhari, Rehma Razzak, Mohamed Rahouti, Abbas Yazdinejad, Reza M. Parizi, Basheer Qolomany, Mohsen Guizani, Junaid Qadir, Ala Al-Fuqaha

Figure 1 for Safeguarding connected autonomous vehicle communication: Protocols, intra- and inter-vehicular attacks and defenses

Figure 2 for Safeguarding connected autonomous vehicle communication: Protocols, intra- and inter-vehicular attacks and defenses

Figure 3 for Safeguarding connected autonomous vehicle communication: Protocols, intra- and inter-vehicular attacks and defenses

Figure 4 for Safeguarding connected autonomous vehicle communication: Protocols, intra- and inter-vehicular attacks and defenses

Abstract:The advancements in autonomous driving technology, coupled with the growing interest from automotive manufacturers and tech companies, suggest a rising adoption of Connected Autonomous Vehicles (CAVs) in the near future. Despite some evidence of higher accident rates in AVs, these incidents tend to result in less severe injuries compared to traditional vehicles due to cooperative safety measures. However, the increased complexity of CAV systems exposes them to significant security vulnerabilities, potentially compromising their performance and communication integrity. This paper contributes by presenting a detailed analysis of existing security frameworks and protocols, focusing on intra- and inter-vehicle communications. We systematically evaluate the effectiveness of these frameworks in addressing known vulnerabilities and propose a set of best practices for enhancing CAV communication security. The paper also provides a comprehensive taxonomy of attack vectors in CAV ecosystems and suggests future research directions for designing more robust security mechanisms. Our key contributions include the development of a new classification system for CAV security threats, the proposal of practical security protocols, and the introduction of use cases that demonstrate how these protocols can be integrated into real-world CAV applications. These insights are crucial for advancing secure CAV adoption and ensuring the safe integration of autonomous vehicles into intelligent transportation systems.

Via

Access Paper or Ask Questions

PulmoFusion: Advancing Pulmonary Health with Efficient Multi-Modal Fusion

Jan 29, 2025

Ahmed Sharshar, Yasser Attia, Mohammad Yaqub, Mohsen Guizani

Figure 1 for PulmoFusion: Advancing Pulmonary Health with Efficient Multi-Modal Fusion

Figure 2 for PulmoFusion: Advancing Pulmonary Health with Efficient Multi-Modal Fusion

Figure 3 for PulmoFusion: Advancing Pulmonary Health with Efficient Multi-Modal Fusion

Figure 4 for PulmoFusion: Advancing Pulmonary Health with Efficient Multi-Modal Fusion

Abstract:Traditional remote spirometry lacks the precision required for effective pulmonary monitoring. We present a novel, non-invasive approach using multimodal predictive models that integrate RGB or thermal video data with patient metadata. Our method leverages energy-efficient Spiking Neural Networks (SNNs) for the regression of Peak Expiratory Flow (PEF) and classification of Forced Expiratory Volume (FEV1) and Forced Vital Capacity (FVC), using lightweight CNNs to overcome SNN limitations in regression tasks. Multimodal data integration is improved with a Multi-Head Attention Layer, and we employ K-Fold validation and ensemble learning to boost robustness. Using thermal data, our SNN models achieve 92% accuracy on a breathing-cycle basis and 99.5% patient-wise. PEF regression models attain Relative RMSEs of 0.11 (thermal) and 0.26 (RGB), with an MAE of 4.52% for FEV1/FVC predictions, establishing state-of-the-art performance. Code and dataset can be found on https://github.com/ahmed-sharshar/RespiroDynamics.git

* (ISBI 2025) 2025 IEEE International Symposium on Biomedical Imaging

Via

Access Paper or Ask Questions

UAV-Assisted Real-Time Disaster Detection Using Optimized Transformer Model

Jan 21, 2025

Branislava Jankovic, Sabina Jangirova, Waseem Ullah, Latif U. Khan, Mohsen Guizani

Figure 1 for UAV-Assisted Real-Time Disaster Detection Using Optimized Transformer Model

Figure 2 for UAV-Assisted Real-Time Disaster Detection Using Optimized Transformer Model

Figure 3 for UAV-Assisted Real-Time Disaster Detection Using Optimized Transformer Model

Figure 4 for UAV-Assisted Real-Time Disaster Detection Using Optimized Transformer Model

Abstract:Disaster recovery and management present significant challenges, particularly in unstable environments and hard-to-reach terrains. These difficulties can be overcome by employing unmanned aerial vehicles (UAVs) equipped with onboard embedded platforms and camera sensors. In this work, we address the critical need for accurate and timely disaster detection by enabling onboard aerial imagery processing and avoiding connectivity, privacy, and latency issues despite the challenges posed by limited onboard hardware resources. We propose a UAV-assisted edge framework for real-time disaster management, leveraging our proposed model optimized for real-time aerial image classification. The optimization of the model employs post-training quantization techniques. For real-world disaster scenarios, we introduce a novel dataset, DisasterEye, featuring UAV-captured disaster scenes as well as ground-level images taken by individuals on-site. Experimental results demonstrate the effectiveness of our model, achieving high accuracy with reduced inference latency and memory usage on resource-constrained devices. The framework's scalability and adaptability make it a robust solution for real-time disaster detection on resource-limited UAV platforms.

Via

Access Paper or Ask Questions

Overview of AI and Communication for 6G Network: Fundamentals, Challenges, and Future Research Opportunities

Dec 19, 2024

Qimei Cui, Xiaohu You, Ni Wei, Guoshun Nan, Xuefei Zhang, Jianhua Zhang, Xinchen Lyu, Ming Ai, Xiaofeng Tao, Zhiyong Feng(+15 more)

Figure 1 for Overview of AI and Communication for 6G Network: Fundamentals, Challenges, and Future Research Opportunities

Figure 2 for Overview of AI and Communication for 6G Network: Fundamentals, Challenges, and Future Research Opportunities

Figure 3 for Overview of AI and Communication for 6G Network: Fundamentals, Challenges, and Future Research Opportunities

Figure 4 for Overview of AI and Communication for 6G Network: Fundamentals, Challenges, and Future Research Opportunities

Abstract:With the increasing demand for seamless connectivity and intelligent communication, the integration of artificial intelligence (AI) and communication for sixth-generation (6G) network is emerging as a revolutionary architecture. This paper presents a comprehensive overview of AI and communication for 6G networks, emphasizing their foundational principles, inherent challenges, and future research opportunities. We commence with a retrospective analysis of AI and the evolution of large-scale AI models, underscoring their pivotal roles in shaping contemporary communication technologies. The discourse then transitions to a detailed exposition of the envisioned integration of AI within 6G networks, delineated across three progressive developmental stages. The initial stage, AI for Network, focuses on employing AI to augment network performance, optimize efficiency, and enhance user service experiences. The subsequent stage, Network for AI, highlights the role of the network in facilitating and buttressing AI operations and presents key enabling technologies, including digital twins for AI and semantic communication. In the final stage, AI as a Service, it is anticipated that future 6G networks will innately provide AI functions as services and support application scenarios like immersive communication and intelligent industrial robots. Specifically, we have defined the quality of AI service, which refers to the measurement framework system of AI services within the network. In addition to these developmental stages, we thoroughly examine the standardization processes pertinent to AI in network contexts, highlighting key milestones and ongoing efforts. Finally, we outline promising future research opportunities that could drive the evolution and refinement of AI and communication for 6G, positioning them as a cornerstone of next-generation communication infrastructure.

Via

Access Paper or Ask Questions

BGTplanner: Maximizing Training Accuracy for Differentially Private Federated Recommenders via Strategic Privacy Budget Allocation

Dec 04, 2024

Xianzhi Zhang, Yipeng Zhou, Miao Hu, Di Wu, Pengshan Liao, Mohsen Guizani, Michael Sheng

Figure 1 for BGTplanner: Maximizing Training Accuracy for Differentially Private Federated Recommenders via Strategic Privacy Budget Allocation

Figure 2 for BGTplanner: Maximizing Training Accuracy for Differentially Private Federated Recommenders via Strategic Privacy Budget Allocation

Figure 3 for BGTplanner: Maximizing Training Accuracy for Differentially Private Federated Recommenders via Strategic Privacy Budget Allocation

Figure 4 for BGTplanner: Maximizing Training Accuracy for Differentially Private Federated Recommenders via Strategic Privacy Budget Allocation

Abstract:To mitigate the rising concern about privacy leakage, the federated recommender (FR) paradigm emerges, in which decentralized clients co-train the recommendation model without exposing their raw user-item rating data. The differentially private federated recommender (DPFR) further enhances FR by injecting differentially private (DP) noises into clients. Yet, current DPFRs, suffering from noise distortion, cannot achieve satisfactory accuracy. Various efforts have been dedicated to improving DPFRs by adaptively allocating the privacy budget over the learning process. However, due to the intricate relation between privacy budget allocation and model accuracy, existing works are still far from maximizing DPFR accuracy. To address this challenge, we develop BGTplanner (Budget Planner) to strategically allocate the privacy budget for each round of DPFR training, improving overall training performance. Specifically, we leverage the Gaussian process regression and historical information to predict the change in recommendation accuracy with a certain allocated privacy budget. Additionally, Contextual Multi-Armed Bandit (CMAB) is harnessed to make privacy budget allocation decisions by reconciling the current improvement and long-term privacy constraints. Our extensive experimental results on real datasets demonstrate that \emph{BGTplanner} achieves an average improvement of 6.76\% in training performance compared to state-of-the-art baselines.

Via

Access Paper or Ask Questions

GeoLLaVA: Efficient Fine-Tuned Vision-Language Models for Temporal Change Detection in Remote Sensing

Oct 25, 2024

Hosam Elgendy, Ahmed Sharshar, Ahmed Aboeitta, Yasser Ashraf, Mohsen Guizani

Abstract:Detecting temporal changes in geographical landscapes is critical for applications like environmental monitoring and urban planning. While remote sensing data is abundant, existing vision-language models (VLMs) often fail to capture temporal dynamics effectively. This paper addresses these limitations by introducing an annotated dataset of video frame pairs to track evolving geographical patterns over time. Using fine-tuning techniques like Low-Rank Adaptation (LoRA), quantized LoRA (QLoRA), and model pruning on models such as Video-LLaVA and LLaVA-NeXT-Video, we significantly enhance VLM performance in processing remote sensing temporal changes. Results show significant improvements, with the best performance achieving a BERT score of 0.864 and ROUGE-1 score of 0.576, demonstrating superior accuracy in describing land-use transformations.

* 14 pages, 5 figures, 3 tables

Via

Access Paper or Ask Questions