Abstract:Federated Learning (FL) is a privacy-preserving distributed learning paradigm designed to build a highly accurate global model. In Mobile Edge IoT (MEIoT), the training and communication processes can significantly deplete the limited battery resources of devices. Existing research primarily focuses on reducing overall energy consumption, but this may inadvertently create energy consumption imbalances, leading to the premature dropout of energy-sensitive devices.To address these challenges, we propose BEFL, a joint optimization framework aimed at balancing three objectives: enhancing global model accuracy, minimizing total energy consumption, and reducing energy usage disparities among devices. First, taking into account the communication constraints of MEIoT and the heterogeneity of devices, we employed the Sequential Least Squares Programming (SLSQP) algorithm for the rational allocation of communication resources. Based on this, we introduce a heuristic client selection algorithm that combines cluster partitioning with utility-driven approaches to alleviate both the total energy consumption of all devices and the discrepancies in energy usage.Furthermore, we utilize the proposed heuristic client selection algorithm as a template for offline imitation learning during pre-training, while adopting a ranking-based reinforcement learning approach online to further boost training efficiency. Our experiments reveal that BEFL improves global model accuracy by 1.6\%, reduces energy consumption variance by 72.7\%, and lowers total energy consumption by 28.2\% compared to existing methods. The relevant code can be found at \href{URL}{https://github.com/juzehao/BEFL}.
Abstract:Recent advances in Large Vision-Language Models (LVLMs) have significantly improve performance in image comprehension tasks, such as formatted charts and rich-content images. Yet, Graphical User Interface (GUI) pose a greater challenge due to their structured format and detailed textual information. Existing LVLMs often overly depend on internal knowledge and neglect image content, resulting in hallucinations and incorrect responses in GUI comprehension.To address these issues, we introduce VGA, a fine-tuned model designed for comprehensive GUI understanding. Our model aims to enhance the interpretation of visual data of GUI and reduce hallucinations. We first construct a Vision Question Answering (VQA) dataset of 63.8k high-quality examples with our propose Referent Method, which ensures the model's responses are highly depend on visual content within the image. We then design a two-stage fine-tuning method called Foundation and Advanced Comprehension (FAC) to enhance both the model's ability to extract information from image content and alignment with human intent. Experiments show that our approach enhances the model's ability to extract information from images and achieves state-of-the-art results in GUI understanding tasks. Our dataset and fine-tuning script will be released soon.
Abstract:Along with the proliferation of Artificial Intelligence (AI) and Internet of Things (IoT) techniques, various kinds of adversarial attacks are increasingly emerging to fool Deep Neural Networks (DNNs) used by Industrial IoT (IIoT) applications. Due to biased training data or vulnerable underlying models, imperceptible modifications on inputs made by adversarial attacks may result in devastating consequences. Although existing methods are promising in defending such malicious attacks, most of them can only deal with limited existing attack types, which makes the deployment of large-scale IIoT devices a great challenge. To address this problem, we present an effective federated defense approach named FDA3 that can aggregate defense knowledge against adversarial examples from different sources. Inspired by federated learning, our proposed cloud-based architecture enables the sharing of defense capabilities against different attacks among IIoT devices. Comprehensive experimental results show that the generated DNNs by our approach can not only resist more malicious attacks than existing attack-specific adversarial training methods, but also can prevent IIoT applications from new attacks.