Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuwei Sun

Detection of Global Anomalies on Distributed IoT Edges with Device-to-Device Communication

Jul 16, 2024

Hideya Ochiai, Riku Nishihata, Eisuke Tomiyama, Yuwei Sun, Hiroshi Esaki

Abstract:Anomaly detection is an important function in IoT applications for finding outliers caused by abnormal events. Anomaly detection sometimes comes with high-frequency data sampling which should be carried out at Edge devices rather than Cloud. In this paper, we consider the case that multiple IoT devices are installed in a single remote site and that they collaboratively detect anomalies from the observations with device-to-device communications. For this, we propose a fully distributed collaborative scheme for training distributed anomaly detectors with Wireless Ad Hoc Federated Learning, namely "WAFL-Autoencoder". We introduce the concept of Global Anomaly which sample is not only rare to the local device but rare to all the devices in the target domain. We also propose a distributed threshold-finding algorithm for Global Anomaly detection. With our standard benchmark-based evaluation, we have confirmed that our scheme trained anomaly detectors perfectly across the devices. We have also confirmed that the devices collaboratively found thresholds for Global Anomaly detection with low false positive rates while achieving high true positive rates with few exceptions.

* 6 pages, 3 figures, ACM MobiHoc AIoT 2023 (accepted)

Via

Access Paper or Ask Questions

Remembering Transformer for Continual Learning

Apr 11, 2024

Yuwei Sun, Jun Sakuma, Ryota Kanai

Abstract:Neural networks encounter the challenge of Catastrophic Forgetting (CF) in continual learning, where new task knowledge interferes with previously learned knowledge. We propose Remembering Transformer, inspired by the brain's Complementary Learning Systems (CLS), to tackle this issue. Remembering Transformer employs a mixture-of-adapters and a generative model-based routing mechanism to alleviate CF by dynamically routing task data to relevant adapters. Our approach demonstrated a new SOTA performance in various vision continual learning tasks and great parameter efficiency.

Via

Access Paper or Ask Questions

Associative Transformer Is A Sparse Representation Learner

Sep 22, 2023

Yuwei Sun, Hideya Ochiai, Zhirong Wu, Stephen Lin, Ryota Kanai

Abstract:Emerging from the monolithic pairwise attention mechanism in conventional Transformer models, there is a growing interest in leveraging sparse interactions that align more closely with biological principles. Approaches including the Set Transformer and the Perceiver employ cross-attention consolidated with a latent space that forms an attention bottleneck with limited capacity. Building upon recent neuroscience studies of Global Workspace Theory and associative memory, we propose the Associative Transformer (AiT). AiT induces low-rank explicit memory that serves as both priors to guide bottleneck attention in the shared workspace and attractors within associative memory of a Hopfield network. Through joint end-to-end training, these priors naturally develop module specialization, each contributing a distinct inductive bias to form attention bottlenecks. A bottleneck can foster competition among inputs for writing information into the memory. We show that AiT is a sparse representation learner, learning distinct priors through the bottlenecks that are complexity-invariant to input quantities and dimensions. AiT demonstrates its superiority over methods such as the Set Transformer, Vision Transformer, and Coordination in various vision tasks.

Via

Access Paper or Ask Questions

Meta Neural Coordination

May 20, 2023

Yuwei Sun

Abstract:Meta-learning aims to develop algorithms that can learn from other learning algorithms to adapt to new and changing environments. This requires a model of how other learning algorithms operate and perform in different contexts, which is similar to representing and reasoning about mental states in the theory of mind. Furthermore, the problem of uncertainty in the predictions of conventional deep neural networks highlights the partial predictability of the world, requiring the representation of multiple predictions simultaneously. This is facilitated by coordination among neural modules, where different modules' beliefs and desires are attributed to others. The neural coordination among modular and decentralized neural networks is a fundamental prerequisite for building autonomous intelligence machines that can interact flexibly and adaptively. In this work, several pieces of evidence demonstrate a new avenue for tackling the problems above, termed Meta Neural Coordination. We discuss the potential advancements required to build biologically-inspired machine intelligence, drawing from both machine learning and cognitive science communities.

Via

Access Paper or Ask Questions

Instance-level Trojan Attacks on Visual Question Answering via Adversarial Learning in Neuron Activation Space

Apr 02, 2023

Yuwei Sun, Hideya Ochiai, Jun Sakuma

Abstract:Malicious perturbations embedded in input data, known as Trojan attacks, can cause neural networks to misbehave. However, the impact of a Trojan attack is reduced during fine-tuning of the model, which involves transferring knowledge from a pretrained large-scale model like visual question answering (VQA) to the target model. To mitigate the effects of a Trojan attack, replacing and fine-tuning multiple layers of the pretrained model is possible. This research focuses on sample efficiency, stealthiness and variation, and robustness to model fine-tuning. To address these challenges, we propose an instance-level Trojan attack that generates diverse Trojans across input samples and modalities. Adversarial learning establishes a correlation between a specified perturbation layer and the misbehavior of the fine-tuned model. We conducted extensive experiments on the VQA-v2 dataset using a range of metrics. The results show that our proposed method can effectively adapt to a fine-tuned model with minimal samples. Specifically, we found that a model with a single fine-tuning layer can be compromised using a single shot of adversarial samples, while a model with more fine-tuning layers can be compromised using only a few shots.

Via

Access Paper or Ask Questions

Meta Learning in Decentralized Neural Networks: Towards More General AI

Feb 02, 2023

Yuwei Sun

Abstract:Meta-learning usually refers to a learning algorithm that learns from other learning algorithms. The problem of uncertainty in the predictions of neural networks shows that the world is only partially predictable and a learned neural network cannot generalize to its ever-changing surrounding environments. Therefore, the question is how a predictive model can represent multiple predictions simultaneously. We aim to provide a fundamental understanding of learning to learn in the contents of Decentralized Neural Networks (Decentralized NNs) and we believe this is one of the most important questions and prerequisites to building an autonomous intelligence machine. To this end, we shall demonstrate several pieces of evidence for tackling the problems above with Meta Learning in Decentralized NNs. In particular, we will present three different approaches to building such a decentralized learning system: (1) learning from many replica neural networks, (2) building the hierarchy of neural networks for different functions, and (3) leveraging different modality experts to learn cross-modal representations.

* Accepted for AAAI 2023 workshop

Via

Access Paper or Ask Questions

Resilience of Wireless Ad Hoc Federated Learning against Model Poisoning Attacks

Nov 07, 2022

Naoya Tezuka, Hideya Ochiai, Yuwei Sun, Hiroshi Esaki

Abstract:Wireless ad hoc federated learning (WAFL) is a fully decentralized collaborative machine learning framework organized by opportunistically encountered mobile nodes. Compared to conventional federated learning, WAFL performs model training by weakly synchronizing the model parameters with others, and this shows great resilience to a poisoned model injected by an attacker. In this paper, we provide our theoretical analysis of the WAFL's resilience against model poisoning attacks, by formulating the force balance between the poisoned model and the legitimate model. According to our experiments, we confirmed that the nodes directly encountered the attacker has been somehow compromised to the poisoned model but other nodes have shown great resilience. More importantly, after the attacker has left the network, all the nodes have finally found stronger model parameters combined with the poisoned model. Most of the attack-experienced cases achieved higher accuracy than the no-attack-experienced cases.

* 10 pages, 7 figures, to be published in IEEE International Conference on Trust, Privacy and Security in Intelligent Systems, and Applications 2022

Via

Access Paper or Ask Questions

UniCon: Unidirectional Split Learning with Contrastive Loss for Visual Question Answering

Aug 24, 2022

Yuwei Sun, Hideya Ochiai

Figure 1 for UniCon: Unidirectional Split Learning with Contrastive Loss for Visual Question Answering

Figure 2 for UniCon: Unidirectional Split Learning with Contrastive Loss for Visual Question Answering

Figure 3 for UniCon: Unidirectional Split Learning with Contrastive Loss for Visual Question Answering

Figure 4 for UniCon: Unidirectional Split Learning with Contrastive Loss for Visual Question Answering

Abstract:Visual question answering (VQA) that leverages multi-modality data has attracted intensive interest in real-life applications, such as home robots and clinic diagnoses. Nevertheless, one of the challenges is to design robust learning for different client tasks. This work aims to bridge the gap between the prerequisite of large-scale training data and the constraint of client data sharing mainly due to confidentiality. We propose the Unidirectional Split Learning with Contrastive Loss (UniCon) to tackle VQA tasks training on distributed data silos. In particular, UniCon trains a global model over the entire data distribution of different clients learning refined cross-modal representations via contrastive learning. The learned representations of the global model aggregate knowledge from different local tasks. Moreover, we devise a unidirectional split learning framework to enable more efficient knowledge sharing. The comprehensive experiments with five state-of-the-art VQA models on the VQA-v2 dataset demonstrated the efficacy of UniCon, achieving an accuracy of 49.89% in the validation set of VQA-v2. This work is the first study of VQA under the constraint of data confidentiality using self-supervised Split Learning.

Via

Access Paper or Ask Questions

Wireless Ad Hoc Federated Learning: A Fully Distributed Cooperative Machine Learning

May 24, 2022

Hideya Ochiai, Yuwei Sun, Qingzhe Jin, Nattanon Wongwiwatchai, Hiroshi Esaki

Figure 1 for Wireless Ad Hoc Federated Learning: A Fully Distributed Cooperative Machine Learning

Figure 2 for Wireless Ad Hoc Federated Learning: A Fully Distributed Cooperative Machine Learning

Figure 3 for Wireless Ad Hoc Federated Learning: A Fully Distributed Cooperative Machine Learning

Figure 4 for Wireless Ad Hoc Federated Learning: A Fully Distributed Cooperative Machine Learning

Abstract:Federated learning has allowed training of a global model by aggregating local models trained on local nodes. However, it still takes client-server model, which can be further distributed, fully decentralized, or even partially connected, or totally opportunistic. In this paper, we propose a wireless ad hoc federated learning (WAFL) -- a fully distributed cooperative machine learning organized by the nodes physically nearby. Here, each node has a wireless interface and can communicate with each other when they are within the radio range. The nodes are expected to move with people, vehicles, or robots, producing opportunistic contacts with each other. In WAFL, each node trains a model individually with the local data it has. When a node encounter with others, they exchange their trained models, and generate new aggregated models, which are expected to be more general compared to the locally trained models on Non-IID data. For evaluation, we have prepared four static communication networks and two types of dynamic and opportunistic communication networks based on random waypoint mobility and community-structured environment, and then studied the training process of a fully connected neural network with 90% Non-IID MNIST dataset. The evaluation results indicate that WAFL allowed the convergence of model parameters among the nodes toward generalization, even with opportunistic node contact scenarios -- whereas in self-training (or lonely training) case, they have diverged. This WAFL's model generalization contributed to achieving higher accuracy 94.7-96.2% to the testing IID dataset compared to the self-training case 84.7%.

* 13 pages, 8 figures

Via

Access Paper or Ask Questions

Multi-Source Domain Adaptation Based on Federated Knowledge Alignment

Mar 22, 2022

Yuwei Sun, Ng Chong, Ochiai Hideya

Figure 1 for Multi-Source Domain Adaptation Based on Federated Knowledge Alignment

Figure 2 for Multi-Source Domain Adaptation Based on Federated Knowledge Alignment

Figure 3 for Multi-Source Domain Adaptation Based on Federated Knowledge Alignment

Figure 4 for Multi-Source Domain Adaptation Based on Federated Knowledge Alignment

Abstract:Federated Learning (FL) facilitates distributed model learning to protect users' privacy. In the absence of labels for a new user's data, the knowledge transfer in FL allows a learned global model to adapt to the new samples quickly. The multi-source domain adaptation in FL aims to improve the model's generality in a target domain by learning domain-invariant features from different clients. In this paper, we propose Federated Knowledge Alignment (FedKA) that aligns features from different clients and those of the target task. We identify two types of negative transfer arising in multi-source domain adaptation of FL and demonstrate how FedKA can alleviate such negative transfers with the help of a global features disentangler enhanced by embedding matching. To further facilitate representation learning of the target task, we devise a federated voting mechanism to provide labels for samples from the target domain via a consensus from querying local models and fine-tune the global model with these labeled samples. Extensive experiments, including an ablation study, on an image classification task of Digit-Five and a text sentiment classification task of Amazon Review, show that FedKA could be augmented to existing FL algorithms to improve the generality of the learned model for tackling a new task.

Via

Access Paper or Ask Questions