Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Richard Dazeley

DAPoinTr: Domain Adaptive Point Transformer for Point Cloud Completion

Dec 26, 2024

Yinghui Li, Qianyu Zhou, Jingyu Gong, Ye Zhu, Richard Dazeley, Xinkui Zhao, Xuequan Lu

Figure 1 for DAPoinTr: Domain Adaptive Point Transformer for Point Cloud Completion

Figure 2 for DAPoinTr: Domain Adaptive Point Transformer for Point Cloud Completion

Figure 3 for DAPoinTr: Domain Adaptive Point Transformer for Point Cloud Completion

Figure 4 for DAPoinTr: Domain Adaptive Point Transformer for Point Cloud Completion

Abstract:Point Transformers (PoinTr) have shown great potential in point cloud completion recently. Nevertheless, effective domain adaptation that improves transferability toward target domains remains unexplored. In this paper, we delve into this topic and empirically discover that direct feature alignment on point Transformer's CNN backbone only brings limited improvements since it cannot guarantee sequence-wise domain-invariant features in the Transformer. To this end, we propose a pioneering Domain Adaptive Point Transformer (DAPoinTr) framework for point cloud completion. DAPoinTr consists of three key components: Domain Query-based Feature Alignment (DQFA), Point Token-wise Feature alignment (PTFA), and Voted Prediction Consistency (VPC). In particular, DQFA is presented to narrow the global domain gaps from the sequence via the presented domain proxy and domain query at the Transformer encoder and decoder, respectively. PTFA is proposed to close the local domain shifts by aligning the tokens, \emph{i.e.,} point proxy and dynamic query, at the Transformer encoder and decoder, respectively. VPC is designed to consider different Transformer decoders as multiple of experts (MoE) for ensembled prediction voting and pseudo-label generation. Extensive experiments with visualization on several domain adaptation benchmarks demonstrate the effectiveness and superiority of our DAPoinTr compared with state-of-the-art methods. Code will be publicly available at: https://github.com/Yinghui-Li-New/DAPoinTr

* Accepted to AAAI 2025

Via

Access Paper or Ask Questions

Adaptive Alignment: Dynamic Preference Adjustments via Multi-Objective Reinforcement Learning for Pluralistic AI

Oct 31, 2024

Hadassah Harland, Richard Dazeley, Peter Vamplew, Hashini Senaratne, Bahareh Nakisa, Francisco Cruz

Figure 1 for Adaptive Alignment: Dynamic Preference Adjustments via Multi-Objective Reinforcement Learning for Pluralistic AI

Abstract:Emerging research in Pluralistic Artificial Intelligence (AI) alignment seeks to address how intelligent systems can be designed and deployed in accordance with diverse human needs and values. We contribute to this pursuit with a dynamic approach for aligning AI with diverse and shifting user preferences through Multi Objective Reinforcement Learning (MORL), via post-learning policy selection adjustment. In this paper, we introduce the proposed framework for this approach, outline its anticipated advantages and assumptions, and discuss technical details about the implementation. We also examine the broader implications of adopting a retroactive alignment approach through the sociotechnical systems perspective.

* Accepted for the Pluralistic Alignment workshop at NeurIPS 2024

Via

Access Paper or Ask Questions

Multi-objective Reinforcement Learning: A Tool for Pluralistic Alignment

Oct 15, 2024

Peter Vamplew, Conor F Hayes, Cameron Foale, Richard Dazeley, Hadassah Harland

Abstract:Reinforcement learning (RL) is a valuable tool for the creation of AI systems. However it may be problematic to adequately align RL based on scalar rewards if there are multiple conflicting values or stakeholders to be considered. Over the last decade multi-objective reinforcement learning (MORL) using vector rewards has emerged as an alternative to standard, scalar RL. This paper provides an overview of the role which MORL can play in creating pluralistically-aligned AI.

* Accepted for the Pluralistic Alignment workshop at NeurIPS 2024. https://pluralistic-alignment.github.io/

Via

Access Paper or Ask Questions

IReCa: Intrinsic Reward-enhanced Context-aware Reinforcement Learning for Human-AI Coordination

Aug 15, 2024

Xin Hao, Bahareh Nakisa, Mohmmad Naim Rastgoo, Richard Dazeley

Figure 1 for IReCa: Intrinsic Reward-enhanced Context-aware Reinforcement Learning for Human-AI Coordination

Figure 2 for IReCa: Intrinsic Reward-enhanced Context-aware Reinforcement Learning for Human-AI Coordination

Figure 3 for IReCa: Intrinsic Reward-enhanced Context-aware Reinforcement Learning for Human-AI Coordination

Figure 4 for IReCa: Intrinsic Reward-enhanced Context-aware Reinforcement Learning for Human-AI Coordination

Abstract:In human-AI coordination scenarios, human agents usually exhibit asymmetric behaviors that are significantly sparse and unpredictable compared to those of AI agents. These characteristics introduce two primary challenges to human-AI coordination: the effectiveness of obtaining sparse rewards and the efficiency of training the AI agents. To tackle these challenges, we propose an Intrinsic Reward-enhanced Context-aware (IReCa) reinforcement learning (RL) algorithm, which leverages intrinsic rewards to facilitate the acquisition of sparse rewards and utilizes environmental context to enhance training efficiency. Our IReCa RL algorithm introduces three unique features: (i) it encourages the exploration of sparse rewards by incorporating intrinsic rewards that supplement traditional extrinsic rewards from the environment; (ii) it improves the acquisition of sparse rewards by prioritizing the corresponding sparse state-action pairs; and (iii) it enhances the training efficiency by optimizing the exploration and exploitation through innovative context-aware weights of extrinsic and intrinsic rewards. Extensive simulations executed in the Overcooked layouts demonstrate that our IReCa RL algorithm can increase the accumulated rewards by approximately 20% and reduce the epochs required for convergence by approximately 67% compared to state-of-the-art baselines.

Via

Access Paper or Ask Questions

Data-driven Machinery Fault Detection: A Comprehensive Review

May 29, 2024

Dhiraj Neupane, Mohamed Reda Bouadjenek, Richard Dazeley, Sunil Aryal

Figure 1 for Data-driven Machinery Fault Detection: A Comprehensive Review

Figure 2 for Data-driven Machinery Fault Detection: A Comprehensive Review

Figure 3 for Data-driven Machinery Fault Detection: A Comprehensive Review

Figure 4 for Data-driven Machinery Fault Detection: A Comprehensive Review

Abstract:In this era of advanced manufacturing, it's now more crucial than ever to diagnose machine faults as early as possible to guarantee their safe and efficient operation. With the massive surge in industrial big data and advancement in sensing and computational technologies, data-driven Machinery Fault Diagnosis (MFD) solutions based on machine/deep learning approaches have been used ubiquitously in manufacturing. Timely and accurately identifying faulty machine signals is vital in industrial applications for which many relevant solutions have been proposed and are reviewed in many articles. Despite the availability of numerous solutions and reviews on MFD, existing works often lack several aspects. Most of the available literature has limited applicability in a wide range of manufacturing settings due to their concentration on a particular type of equipment or method of analysis. Additionally, discussions regarding the challenges associated with implementing data-driven approaches, such as dealing with noisy data, selecting appropriate features, and adapting models to accommodate new or unforeseen faults, are often superficial or completely overlooked. Thus, this survey provides a comprehensive review of the articles using different types of machine learning approaches for the detection and diagnosis of various types of machinery faults, highlights their strengths and limitations, provides a review of the methods used for condition-based analyses, comprehensively discusses the available machinery fault datasets, introduces future researchers to the possible challenges they have to encounter while using these approaches for MFD and recommends the probable solutions to mitigate those problems. The future research prospects are also pointed out for a better understanding of the field. We believe this article will help researchers and contribute to the further development of the field.

Via

Access Paper or Ask Questions

Value function interference and greedy action selection in value-based multi-objective reinforcement learning

Feb 09, 2024

Peter Vamplew, Cameron Foale, Richard Dazeley

Figure 1 for Value function interference and greedy action selection in value-based multi-objective reinforcement learning

Figure 2 for Value function interference and greedy action selection in value-based multi-objective reinforcement learning

Figure 3 for Value function interference and greedy action selection in value-based multi-objective reinforcement learning

Figure 4 for Value function interference and greedy action selection in value-based multi-objective reinforcement learning

Abstract:Multi-objective reinforcement learning (MORL) algorithms extend conventional reinforcement learning (RL) to the more general case of problems with multiple, conflicting objectives, represented by vector-valued rewards. Widely-used scalar RL methods such as Q-learning can be modified to handle multiple objectives by (1) learning vector-valued value functions, and (2) performing action selection using a scalarisation or ordering operator which reflects the user's utility with respect to the different objectives. However, as we demonstrate here, if the user's utility function maps widely varying vector-values to similar levels of utility, this can lead to interference in the value-function learned by the agent, leading to convergence to sub-optimal policies. This will be most prevalent in stochastic environments when optimising for the Expected Scalarised Return criterion, but we present a simple example showing that interference can also arise in deterministic environments. We demonstrate empirically that avoiding the use of random tie-breaking when identifying greedy actions can ameliorate, but not fully overcome, the problems caused by value function interference.

Via

Access Paper or Ask Questions

Utility-Based Reinforcement Learning: Unifying Single-objective and Multi-objective Reinforcement Learning

Feb 05, 2024

Peter Vamplew, Cameron Foale, Conor F. Hayes, Patrick Mannion, Enda Howley, Richard Dazeley, Scott Johnson, Johan Källström, Gabriel Ramos, Roxana Rădulescu(+2 more)

Abstract:Research in multi-objective reinforcement learning (MORL) has introduced the utility-based paradigm, which makes use of both environmental rewards and a function that defines the utility derived by the user from those rewards. In this paper we extend this paradigm to the context of single-objective reinforcement learning (RL), and outline multiple potential benefits including the ability to perform multi-policy learning across tasks relating to uncertain objectives, risk-aware RL, discounting, and safe RL. We also examine the algorithmic implications of adopting a utility-based approach.

* Accepted for the Blue Sky Track at AAMAS'24

Via

Access Paper or Ask Questions

An Empirical Investigation of Value-Based Multi-objective Reinforcement Learning for Stochastic Environments

Jan 06, 2024

Kewen Ding, Peter Vamplew, Cameron Foale, Richard Dazeley

Abstract:One common approach to solve multi-objective reinforcement learning (MORL) problems is to extend conventional Q-learning by using vector Q-values in combination with a utility function. However issues can arise with this approach in the context of stochastic environments, particularly when optimising for the Scalarised Expected Reward (SER) criterion. This paper extends prior research, providing a detailed examination of the factors influencing the frequency with which value-based MORL Q-learning algorithms learn the SER-optimal policy for an environment with stochastic state transitions. We empirically examine several variations of the core multi-objective Q-learning algorithm as well as reward engineering approaches, and demonstrate the limitations of these methods. In particular, we highlight the critical impact of the noisy Q-value estimates issue on the stability and convergence of these algorithms.

* arXiv admin note: substantial text overlap with arXiv:2211.08669

Via

Access Paper or Ask Questions

Weighted Point Cloud Normal Estimation

May 06, 2023

Weijia Wang, Xuequan Lu, Di Shao, Xiao Liu, Richard Dazeley, Antonio Robles-Kelly, Wei Pan

Abstract:Existing normal estimation methods for point clouds are often less robust to severe noise and complex geometric structures. Also, they usually ignore the contributions of different neighbouring points during normal estimation, which leads to less accurate results. In this paper, we introduce a weighted normal estimation method for 3D point cloud data. We innovate in two key points: 1) we develop a novel weighted normal regression technique that predicts point-wise weights from local point patches and use them for robust, feature-preserving normal regression; 2) we propose to conduct contrastive learning between point patches and the corresponding ground-truth normals of the patches' central points as a pre-training process to facilitate normal regression. Comprehensive experiments demonstrate that our method can robustly handle noisy and complex point clouds, achieving state-of-the-art performance on both synthetic and real-world datasets.

* Accepted by ICME 2023

Via

Access Paper or Ask Questions

Broad-persistent Advice for Interactive Reinforcement Learning Scenarios

Oct 11, 2022

Francisco Cruz, Adam Bignold, Hung Son Nguyen, Richard Dazeley, Peter Vamplew

Figure 1 for Broad-persistent Advice for Interactive Reinforcement Learning Scenarios

Figure 2 for Broad-persistent Advice for Interactive Reinforcement Learning Scenarios

Figure 3 for Broad-persistent Advice for Interactive Reinforcement Learning Scenarios

Figure 4 for Broad-persistent Advice for Interactive Reinforcement Learning Scenarios

Abstract:The use of interactive advice in reinforcement learning scenarios allows for speeding up the learning process for autonomous agents. Current interactive reinforcement learning research has been limited to real-time interactions that offer relevant user advice to the current state only. Moreover, the information provided by each interaction is not retained and instead discarded by the agent after a single use. In this paper, we present a method for retaining and reusing provided knowledge, allowing trainers to give general advice relevant to more than just the current state. Results obtained show that the use of broad-persistent advice substantially improves the performance of the agent while reducing the number of interactions required for the trainer.

* Extended abstract accepted at the 2nd RL-CONFORM Workshop at IEEE/RSJ IROS'22 Conference. 5 pages, 7 figures. arXiv admin note: substantial text overlap with arXiv:2102.02441, arXiv:2110.08003

Via

Access Paper or Ask Questions