Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yan Zuo

Nesterov Method for Asynchronous Pipeline Parallel Optimization

May 02, 2025

Thalaiyasingam Ajanthan, Sameera Ramasinghe, Yan Zuo, Gil Avraham, Alexander Long

Abstract:Pipeline Parallelism (PP) enables large neural network training on small, interconnected devices by splitting the model into multiple stages. To maximize pipeline utilization, asynchronous optimization is appealing as it offers 100% pipeline utilization by construction. However, it is inherently challenging as the weights and gradients are no longer synchronized, leading to stale (or delayed) gradients. To alleviate this, we introduce a variant of Nesterov Accelerated Gradient (NAG) for asynchronous optimization in PP. Specifically, we modify the look-ahead step in NAG to effectively address the staleness in gradients. We theoretically prove that our approach converges at a sublinear rate in the presence of fixed delay in gradients. Our experiments on large-scale language modelling tasks using decoder-only architectures with up to 1B parameters, demonstrate that our approach significantly outperforms existing asynchronous methods, even surpassing the synchronous baseline.

Via

Access Paper or Ask Questions

ViewFusion: Towards Multi-View Consistency via Interpolated Denoising

Feb 29, 2024

Xianghui Yang, Yan Zuo, Sameera Ramasinghe, Loris Bazzani, Gil Avraham, Anton van den Hengel

Figure 1 for ViewFusion: Towards Multi-View Consistency via Interpolated Denoising

Figure 2 for ViewFusion: Towards Multi-View Consistency via Interpolated Denoising

Figure 3 for ViewFusion: Towards Multi-View Consistency via Interpolated Denoising

Figure 4 for ViewFusion: Towards Multi-View Consistency via Interpolated Denoising

Abstract:Novel-view synthesis through diffusion models has demonstrated remarkable potential for generating diverse and high-quality images. Yet, the independent process of image generation in these prevailing methods leads to challenges in maintaining multiple-view consistency. To address this, we introduce ViewFusion, a novel, training-free algorithm that can be seamlessly integrated into existing pre-trained diffusion models. Our approach adopts an auto-regressive method that implicitly leverages previously generated views as context for the next view generation, ensuring robust multi-view consistency during the novel-view generation process. Through a diffusion process that fuses known-view information via interpolated denoising, our framework successfully extends single-view conditioned models to work in multiple-view conditional settings without any additional fine-tuning. Extensive experimental results demonstrate the effectiveness of ViewFusion in generating consistent and detailed novel views.

* CVPR2024,homepage:https://wi-sc.github.io/ViewFusion.github.io/

Via

Access Paper or Ask Questions

Divide and Conquer: Rethinking the Training Paradigm of Neural Radiance Fields

Jan 29, 2024

Rongkai Ma, Leo Lebrat, Rodrigo Santa Cruz, Gil Avraham, Yan Zuo, Clinton Fookes, Olivier Salvado

Abstract:Neural radiance fields (NeRFs) have exhibited potential in synthesizing high-fidelity views of 3D scenes but the standard training paradigm of NeRF presupposes an equal importance for each image in the training set. This assumption poses a significant challenge for rendering specific views presenting intricate geometries, thereby resulting in suboptimal performance. In this paper, we take a closer look at the implications of the current training paradigm and redesign this for more superior rendering quality by NeRFs. Dividing input views into multiple groups based on their visual similarities and training individual models on each of these groups enables each model to specialize on specific regions without sacrificing speed or efficiency. Subsequently, the knowledge of these specialized models is aggregated into a single entity via a teacher-student distillation paradigm, enabling spatial efficiency for online render-ing. Empirically, we evaluate our novel training framework on two publicly available datasets, namely NeRF synthetic and Tanks&Temples. Our evaluation demonstrates that our DaC training pipeline enhances the rendering quality of a state-of-the-art baseline model while exhibiting convergence to a superior minimum.

Via

Access Paper or Ask Questions

Bayesian Optimisation for Mixed-Variable Inputs using Value Proposals

Feb 17, 2022

Yan Zuo, Amir Dezfouli, Iadine Chades, David Alexander, Benjamin Ward Muir

Figure 1 for Bayesian Optimisation for Mixed-Variable Inputs using Value Proposals

Figure 2 for Bayesian Optimisation for Mixed-Variable Inputs using Value Proposals

Figure 3 for Bayesian Optimisation for Mixed-Variable Inputs using Value Proposals

Figure 4 for Bayesian Optimisation for Mixed-Variable Inputs using Value Proposals

Abstract:Many real-world optimisation problems are defined over both categorical and continuous variables, yet efficient optimisation methods such asBayesian Optimisation (BO) are not designed tohandle such mixed-variable search spaces. Recent approaches to this problem cast the selection of the categorical variables as a bandit problem, operating independently alongside a BO component which optimises the continuous variables. In this paper, we adopt a holistic view and aim to consolidate optimisation of the categorical and continuous sub-spaces under a single acquisition metric. We derive candidates from the ExpectedImprovement criterion, which we call value proposals, and use these proposals to make selections on both the categorical and continuous components of the input. We show that this unified approach significantly outperforms existing mixed-variable optimisation approaches across several mixed-variable black-box optimisation tasks.

Via

Access Paper or Ask Questions

Learning Instance and Task-Aware Dynamic Kernels for Few Shot Learning

Dec 07, 2021

Rongkai Ma, Pengfei Fang, Gil Avraham, Yan Zuo, Tom Drummond, Mehrtash Harandi

Figure 1 for Learning Instance and Task-Aware Dynamic Kernels for Few Shot Learning

Figure 2 for Learning Instance and Task-Aware Dynamic Kernels for Few Shot Learning

Figure 3 for Learning Instance and Task-Aware Dynamic Kernels for Few Shot Learning

Figure 4 for Learning Instance and Task-Aware Dynamic Kernels for Few Shot Learning

Abstract:Learning and generalizing to novel concepts with few samples (Few-Shot Learning) is still an essential challenge to real-world applications. A principle way of achieving few-shot learning is to realize a model that can rapidly adapt to the context of a given task. Dynamic networks have been shown capable of learning content-adaptive parameters efficiently, making them suitable for few-shot learning. In this paper, we propose to learn the dynamic kernels of a convolution network as a function of the task at hand, enabling faster generalization. To this end, we obtain our dynamic kernels based on the entire task and each sample and develop a mechanism further conditioning on each individual channel and position independently. This results in dynamic kernels that simultaneously attend to the global information whilst also considering minuscule details available. We empirically show that our model improves performance on few-shot classification and detection tasks, achieving a tangible improvement over several baseline models. This includes state-of-the-art results on 4 few-shot classification benchmarks: mini-ImageNet, tiered-ImageNet, CUB and FC100 and competitive results on a few-shot detection dataset: MS COCO-PASCAL-VOC.

Via

Access Paper or Ask Questions

Localising In Complex Scenes Using Balanced Adversarial Adaptation

Nov 09, 2020

Gil Avraham, Yan Zuo, Tom Drummond

Figure 1 for Localising In Complex Scenes Using Balanced Adversarial Adaptation

Figure 2 for Localising In Complex Scenes Using Balanced Adversarial Adaptation

Figure 3 for Localising In Complex Scenes Using Balanced Adversarial Adaptation

Figure 4 for Localising In Complex Scenes Using Balanced Adversarial Adaptation

Abstract:Domain adaptation and generative modelling have collectively mitigated the expensive nature of data collection and labelling by leveraging the rich abundance of accurate, labelled data in simulation environments. In this work, we study the performance gap that exists between representations optimised for localisation on simulation environments and the application of such representations in a real-world setting. Our method exploits the shared geometric similarities between simulation and real-world environments whilst maintaining invariance towards visual discrepancies. This is achieved by optimising a representation extractor to project both simulated and real representations into a shared representation space. Our method uses a symmetrical adversarial approach which encourages the representation extractor to conceal the domain that features are extracted from and simultaneously preserves robust attributes between source and target domains that are beneficial for localisation. We evaluate our method by adapting representations optimised for indoor Habitat simulated environments (Matterport3D and Replica) to a real-world indoor environment (Active Vision Dataset), showing that it compares favourably against fully-supervised approaches.

* Accepted at 3DV 2020

Via

Access Paper or Ask Questions

Residual Likelihood Forests

Nov 04, 2020

Yan Zuo, Tom Drummond

Figure 1 for Residual Likelihood Forests

Figure 2 for Residual Likelihood Forests

Figure 3 for Residual Likelihood Forests

Figure 4 for Residual Likelihood Forests

Abstract:This paper presents a novel ensemble learning approach called Residual Likelihood Forests (RLF). Our weak learners produce conditional likelihoods that are sequentially optimized using global loss in the context of previous learners within a boosting-like framework (rather than probability distributions that are measured from observed data) and are combined multiplicatively (rather than additively). This increases the efficiency of our strong classifier, allowing for the design of classifiers which are more compact in terms of model capacity. We apply our method to several machine learning classification tasks, showing significant improvements in performance. When compared against several ensemble approaches including Random Forests and Gradient Boosted Trees, RLFs offer a significant improvement in performance whilst concurrently reducing the required model size.

* Accepted at BMVC2020

Via

Access Paper or Ask Questions

EMPNet: Neural Localisation and Mapping Using Embedded Memory Points

Aug 02, 2019

Gil Avraham, Yan Zuo, Thanuja Dharmasiri, Tom Drummond

Figure 1 for EMPNet: Neural Localisation and Mapping Using Embedded Memory Points

Figure 2 for EMPNet: Neural Localisation and Mapping Using Embedded Memory Points

Figure 3 for EMPNet: Neural Localisation and Mapping Using Embedded Memory Points

Figure 4 for EMPNet: Neural Localisation and Mapping Using Embedded Memory Points

Abstract:Continuously estimating an agent's state space and a representation of its surroundings has proven vital towards full autonomy. A shared common ground among systems which successfully achieve this feat is the integration of previously encountered observations into the current state being estimated. This necessitates the use of a memory module for incorporating previously visited states whilst simultaneously offering an internal representation of the observed environment. In this work we develop a memory module which contains rigidly aligned point-embeddings that represent a coherent scene structure acquired from an RGB-D sequence of observations. The point-embeddings are extracted using modern convolutional neural network architectures, and alignment is performed by computing a dense correspondence matrix between a new observation and the current embeddings residing in the memory module. The whole framework is end-to-end trainable, resulting in a recurrent joint optimisation of the point-embeddings contained in the memory. This process amplifies the shared information across states, providing increased robustness and accuracy. We show significant improvement of our method across a set of experiments performed on the synthetic VIZDoom environment and a real world Active Vision Dataset.

* Accepted at ICCV 2019

Via

Access Paper or Ask Questions

Traversing Latent Space using Decision Ferns

Dec 06, 2018

Yan Zuo, Gil Avraham, Tom Drummond

Figure 1 for Traversing Latent Space using Decision Ferns

Figure 2 for Traversing Latent Space using Decision Ferns

Figure 3 for Traversing Latent Space using Decision Ferns

Figure 4 for Traversing Latent Space using Decision Ferns

Abstract:The practice of transforming raw data to a feature space so that inference can be performed in that space has been popular for many years. Recently, rapid progress in deep neural networks has given both researchers and practitioners enhanced methods that increase the richness of feature representations, be it from images, text or speech. In this work we show how a constructed latent space can be explored in a controlled manner and argue that this complements well founded inference methods. For constructing the latent space a Variational Autoencoder is used. We present a novel controller module that allows for smooth traversal in the latent space and construct an end-to-end trainable framework. We explore the applicability of our method for performing spatial transformations as well as kinematics for predicting future latent vectors of a video sequence.

Via

Access Paper or Ask Questions

Generative Adversarial Forests for Better Conditioned Adversarial Learning

May 14, 2018

Yan Zuo, Gil Avraham, Tom Drummond

Figure 1 for Generative Adversarial Forests for Better Conditioned Adversarial Learning

Figure 2 for Generative Adversarial Forests for Better Conditioned Adversarial Learning

Figure 3 for Generative Adversarial Forests for Better Conditioned Adversarial Learning

Figure 4 for Generative Adversarial Forests for Better Conditioned Adversarial Learning

Abstract:In recent times, many of the breakthroughs in various vision-related tasks have revolved around improving learning of deep models; these methods have ranged from network architectural improvements such as Residual Networks, to various forms of regularisation such as Batch Normalisation. In essence, many of these techniques revolve around better conditioning, allowing for deeper and deeper models to be successfully learned. In this paper, we look towards better conditioning Generative Adversarial Networks (GANs) in an unsupervised learning setting. Our method embeds the powerful discriminating capabilities of a decision forest into the discriminator of a GAN. This results in a better conditioned model which learns in an extremely stable way. We demonstrate empirical results which show both clear qualitative and quantitative evidence of the effectiveness of our approach, gaining significant performance improvements over several popular GAN-based approaches on the Oxford Flowers and Aligned Celebrity Faces datasets.

Via

Access Paper or Ask Questions