Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Thanh Vu

LossMix: Simplify and Generalize Mixup for Object Detection and Beyond

Mar 18, 2023

Thanh Vu, Baochen Sun, Bodi Yuan, Alex Ngai, Yueqi Li, Jan-Michael Frahm

Abstract:The success of data mixing augmentations in image classification tasks has been well-received. However, these techniques cannot be readily applied to object detection due to challenges such as spatial misalignment, foreground/background distinction, and plurality of instances. To tackle these issues, we first introduce a novel conceptual framework called Supervision Interpolation, which offers a fresh perspective on interpolation-based augmentations by relaxing and generalizing Mixup. Building on this framework, we propose LossMix, a simple yet versatile and effective regularization that enhances the performance and robustness of object detectors and more. Our key insight is that we can effectively regularize the training on mixed data by interpolating their loss errors instead of ground truth labels. Empirical results on the PASCAL VOC and MS COCO datasets demonstrate that LossMix consistently outperforms currently popular mixing strategies. Furthermore, we design a two-stage domain mixing method that leverages LossMix to surpass Adaptive Teacher (CVPR 2022) and set a new state of the art for unsupervised domain adaptation.

Via

Access Paper or Ask Questions

Toward Edge-Efficient Dense Predictions with Synergistic Multi-Task Neural Architecture Search

Oct 04, 2022

Thanh Vu, Yanqi Zhou, Chunfeng Wen, Yueqi Li, Jan-Michael Frahm

Figure 1 for Toward Edge-Efficient Dense Predictions with Synergistic Multi-Task Neural Architecture Search

Figure 2 for Toward Edge-Efficient Dense Predictions with Synergistic Multi-Task Neural Architecture Search

Figure 3 for Toward Edge-Efficient Dense Predictions with Synergistic Multi-Task Neural Architecture Search

Figure 4 for Toward Edge-Efficient Dense Predictions with Synergistic Multi-Task Neural Architecture Search

Abstract:In this work, we propose a novel and scalable solution to address the challenges of developing efficient dense predictions on edge platforms. Our first key insight is that MultiTask Learning (MTL) and hardware-aware Neural Architecture Search (NAS) can work in synergy to greatly benefit on-device Dense Predictions (DP). Empirical results reveal that the joint learning of the two paradigms is surprisingly effective at improving DP accuracy, achieving superior performance over both the transfer learning of single-task NAS and prior state-of-the-art approaches in MTL, all with just 1/10th of the computation. To the best of our knowledge, our framework, named EDNAS, is the first to successfully leverage the synergistic relationship of NAS and MTL for DP. Our second key insight is that the standard depth training for multi-task DP can cause significant instability and noise to MTL evaluation. Instead, we propose JAReD, an improved, easy-to-adopt Joint Absolute-Relative Depth loss, that reduces up to 88% of the undesired noise while simultaneously boosting accuracy. We conduct extensive evaluations on standard datasets, benchmark against strong baselines and state-of-the-art approaches, as well as provide an analysis of the discovered optimal architectures.

* WACV 2023. 14 pages, 5 figures

Via

Access Paper or Ask Questions

Automatic Post-Editing for Translating Chinese Novels to Vietnamese

Apr 25, 2021

Thanh Vu, Dai Quoc Nguyen

Figure 1 for Automatic Post-Editing for Translating Chinese Novels to Vietnamese

Figure 2 for Automatic Post-Editing for Translating Chinese Novels to Vietnamese

Figure 3 for Automatic Post-Editing for Translating Chinese Novels to Vietnamese

Abstract:Automatic post-editing (APE) is an important remedy for reducing errors of raw translated texts that are produced by machine translation (MT) systems or software-aided translation. In this paper, we present the first attempt to tackle the APE task for Vietnamese. Specifically, we construct the first large-scale dataset of 5M Vietnamese translated and corrected sentence pairs. We then apply strong neural MT models to handle the APE task, using our constructed dataset. Experimental results from both automatic and human evaluations show the effectiveness of the neural MT models in handling the Vietnamese APE task.

Via

Access Paper or Ask Questions

Any-Width Networks

Dec 06, 2020

Thanh Vu, Marc Eder, True Price, Jan-Michael Frahm

Abstract:Despite remarkable improvements in speed and accuracy, convolutional neural networks (CNNs) still typically operate as monolithic entities at inference time. This poses a challenge for resource-constrained practical applications, where both computational budgets and performance needs can vary with the situation. To address these constraints, we propose the Any-Width Network (AWN), an adjustable-width CNN architecture and associated training routine that allow for fine-grained control over speed and accuracy during inference. Our key innovation is the use of lower-triangular weight matrices which explicitly address width-varying batch statistics while being naturally suited for multi-width operations. We also show that this design facilitates an efficient training routine based on random width sampling. We empirically demonstrate that our proposed AWNs compare favorably to existing methods while providing maximally granular control during inference.

* 8 pages. Published at CVPR 2020 Workshop on Efficient Deep Learning in Computer Vision. Code at https://github.com/thanhmvu/awn

Via

Access Paper or Ask Questions

Active and Interactive Mapping with Dynamic Gaussian ProcessImplicit Surfaces for Mobile Manipulators

Oct 25, 2020

Liyang Liu, Simon Fryc, Lan Wu, Thanh Vu, Gavin Paul, Teresa Vidal-Calleja

Figure 1 for Active and Interactive Mapping with Dynamic Gaussian ProcessImplicit Surfaces for Mobile Manipulators

Figure 2 for Active and Interactive Mapping with Dynamic Gaussian ProcessImplicit Surfaces for Mobile Manipulators

Figure 3 for Active and Interactive Mapping with Dynamic Gaussian ProcessImplicit Surfaces for Mobile Manipulators

Figure 4 for Active and Interactive Mapping with Dynamic Gaussian ProcessImplicit Surfaces for Mobile Manipulators

Abstract:In this paper, we present an interactive probabilistic framework for a mobile manipulator which moves in the environment, makes changes and maps the changing scene alongside. The framework is motivated by interactive robotic applications found in warehouses, construction sites and additive manufacturing, where a mobile robot manipulates objects in the scene. The proposed framework uses a novel dynamic Gaussian Process (GP) Implicit Surface method to incrementally build and update the scene map that reflects environment changes. Actively the framework provides the next-best-view (NBV), balancing the need of pick object reach-ability and map's information gain (IG). To enforce a priority of visiting boundary segments over unknown regions, the IG formulation includes an uncertainty gradient based frontier score by exploiting the GP kernel derivative. This leads to an efficient strategy that addresses the often conflicting requirement of unknown environment exploration and object picking exploitation given a limited execution horizon. We demonstrate the effectiveness of our framework with software simulation and real-life experiments.

Via

Access Paper or Ask Questions

WNUT-2020 Task 2: Identification of Informative COVID-19 English Tweets

Oct 16, 2020

Dat Quoc Nguyen, Thanh Vu, Afshin Rahimi, Mai Hoang Dao, Linh The Nguyen, Long Doan

Figure 1 for WNUT-2020 Task 2: Identification of Informative COVID-19 English Tweets

Figure 2 for WNUT-2020 Task 2: Identification of Informative COVID-19 English Tweets

Abstract:In this paper, we provide an overview of the WNUT-2020 shared task on the identification of informative COVID-19 English Tweets. We describe how we construct a corpus of 10K Tweets and organize the development and evaluation phases for this task. In addition, we also present a brief summary of results obtained from the final system evaluation submissions of 55 teams, finding that (i) many systems obtain very high performance, up to 0.91 F1 score, (ii) the majority of the submissions achieve substantially higher results than the baseline fastText (Joulin et al., 2017), and (iii) fine-tuning pre-trained language models on relevant language data followed by supervised training performs well in this task.

* In Proceedings of the 6th Workshop on Noisy User-generated Text

Via

Access Paper or Ask Questions

QuatRE: Relation-Aware Quaternions for Knowledge Graph Embeddings

Sep 26, 2020

Dai Quoc Nguyen, Thanh Vu, Tu Dinh Nguyen, Dinh Phung

Figure 1 for QuatRE: Relation-Aware Quaternions for Knowledge Graph Embeddings

Figure 2 for QuatRE: Relation-Aware Quaternions for Knowledge Graph Embeddings

Figure 3 for QuatRE: Relation-Aware Quaternions for Knowledge Graph Embeddings

Figure 4 for QuatRE: Relation-Aware Quaternions for Knowledge Graph Embeddings

Abstract:We propose a simple and effective embedding model, named QuatRE, to learn quaternion embeddings for entities and relations in knowledge graphs. QuatRE aims to enhance correlations between head and tail entities given a relation within the Quaternion space with Hamilton product. QuatRE achieves this by associating each relation with two quaternion vectors which are used to rotate the quaternion embeddings of the head and tail entities, respectively. To obtain the triple score, QuatRE rotates the rotated embedding of the head entity using the normalized quaternion embedding of the relation, followed by a quaternion-inner product with the rotated embedding of the tail entity. Experimental results show that our QuatRE outperforms up-to-date embedding models on well-known benchmark datasets for knowledge graph completion.

Via

Access Paper or Ask Questions

HSD Shared Task in VLSP Campaign 2019:Hate Speech Detection for Social Good

Jul 13, 2020

Xuan-Son Vu, Thanh Vu, Mai-Vu Tran, Thanh Le-Cong, Huyen T M. Nguyen

Figure 1 for HSD Shared Task in VLSP Campaign 2019:Hate Speech Detection for Social Good

Figure 2 for HSD Shared Task in VLSP Campaign 2019:Hate Speech Detection for Social Good

Abstract:The paper describes the organisation of the "HateSpeech Detection" (HSD) task at the VLSP workshop 2019 on detecting the fine-grained presence of hate speech in Vietnamese textual items (i.e., messages) extracted from Facebook, which is the most popular social network site (SNS) in Vietnam. The task is organised as a multi-class classification task and based on a large-scale dataset containing 25,431 Vietnamese textual items from Facebook. The task participants were challenged to build a classification model that is capable of classifying an item to one of 3 classes, i.e., "HATE", "OFFENSIVE" and "CLEAN". HSD attracted a large number of participants and was a popular task at VLSP 2019. In particular, there were 71 teams signed up for the task, 14 of them submitted results with 380 valid submissions from 20th September 2019 to 4th October 2019.

Via

Access Paper or Ask Questions

A Label Attention Model for ICD Coding from Clinical Text

Jul 13, 2020

Thanh Vu, Dat Quoc Nguyen, Anthony Nguyen

Figure 1 for A Label Attention Model for ICD Coding from Clinical Text

Figure 2 for A Label Attention Model for ICD Coding from Clinical Text

Figure 3 for A Label Attention Model for ICD Coding from Clinical Text

Figure 4 for A Label Attention Model for ICD Coding from Clinical Text

Abstract:ICD coding is a process of assigning the International Classification of Disease diagnosis codes to clinical/medical notes documented by health professionals (e.g. clinicians). This process requires significant human resources, and thus is costly and prone to error. To handle the problem, machine learning has been utilized for automatic ICD coding. Previous state-of-the-art models were based on convolutional neural networks, using a single/several fixed window sizes. However, the lengths and interdependence between text fragments related to ICD codes in clinical text vary significantly, leading to the difficulty of deciding what the best window sizes are. In this paper, we propose a new label attention model for automatic ICD coding, which can handle both the various lengths and the interdependence of the ICD code related text fragments. Furthermore, as the majority of ICD codes are not frequently used, leading to the extremely imbalanced data issue, we additionally propose a hierarchical joint learning mechanism extending our label attention model to handle the issue, using the hierarchical relationships among the codes. Our label attention model achieves new state-of-the-art results on three benchmark MIMIC datasets, and the joint learning mechanism helps improve the performances for infrequent codes.

* In Proceedings of IJCAI 2020 (Main Track)

Via

Access Paper or Ask Questions

BERTweet: A pre-trained language model for English Tweets

May 20, 2020

Dat Quoc Nguyen, Thanh Vu, Anh Tuan Nguyen

Figure 1 for BERTweet: A pre-trained language model for English Tweets

Figure 2 for BERTweet: A pre-trained language model for English Tweets

Figure 3 for BERTweet: A pre-trained language model for English Tweets

Figure 4 for BERTweet: A pre-trained language model for English Tweets

Abstract:We present BERTweet, the first public large-scale pre-trained language model for English Tweets. Our BERTweet is trained using the RoBERTa pre-training procedure (Liu et al., 2019), with the same model configuration as BERT-base (Devlin et al., 2019). Experiments show that BERTweet outperforms strong baselines RoBERTa-base and XLM-R-base (Conneau et al., 2020), producing better performance results than the previous state-of-the-art models on three Tweet NLP tasks: Part-of-speech tagging, Named-entity recognition and text classification. We release BERTweet to facilitate future research and downstream applications on Tweet data. Our BERTweet is available at: https://github.com/VinAIResearch/BERTweet

Via

Access Paper or Ask Questions