Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tong Ye

Align Your Rhythm: Generating Highly Aligned Dance Poses with Gating-Enhanced Rhythm-Aware Feature Representation

Mar 21, 2025

Congyi Fan, Jian Guan, Xuanjia Zhao, Dongli Xu, Youtian Lin, Tong Ye, Pengming Feng, Haiwei Pan

Abstract:Automatically generating natural, diverse and rhythmic human dance movements driven by music is vital for virtual reality and film industries. However, generating dance that naturally follows music remains a challenge, as existing methods lack proper beat alignment and exhibit unnatural motion dynamics. In this paper, we propose Danceba, a novel framework that leverages gating mechanism to enhance rhythm-aware feature representation for music-driven dance generation, which achieves highly aligned dance poses with enhanced rhythmic sensitivity. Specifically, we introduce Phase-Based Rhythm Extraction (PRE) to precisely extract rhythmic information from musical phase data, capitalizing on the intrinsic periodicity and temporal structures of music. Additionally, we propose Temporal-Gated Causal Attention (TGCA) to focus on global rhythmic features, ensuring that dance movements closely follow the musical rhythm. We also introduce Parallel Mamba Motion Modeling (PMMM) architecture to separately model upper and lower body motions along with musical features, thereby improving the naturalness and diversity of generated dance movements. Extensive experiments confirm that Danceba outperforms state-of-the-art methods, achieving significantly better rhythmic alignment and motion diversity. Project page: https://danceba.github.io/ .

* 10 pages, 6 figures

Via

Access Paper or Ask Questions

Optimizing Knowledge Integration in Retrieval-Augmented Generation with Self-Selection

Feb 10, 2025

Yan Weng, Fengbin Zhu, Tong Ye, Haoyan Liu, Fuli Feng, Tat-Seng Chua

Abstract:Retrieval-Augmented Generation (RAG), which integrates external knowledge into Large Language Models (LLMs), has proven effective in enabling LLMs to produce more accurate and reliable responses. However, it remains a significant challenge how to effectively integrate external retrieved knowledge with internal parametric knowledge in LLMs. In this work, we propose a novel Self-Selection RAG framework, where the LLM is made to select from pairwise responses generated with internal parametric knowledge solely and with external retrieved knowledge together to achieve enhanced accuracy. To this end, we devise a Self-Selection-RGP method to enhance the capabilities of the LLM in both generating and selecting the correct answer, by training the LLM with Direct Preference Optimization (DPO) over a curated Retrieval Generation Preference (RGP) dataset. Experimental results with two open-source LLMs (i.e., Llama2-13B-Chat and Mistral-7B) well demonstrate the superiority of our approach over other baseline methods on Natural Questions (NQ) and TrivialQA datasets.

* 12 pages, 6 figures

Via

Access Paper or Ask Questions

Iterative or Innovative? A Problem-Oriented Perspective for Code Optimization

Jun 17, 2024

Tong Ye, Tengfei Ma, Lingfei Wu, Xuhong Zhang, Shouling Ji, Wenhai Wang

Figure 1 for Iterative or Innovative? A Problem-Oriented Perspective for Code Optimization

Figure 2 for Iterative or Innovative? A Problem-Oriented Perspective for Code Optimization

Figure 3 for Iterative or Innovative? A Problem-Oriented Perspective for Code Optimization

Figure 4 for Iterative or Innovative? A Problem-Oriented Perspective for Code Optimization

Abstract:Large language models (LLMs) have demonstrated strong capabilities in solving a wide range of programming tasks. However, LLMs have rarely been explored for code optimization. In this paper, we explore code optimization with a focus on performance enhancement, specifically aiming to optimize code for minimal execution time. The recently proposed first PIE dataset for performance optimization constructs program optimization pairs based on iterative submissions from the same programmer for the same problem. However, this approach restricts LLMs to local performance improvements, neglecting global algorithmic innovation. Therefore, we adopt a completely different perspective by reconstructing the optimization pairs into a problem-oriented approach. This allows for the integration of various ingenious ideas from different programmers tackling the same problem. Experimental results demonstrate that adapting LLMs to problem-oriented optimization pairs significantly enhances their optimization capabilities. Meanwhile, we identified performance bottlenecks within the problem-oriented perspective. By employing model merge, we further overcame bottlenecks and ultimately elevated the program optimization ratio ($51.76\%\rightarrow76.65\%$) and speedup ($2.65\times\rightarrow5.09\times$) to new levels.

Via

Access Paper or Ask Questions

Uncovering LLM-Generated Code: A Zero-Shot Synthetic Code Detector via Code Rewriting

May 30, 2024

Tong Ye, Yangkai Du, Tengfei Ma, Lingfei Wu, Xuhong Zhang, Shouling Ji, Wenhai Wang

Abstract:Large Language Models (LLMs) have exhibited remarkable proficiency in generating code. However, the misuse of LLM-generated (Synthetic) code has prompted concerns within both educational and industrial domains, highlighting the imperative need for the development of synthetic code detectors. Existing methods for detecting LLM-generated content are primarily tailored for general text and often struggle with code content due to the distinct grammatical structure of programming languages and massive "low-entropy" tokens. Building upon this, our work proposes a novel zero-shot synthetic code detector based on the similarity between the code and its rewritten variants. Our method relies on the intuition that the differences between the LLM-rewritten and original codes tend to be smaller when the original code is synthetic. We utilize self-supervised contrastive learning to train a code similarity model and assess our approach on two synthetic code detection benchmarks. Our results demonstrate a notable enhancement over existing synthetic content detectors designed for general texts, with an improvement of 20.5% in the APPS benchmark and 29.1% in the MBPP benchmark.

* Previously submitted to EMNLP2023

Via

Access Paper or Ask Questions

Quantification of cardiac capillarization in single-immunostained myocardial slices using weakly supervised instance segmentation

Nov 30, 2023

Zhao Zhang, Xiwen Chen, William Richardson, Bruce Z. Gao, Abolfazl Razi, Tong Ye

Abstract:Decreased myocardial capillary density has been reported as an important histopathological feature associated with various heart disorders. Quantitative assessment of cardiac capillarization typically involves double immunostaining of cardiomyocytes (CMs) and capillaries in myocardial slices. In contrast, single immunostaining of basement membrane components is a straightforward approach to simultaneously label CMs and capillaries, presenting fewer challenges in background staining. However, subsequent image analysis always requires manual work in identifying and segmenting CMs and capillaries. Here, we developed an image analysis tool, AutoQC, to automatically identify and segment CMs and capillaries in immunofluorescence images of collagen type IV, a predominant basement membrane protein within the myocardium. In addition, commonly used capillarization-related measurements can be derived from segmentation masks. AutoQC features a weakly supervised instance segmentation algorithm by leveraging the power of a pre-trained segmentation model via prompt engineering. AutoQC outperformed YOLOv8-Seg, a state-of-the-art instance segmentation model, in both instance segmentation and capillarization assessment. Furthermore, the training of AutoQC required only a small dataset with bounding box annotations instead of pixel-wise annotations, leading to a reduced workload during network training. AutoQC provides an automated solution for quantifying cardiac capillarization in basement-membrane-immunostained myocardial slices, eliminating the need for manual image analysis once it is trained.

Via

Access Paper or Ask Questions

CP-BCS: Binary Code Summarization Guided by Control Flow Graph and Pseudo Code

Oct 24, 2023

Tong Ye, Lingfei Wu, Tengfei Ma, Xuhong Zhang, Yangkai Du, Peiyu Liu, Shouling Ji, Wenhai Wang

Abstract:Automatically generating function summaries for binaries is an extremely valuable but challenging task, since it involves translating the execution behavior and semantics of the low-level language (assembly code) into human-readable natural language. However, most current works on understanding assembly code are oriented towards generating function names, which involve numerous abbreviations that make them still confusing. To bridge this gap, we focus on generating complete summaries for binary functions, especially for stripped binary (no symbol table and debug information in reality). To fully exploit the semantics of assembly code, we present a control flow graph and pseudo code guided binary code summarization framework called CP-BCS. CP-BCS utilizes a bidirectional instruction-level control flow graph and pseudo code that incorporates expert knowledge to learn the comprehensive binary function execution behavior and logic semantics. We evaluate CP-BCS on 3 different binary optimization levels (O1, O2, and O3) for 3 different computer architectures (X86, X64, and ARM). The evaluation results demonstrate CP-BCS is superior and significantly improves the efficiency of reverse engineering.

* EMNLP 2023 Main Conference

Via

Access Paper or Ask Questions

Tram: A Token-level Retrieval-augmented Mechanism for Source Code Summarization

May 18, 2023

Tong Ye, Lingfei Wu, Tengfei Ma, Xuhong Zhang, Yangkai Du, Peiyu Liu, Wenhai Wang, Shouling Ji

Figure 1 for Tram: A Token-level Retrieval-augmented Mechanism for Source Code Summarization

Figure 2 for Tram: A Token-level Retrieval-augmented Mechanism for Source Code Summarization

Figure 3 for Tram: A Token-level Retrieval-augmented Mechanism for Source Code Summarization

Figure 4 for Tram: A Token-level Retrieval-augmented Mechanism for Source Code Summarization

Abstract:Automatically generating human-readable text describing the functionality of a program is the intent of source code summarization. Although Neural Language Models achieve significant performance in this field, an emerging trend is combining neural models with external knowledge. Most previous approaches rely on the sentence-level retrieval and combination paradigm (retrieval of similar code snippets and use of the corresponding code and summary pairs) on the encoder side. However, this paradigm is coarse-grained and cannot directly take advantage of the high-quality retrieved summary tokens on the decoder side. In this paper, we explore a fine-grained token-level retrieval-augmented mechanism on the decoder side to help the vanilla neural model generate a better code summary. Furthermore, to mitigate the limitation of token-level retrieval on capturing contextual code semantics, we propose to integrate code semantics into summary tokens. Extensive experiments and human evaluation reveal that our token-level retrieval-augmented approach significantly improves performance and is more interpretive.

Via

Access Paper or Ask Questions

On the Calibration and Uncertainty with Pólya-Gamma Augmentation for Dialog Retrieval Models

Mar 15, 2023

Tong Ye, Shijing Si, Jianzong Wang, Ning Cheng, Zhitao Li, Jing Xiao

Figure 1 for On the Calibration and Uncertainty with Pólya-Gamma Augmentation for Dialog Retrieval Models

Figure 2 for On the Calibration and Uncertainty with Pólya-Gamma Augmentation for Dialog Retrieval Models

Figure 3 for On the Calibration and Uncertainty with Pólya-Gamma Augmentation for Dialog Retrieval Models

Figure 4 for On the Calibration and Uncertainty with Pólya-Gamma Augmentation for Dialog Retrieval Models

Abstract:Deep neural retrieval models have amply demonstrated their power but estimating the reliability of their predictions remains challenging. Most dialog response retrieval models output a single score for a response on how relevant it is to a given question. However, the bad calibration of deep neural network results in various uncertainty for the single score such that the unreliable predictions always misinform user decisions. To investigate these issues, we present an efficient calibration and uncertainty estimation framework PG-DRR for dialog response retrieval models which adds a Gaussian Process layer to a deterministic deep neural network and recovers conjugacy for tractable posterior inference by P\'{o}lya-Gamma augmentation. Finally, PG-DRR achieves the lowest empirical calibration error (ECE) in the in-domain datasets and the distributional shift task while keeping $R_{10}@1$ and MAP performance.

* Accepted by AAAI 2023

Via

Access Paper or Ask Questions

Efficient Uncertainty Estimation with Gaussian Process for Reliable Dialog Response Retrieval

Mar 15, 2023

Tong Ye, Zhitao Li, Jianzong Wang, Ning Cheng, Jing Xiao

Figure 1 for Efficient Uncertainty Estimation with Gaussian Process for Reliable Dialog Response Retrieval

Figure 2 for Efficient Uncertainty Estimation with Gaussian Process for Reliable Dialog Response Retrieval

Figure 3 for Efficient Uncertainty Estimation with Gaussian Process for Reliable Dialog Response Retrieval

Figure 4 for Efficient Uncertainty Estimation with Gaussian Process for Reliable Dialog Response Retrieval

Abstract:Deep neural networks have achieved remarkable performance in retrieval-based dialogue systems, but they are shown to be ill calibrated. Though basic calibration methods like Monte Carlo Dropout and Ensemble can calibrate well, these methods are time-consuming in the training or inference stages. To tackle these challenges, we propose an efficient uncertainty calibration framework GPF-BERT for BERT-based conversational search, which employs a Gaussian Process layer and the focal loss on top of the BERT architecture to achieve a high-quality neural ranker. Extensive experiments are conducted to verify the effectiveness of our method. In comparison with basic calibration methods, GPF-BERT achieves the lowest empirical calibration error (ECE) in three in-domain datasets and the distributional shift tasks, while yielding the highest $R_{10}@1$ and MAP performance on most cases. In terms of time consumption, our GPF-BERT has an 8$\times$ speedup.

* Accepted by ICASSP 2023

Via

Access Paper or Ask Questions

Uncertainty Calibration for Deep Audio Classifiers

Jun 27, 2022

Tong Ye, Shijing Si, Jianzong Wang, Ning Cheng, Jing Xiao

Figure 1 for Uncertainty Calibration for Deep Audio Classifiers

Figure 2 for Uncertainty Calibration for Deep Audio Classifiers

Figure 3 for Uncertainty Calibration for Deep Audio Classifiers

Figure 4 for Uncertainty Calibration for Deep Audio Classifiers

Abstract:Although deep Neural Networks (DNNs) have achieved tremendous success in audio classification tasks, their uncertainty calibration are still under-explored. A well-calibrated model should be accurate when it is certain about its prediction and indicate high uncertainty when it is likely to be inaccurate. In this work, we investigate the uncertainty calibration for deep audio classifiers. In particular, we empirically study the performance of popular calibration methods: (i) Monte Carlo Dropout, (ii) ensemble, (iii) focal loss, and (iv) spectral-normalized Gaussian process (SNGP), on audio classification datasets. To this end, we evaluate (i-iv) for the tasks of environment sound and music genre classification. Results indicate that uncalibrated deep audio classifiers may be over-confident, and SNGP performs the best and is very efficient on the two datasets of this paper.

* Accepted by InterSpeech 2022, the first two authors contributed equally

Via

Access Paper or Ask Questions