Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shengjie Sun

Delta Decompression for MoE-based LLMs Compression

Feb 24, 2025

Hao Gu, Wei Li, Lujun Li, Qiyuan Zhu, Mark Lee, Shengjie Sun, Wei Xue, Yike Guo

Abstract:Mixture-of-Experts (MoE) architectures in large language models (LLMs) achieve exceptional performance, but face prohibitive storage and memory requirements. To address these challenges, we present $D^2$-MoE, a new delta decompression compressor for reducing the parameters of MoE LLMs. Based on observations of expert diversity, we decompose their weights into a shared base weight and unique delta weights. Specifically, our method first merges each expert's weight into the base weight using the Fisher information matrix to capture shared components. Then, we compress delta weights through Singular Value Decomposition (SVD) by exploiting their low-rank properties. Finally, we introduce a semi-dynamical structured pruning strategy for the base weights, combining static and dynamic redundancy analysis to achieve further parameter reduction while maintaining input adaptivity. In this way, our $D^2$-MoE successfully compact MoE LLMs to high compression ratios without additional training. Extensive experiments highlight the superiority of our approach, with over 13% performance gains than other compressors on Mixtral|Phi-3.5|DeepSeek|Qwen2 MoE LLMs at 40$\sim$60% compression rates. Codes are available in https://github.com/lliai/D2MoE.

* Work in progress

Via

Access Paper or Ask Questions

VLP: Vision-Language Preference Learning for Embodied Manipulation

Feb 17, 2025

Runze Liu, Chenjia Bai, Jiafei Lyu, Shengjie Sun, Yali Du, Xiu Li

Abstract:Reward engineering is one of the key challenges in Reinforcement Learning (RL). Preference-based RL effectively addresses this issue by learning from human feedback. However, it is both time-consuming and expensive to collect human preference labels. In this paper, we propose a novel \textbf{V}ision-\textbf{L}anguage \textbf{P}reference learning framework, named \textbf{VLP}, which learns a vision-language preference model to provide preference feedback for embodied manipulation tasks. To achieve this, we define three types of language-conditioned preferences and construct a vision-language preference dataset, which contains versatile implicit preference orders without human annotations. The preference model learns to extract language-related features, and then serves as a preference annotator in various downstream tasks. The policy can be learned according to the annotated preferences via reward learning or direct policy optimization. Extensive empirical results on simulated embodied manipulation tasks demonstrate that our method provides accurate preferences and generalizes to unseen tasks and unseen language instructions, outperforming the baselines by a large margin.

Via

Access Paper or Ask Questions

A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning

Oct 18, 2024

Shengjie Sun, Runze Liu, Jiafei Lyu, Jing-Wen Yang, Liangpeng Zhang, Xiu Li

Figure 1 for A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning

Figure 2 for A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning

Figure 3 for A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning

Figure 4 for A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning

Abstract:Large Language Models (LLMs) have shown significant potential in designing reward functions for Reinforcement Learning (RL) tasks. However, obtaining high-quality reward code often involves human intervention, numerous LLM queries, or repetitive RL training. To address these issues, we propose CARD, a LLM-driven Reward Design framework that iteratively generates and improves reward function code. Specifically, CARD includes a Coder that generates and verifies the code, while a Evaluator provides dynamic feedback to guide the Coder in improving the code, eliminating the need for human feedback. In addition to process feedback and trajectory feedback, we introduce Trajectory Preference Evaluation (TPE), which evaluates the current reward function based on trajectory preferences. If the code fails the TPE, the Evaluator provides preference feedback, avoiding RL training at every iteration and making the reward function better aligned with the task objective. Empirical results on Meta-World and ManiSkill2 demonstrate that our method achieves an effective balance between task performance and token efficiency, outperforming or matching the baselines across all tasks. On 10 out of 12 tasks, CARD shows better or comparable performance to policies trained with expert-designed rewards, and our method even surpasses the oracle on 3 tasks.

Via

Access Paper or Ask Questions

EdgeNet : Encoder-decoder generative Network for Auction Design in E-commerce Online Advertising

May 09, 2023

Guangyuan Shen, Shengjie Sun, Dehong Gao, Libin Yang, Yongping Shi, Wei Ning

Figure 1 for EdgeNet : Encoder-decoder generative Network for Auction Design in E-commerce Online Advertising

Figure 2 for EdgeNet : Encoder-decoder generative Network for Auction Design in E-commerce Online Advertising

Figure 3 for EdgeNet : Encoder-decoder generative Network for Auction Design in E-commerce Online Advertising

Abstract:We present a new encoder-decoder generative network dubbed EdgeNet, which introduces a novel encoder-decoder framework for data-driven auction design in online e-commerce advertising. We break the neural auction paradigm of Generalized-Second-Price(GSP), and improve the utilization efficiency of data while ensuring the economic characteristics of the auction mechanism. Specifically, EdgeNet introduces a transformer-based encoder to better capture the mutual influence among different candidate advertisements. In contrast to GSP based neural auction model, we design an autoregressive decoder to better utilize the rich context information in online advertising auctions. EdgeNet is conceptually simple and easy to extend to the existing end-to-end neural auction framework. We validate the efficiency of EdgeNet on a wide range of e-commercial advertising auction, demonstrating its potential in improving user experience and platform revenue.

* under review. arXiv admin note: substantial text overlap with arXiv:2106.03593 by other authors

Via

Access Paper or Ask Questions

Time-aware Graph Embedding: A temporal smoothness and task-oriented approach

Jul 22, 2020

Yonghui Xu, Shengjie Sun, Yuan Miao, Dong Yang, Xiaonan Meng, Yi Hu, Ke Wang, Hengjie Song, Chuanyan Miao

Figure 1 for Time-aware Graph Embedding: A temporal smoothness and task-oriented approach

Figure 2 for Time-aware Graph Embedding: A temporal smoothness and task-oriented approach

Figure 3 for Time-aware Graph Embedding: A temporal smoothness and task-oriented approach

Figure 4 for Time-aware Graph Embedding: A temporal smoothness and task-oriented approach

Abstract:Knowledge graph embedding, which aims to learn the low-dimensional representations of entities and relationships, has attracted considerable research efforts recently. However, most knowledge graph embedding methods focus on the structural relationships in fixed triples while ignoring the temporal information. Currently, existing time-aware graph embedding methods only focus on the factual plausibility, while ignoring the temporal smoothness which models the interactions between a fact and its contexts, and thus can capture fine-granularity temporal relationships. This leads to the limited performance of embedding related applications. To solve this problem, this paper presents a Robustly Time-aware Graph Embedding (RTGE) method by incorporating temporal smoothness. Two major innovations of our paper are presented here. At first, RTGE integrates a measure of temporal smoothness in the learning process of the time-aware graph embedding. Via the proposed additional smoothing factor, RTGE can preserve both structural information and evolutionary patterns of a given graph. Secondly, RTGE provides a general task-oriented negative sampling strategy associated with temporally-aware information, which further improves the adaptive ability of the proposed algorithm and plays an essential role in obtaining superior performance in various tasks. Extensive experiments conducted on multiple benchmark tasks show that RTGE can increase performance in entity/relationship/temporal scoping prediction tasks.

Via

Access Paper or Ask Questions

Optimal Delivery with Budget Constraint in E-Commerce Advertising

Oct 08, 2019

Chao Wei, Weiru Zhang, Shengjie Sun, Fei Li, Xiaonan Meng, Yi Hu, Hao Wang

Figure 1 for Optimal Delivery with Budget Constraint in E-Commerce Advertising

Figure 2 for Optimal Delivery with Budget Constraint in E-Commerce Advertising

Figure 3 for Optimal Delivery with Budget Constraint in E-Commerce Advertising

Figure 4 for Optimal Delivery with Budget Constraint in E-Commerce Advertising

Abstract:Online advertising in E-commerce platforms provides sellers an opportunity to achieve potential audiences with different target goals. Ad serving systems (like display and search advertising systems) that assign ads to pages should satisfy objectives such as plenty of audience for branding advertisers, clicks or conversions for performance-based advertisers, at the same time try to maximize overall revenue of the platform. In this paper, we propose an approach based on linear programming subjects to constraints in order to optimize the revenue and improve different performance goals simultaneously. We have validated our algorithm by implementing an offline simulation system in Alibaba E-commerce platform and running the auctions from online requests which takes system performance, ranking and pricing schemas into account. We have also compared our algorithm with related work, and the results show that our algorithm can effectively improve campaign performance and revenue of the platform.

* 13 pages, 5 figures

Via

Access Paper or Ask Questions

Important Attribute Identification in Knowledge Graph

Oct 12, 2018

Shengjie Sun, Dong Yang, Hongchun Zhang, Yanxu Chen, Chao Wei, Xiaonan Meng, Yi Hu

Figure 1 for Important Attribute Identification in Knowledge Graph

Figure 2 for Important Attribute Identification in Knowledge Graph

Figure 3 for Important Attribute Identification in Knowledge Graph

Abstract:The knowledge graph(KG) composed of entities with their descriptions and attributes, and relationship between entities, is finding more and more application scenarios in various natural language processing tasks. In a typical knowledge graph like Wikidata, entities usually have a large number of attributes, but it is difficult to know which ones are important. The importance of attributes can be a valuable piece of information in various applications spanning from information retrieval to natural language generation. In this paper, we propose a general method of using external user generated text data to evaluate the relative importance of an entity's attributes. To be more specific, we use the word/sub-word embedding techniques to match the external textual data back to entities' attribute name and values and rank the attributes by their matching cohesiveness. To our best knowledge, this is the first work of applying vector based semantic matching to important attribute identification, and our method outperforms the previous traditional methods. We also apply the outcome of the detected important attributes to a language generation task; compared with previous generated text, the new method generates much more customized and informative messages.

Via

Access Paper or Ask Questions