Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yilei Zhang

Trinity-RFT: A General-Purpose and Unified Framework for Reinforcement Fine-Tuning of Large Language Models

May 23, 2025

Xuchen Pan, Yanxi Chen, Yushuo Chen, Yuchang Sun, Daoyuan Chen, Wenhao Zhang, Yuexiang Xie, Yilun Huang, Yilei Zhang, Dawei Gao(+3 more)

Abstract:Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models. It is built with a decoupled design, consisting of (1) an RFT-core that unifies and generalizes synchronous/asynchronous, on-policy/off-policy, and online/offline modes of RFT, (2) seamless integration for agent-environment interaction with high efficiency and robustness, and (3) systematic data pipelines optimized for RFT. Trinity-RFT can be easily adapted for diverse application scenarios, and serves as a unified platform for exploring advanced reinforcement learning paradigms. This technical report outlines the vision, features, design and implementations of Trinity-RFT, accompanied by extensive examples demonstrating the utility and user-friendliness of the proposed framework.

* This technical report will be continuously updated as the codebase evolves. GitHub: https://github.com/modelscope/Trinity-RFT

Via

Access Paper or Ask Questions

Optimizing Collaboration of LLM based Agents for Finite Element Analysis

Aug 23, 2024

Chuan Tian, Yilei Zhang

Abstract:This paper investigates the interactions between multiple agents within Large Language Models (LLMs) in the context of programming and coding tasks. We utilize the AutoGen framework to facilitate communication among agents, evaluating different configurations based on the success rates from 40 random runs for each setup. The study focuses on developing a flexible automation framework for applying the Finite Element Method (FEM) to solve linear elastic problems. Our findings emphasize the importance of optimizing agent roles and clearly defining their responsibilities, rather than merely increasing the number of agents. Effective collaboration among agents is shown to be crucial for addressing general FEM challenges. This research demonstrates the potential of LLM multi-agent systems to enhance computational automation in simulation methodologies, paving the way for future advancements in engineering and artificial intelligence.

Via

Access Paper or Ask Questions

KLIF: An optimized spiking neuron unit for tuning surrogate gradient slope and membrane potential

Feb 18, 2023

Chunming Jiang, Yilei Zhang

Abstract:Spiking neural networks (SNNs) have attracted much attention due to their ability to process temporal information, low power consumption, and higher biological plausibility. However, it is still challenging to develop efficient and high-performing learning algorithms for SNNs. Methods like artificial neural network (ANN)-to-SNN conversion can transform ANNs to SNNs with slight performance loss, but it needs a long simulation to approximate the rate coding. Directly training SNN by spike-based backpropagation (BP) such as surrogate gradient approximation is more flexible. Yet now, the performance of SNNs is not competitive compared with ANNs. In this paper, we propose a novel k-based leaky Integrate-and-Fire (KLIF) neuron model to improve the learning ability of SNNs. Compared with the popular leaky integrate-and-fire (LIF) model, KLIF adds a learnable scaling factor to dynamically update the slope and width of the surrogate gradient curve during training and incorporates a ReLU activation function that selectively delivers membrane potential to spike firing and resetting. The proposed spiking unit is evaluated on both static MNIST, Fashion-MNIST, CIFAR-10 datasets, as well as neuromorphic N-MNIST, CIFAR10-DVS, and DVS128-Gesture datasets. Experiments indicate that KLIF performs much better than LIF without introducing additional computational cost and achieves state-of-the-art performance on these datasets with few time steps. Also, KLIF is believed to be more biological plausible than LIF. The good performance of KLIF can make it completely replace the role of LIF in SNN for various tasks.

Via

Access Paper or Ask Questions

A noise based novel strategy for faster SNN training

Nov 10, 2022

Chunming Jiang, Yilei Zhang

Abstract:Spiking neural networks (SNNs) are receiving increasing attention due to their low power consumption and strong bio-plausibility. Optimization of SNNs is a challenging task. Two main methods, artificial neural network (ANN)-to-SNN conversion and spike-based backpropagation (BP), both have their advantages and limitations. For ANN-to-SNN conversion, it requires a long inference time to approximate the accuracy of ANN, thus diminishing the benefits of SNN. With spike-based BP, training high-precision SNNs typically consumes dozens of times more computational resources and time than their ANN counterparts. In this paper, we propose a novel SNN training approach that combines the benefits of the two methods. We first train a single-step SNN by approximating the neural potential distribution with random noise, then convert the single-step SNN to a multi-step SNN losslessly. The introduction of Gaussian distributed noise leads to a significant gain in accuracy after conversion. The results show that our method considerably reduces the training and inference times of SNNs while maintaining their high accuracy. Compared to the previous two methods, ours can reduce training time by 65%-75% and achieves more than 100 times faster inference speed. We also argue that the neuron model augmented with noise makes it more bio-plausible.

Via

Access Paper or Ask Questions

Spiking sampling network for image sparse representation and dynamic vision sensor data compression

Nov 08, 2022

Chunming Jiang, Yilei Zhang

Abstract:Sparse representation has attracted great attention because it can greatly save storage resources and find representative features of data in a low-dimensional space. As a result, it may be widely applied in engineering domains including feature extraction, compressed sensing, signal denoising, picture clustering, and dictionary learning, just to name a few. In this paper, we propose a spiking sampling network. This network is composed of spiking neurons, and it can dynamically decide which pixel points should be retained and which ones need to be masked according to the input. Our experiments demonstrate that this approach enables better sparse representation of the original image and facilitates image reconstruction compared to random sampling. We thus use this approach for compressing massive data from the dynamic vision sensor, which greatly reduces the storage requirements for event data.

Via

Access Paper or Ask Questions

Adversarial Defense via Neural Oscillation inspired Gradient Masking

Nov 04, 2022

Chunming Jiang, Yilei Zhang

Abstract:Spiking neural networks (SNNs) attract great attention due to their low power consumption, low latency, and biological plausibility. As they are widely deployed in neuromorphic devices for low-power brain-inspired computing, security issues become increasingly important. However, compared to deep neural networks (DNNs), SNNs currently lack specifically designed defense methods against adversarial attacks. Inspired by neural membrane potential oscillation, we propose a novel neural model that incorporates the bio-inspired oscillation mechanism to enhance the security of SNNs. Our experiments show that SNNs with neural oscillation neurons have better resistance to adversarial attacks than ordinary SNNs with LIF neurons on kinds of architectures and datasets. Furthermore, we propose a defense method that changes model's gradients by replacing the form of oscillation, which hides the original training gradients and confuses the attacker into using gradients of 'fake' neurons to generate invalid adversarial samples. Our experiments suggest that the proposed defense method can effectively resist both single-step and iterative attacks with comparable defense effectiveness and much less computational costs than adversarial training methods on DNNs. To the best of our knowledge, this is the first work that establishes adversarial defense through masking surrogate gradients on SNNs.

Via

Access Paper or Ask Questions

FocalClick: Towards Practical Interactive Image Segmentation

Apr 17, 2022

Xi Chen, Zhiyan Zhao, Yilei Zhang, Manni Duan, Donglian Qi, Hengshuang Zhao

Figure 1 for FocalClick: Towards Practical Interactive Image Segmentation

Figure 2 for FocalClick: Towards Practical Interactive Image Segmentation

Figure 3 for FocalClick: Towards Practical Interactive Image Segmentation

Figure 4 for FocalClick: Towards Practical Interactive Image Segmentation

Abstract:Interactive segmentation allows users to extract target masks by making positive/negative clicks. Although explored by many previous works, there is still a gap between academic approaches and industrial needs: first, existing models are not efficient enough to work on low power devices; second, they perform poorly when used to refine preexisting masks as they could not avoid destroying the correct part. FocalClick solves both issues at once by predicting and updating the mask in localized areas. For higher efficiency, we decompose the slow prediction on the entire image into two fast inferences on small crops: a coarse segmentation on the Target Crop, and a local refinement on the Focus Crop. To make the model work with preexisting masks, we formulate a sub-task termed Interactive Mask Correction, and propose Progressive Merge as the solution. Progressive Merge exploits morphological information to decide where to preserve and where to update, enabling users to refine any preexisting mask effectively. FocalClick achieves competitive results against SOTA methods with significantly smaller FLOPs. It also shows significant superiority when making corrections on preexisting masks. Code and data will be released at github.com/XavierCHEN34/ClickSEG

* CVPR2022

Via

Access Paper or Ask Questions