Picture for Zhiming Ma

Zhiming Ma

Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning

Add code
Apr 06, 2025
Viaarxiv icon

TeleAntiFraud-28k: An Audio-Text Slow-Thinking Dataset for Telecom Fraud Detection

Add code
Apr 01, 2025
Viaarxiv icon

SARChat-Bench-2M: A Multi-Task Vision-Language Benchmark for SAR Image Interpretation

Add code
Feb 12, 2025
Viaarxiv icon

Reveal the Mystery of DPO: The Connection between DPO and RL Algorithms

Add code
Feb 05, 2025
Viaarxiv icon

Language Models as Continuous Self-Evolving Data Engineers

Add code
Dec 19, 2024
Viaarxiv icon

Molecule Joint Auto-Encoding: Trajectory Pretraining with 2D and 3D Diffusion

Add code
Dec 06, 2023
Viaarxiv icon

Elastic Information Bottleneck

Add code
Nov 07, 2023
Viaarxiv icon

Self-supervised Pocket Pretraining via Protein Fragment-Surroundings Alignment

Add code
Oct 11, 2023
Figure 1 for Self-supervised Pocket Pretraining via Protein Fragment-Surroundings Alignment
Figure 2 for Self-supervised Pocket Pretraining via Protein Fragment-Surroundings Alignment
Figure 3 for Self-supervised Pocket Pretraining via Protein Fragment-Surroundings Alignment
Figure 4 for Self-supervised Pocket Pretraining via Protein Fragment-Surroundings Alignment
Viaarxiv icon

Symmetry-Informed Geometric Representation for Molecules, Proteins, and Crystalline Materials

Add code
Jun 15, 2023
Viaarxiv icon

Explore and Exploit the Diverse Knowledge in Model Zoo for Domain Generalization

Add code
Jun 05, 2023
Figure 1 for Explore and Exploit the Diverse Knowledge in Model Zoo for Domain Generalization
Figure 2 for Explore and Exploit the Diverse Knowledge in Model Zoo for Domain Generalization
Figure 3 for Explore and Exploit the Diverse Knowledge in Model Zoo for Domain Generalization
Figure 4 for Explore and Exploit the Diverse Knowledge in Model Zoo for Domain Generalization
Viaarxiv icon