Picture for Jie Zhu

Jie Zhu

MOSS-VoiceGenerator: Create Realistic Voices with Natural Language Descriptions

Add code
Mar 30, 2026
Viaarxiv icon

FusionAgent: A Multimodal Agent with Dynamic Model Selection for Human Recognition

Add code
Mar 27, 2026
Viaarxiv icon

FinMCP-Bench: Benchmarking LLM Agents for Real-World Financial Tool Use under the Model Context Protocol

Add code
Mar 26, 2026
Viaarxiv icon

A Simple Baseline for Unifying Understanding, Generation, and Editing via Vanilla Next-token Prediction

Add code
Mar 05, 2026
Viaarxiv icon

MDL: A Unified Multi-Distribution Learner in Large-scale Industrial Recommendation through Tokenization

Add code
Feb 07, 2026
Viaarxiv icon

TokenMixer-Large: Scaling Up Large Ranking Models in Industrial Recommenders

Add code
Feb 06, 2026
Viaarxiv icon

LocalScore: Local Density-Aware Similarity Scoring for Biometrics

Add code
Feb 01, 2026
Viaarxiv icon

Can Textual Reasoning Improve the Performance of MLLMs on Fine-grained Visual Classification?

Add code
Jan 11, 2026
Viaarxiv icon

Forge-and-Quench: Enhancing Image Generation for Higher Fidelity in Unified Multimodal Models

Add code
Jan 08, 2026
Viaarxiv icon

On the Holistic Approach for Detecting Human Image Forgery

Add code
Jan 08, 2026
Viaarxiv icon