Picture for Yang Feng

Yang Feng

Alibaba Group

Stream-Omni: Simultaneous Multimodal Interactions with Large Language-Vision-Speech Model

Add code
Jun 16, 2025
Viaarxiv icon

CAPO: Reinforcing Consistent Reasoning in Medical Decision-Making

Add code
Jun 15, 2025
Viaarxiv icon

FlexRAG: A Flexible and Comprehensive Framework for Retrieval-Augmented Generation

Add code
Jun 14, 2025
Viaarxiv icon

Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space

Add code
May 19, 2025
Viaarxiv icon

GeoERM: Geometry-Aware Multi-Task Representation Learning on Riemannian Manifolds

Add code
May 05, 2025
Viaarxiv icon

LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis

Add code
May 05, 2025
Viaarxiv icon

Persona-judge: Personalized Alignment of Large Language Models via Token-level Self-judgment

Add code
Apr 17, 2025
Viaarxiv icon

StruPhantom: Evolutionary Injection Attacks on Black-Box Tabular Agents Powered by Large Language Models

Add code
Apr 14, 2025
Viaarxiv icon

LevelRAG: Enhancing Retrieval-Augmented Generation with Multi-hop Logic Planning over Rewriting Augmented Searchers

Add code
Feb 25, 2025
Viaarxiv icon

LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

Add code
Jan 07, 2025
Viaarxiv icon