Picture for Yanfeng Wang

Yanfeng Wang

Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, China and Shanghai AI Laboratory, China

Bohrium + SciMaster: Building the Infrastructure and Ecosystem for Agentic Science at Scale

Add code
Dec 23, 2025
Viaarxiv icon

How Far are Modern Trackers from UAV-Anti-UAV? A Million-Scale Benchmark and New Baseline

Add code
Dec 08, 2025
Viaarxiv icon

VocalBench-zh: Decomposing and Benchmarking the Speech Conversational Abilities in Mandarin Context

Add code
Nov 17, 2025
Viaarxiv icon

VocalNet-M2: Advancing Low-Latency Spoken Language Modeling via Integrated Multi-Codebook Tokenization and Multi-Token Prediction

Add code
Nov 13, 2025
Viaarxiv icon

Selecting Auxiliary Data via Neural Tangent Kernels for Low-Resource Domains

Add code
Nov 10, 2025
Viaarxiv icon

CS3-Bench: Evaluating and Enhancing Speech-to-Speech LLMs for Mandarin-English Code-Switching

Add code
Oct 09, 2025
Figure 1 for CS3-Bench: Evaluating and Enhancing Speech-to-Speech LLMs for Mandarin-English Code-Switching
Figure 2 for CS3-Bench: Evaluating and Enhancing Speech-to-Speech LLMs for Mandarin-English Code-Switching
Figure 3 for CS3-Bench: Evaluating and Enhancing Speech-to-Speech LLMs for Mandarin-English Code-Switching
Figure 4 for CS3-Bench: Evaluating and Enhancing Speech-to-Speech LLMs for Mandarin-English Code-Switching
Viaarxiv icon

Wide-In, Narrow-Out: Revokable Decoding for Efficient and Effective DLLMs

Add code
Jul 24, 2025
Figure 1 for Wide-In, Narrow-Out: Revokable Decoding for Efficient and Effective DLLMs
Figure 2 for Wide-In, Narrow-Out: Revokable Decoding for Efficient and Effective DLLMs
Figure 3 for Wide-In, Narrow-Out: Revokable Decoding for Efficient and Effective DLLMs
Figure 4 for Wide-In, Narrow-Out: Revokable Decoding for Efficient and Effective DLLMs
Viaarxiv icon

Differential-informed Sample Selection Accelerates Multimodal Contrastive Learning

Add code
Jul 17, 2025
Figure 1 for Differential-informed Sample Selection Accelerates Multimodal Contrastive Learning
Figure 2 for Differential-informed Sample Selection Accelerates Multimodal Contrastive Learning
Figure 3 for Differential-informed Sample Selection Accelerates Multimodal Contrastive Learning
Figure 4 for Differential-informed Sample Selection Accelerates Multimodal Contrastive Learning
Viaarxiv icon

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Add code
Jul 02, 2025
Figure 1 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Figure 2 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Figure 3 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Figure 4 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Viaarxiv icon

Universal Video Temporal Grounding with Generative Multi-modal Large Language Models

Add code
Jun 23, 2025
Viaarxiv icon