Picture for Heng Wang

Heng Wang

Dual-Mode Magnetic Continuum Robot for Targeted Drug Delivery

Add code
Oct 02, 2025
Viaarxiv icon

Continuously Steering LLMs Sensitivity to Contextual Knowledge with Proxy Models

Add code
Aug 28, 2025
Viaarxiv icon

OpenCUA: Open Foundations for Computer-Use Agents

Add code
Aug 12, 2025
Viaarxiv icon

DesignLab: Designing Slides Through Iterative Detection and Correction

Add code
Jul 23, 2025
Viaarxiv icon

Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model

Add code
Jun 10, 2025
Figure 1 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Figure 2 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Figure 3 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Figure 4 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Viaarxiv icon

Unveiling the Hidden: Movie Genre and User Bias in Spoiler Detection

Add code
Apr 28, 2025
Viaarxiv icon

Kimi-VL Technical Report

Add code
Apr 10, 2025
Figure 1 for Kimi-VL Technical Report
Figure 2 for Kimi-VL Technical Report
Figure 3 for Kimi-VL Technical Report
Figure 4 for Kimi-VL Technical Report
Viaarxiv icon

Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources

Add code
Apr 02, 2025
Viaarxiv icon

ROS-SAM: High-Quality Interactive Segmentation for Remote Sensing Moving Object

Add code
Mar 15, 2025
Figure 1 for ROS-SAM: High-Quality Interactive Segmentation for Remote Sensing Moving Object
Figure 2 for ROS-SAM: High-Quality Interactive Segmentation for Remote Sensing Moving Object
Figure 3 for ROS-SAM: High-Quality Interactive Segmentation for Remote Sensing Moving Object
Figure 4 for ROS-SAM: High-Quality Interactive Segmentation for Remote Sensing Moving Object
Viaarxiv icon

BannerAgency: Advertising Banner Design with Multimodal LLM Agents

Add code
Mar 14, 2025
Viaarxiv icon