Picture for Heng Wang

Heng Wang

ROS-SAM: High-Quality Interactive Segmentation for Remote Sensing Moving Object

Add code
Mar 15, 2025
Viaarxiv icon

BannerAgency: Advertising Banner Design with Multimodal LLM Agents

Add code
Mar 14, 2025
Viaarxiv icon

Reward Shaping to Mitigate Reward Hacking in RLHF

Add code
Feb 26, 2025
Viaarxiv icon

Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction

Add code
Feb 18, 2025
Viaarxiv icon

CoSER: Coordinating LLM-Based Persona Simulation of Established Roles

Add code
Feb 13, 2025
Viaarxiv icon

Cosmos World Foundation Model Platform for Physical AI

Add code
Jan 07, 2025
Figure 1 for Cosmos World Foundation Model Platform for Physical AI
Figure 2 for Cosmos World Foundation Model Platform for Physical AI
Figure 3 for Cosmos World Foundation Model Platform for Physical AI
Figure 4 for Cosmos World Foundation Model Platform for Physical AI
Viaarxiv icon

Fast Prompt Alignment for Text-to-Image Generation

Add code
Dec 11, 2024
Viaarxiv icon

Gotta Hear Them All: Sound Source Aware Vision to Audio Generation

Add code
Nov 26, 2024
Figure 1 for Gotta Hear Them All: Sound Source Aware Vision to Audio Generation
Figure 2 for Gotta Hear Them All: Sound Source Aware Vision to Audio Generation
Figure 3 for Gotta Hear Them All: Sound Source Aware Vision to Audio Generation
Figure 4 for Gotta Hear Them All: Sound Source Aware Vision to Audio Generation
Viaarxiv icon

Real-time and Downtime-tolerant Fault Diagnosis for Railway Turnout Machines (RTMs) Empowered with Cloud-Edge Pipeline Parallelism

Add code
Nov 04, 2024
Figure 1 for Real-time and Downtime-tolerant Fault Diagnosis for Railway Turnout Machines (RTMs) Empowered with Cloud-Edge Pipeline Parallelism
Figure 2 for Real-time and Downtime-tolerant Fault Diagnosis for Railway Turnout Machines (RTMs) Empowered with Cloud-Edge Pipeline Parallelism
Figure 3 for Real-time and Downtime-tolerant Fault Diagnosis for Railway Turnout Machines (RTMs) Empowered with Cloud-Edge Pipeline Parallelism
Figure 4 for Real-time and Downtime-tolerant Fault Diagnosis for Railway Turnout Machines (RTMs) Empowered with Cloud-Edge Pipeline Parallelism
Viaarxiv icon

DINeuro: Distilling Knowledge from 2D Natural Images via Deformable Tubular Transferring Strategy for 3D Neuron Reconstruction

Add code
Oct 29, 2024
Figure 1 for DINeuro: Distilling Knowledge from 2D Natural Images via Deformable Tubular Transferring Strategy for 3D Neuron Reconstruction
Figure 2 for DINeuro: Distilling Knowledge from 2D Natural Images via Deformable Tubular Transferring Strategy for 3D Neuron Reconstruction
Figure 3 for DINeuro: Distilling Knowledge from 2D Natural Images via Deformable Tubular Transferring Strategy for 3D Neuron Reconstruction
Figure 4 for DINeuro: Distilling Knowledge from 2D Natural Images via Deformable Tubular Transferring Strategy for 3D Neuron Reconstruction
Viaarxiv icon