Picture for Bohan Yu

Bohan Yu

OmniAgent: Audio-Guided Active Perception Agent for Omnimodal Audio-Video Understanding

Add code
Dec 29, 2025
Viaarxiv icon

StreamingAssistant: Efficient Visual Token Pruning for Accelerating Online Video Understanding

Add code
Dec 14, 2025
Viaarxiv icon

OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models

Add code
Nov 18, 2025
Figure 1 for OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models
Figure 2 for OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models
Figure 3 for OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models
Figure 4 for OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models
Viaarxiv icon

SR-KI: Scalable and Real-Time Knowledge Integration into LLMs via Supervised Attention

Add code
Nov 09, 2025
Figure 1 for SR-KI: Scalable and Real-Time Knowledge Integration into LLMs via Supervised Attention
Figure 2 for SR-KI: Scalable and Real-Time Knowledge Integration into LLMs via Supervised Attention
Figure 3 for SR-KI: Scalable and Real-Time Knowledge Integration into LLMs via Supervised Attention
Figure 4 for SR-KI: Scalable and Real-Time Knowledge Integration into LLMs via Supervised Attention
Viaarxiv icon

EvolKV: Evolutionary KV Cache Compression for LLM Inference

Add code
Sep 10, 2025
Viaarxiv icon

TableEval: A Real-World Benchmark for Complex, Multilingual, and Multi-Structured Table Question Answering

Add code
Jun 04, 2025
Viaarxiv icon

Deep Speech Synthesis from Multimodal Articulatory Representations

Add code
Dec 17, 2024
Figure 1 for Deep Speech Synthesis from Multimodal Articulatory Representations
Figure 2 for Deep Speech Synthesis from Multimodal Articulatory Representations
Figure 3 for Deep Speech Synthesis from Multimodal Articulatory Representations
Figure 4 for Deep Speech Synthesis from Multimodal Articulatory Representations
Viaarxiv icon

Fast, High-Quality and Parameter-Efficient Articulatory Synthesis using Differentiable DSP

Add code
Sep 04, 2024
Figure 1 for Fast, High-Quality and Parameter-Efficient Articulatory Synthesis using Differentiable DSP
Figure 2 for Fast, High-Quality and Parameter-Efficient Articulatory Synthesis using Differentiable DSP
Figure 3 for Fast, High-Quality and Parameter-Efficient Articulatory Synthesis using Differentiable DSP
Figure 4 for Fast, High-Quality and Parameter-Efficient Articulatory Synthesis using Differentiable DSP
Viaarxiv icon

Towards EMG-to-Speech with a Necklace Form Factor

Add code
Jul 31, 2024
Figure 1 for Towards EMG-to-Speech with a Necklace Form Factor
Figure 2 for Towards EMG-to-Speech with a Necklace Form Factor
Figure 3 for Towards EMG-to-Speech with a Necklace Form Factor
Figure 4 for Towards EMG-to-Speech with a Necklace Form Factor
Viaarxiv icon

E2VIDiff: Perceptual Events-to-Video Reconstruction using Diffusion Priors

Add code
Jul 11, 2024
Viaarxiv icon