Picture for Tao Wu

Tao Wu

TransiT: Transient Transformer for Non-line-of-sight Videography

Add code
Mar 14, 2025
Viaarxiv icon

Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs

Add code
Mar 07, 2025
Figure 1 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Figure 2 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Figure 3 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Figure 4 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Viaarxiv icon

iFADIT: Invertible Face Anonymization via Disentangled Identity Transform

Add code
Jan 08, 2025
Viaarxiv icon

Online Video Understanding: A Comprehensive Benchmark and Memory-Augmented Method

Add code
Dec 31, 2024
Figure 1 for Online Video Understanding: A Comprehensive Benchmark and Memory-Augmented Method
Figure 2 for Online Video Understanding: A Comprehensive Benchmark and Memory-Augmented Method
Figure 3 for Online Video Understanding: A Comprehensive Benchmark and Memory-Augmented Method
Figure 4 for Online Video Understanding: A Comprehensive Benchmark and Memory-Augmented Method
Viaarxiv icon

Do Current Video LLMs Have Strong OCR Abilities? A Preliminary Study

Add code
Dec 29, 2024
Viaarxiv icon

VideoMaker: Zero-shot Customized Video Generation with the Inherent Force of Video Diffusion Models

Add code
Dec 27, 2024
Viaarxiv icon

NoisyEQA: Benchmarking Embodied Question Answering Against Noisy Queries

Add code
Dec 14, 2024
Figure 1 for NoisyEQA: Benchmarking Embodied Question Answering Against Noisy Queries
Figure 2 for NoisyEQA: Benchmarking Embodied Question Answering Against Noisy Queries
Figure 3 for NoisyEQA: Benchmarking Embodied Question Answering Against Noisy Queries
Figure 4 for NoisyEQA: Benchmarking Embodied Question Answering Against Noisy Queries
Viaarxiv icon

p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay

Add code
Dec 05, 2024
Figure 1 for p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay
Figure 2 for p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay
Figure 3 for p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay
Figure 4 for p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay
Viaarxiv icon

CamI2V: Camera-Controlled Image-to-Video Diffusion Model

Add code
Oct 21, 2024
Figure 1 for CamI2V: Camera-Controlled Image-to-Video Diffusion Model
Figure 2 for CamI2V: Camera-Controlled Image-to-Video Diffusion Model
Figure 3 for CamI2V: Camera-Controlled Image-to-Video Diffusion Model
Figure 4 for CamI2V: Camera-Controlled Image-to-Video Diffusion Model
Viaarxiv icon

CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities

Add code
Aug 23, 2024
Figure 1 for CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities
Figure 2 for CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities
Figure 3 for CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities
Figure 4 for CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities
Viaarxiv icon