Picture for Hao Feng

Hao Feng

Summer

WonderVerse: Extendable 3D Scene Generation with Video Generative Models

Add code
Mar 13, 2025
Viaarxiv icon

Multi-Cue Adaptive Visual Token Pruning for Large Vision-Language Models

Add code
Mar 11, 2025
Viaarxiv icon

Pre-train and Fine-tune: Recommenders as Large Models

Add code
Jan 24, 2025
Figure 1 for Pre-train and Fine-tune: Recommenders as Large Models
Figure 2 for Pre-train and Fine-tune: Recommenders as Large Models
Figure 3 for Pre-train and Fine-tune: Recommenders as Large Models
Figure 4 for Pre-train and Fine-tune: Recommenders as Large Models
Viaarxiv icon

OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning

Add code
Dec 31, 2024
Figure 1 for OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
Figure 2 for OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
Figure 3 for OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
Figure 4 for OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
Viaarxiv icon

LiRCDepth: Lightweight Radar-Camera Depth Estimation via Knowledge Distillation and Uncertainty Guidance

Add code
Dec 20, 2024
Viaarxiv icon

EPIC: Efficient Position-Independent Context Caching for Serving Large Language Models

Add code
Oct 20, 2024
Viaarxiv icon

SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training

Add code
Oct 20, 2024
Figure 1 for SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training
Figure 2 for SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training
Figure 3 for SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training
Figure 4 for SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training
Viaarxiv icon

GET-UP: GEomeTric-aware Depth Estimation with Radar Points UPsampling

Add code
Sep 02, 2024
Figure 1 for GET-UP: GEomeTric-aware Depth Estimation with Radar Points UPsampling
Figure 2 for GET-UP: GEomeTric-aware Depth Estimation with Radar Points UPsampling
Figure 3 for GET-UP: GEomeTric-aware Depth Estimation with Radar Points UPsampling
Figure 4 for GET-UP: GEomeTric-aware Depth Estimation with Radar Points UPsampling
Viaarxiv icon

AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding

Add code
Aug 30, 2024
Figure 1 for AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding
Figure 2 for AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding
Figure 3 for AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding
Figure 4 for AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding
Viaarxiv icon

LaneTCA: Enhancing Video Lane Detection with Temporal Context Aggregation

Add code
Aug 25, 2024
Viaarxiv icon