Picture for Yan Xia

Yan Xia

Imagine while Reasoning in Space: Multimodal Visualization-of-Thought

Add code
Jan 13, 2025
Viaarxiv icon

Semantic Residual for Multimodal Unified Discrete Representation

Add code
Dec 26, 2024
Viaarxiv icon

UniLoc: Towards Universal Place Recognition Using Any Single Modality

Add code
Dec 16, 2024
Viaarxiv icon

TrafficLoc: Localizing Traffic Surveillance Cameras in 3D Scenes

Add code
Dec 13, 2024
Figure 1 for TrafficLoc: Localizing Traffic Surveillance Cameras in 3D Scenes
Figure 2 for TrafficLoc: Localizing Traffic Surveillance Cameras in 3D Scenes
Figure 3 for TrafficLoc: Localizing Traffic Surveillance Cameras in 3D Scenes
Figure 4 for TrafficLoc: Localizing Traffic Surveillance Cameras in 3D Scenes
Viaarxiv icon

SADG: Segment Any Dynamic Gaussian Without Object Trackers

Add code
Nov 28, 2024
Viaarxiv icon

SMILE-UHURA Challenge -- Small Vessel Segmentation at Mesoscopic Scale from Ultra-High Resolution 7T Magnetic Resonance Angiograms

Add code
Nov 14, 2024
Viaarxiv icon

ZAHA: Introducing the Level of Facade Generalization and the Large-Scale Point Cloud Facade Semantic Segmentation Benchmark Dataset

Add code
Nov 07, 2024
Figure 1 for ZAHA: Introducing the Level of Facade Generalization and the Large-Scale Point Cloud Facade Semantic Segmentation Benchmark Dataset
Figure 2 for ZAHA: Introducing the Level of Facade Generalization and the Large-Scale Point Cloud Facade Semantic Segmentation Benchmark Dataset
Figure 3 for ZAHA: Introducing the Level of Facade Generalization and the Large-Scale Point Cloud Facade Semantic Segmentation Benchmark Dataset
Figure 4 for ZAHA: Introducing the Level of Facade Generalization and the Large-Scale Point Cloud Facade Semantic Segmentation Benchmark Dataset
Viaarxiv icon

1-bit AI Infra: Part 1.1, Fast and Lossless BitNet b1.58 Inference on CPUs

Add code
Oct 21, 2024
Viaarxiv icon

CERD: A Comprehensive Chinese Rhetoric Dataset for Rhetorical Understanding and Generation in Essays

Add code
Sep 29, 2024
Figure 1 for CERD: A Comprehensive Chinese Rhetoric Dataset for Rhetorical Understanding and Generation in Essays
Figure 2 for CERD: A Comprehensive Chinese Rhetoric Dataset for Rhetorical Understanding and Generation in Essays
Figure 3 for CERD: A Comprehensive Chinese Rhetoric Dataset for Rhetorical Understanding and Generation in Essays
Figure 4 for CERD: A Comprehensive Chinese Rhetoric Dataset for Rhetorical Understanding and Generation in Essays
Viaarxiv icon

World-Grounded Human Motion Recovery via Gravity-View Coordinates

Add code
Sep 10, 2024
Figure 1 for World-Grounded Human Motion Recovery via Gravity-View Coordinates
Figure 2 for World-Grounded Human Motion Recovery via Gravity-View Coordinates
Figure 3 for World-Grounded Human Motion Recovery via Gravity-View Coordinates
Figure 4 for World-Grounded Human Motion Recovery via Gravity-View Coordinates
Viaarxiv icon