Picture for Xue Yang

Xue Yang

A Simple Aerial Detection Baseline of Multimodal Language Models

Add code
Jan 16, 2025
Viaarxiv icon

Parameter-Inverted Image Pyramid Networks for Visual Perception and Multimodal Understanding

Add code
Jan 14, 2025
Viaarxiv icon

RSAR: Restricted State Angle Resolver and Rotated SAR Benchmark

Add code
Jan 08, 2025
Figure 1 for RSAR: Restricted State Angle Resolver and Rotated SAR Benchmark
Figure 2 for RSAR: Restricted State Angle Resolver and Rotated SAR Benchmark
Figure 3 for RSAR: Restricted State Angle Resolver and Rotated SAR Benchmark
Figure 4 for RSAR: Restricted State Angle Resolver and Rotated SAR Benchmark
Viaarxiv icon

Efficiently Achieving Secure Model Training and Secure Aggregation to Ensure Bidirectional Privacy-Preservation in Federated Learning

Add code
Dec 16, 2024
Viaarxiv icon

DiffCLIP: Few-shot Language-driven Multimodal Classifier

Add code
Dec 10, 2024
Figure 1 for DiffCLIP: Few-shot Language-driven Multimodal Classifier
Figure 2 for DiffCLIP: Few-shot Language-driven Multimodal Classifier
Figure 3 for DiffCLIP: Few-shot Language-driven Multimodal Classifier
Figure 4 for DiffCLIP: Few-shot Language-driven Multimodal Classifier
Viaarxiv icon

Marco-LLM: Bridging Languages via Massive Multilingual Training for Cross-Lingual Enhancement

Add code
Dec 05, 2024
Viaarxiv icon

GeneMAN: Generalizable Single-Image 3D Human Reconstruction from Multi-Source Human Data

Add code
Nov 27, 2024
Viaarxiv icon

Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation

Add code
Nov 04, 2024
Figure 1 for Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation
Figure 2 for Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation
Figure 3 for Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation
Figure 4 for Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation
Viaarxiv icon

PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection

Add code
Oct 10, 2024
Figure 1 for PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection
Figure 2 for PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection
Figure 3 for PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection
Figure 4 for PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection
Viaarxiv icon

Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training

Add code
Oct 10, 2024
Figure 1 for Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Figure 2 for Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Figure 3 for Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Figure 4 for Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Viaarxiv icon