Picture for Letian Zhang

Letian Zhang

Kestrel: Grounding Self-Refinement for LVLM Hallucination Mitigation

Add code
Mar 17, 2026
Viaarxiv icon

MiniCPM-SALA: Hybridizing Sparse and Linear Attention for Efficient Long-Context Modeling

Add code
Feb 12, 2026
Viaarxiv icon

Controllable Layered Image Generation for Real-World Editing

Add code
Jan 21, 2026
Viaarxiv icon

OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation

Add code
Jan 21, 2026
Viaarxiv icon

Adaptive LoRA Experts Allocation and Selection for Federated Fine-Tuning

Add code
Sep 18, 2025
Viaarxiv icon

Enhancing Retrieval Augmentation via Adversarial Collaboration

Add code
Sep 18, 2025
Viaarxiv icon

GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset

Add code
Jul 28, 2025
Figure 1 for GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset
Figure 2 for GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset
Figure 3 for GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset
Figure 4 for GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset
Viaarxiv icon

EvdCLIP: Improving Vision-Language Retrieval with Entity Visual Descriptions from Large Language Models

Add code
May 24, 2025
Viaarxiv icon

FedALT: Federated Fine-Tuning through Adaptive Local Training with Rest-of-the-World LoRA

Add code
Mar 14, 2025
Viaarxiv icon

Oasis: One Image is All You Need for Multimodal Instruction Data Synthesis

Add code
Mar 13, 2025
Viaarxiv icon