Picture for Tong Zhang

Tong Zhang

Nanjing University of Science and Technology, Nanjing, China

EarthGPT-X: Enabling MLLMs to Flexibly and Comprehensively Understand Multi-Source Remote Sensing Imagery

Add code
Apr 17, 2025
Viaarxiv icon

A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce

Add code
Apr 15, 2025
Viaarxiv icon

Multi-Modal Hypergraph Enhanced LLM Learning for Recommendation

Add code
Apr 13, 2025
Viaarxiv icon

Refining CLIP's Spatial Awareness: A Visual-Centric Perspective

Add code
Apr 03, 2025
Viaarxiv icon

VGRP-Bench: Visual Grid Reasoning Puzzle Benchmark for Large Vision-Language Models

Add code
Apr 02, 2025
Viaarxiv icon

ASGO: Adaptive Structured Gradient Optimization

Add code
Mar 26, 2025
Viaarxiv icon

FDS: Frequency-Aware Denoising Score for Text-Guided Latent Diffusion Image Editing

Add code
Mar 24, 2025
Viaarxiv icon

Generating Multimodal Driving Scenes via Next-Scene Prediction

Add code
Mar 19, 2025
Viaarxiv icon

RAG-RL: Advancing Retrieval-Augmented Generation via RL and Curriculum Learning

Add code
Mar 17, 2025
Viaarxiv icon

Monte Carlo Diffusion for Generalizable Learning-Based RANSAC

Add code
Mar 12, 2025
Viaarxiv icon