Picture for Jiansheng Chen

Jiansheng Chen

CIBR: Cross-modal Information Bottleneck Regularization for Robust CLIP Generalization

Add code
Mar 31, 2025
Viaarxiv icon

LRSCLIP: A Vision-Language Foundation Model for Aligning Remote Sensing Image with Longer Text

Add code
Mar 25, 2025
Viaarxiv icon

Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model

Add code
Mar 14, 2025
Viaarxiv icon

Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction

Add code
Feb 18, 2025
Viaarxiv icon

Enhancing LLM Reasoning with Multi-Path Collaborative Reactive and Reflection agents

Add code
Jan 03, 2025
Figure 1 for Enhancing LLM Reasoning with Multi-Path Collaborative Reactive and Reflection agents
Figure 2 for Enhancing LLM Reasoning with Multi-Path Collaborative Reactive and Reflection agents
Figure 3 for Enhancing LLM Reasoning with Multi-Path Collaborative Reactive and Reflection agents
Figure 4 for Enhancing LLM Reasoning with Multi-Path Collaborative Reactive and Reflection agents
Viaarxiv icon

Enhancing Contrastive Learning Inspired by the Philosophy of "The Blind Men and the Elephant"

Add code
Dec 21, 2024
Figure 1 for Enhancing Contrastive Learning Inspired by the Philosophy of "The Blind Men and the Elephant"
Figure 2 for Enhancing Contrastive Learning Inspired by the Philosophy of "The Blind Men and the Elephant"
Figure 3 for Enhancing Contrastive Learning Inspired by the Philosophy of "The Blind Men and the Elephant"
Figure 4 for Enhancing Contrastive Learning Inspired by the Philosophy of "The Blind Men and the Elephant"
Viaarxiv icon

A2RNet: Adversarial Attack Resilient Network for Robust Infrared and Visible Image Fusion

Add code
Dec 18, 2024
Viaarxiv icon

$\textrm{A}^{\textrm{2}}$RNet: Adversarial Attack Resilient Network for Robust Infrared and Visible Image Fusion

Add code
Dec 13, 2024
Viaarxiv icon

DHCP: Detecting Hallucinations by Cross-modal Attention Pattern in Large Vision-Language Models

Add code
Nov 27, 2024
Figure 1 for DHCP: Detecting Hallucinations by Cross-modal Attention Pattern in Large Vision-Language Models
Figure 2 for DHCP: Detecting Hallucinations by Cross-modal Attention Pattern in Large Vision-Language Models
Figure 3 for DHCP: Detecting Hallucinations by Cross-modal Attention Pattern in Large Vision-Language Models
Figure 4 for DHCP: Detecting Hallucinations by Cross-modal Attention Pattern in Large Vision-Language Models
Viaarxiv icon

Exploring Information-Theoretic Metrics Associated with Neural Collapse in Supervised Training

Add code
Sep 25, 2024
Figure 1 for Exploring Information-Theoretic Metrics Associated with Neural Collapse in Supervised Training
Figure 2 for Exploring Information-Theoretic Metrics Associated with Neural Collapse in Supervised Training
Figure 3 for Exploring Information-Theoretic Metrics Associated with Neural Collapse in Supervised Training
Figure 4 for Exploring Information-Theoretic Metrics Associated with Neural Collapse in Supervised Training
Viaarxiv icon