Picture for Ning Zhang

Ning Zhang

Sid

Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs

Add code
Jan 08, 2025
Viaarxiv icon

Neuromorphic Optical Tracking and Imaging of Randomly Moving Targets through Strongly Scattering Media

Add code
Jan 07, 2025
Figure 1 for Neuromorphic Optical Tracking and Imaging of Randomly Moving Targets through Strongly Scattering Media
Figure 2 for Neuromorphic Optical Tracking and Imaging of Randomly Moving Targets through Strongly Scattering Media
Figure 3 for Neuromorphic Optical Tracking and Imaging of Randomly Moving Targets through Strongly Scattering Media
Figure 4 for Neuromorphic Optical Tracking and Imaging of Randomly Moving Targets through Strongly Scattering Media
Viaarxiv icon

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Add code
Dec 13, 2024
Viaarxiv icon

Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation

Add code
Dec 03, 2024
Figure 1 for Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation
Figure 2 for Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation
Figure 3 for Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation
Figure 4 for Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation
Viaarxiv icon

Accelerating Multimodel Large Language Models by Searching Optimal Vision Token Reduction

Add code
Nov 30, 2024
Viaarxiv icon

Sequential LLM Framework for Fashion Recommendation

Add code
Oct 15, 2024
Figure 1 for Sequential LLM Framework for Fashion Recommendation
Figure 2 for Sequential LLM Framework for Fashion Recommendation
Figure 3 for Sequential LLM Framework for Fashion Recommendation
Figure 4 for Sequential LLM Framework for Fashion Recommendation
Viaarxiv icon

Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach

Add code
Oct 08, 2024
Figure 1 for Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach
Figure 2 for Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach
Figure 3 for Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach
Figure 4 for Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach
Viaarxiv icon

ZALM3: Zero-Shot Enhancement of Vision-Language Alignment via In-Context Information in Multi-Turn Multimodal Medical Dialogue

Add code
Sep 26, 2024
Figure 1 for ZALM3: Zero-Shot Enhancement of Vision-Language Alignment via In-Context Information in Multi-Turn Multimodal Medical Dialogue
Figure 2 for ZALM3: Zero-Shot Enhancement of Vision-Language Alignment via In-Context Information in Multi-Turn Multimodal Medical Dialogue
Figure 3 for ZALM3: Zero-Shot Enhancement of Vision-Language Alignment via In-Context Information in Multi-Turn Multimodal Medical Dialogue
Figure 4 for ZALM3: Zero-Shot Enhancement of Vision-Language Alignment via In-Context Information in Multi-Turn Multimodal Medical Dialogue
Viaarxiv icon

Model-in-the-Loop (MILO): Accelerating Multimodal AI Data Annotation with LLMs

Add code
Sep 16, 2024
Figure 1 for Model-in-the-Loop (MILO): Accelerating Multimodal AI Data Annotation with LLMs
Figure 2 for Model-in-the-Loop (MILO): Accelerating Multimodal AI Data Annotation with LLMs
Figure 3 for Model-in-the-Loop (MILO): Accelerating Multimodal AI Data Annotation with LLMs
Figure 4 for Model-in-the-Loop (MILO): Accelerating Multimodal AI Data Annotation with LLMs
Viaarxiv icon

MGSA: Multi-granularity Graph Structure Attention for Knowledge Graph-to-Text Generation

Add code
Sep 16, 2024
Viaarxiv icon