Picture for Ruiping Wang

Ruiping Wang

EMMA: Empowering Multi-modal Mamba with Structural and Hierarchical Alignment

Add code
Oct 08, 2024
Viaarxiv icon

Blocks as Probes: Dissecting Categorization Ability of Large Multimodal Models

Add code
Sep 03, 2024
Figure 1 for Blocks as Probes: Dissecting Categorization Ability of Large Multimodal Models
Figure 2 for Blocks as Probes: Dissecting Categorization Ability of Large Multimodal Models
Figure 3 for Blocks as Probes: Dissecting Categorization Ability of Large Multimodal Models
Figure 4 for Blocks as Probes: Dissecting Categorization Ability of Large Multimodal Models
Viaarxiv icon

GPSFormer: A Global Perception and Local Structure Fitting-based Transformer for Point Cloud Understanding

Add code
Jul 18, 2024
Viaarxiv icon

Q-Sparse: All Large Language Models can be Fully Sparsely-Activated

Add code
Jul 15, 2024
Viaarxiv icon

AIC MLLM: Autonomous Interactive Correction MLLM for Robust Robotic Manipulation

Add code
Jun 17, 2024
Viaarxiv icon

M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models

Add code
May 24, 2024
Viaarxiv icon

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Add code
Feb 27, 2024
Viaarxiv icon

Glance and Focus: Memory Prompting for Multi-Event Video Question Answering

Add code
Jan 03, 2024
Viaarxiv icon

BitNet: Scaling 1-bit Transformers for Large Language Models

Add code
Oct 17, 2023
Viaarxiv icon

SEGA: Semantic Guided Attention on Visual Prototype for Few-Shot Learning

Add code
Nov 08, 2021
Figure 1 for SEGA: Semantic Guided Attention on Visual Prototype for Few-Shot Learning
Figure 2 for SEGA: Semantic Guided Attention on Visual Prototype for Few-Shot Learning
Figure 3 for SEGA: Semantic Guided Attention on Visual Prototype for Few-Shot Learning
Figure 4 for SEGA: Semantic Guided Attention on Visual Prototype for Few-Shot Learning
Viaarxiv icon