Picture for Yue Cao

Yue Cao

Distributed Cooperative Positioning in Dense Wireless Networks: A Neural Network Enhanced Fast Convergent Parametric Message Passing Method

Add code
Dec 22, 2024
Viaarxiv icon

Learning Novel Skills from Language-Generated Demonstrations

Add code
Dec 12, 2024
Viaarxiv icon

MAGIC: Mastering Physical Adversarial Generation in Context through Collaborative LLM Agents

Add code
Dec 11, 2024
Figure 1 for MAGIC: Mastering Physical Adversarial Generation in Context through Collaborative LLM Agents
Figure 2 for MAGIC: Mastering Physical Adversarial Generation in Context through Collaborative LLM Agents
Figure 3 for MAGIC: Mastering Physical Adversarial Generation in Context through Collaborative LLM Agents
Figure 4 for MAGIC: Mastering Physical Adversarial Generation in Context through Collaborative LLM Agents
Viaarxiv icon

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

Add code
Dec 06, 2024
Viaarxiv icon

SceneTAP: Scene-Coherent Typographic Adversarial Planner against Vision-Language Models in Real-World Environments

Add code
Nov 28, 2024
Figure 1 for SceneTAP: Scene-Coherent Typographic Adversarial Planner against Vision-Language Models in Real-World Environments
Figure 2 for SceneTAP: Scene-Coherent Typographic Adversarial Planner against Vision-Language Models in Real-World Environments
Figure 3 for SceneTAP: Scene-Coherent Typographic Adversarial Planner against Vision-Language Models in Real-World Environments
Figure 4 for SceneTAP: Scene-Coherent Typographic Adversarial Planner against Vision-Language Models in Real-World Environments
Viaarxiv icon

Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization

Add code
Nov 15, 2024
Figure 1 for Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Figure 2 for Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Figure 3 for Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Figure 4 for Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Viaarxiv icon

MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding

Add code
Oct 15, 2024
Figure 1 for MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding
Figure 2 for MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding
Figure 3 for MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding
Figure 4 for MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding
Viaarxiv icon

BA-Net: Bridge Attention in Deep Neural Networks

Add code
Oct 10, 2024
Viaarxiv icon

MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity

Add code
Jul 22, 2024
Figure 1 for MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity
Figure 2 for MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity
Figure 3 for MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity
Figure 4 for MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity
Viaarxiv icon

Ax-to-Grind Urdu: Benchmark Dataset for Urdu Fake News Detection

Add code
Mar 20, 2024
Figure 1 for Ax-to-Grind Urdu: Benchmark Dataset for Urdu Fake News Detection
Figure 2 for Ax-to-Grind Urdu: Benchmark Dataset for Urdu Fake News Detection
Figure 3 for Ax-to-Grind Urdu: Benchmark Dataset for Urdu Fake News Detection
Figure 4 for Ax-to-Grind Urdu: Benchmark Dataset for Urdu Fake News Detection
Viaarxiv icon