Picture for Tianqing Fang

Tianqing Fang

OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization

Add code
Oct 25, 2024
Figure 1 for OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization
Figure 2 for OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization
Figure 3 for OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization
Figure 4 for OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization
Viaarxiv icon

Concept-Reversed Winograd Schema Challenge: Evaluating and Improving Robust Reasoning in Large Language Models via Abstraction

Add code
Oct 15, 2024
Figure 1 for Concept-Reversed Winograd Schema Challenge: Evaluating and Improving Robust Reasoning in Large Language Models via Abstraction
Figure 2 for Concept-Reversed Winograd Schema Challenge: Evaluating and Improving Robust Reasoning in Large Language Models via Abstraction
Figure 3 for Concept-Reversed Winograd Schema Challenge: Evaluating and Improving Robust Reasoning in Large Language Models via Abstraction
Figure 4 for Concept-Reversed Winograd Schema Challenge: Evaluating and Improving Robust Reasoning in Large Language Models via Abstraction
Viaarxiv icon

DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects

Add code
Oct 03, 2024
Figure 1 for DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects
Figure 2 for DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects
Figure 3 for DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects
Figure 4 for DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects
Viaarxiv icon

LEOPARD : A Vision Language Model For Text-Rich Multi-Image Tasks

Add code
Oct 02, 2024
Viaarxiv icon

CLR-Fact: Evaluating the Complex Logical Reasoning Capability of Large Language Models over Factual Knowledge

Add code
Jul 30, 2024
Viaarxiv icon

On the Role of Entity and Event Level Conceptualization in Generalizable Reasoning: A Survey of Tasks, Methods, Applications, and Future Directions

Add code
Jun 16, 2024
Viaarxiv icon

MIND: Multimodal Shopping Intention Distillation from Large Vision-language Models for E-commerce Purchase Understanding

Add code
Jun 15, 2024
Viaarxiv icon

IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerce

Add code
Jun 14, 2024
Viaarxiv icon

Neuro-Inspired Information-Theoretic Hierarchical Perception for Multimodal Learning

Add code
Apr 15, 2024
Figure 1 for Neuro-Inspired Information-Theoretic Hierarchical Perception for Multimodal Learning
Figure 2 for Neuro-Inspired Information-Theoretic Hierarchical Perception for Multimodal Learning
Figure 3 for Neuro-Inspired Information-Theoretic Hierarchical Perception for Multimodal Learning
Figure 4 for Neuro-Inspired Information-Theoretic Hierarchical Perception for Multimodal Learning
Viaarxiv icon

Complex Reasoning over Logical Queries on Commonsense Knowledge Graphs

Add code
Mar 12, 2024
Viaarxiv icon