Picture for Tao Yu

Tao Yu

CIRCUIT: A Benchmark for Circuit Interpretation and Reasoning Capabilities of LLMs

Add code
Feb 11, 2025
Viaarxiv icon

Systematic Outliers in Large Language Models

Add code
Feb 10, 2025
Viaarxiv icon

Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments

Add code
Jan 18, 2025
Figure 1 for Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments
Figure 2 for Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments
Figure 3 for Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments
Figure 4 for Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments
Viaarxiv icon

View Transformation Robustness for Multi-View 3D Object Reconstruction with Reconstruction Error-Guided View Selection

Add code
Dec 16, 2024
Viaarxiv icon

AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials

Add code
Dec 12, 2024
Viaarxiv icon

Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

Add code
Dec 05, 2024
Figure 1 for Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
Figure 2 for Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
Figure 3 for Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
Figure 4 for Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
Viaarxiv icon

IREE Oriented Active RIS-Assisted Green communication System with Outdated CSI

Add code
Nov 17, 2024
Figure 1 for IREE Oriented Active RIS-Assisted Green communication System with Outdated CSI
Figure 2 for IREE Oriented Active RIS-Assisted Green communication System with Outdated CSI
Figure 3 for IREE Oriented Active RIS-Assisted Green communication System with Outdated CSI
Figure 4 for IREE Oriented Active RIS-Assisted Green communication System with Outdated CSI
Viaarxiv icon

Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows

Add code
Nov 12, 2024
Figure 1 for Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
Figure 2 for Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
Figure 3 for Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
Figure 4 for Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
Viaarxiv icon

Attacking Vision-Language Computer Agents via Pop-ups

Add code
Nov 04, 2024
Figure 1 for Attacking Vision-Language Computer Agents via Pop-ups
Figure 2 for Attacking Vision-Language Computer Agents via Pop-ups
Figure 3 for Attacking Vision-Language Computer Agents via Pop-ups
Figure 4 for Attacking Vision-Language Computer Agents via Pop-ups
Viaarxiv icon

Can Uncertainty Quantification Enable Better Learning-based Index Tuning?

Add code
Oct 23, 2024
Viaarxiv icon