Picture for Ruoxi Sun

Ruoxi Sun

Data-Centric Improvements for Enhancing Multi-Modal Understanding in Spoken Conversation Modeling

Add code
Dec 20, 2024
Viaarxiv icon

Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows

Add code
Nov 12, 2024
Figure 1 for Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
Figure 2 for Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
Figure 3 for Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
Figure 4 for Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
Viaarxiv icon

AI-Compass: A Comprehensive and Effective Multi-module Testing Tool for AI Systems

Add code
Nov 09, 2024
Figure 1 for AI-Compass: A Comprehensive and Effective Multi-module Testing Tool for AI Systems
Figure 2 for AI-Compass: A Comprehensive and Effective Multi-module Testing Tool for AI Systems
Figure 3 for AI-Compass: A Comprehensive and Effective Multi-module Testing Tool for AI Systems
Figure 4 for AI-Compass: A Comprehensive and Effective Multi-module Testing Tool for AI Systems
Viaarxiv icon

Edge Unlearning is Not "on Edge"! An Adaptive Exact Unlearning System on Resource-Constrained Devices

Add code
Oct 15, 2024
Viaarxiv icon

Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models

Add code
Oct 09, 2024
Viaarxiv icon

CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL

Add code
Oct 02, 2024
Figure 1 for CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL
Figure 2 for CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL
Figure 3 for CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL
Figure 4 for CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL
Viaarxiv icon

SQL-GEN: Bridging the Dialect Gap for Text-to-SQL Via Synthetic Data And Model Merging

Add code
Aug 22, 2024
Viaarxiv icon

BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval

Add code
Jul 16, 2024
Viaarxiv icon

Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?

Add code
Jul 15, 2024
Figure 1 for Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
Figure 2 for Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
Figure 3 for Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
Figure 4 for Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
Viaarxiv icon

Teach Better or Show Smarter? On Instructions and Exemplars in Automatic Prompt Optimization

Add code
Jun 22, 2024
Figure 1 for Teach Better or Show Smarter? On Instructions and Exemplars in Automatic Prompt Optimization
Figure 2 for Teach Better or Show Smarter? On Instructions and Exemplars in Automatic Prompt Optimization
Figure 3 for Teach Better or Show Smarter? On Instructions and Exemplars in Automatic Prompt Optimization
Figure 4 for Teach Better or Show Smarter? On Instructions and Exemplars in Automatic Prompt Optimization
Viaarxiv icon