Picture for Pengcheng Yin

Pengcheng Yin

Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows

Add code
Nov 12, 2024
Figure 1 for Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
Figure 2 for Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
Figure 3 for Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
Figure 4 for Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
Viaarxiv icon

Natural Language Outlines for Code: Literate Programming in the LLM Era

Add code
Aug 09, 2024
Viaarxiv icon

Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?

Add code
Jul 15, 2024
Figure 1 for Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
Figure 2 for Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
Figure 3 for Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
Figure 4 for Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
Viaarxiv icon

NExT: Teaching Large Language Models to Reason about Code Execution

Add code
Apr 23, 2024
Figure 1 for NExT: Teaching Large Language Models to Reason about Code Execution
Figure 2 for NExT: Teaching Large Language Models to Reason about Code Execution
Figure 3 for NExT: Teaching Large Language Models to Reason about Code Execution
Figure 4 for NExT: Teaching Large Language Models to Reason about Code Execution
Viaarxiv icon

A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond

Add code
Mar 21, 2024
Viaarxiv icon

Unsupervised Evaluation of Code LLMs with Round-Trip Correctness

Add code
Feb 13, 2024
Figure 1 for Unsupervised Evaluation of Code LLMs with Round-Trip Correctness
Figure 2 for Unsupervised Evaluation of Code LLMs with Round-Trip Correctness
Figure 3 for Unsupervised Evaluation of Code LLMs with Round-Trip Correctness
Figure 4 for Unsupervised Evaluation of Code LLMs with Round-Trip Correctness
Viaarxiv icon

Grounding Data Science Code Generation with Input-Output Specifications

Add code
Feb 12, 2024
Viaarxiv icon

Universal Self-Consistency for Large Language Model Generation

Add code
Nov 29, 2023
Viaarxiv icon

SQLPrompt: In-Context Text-to-SQL with Minimal Labeled Data

Add code
Nov 06, 2023
Viaarxiv icon

L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models

Add code
Oct 02, 2023
Viaarxiv icon