Picture for Baishakhi Ray

Baishakhi Ray

Recent Advances in Large Langauge Model Benchmarks against Data Contamination: From Static to Dynamic Evaluation

Add code
Feb 23, 2025
Viaarxiv icon

AI Software Engineer: Programming with Trust

Add code
Feb 19, 2025
Viaarxiv icon

CWEval: Outcome-driven Evaluation on Functionality and Security of LLM Code Generation

Add code
Jan 14, 2025
Figure 1 for CWEval: Outcome-driven Evaluation on Functionality and Security of LLM Code Generation
Figure 2 for CWEval: Outcome-driven Evaluation on Functionality and Security of LLM Code Generation
Figure 3 for CWEval: Outcome-driven Evaluation on Functionality and Security of LLM Code Generation
Figure 4 for CWEval: Outcome-driven Evaluation on Functionality and Security of LLM Code Generation
Viaarxiv icon

Can LLM Prompting Serve as a Proxy for Static Analysis in Vulnerability Detection

Add code
Dec 16, 2024
Viaarxiv icon

On Mitigating Code LLM Hallucinations with API Documentation

Add code
Jul 13, 2024
Figure 1 for On Mitigating Code LLM Hallucinations with API Documentation
Figure 2 for On Mitigating Code LLM Hallucinations with API Documentation
Figure 3 for On Mitigating Code LLM Hallucinations with API Documentation
Figure 4 for On Mitigating Code LLM Hallucinations with API Documentation
Viaarxiv icon

Solving Zebra Puzzles Using Constraint-Guided Multi-Agent Systems

Add code
Jul 04, 2024
Viaarxiv icon

Reasoning in Token Economies: Budget-Aware Evaluation of LLM Reasoning Strategies

Add code
Jun 11, 2024
Figure 1 for Reasoning in Token Economies: Budget-Aware Evaluation of LLM Reasoning Strategies
Figure 2 for Reasoning in Token Economies: Budget-Aware Evaluation of LLM Reasoning Strategies
Figure 3 for Reasoning in Token Economies: Budget-Aware Evaluation of LLM Reasoning Strategies
Figure 4 for Reasoning in Token Economies: Budget-Aware Evaluation of LLM Reasoning Strategies
Viaarxiv icon

SemCoder: Training Code Language Models with Comprehensive Semantics

Add code
Jun 03, 2024
Viaarxiv icon

Training LLMs to Better Self-Debug and Explain Code

Add code
May 28, 2024
Figure 1 for Training LLMs to Better Self-Debug and Explain Code
Figure 2 for Training LLMs to Better Self-Debug and Explain Code
Figure 3 for Training LLMs to Better Self-Debug and Explain Code
Figure 4 for Training LLMs to Better Self-Debug and Explain Code
Viaarxiv icon

Automatic Programming: Large Language Models and Beyond

Add code
May 03, 2024
Figure 1 for Automatic Programming: Large Language Models and Beyond
Figure 2 for Automatic Programming: Large Language Models and Beyond
Figure 3 for Automatic Programming: Large Language Models and Beyond
Figure 4 for Automatic Programming: Large Language Models and Beyond
Viaarxiv icon