Picture for Wenhan Yu

Wenhan Yu

TriPlay-RL: Tri-Role Self-Play Reinforcement Learning for LLM Safety Alignment

Add code
Jan 26, 2026
Viaarxiv icon

EntroCoT: Enhancing Chain-of-Thought via Adaptive Entropy-Guided Segmentation

Add code
Jan 08, 2026
Viaarxiv icon

MCPAgentBench: A Real-world Task Benchmark for Evaluating LLM Agent MCP Tool Use

Add code
Dec 31, 2025
Viaarxiv icon

Benchmarking Multi-Step Legal Reasoning and Analyzing Chain-of-Thought Effects in Large Language Models

Add code
Nov 19, 2025
Figure 1 for Benchmarking Multi-Step Legal Reasoning and Analyzing Chain-of-Thought Effects in Large Language Models
Figure 2 for Benchmarking Multi-Step Legal Reasoning and Analyzing Chain-of-Thought Effects in Large Language Models
Figure 3 for Benchmarking Multi-Step Legal Reasoning and Analyzing Chain-of-Thought Effects in Large Language Models
Figure 4 for Benchmarking Multi-Step Legal Reasoning and Analyzing Chain-of-Thought Effects in Large Language Models
Viaarxiv icon

BBox DocVQA: A Large Scale Bounding Box Grounded Dataset for Enhancing Reasoning in Document Visual Question Answer

Add code
Nov 19, 2025
Viaarxiv icon

Rank Also Matters: Hierarchical Configuration for Mixture of Adapter Experts in LLM Fine-Tuning

Add code
Feb 06, 2025
Figure 1 for Rank Also Matters: Hierarchical Configuration for Mixture of Adapter Experts in LLM Fine-Tuning
Figure 2 for Rank Also Matters: Hierarchical Configuration for Mixture of Adapter Experts in LLM Fine-Tuning
Figure 3 for Rank Also Matters: Hierarchical Configuration for Mixture of Adapter Experts in LLM Fine-Tuning
Figure 4 for Rank Also Matters: Hierarchical Configuration for Mixture of Adapter Experts in LLM Fine-Tuning
Viaarxiv icon

Unlocking Multi-View Insights in Knowledge-Dense Retrieval-Augmented Generation

Add code
Apr 19, 2024
Figure 1 for Unlocking Multi-View Insights in Knowledge-Dense Retrieval-Augmented Generation
Figure 2 for Unlocking Multi-View Insights in Knowledge-Dense Retrieval-Augmented Generation
Figure 3 for Unlocking Multi-View Insights in Knowledge-Dense Retrieval-Augmented Generation
Figure 4 for Unlocking Multi-View Insights in Knowledge-Dense Retrieval-Augmented Generation
Viaarxiv icon

Mobile Edge Computing and AI Enabled Web3 Metaverse over 6G Wireless Communications: A Deep Reinforcement Learning Approach

Add code
Dec 11, 2023
Viaarxiv icon

FedPEAT: Convergence of Federated Learning, Parameter-Efficient Fine Tuning, and Emulator Assisted Tuning for Artificial Intelligence Foundation Models with Mobile Edge Computing

Add code
Oct 26, 2023
Figure 1 for FedPEAT: Convergence of Federated Learning, Parameter-Efficient Fine Tuning, and Emulator Assisted Tuning for Artificial Intelligence Foundation Models with Mobile Edge Computing
Figure 2 for FedPEAT: Convergence of Federated Learning, Parameter-Efficient Fine Tuning, and Emulator Assisted Tuning for Artificial Intelligence Foundation Models with Mobile Edge Computing
Figure 3 for FedPEAT: Convergence of Federated Learning, Parameter-Efficient Fine Tuning, and Emulator Assisted Tuning for Artificial Intelligence Foundation Models with Mobile Edge Computing
Figure 4 for FedPEAT: Convergence of Federated Learning, Parameter-Efficient Fine Tuning, and Emulator Assisted Tuning for Artificial Intelligence Foundation Models with Mobile Edge Computing
Viaarxiv icon

Orchestration of Emulator Assisted Mobile Edge Tuning for AI Foundation Models: A Multi-Agent Deep Reinforcement Learning Approach

Add code
Oct 26, 2023
Figure 1 for Orchestration of Emulator Assisted Mobile Edge Tuning for AI Foundation Models: A Multi-Agent Deep Reinforcement Learning Approach
Figure 2 for Orchestration of Emulator Assisted Mobile Edge Tuning for AI Foundation Models: A Multi-Agent Deep Reinforcement Learning Approach
Figure 3 for Orchestration of Emulator Assisted Mobile Edge Tuning for AI Foundation Models: A Multi-Agent Deep Reinforcement Learning Approach
Viaarxiv icon