Picture for Renxi Wang

Renxi Wang

SCALAR: Scientific Citation-based Live Assessment of Long-context Academic Reasoning

Add code
Feb 19, 2025
Viaarxiv icon

LLM360 K2: Building a 65B 360-Open-Source Large Language Model from Scratch

Add code
Jan 16, 2025
Viaarxiv icon

Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability

Add code
Dec 24, 2024
Figure 1 for Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
Figure 2 for Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
Figure 3 for Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
Figure 4 for Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
Viaarxiv icon

Explore the Reasoning Capability of LLMs in the Chess Testbed

Add code
Nov 11, 2024
Figure 1 for Explore the Reasoning Capability of LLMs in the Chess Testbed
Figure 2 for Explore the Reasoning Capability of LLMs in the Chess Testbed
Figure 3 for Explore the Reasoning Capability of LLMs in the Chess Testbed
Figure 4 for Explore the Reasoning Capability of LLMs in the Chess Testbed
Viaarxiv icon

ToolGen: Unified Tool Retrieval and Calling via Generation

Add code
Oct 04, 2024
Figure 1 for ToolGen: Unified Tool Retrieval and Calling via Generation
Figure 2 for ToolGen: Unified Tool Retrieval and Calling via Generation
Figure 3 for ToolGen: Unified Tool Retrieval and Calling via Generation
Figure 4 for ToolGen: Unified Tool Retrieval and Calling via Generation
Viaarxiv icon

Against The Achilles' Heel: A Survey on Red Teaming for Generative Models

Add code
Mar 31, 2024
Figure 1 for Against The Achilles' Heel: A Survey on Red Teaming for Generative Models
Figure 2 for Against The Achilles' Heel: A Survey on Red Teaming for Generative Models
Figure 3 for Against The Achilles' Heel: A Survey on Red Teaming for Generative Models
Figure 4 for Against The Achilles' Heel: A Survey on Red Teaming for Generative Models
Viaarxiv icon

Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Models as Agents

Add code
Feb 18, 2024
Viaarxiv icon

Understanding the Instruction Mixture for Large Language Model Fine-tuning

Add code
Dec 19, 2023
Viaarxiv icon