Picture for Boshi Wang

Boshi Wang

ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery

Add code
Oct 07, 2024
Figure 1 for ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
Figure 2 for ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
Figure 3 for ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
Figure 4 for ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
Viaarxiv icon

Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization

Add code
May 27, 2024
Figure 1 for Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization
Figure 2 for Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization
Figure 3 for Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization
Figure 4 for Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization
Viaarxiv icon

Beamforming Design for Double-Active-RIS-aided Communication Systems with Inter-Excitation

Add code
Mar 17, 2024
Viaarxiv icon

LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error

Add code
Mar 07, 2024
Viaarxiv icon

Reconfigurable Intelligent Surface-Aided Dual-Function Radar and Communication Systems With MU-MIMO Communication

Add code
Feb 08, 2024
Viaarxiv icon

Joint Beamforming Design for Double Active RIS-assisted Radar-Communication Coexistence Systems

Add code
Feb 07, 2024
Viaarxiv icon

Secure Wireless Communication in Active RIS-Assisted DFRC System

Add code
Feb 03, 2024
Figure 1 for Secure Wireless Communication in Active RIS-Assisted DFRC System
Figure 2 for Secure Wireless Communication in Active RIS-Assisted DFRC System
Figure 3 for Secure Wireless Communication in Active RIS-Assisted DFRC System
Figure 4 for Secure Wireless Communication in Active RIS-Assisted DFRC System
Viaarxiv icon

How Trustworthy are Open-Source LLMs? An Assessment under Malicious Demonstrations Shows their Vulnerabilities

Add code
Nov 15, 2023
Viaarxiv icon

Mind2Web: Towards a Generalist Agent for the Web

Add code
Jun 15, 2023
Viaarxiv icon

Can ChatGPT Defend the Truth? Automatic Dialectical Evaluation Elicits LLMs' Deficiencies in Reasoning

Add code
May 22, 2023
Viaarxiv icon