Picture for Sihang Jiang

Sihang Jiang

HINT: Helping Ineffective Rollouts Navigate Towards Effectiveness

Add code
Oct 10, 2025
Viaarxiv icon

CultureScope: A Dimensional Lens for Probing Cultural Understanding in LLMs

Add code
Sep 19, 2025
Viaarxiv icon

AdaptiveLog: An Adaptive Log Analysis Framework with the Collaboration of Large and Small Language Model

Add code
Jan 19, 2025
Figure 1 for AdaptiveLog: An Adaptive Log Analysis Framework with the Collaboration of Large and Small Language Model
Figure 2 for AdaptiveLog: An Adaptive Log Analysis Framework with the Collaboration of Large and Small Language Model
Figure 3 for AdaptiveLog: An Adaptive Log Analysis Framework with the Collaboration of Large and Small Language Model
Figure 4 for AdaptiveLog: An Adaptive Log Analysis Framework with the Collaboration of Large and Small Language Model
Viaarxiv icon

EDGE: Enhanced Grounded GUI Understanding with Enriched Multi-Granularity Synthetic Data

Add code
Oct 25, 2024
Viaarxiv icon

LUK: Empowering Log Understanding with Expert Knowledge from Large Language Models

Add code
Sep 03, 2024
Figure 1 for LUK: Empowering Log Understanding with Expert Knowledge from Large Language Models
Figure 2 for LUK: Empowering Log Understanding with Expert Knowledge from Large Language Models
Figure 3 for LUK: Empowering Log Understanding with Expert Knowledge from Large Language Models
Figure 4 for LUK: Empowering Log Understanding with Expert Knowledge from Large Language Models
Viaarxiv icon

Enhancing Quantitative Reasoning Skills of Large Language Models through Dimension Perception

Add code
Dec 29, 2023
Viaarxiv icon

Beyond the Obvious: Evaluating the Reasoning Ability In Real-life Scenarios of Language Models on Life Scapes Reasoning Benchmark~(LSR-Benchmark)

Add code
Jul 11, 2023
Figure 1 for Beyond the Obvious: Evaluating the Reasoning Ability In Real-life Scenarios of Language Models on Life Scapes Reasoning Benchmark~(LSR-Benchmark)
Figure 2 for Beyond the Obvious: Evaluating the Reasoning Ability In Real-life Scenarios of Language Models on Life Scapes Reasoning Benchmark~(LSR-Benchmark)
Figure 3 for Beyond the Obvious: Evaluating the Reasoning Ability In Real-life Scenarios of Language Models on Life Scapes Reasoning Benchmark~(LSR-Benchmark)
Figure 4 for Beyond the Obvious: Evaluating the Reasoning Ability In Real-life Scenarios of Language Models on Life Scapes Reasoning Benchmark~(LSR-Benchmark)
Viaarxiv icon

Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge Evaluation

Add code
Jun 15, 2023
Figure 1 for Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge Evaluation
Figure 2 for Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge Evaluation
Figure 3 for Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge Evaluation
Figure 4 for Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge Evaluation
Viaarxiv icon

Domain Mastery Benchmark: An Ever-Updating Benchmark for Evaluating Holistic Domain Knowledge of Large Language Model--A Preliminary Release

Add code
Apr 23, 2023
Figure 1 for Domain Mastery Benchmark: An Ever-Updating Benchmark for Evaluating Holistic Domain Knowledge of Large Language Model--A Preliminary Release
Viaarxiv icon

GANTEE: Generative Adversatial Network for Taxonomy Entering Evaluation

Add code
Mar 25, 2023
Figure 1 for GANTEE: Generative Adversatial Network for Taxonomy Entering Evaluation
Figure 2 for GANTEE: Generative Adversatial Network for Taxonomy Entering Evaluation
Figure 3 for GANTEE: Generative Adversatial Network for Taxonomy Entering Evaluation
Figure 4 for GANTEE: Generative Adversatial Network for Taxonomy Entering Evaluation
Viaarxiv icon