Picture for Yingshui Tan

Yingshui Tan

ChineseSimpleVQA -- "See the World, Discover Knowledge": A Chinese Factuality Evaluation for Large Vision Language Models

Add code
Feb 19, 2025
Viaarxiv icon

Equilibrate RLHF: Towards Balancing Helpfulness-Safety Trade-off in Large Language Models

Add code
Feb 17, 2025
Viaarxiv icon

RapGuard: Safeguarding Multimodal Large Language Models via Rationale-aware Defensive Prompting

Add code
Dec 25, 2024
Viaarxiv icon

Chinese SafetyQA: A Safety Short-form Factuality Benchmark for Large Language Models

Add code
Dec 23, 2024
Viaarxiv icon

Enhancing Vision-Language Model Safety through Progressive Concept-Bottleneck-Driven Alignment

Add code
Nov 18, 2024
Figure 1 for Enhancing Vision-Language Model Safety through Progressive Concept-Bottleneck-Driven Alignment
Figure 2 for Enhancing Vision-Language Model Safety through Progressive Concept-Bottleneck-Driven Alignment
Figure 3 for Enhancing Vision-Language Model Safety through Progressive Concept-Bottleneck-Driven Alignment
Figure 4 for Enhancing Vision-Language Model Safety through Progressive Concept-Bottleneck-Driven Alignment
Viaarxiv icon

Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models

Add code
Nov 13, 2024
Figure 1 for Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models
Figure 2 for Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models
Figure 3 for Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models
Figure 4 for Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models
Viaarxiv icon

Adaptive Dense Reward: Understanding the Gap Between Action and Reward Space in Alignment

Add code
Oct 23, 2024
Figure 1 for Adaptive Dense Reward: Understanding the Gap Between Action and Reward Space in Alignment
Figure 2 for Adaptive Dense Reward: Understanding the Gap Between Action and Reward Space in Alignment
Figure 3 for Adaptive Dense Reward: Understanding the Gap Between Action and Reward Space in Alignment
Figure 4 for Adaptive Dense Reward: Understanding the Gap Between Action and Reward Space in Alignment
Viaarxiv icon

Safety Alignment for Vision Language Models

Add code
May 22, 2024
Viaarxiv icon

Generalizing Fault Detection Against Domain Shifts Using Stratification-Aware Cross-Validation

Add code
Aug 20, 2020
Figure 1 for Generalizing Fault Detection Against Domain Shifts Using Stratification-Aware Cross-Validation
Figure 2 for Generalizing Fault Detection Against Domain Shifts Using Stratification-Aware Cross-Validation
Figure 3 for Generalizing Fault Detection Against Domain Shifts Using Stratification-Aware Cross-Validation
Figure 4 for Generalizing Fault Detection Against Domain Shifts Using Stratification-Aware Cross-Validation
Viaarxiv icon

Using Ensemble Classifiers to Detect Incipient Anomalies

Add code
Aug 20, 2020
Figure 1 for Using Ensemble Classifiers to Detect Incipient Anomalies
Figure 2 for Using Ensemble Classifiers to Detect Incipient Anomalies
Figure 3 for Using Ensemble Classifiers to Detect Incipient Anomalies
Figure 4 for Using Ensemble Classifiers to Detect Incipient Anomalies
Viaarxiv icon