Picture for Yinong Oliver Wang

Yinong Oliver Wang

DSO: Direct Steering Optimization for Bias Mitigation

Add code
Dec 20, 2025
Figure 1 for DSO: Direct Steering Optimization for Bias Mitigation
Figure 2 for DSO: Direct Steering Optimization for Bias Mitigation
Figure 3 for DSO: Direct Steering Optimization for Bias Mitigation
Figure 4 for DSO: Direct Steering Optimization for Bias Mitigation
Viaarxiv icon

Investigating Intersectional Bias in Large Language Models using Confidence Disparities in Coreference Resolution

Add code
Aug 09, 2025
Figure 1 for Investigating Intersectional Bias in Large Language Models using Confidence Disparities in Coreference Resolution
Figure 2 for Investigating Intersectional Bias in Large Language Models using Confidence Disparities in Coreference Resolution
Figure 3 for Investigating Intersectional Bias in Large Language Models using Confidence Disparities in Coreference Resolution
Figure 4 for Investigating Intersectional Bias in Large Language Models using Confidence Disparities in Coreference Resolution
Viaarxiv icon

Is Your Model Fairly Certain? Uncertainty-Aware Fairness Evaluation for LLMs

Add code
May 29, 2025
Viaarxiv icon

Unsupervised Model Diagnosis

Add code
Oct 08, 2024
Figure 1 for Unsupervised Model Diagnosis
Figure 2 for Unsupervised Model Diagnosis
Figure 3 for Unsupervised Model Diagnosis
Figure 4 for Unsupervised Model Diagnosis
Viaarxiv icon