Picture for Zhebo Wang

Zhebo Wang

Policy of Thoughts: Scaling LLM Reasoning via Test-time Policy Evolution

Add code
Jan 28, 2026
Viaarxiv icon

ICPO: Illocution-Calibrated Policy Optimization for Multi-Turn Conversation

Add code
Jan 20, 2026
Viaarxiv icon

ForgetMark: Stealthy Fingerprint Embedding via Targeted Unlearning in Language Models

Add code
Jan 13, 2026
Viaarxiv icon

FP-VEC: Fingerprinting Large Language Models via Efficient Vector Addition

Add code
Sep 13, 2024
Viaarxiv icon