Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Congzhi Zhang

VReST: Enhancing Reasoning in Large Vision-Language Models through Tree Search and Self-Reward Mechanism

Jun 10, 2025

Congzhi Zhang, Jiawei Peng, Zhenglin Wang, Yilong Lai, Haowen Sun, Heng Chang, Fei Ma, Weijiang Yu

Abstract:Large Vision-Language Models (LVLMs) have shown exceptional performance in multimodal tasks, but their effectiveness in complex visual reasoning is still constrained, especially when employing Chain-of-Thought prompting techniques. In this paper, we propose VReST, a novel training-free approach that enhances Reasoning in LVLMs through Monte Carlo Tree Search and Self-Reward mechanisms. VReST meticulously traverses the reasoning landscape by establishing a search tree, where each node encapsulates a reasoning step, and each path delineates a comprehensive reasoning sequence. Our innovative multimodal Self-Reward mechanism assesses the quality of reasoning steps by integrating the utility of sub-questions, answer correctness, and the relevance of vision-language clues, all without the need for additional models. VReST surpasses current prompting methods and secures state-of-the-art performance across three multimodal mathematical reasoning benchmarks. Furthermore, it substantiates the efficacy of test-time scaling laws in multimodal tasks, offering a promising direction for future research.

* Accepted by ACL 2025 main

Via

Access Paper or Ask Questions

AdaCQR: Enhancing Query Reformulation for Conversational Search via Sparse and Dense Retrieval Alignment

Jul 02, 2024

Yilong Lai, Jialong Wu, Congzhi Zhang, Haowen Sun, Deyu Zhou

Abstract:Conversational Query Reformulation (CQR) has significantly advanced in addressing the challenges of conversational search, particularly those stemming from the latent user intent and the need for historical context. Recent works aimed to boost the performance of CRQ through alignment. However, they are designed for one specific retrieval system, which potentially results in poor generalization. To overcome this limitation, we present a novel framework AdaCQR. By aligning reformulation models with both term-based and semantic-based retrieval systems, AdaCQR enhances the generalizability of information-seeking queries across diverse retrieval environments through a dual-phase training strategy. We also developed two effective approaches for acquiring superior labels and diverse input candidates, boosting the efficiency and robustness of the framework. Experimental evaluations on the TopiOCQA and QReCC datasets demonstrate that AdaCQR significantly outperforms existing methods, offering both quantitative and qualitative improvements in conversational query reformulation.

Via

Access Paper or Ask Questions

SEED: Accelerating Reasoning Tree Construction via Scheduled Speculative Decoding

Jun 26, 2024

Zhenglin Wang, Jialong Wu, Yilong Lai, Congzhi Zhang, Deyu Zhou

Figure 1 for SEED: Accelerating Reasoning Tree Construction via Scheduled Speculative Decoding

Figure 2 for SEED: Accelerating Reasoning Tree Construction via Scheduled Speculative Decoding

Figure 3 for SEED: Accelerating Reasoning Tree Construction via Scheduled Speculative Decoding

Figure 4 for SEED: Accelerating Reasoning Tree Construction via Scheduled Speculative Decoding

Abstract:Large Language Models (LLMs) demonstrate remarkable emergent abilities across various tasks, yet fall short of complex reasoning and planning tasks. The tree-search-based reasoning methods address this by surpassing the capabilities of chain-of-thought prompting, encouraging exploration of intermediate steps. However, such methods introduce significant inference latency due to the systematic exploration and evaluation of multiple thought paths. This paper introduces SeeD, a novel and efficient inference framework to optimize runtime speed and GPU memory management concurrently. By employing a scheduled speculative execution, SeeD efficiently handles multiple iterations for the thought generation and the state evaluation, leveraging a rounds-scheduled strategy to manage draft model dispatching. Extensive experimental evaluations on three reasoning datasets demonstrate superior speedup performance of SeeD, providing a viable path for batched inference in training-free speculative decoding.

Via

Access Paper or Ask Questions

Causal Prompting: Debiasing Large Language Model Prompting based on Front-Door Adjustment

Mar 05, 2024

Congzhi Zhang, Linhai Zhang, Deyu Zhou, Guoqiang Xu

Figure 1 for Causal Prompting: Debiasing Large Language Model Prompting based on Front-Door Adjustment

Figure 2 for Causal Prompting: Debiasing Large Language Model Prompting based on Front-Door Adjustment

Figure 3 for Causal Prompting: Debiasing Large Language Model Prompting based on Front-Door Adjustment

Figure 4 for Causal Prompting: Debiasing Large Language Model Prompting based on Front-Door Adjustment

Abstract:Despite the significant achievements of existing prompting methods such as in-context learning and chain-of-thought for large language models (LLMs), they still face challenges of various biases. Traditional debiasing methods primarily focus on the model training stage, including data augmentation-based and reweight-based approaches, with the limitations of addressing the complex biases of LLMs. To address such limitations, the causal relationship behind the prompting methods is uncovered using a structural causal model, and a novel causal prompting method based on front-door adjustment is proposed to effectively mitigate the bias of LLMs. In specific, causal intervention is implemented by designing the prompts without accessing the parameters and logits of LLMs.The chain-of-thoughts generated by LLMs are employed as the mediator variable and the causal effect between the input prompt and the output answers is calculated through front-door adjustment to mitigate model biases. Moreover, to obtain the representation of the samples precisely and estimate the causal effect more accurately, contrastive learning is used to fine-tune the encoder of the samples by aligning the space of the encoder with the LLM. Experimental results show that the proposed causal prompting approach achieves excellent performance on 3 natural language processing datasets on both open-source and closed-source LLMs.

Via

Access Paper or Ask Questions

Causal Walk: Debiasing Multi-Hop Fact Verification with Front-Door Adjustment

Mar 05, 2024

Congzhi Zhang, Linhai Zhang, Deyu Zhou

Figure 1 for Causal Walk: Debiasing Multi-Hop Fact Verification with Front-Door Adjustment

Figure 2 for Causal Walk: Debiasing Multi-Hop Fact Verification with Front-Door Adjustment

Figure 3 for Causal Walk: Debiasing Multi-Hop Fact Verification with Front-Door Adjustment

Figure 4 for Causal Walk: Debiasing Multi-Hop Fact Verification with Front-Door Adjustment

Abstract:Conventional multi-hop fact verification models are prone to rely on spurious correlations from the annotation artifacts, leading to an obvious performance decline on unbiased datasets. Among the various debiasing works, the causal inference-based methods become popular by performing theoretically guaranteed debiasing such as casual intervention or counterfactual reasoning. However, existing causal inference-based debiasing methods, which mainly formulate fact verification as a single-hop reasoning task to tackle shallow bias patterns, cannot deal with the complicated bias patterns hidden in multiple hops of evidence. To address the challenge, we propose Causal Walk, a novel method for debiasing multi-hop fact verification from a causal perspective with front-door adjustment. Specifically, in the structural causal model, the reasoning path between the treatment (the input claim-evidence graph) and the outcome (the veracity label) is introduced as the mediator to block the confounder. With the front-door adjustment, the causal effect between the treatment and the outcome is decomposed into the causal effect between the treatment and the mediator, which is estimated by applying the idea of random walk, and the causal effect between the mediator and the outcome, which is estimated with normalized weighted geometric mean approximation. To investigate the effectiveness of the proposed method, an adversarial multi-hop fact verification dataset and a symmetric multi-hop fact verification dataset are proposed with the help of the large language model. Experimental results show that Causal Walk outperforms some previous debiasing methods on both existing datasets and the newly constructed datasets. Code and data will be released at https://github.com/zcccccz/CausalWalk.

* Accepted by AAAI 2024

Via

Access Paper or Ask Questions

Genes in Intelligent Agents

Jun 17, 2023

Fu Feng, Jing Wang, Congzhi Zhang, Wenqian Li, Xu Yang, Xin Geng

Abstract:Training intelligent agents in Reinforcement Learning (RL) is much more time-consuming than animal learning. This is because agents learn from scratch, but animals learn with genes inherited from ancestors and are born with some innate abilities. Inspired by genes in animals, here we conceptualize the gene in intelligent agents and introduce Genetic Reinforcement Learning (GRL), a computational framework to represent, evaluate, and evolve genes (in agents). Leveraging GRL we identify genes and demonstrate several advantages of genes. First, we find that genes take the form of the fragment of agents' neural networks and can be inherited across generations. Second, we validate that genes bring better and stabler learning ability to agents, since genes condense knowledge from ancestors and bring agent with innate abilities. Third, we present evidence of Lamarckian evolution in intelligent agents. The continuous encoding of knowledge into genes across generations facilitates the evolution of genes. Overall, our work promotes a novel paradigm to train agents by incorporating genes.

Via

Access Paper or Ask Questions