Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alex Deng

FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop Data

Jan 28, 2025

Deren Lei, Yaxi Li, Siyao Li, Mengya Hu, Rui Xu, Ken Archer, Mingyu Wang, Emily Ching, Alex Deng

Abstract:Prior research on training grounded factuality classification models to detect hallucinations in large language models (LLMs) has relied on public natural language inference (NLI) data and synthetic data. However, conventional NLI datasets are not well-suited for document-level reasoning, which is critical for detecting LLM hallucinations. Recent approaches to document-level synthetic data generation involve iteratively removing sentences from documents and annotating factuality using LLM-based prompts. While effective, this method is computationally expensive for long documents and limited by the LLM's capabilities. In this work, we analyze the differences between existing synthetic training data used in state-of-the-art models and real LLM output claims. Based on our findings, we propose a novel approach for synthetic data generation, CG2C, that leverages multi-hop reasoning on context graphs extracted from documents. Our fact checker model, FactCG, demonstrates improved performance with more connected reasoning, using the same backbone models. Experiments show it even outperforms GPT-4-o on the LLM-Aggrefact benchmark with much smaller model size.

* NAACL 2025

Via

Access Paper or Ask Questions

InvisMark: Invisible and Robust Watermarking for AI-generated Image Provenance

Nov 19, 2024

Rui Xu, Mengya Hu, Deren Lei, Yaxi Li, David Lowe, Alex Gorevski, Mingyu Wang, Emily Ching, Alex Deng

Abstract:The proliferation of AI-generated images has intensified the need for robust content authentication methods. We present InvisMark, a novel watermarking technique designed for high-resolution AI-generated images. Our approach leverages advanced neural network architectures and training strategies to embed imperceptible yet highly robust watermarks. InvisMark achieves state-of-the-art performance in imperceptibility (PSNR$\sim$51, SSIM $\sim$ 0.998) while maintaining over 97\% bit accuracy across various image manipulations. Notably, we demonstrate the successful encoding of 256-bit watermarks, significantly expanding payload capacity while preserving image quality. This enables the embedding of UUIDs with error correction codes, achieving near-perfect decoding success rates even under challenging image distortions. We also address potential vulnerabilities against advanced attacks and propose mitigation strategies. By combining high imperceptibility, extended payload capacity, and resilience to manipulations, InvisMark provides a robust foundation for ensuring media provenance in an era of increasingly sophisticated AI-generated content. Source code of this paper is available at: https://github.com/microsoft/InvisMark.

* Accepted to WACV 2025

Via

Access Paper or Ask Questions

Transforming Location Retrieval at Airbnb: A Journey from Heuristics to Reinforcement Learning

Aug 23, 2024

Dillon Davis, Huiji Gao, Weiwei Guo, Thomas Legrand, Malay Haldar, Alex Deng, Han Zhao, Liwei He, Sanjeev Katariya

Figure 1 for Transforming Location Retrieval at Airbnb: A Journey from Heuristics to Reinforcement Learning

Figure 2 for Transforming Location Retrieval at Airbnb: A Journey from Heuristics to Reinforcement Learning

Figure 3 for Transforming Location Retrieval at Airbnb: A Journey from Heuristics to Reinforcement Learning

Figure 4 for Transforming Location Retrieval at Airbnb: A Journey from Heuristics to Reinforcement Learning

Abstract:The Airbnb search system grapples with many unique challenges as it continues to evolve. We oversee a marketplace that is nuanced by geography, diversity of homes, and guests with a variety of preferences. Crafting an efficient search system that can accommodate diverse guest needs, while showcasing relevant homes lies at the heart of Airbnb's success. Airbnb search has many challenges that parallel other recommendation and search systems but it has a unique information retrieval problem, upstream of ranking, called location retrieval. It requires defining a topological map area that is relevant to the searched query for homes listing retrieval. The purpose of this paper is to demonstrate the methodology, challenges, and impact of building a machine learning based location retrieval product from the ground up. Despite the lack of suitable, prevalent machine learning based approaches, we tackle cold start, generalization, differentiation and algorithmic bias. We detail the efficacy of heuristics, statistics, machine learning, and reinforcement learning approaches to solve these challenges, particularly for systems that are often unexplored by current literature.

Via

Access Paper or Ask Questions

SLM Meets LLM: Balancing Latency, Interpretability and Consistency in Hallucination Detection

Aug 22, 2024

Mengya Hu, Rui Xu, Deren Lei, Yaxi Li, Mingyu Wang, Emily Ching, Eslam Kamal, Alex Deng

Abstract:Large language models (LLMs) are highly capable but face latency challenges in real-time applications, such as conducting online hallucination detection. To overcome this issue, we propose a novel framework that leverages a small language model (SLM) classifier for initial detection, followed by a LLM as constrained reasoner to generate detailed explanations for detected hallucinated content. This study optimizes the real-time interpretable hallucination detection by introducing effective prompting techniques that align LLM-generated explanations with SLM decisions. Empirical experiment results demonstrate its effectiveness, thereby enhancing the overall user experience.

* preprint under review

Via

Access Paper or Ask Questions

Continuous Attribution of Episodical Outcomes for More Efficient and Targeted Online Measurement

Oct 28, 2022

Alex Deng, Michelle Du, Anna Matlin

Abstract:Online experimentation platforms collect user feedback at low cost and large scale. Some systems even support real-time or near real-time data processing, and can update metrics and statistics continuously. Many commonly used metrics, such as clicks and page views, can be observed without much delay. However, many important signals can only be observed after several hours or days, with noise adding up over the duration of the episode. When episodical outcomes follow a complex sequence of user-product interactions, it is difficult to understand which interactions lead to the final outcome. There is no obvious attribution logic for us to associate a positive or negative outcome back to the actions and choices we made at different times. This attribution logic is critical to unlocking more targeted and efficient measurement at a finer granularity that could eventually lead to the full capability of reinforcement learning. In this paper, we borrow the idea of Causal Surrogacy to model a long-term outcome using leading indicators that are incrementally observed and apply it as the value function to track the progress towards the final outcome and attribute incrementally to various user-product interaction steps. Applying this approach to the guest booking metric at Airbnb resulted in significant variance reductions of 50% to 85%, while aligning well with the booking metric itself. Continuous attribution allows us to assign a utility score to each product page-view, and this score can be flexibly further aggregated to a variety of units of interest, such as searches and listings. We provide multiple real-world applications of attribution to illustrate its versatility.

Via

Access Paper or Ask Questions