Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Houssam Nassif

Counterfactual Evaluation of Ads Ranking Models through Domain Adaptation

Sep 29, 2024

Mohamed A. Radwan, Himaghna Bhattacharjee, Quinn Lanners, Jiasheng Zhang, Serkan Karakulak, Houssam Nassif, Murat Ali Bayir

Abstract:We propose a domain-adapted reward model that works alongside an Offline A/B testing system for evaluating ranking models. This approach effectively measures reward for ranking model changes in large-scale Ads recommender systems, where model-free methods like IPS are not feasible. Our experiments demonstrate that the proposed technique outperforms both the vanilla IPS method and approaches using non-generalized reward models.

* Accepted at the CONSEQUENCES'24 workshop, co-located with ACM RecSys'24

Via

Access Paper or Ask Questions

Best of Three Worlds: Adaptive Experimentation for Digital Marketing in Practice

Feb 26, 2024

Tanner Fiez, Houssam Nassif, Yu-Cheng Chen, Sergio Gamez, Lalit Jain

Abstract:Adaptive experimental design (AED) methods are increasingly being used in industry as a tool to boost testing throughput or reduce experimentation cost relative to traditional A/B/N testing methods. However, the behavior and guarantees of such methods are not well-understood beyond idealized stationary settings. This paper shares lessons learned regarding the challenges of naively using AED systems in industrial settings where non-stationarity is prevalent, while also providing perspectives on the proper objectives and system specifications in such settings. We developed an AED framework for counterfactual inference based on these experiences, and tested it in a commercial environment.

* The Web Conference (WWW), Singapore, 2024

Via

Access Paper or Ask Questions

A Data-Driven State Aggregation Approach for Dynamic Discrete Choice Models

Apr 20, 2023

Sinong Geng, Houssam Nassif, Carlos A. Manzanares

Abstract:We study dynamic discrete choice models, where a commonly studied problem involves estimating parameters of agent reward functions (also known as "structural" parameters), using agent behavioral data. Maximum likelihood estimation for such models requires dynamic programming, which is limited by the curse of dimensionality. In this work, we present a novel algorithm that provides a data-driven method for selecting and aggregating states, which lowers the computational and sample complexity of estimation. Our method works in two stages. In the first stage, we use a flexible inverse reinforcement learning approach to estimate agent Q-functions. We use these estimated Q-functions, along with a clustering algorithm, to select a subset of states that are the most pivotal for driving changes in Q-functions. In the second stage, with these selected "aggregated" states, we conduct maximum likelihood estimation using a commonly used nested fixed-point algorithm. The proposed two-stage approach mitigates the curse of dimensionality by reducing the problem dimension. Theoretically, we derive finite-sample bounds on the associated estimation error, which also characterize the trade-off of computational complexity, estimation error, and sample complexity. We demonstrate the empirical performance of the algorithm in two classic dynamic discrete choice estimation applications.

Via

Access Paper or Ask Questions

Neural Insights for Digital Marketing Content Design

Feb 02, 2023

Fanjie Kong, Yuan Li, Houssam Nassif, Tanner Fiez, Shreya Chakrabarti, Ricardo Henao

Abstract:In digital marketing, experimenting with new website content is one of the key levers to improve customer engagement. However, creating successful marketing content is a manual and time-consuming process that lacks clear guiding principles. This paper seeks to close the loop between content creation and online experimentation by offering marketers AI-driven actionable insights based on historical data to improve their creative process. We present a neural-network-based system that scores and extracts insights from a marketing content design, namely, a multimodal neural network predicts the attractiveness of marketing contents, and a post-hoc attribution method generates actionable insights for marketers to improve their content in specific marketing locations. Our insights not only point out the advantages and drawbacks of a given current content, but also provide design recommendations based on historical data. We show that our scoring model and insights work well both quantitatively and qualitatively.

Via

Access Paper or Ask Questions

Adaptive Experimental Design and Counterfactual Inference

Oct 25, 2022

Tanner Fiez, Sergio Gamez, Arick Chen, Houssam Nassif, Lalit Jain

Figure 1 for Adaptive Experimental Design and Counterfactual Inference

Abstract:Adaptive experimental design methods are increasingly being used in industry as a tool to boost testing throughput or reduce experimentation cost relative to traditional A/B/N testing methods. This paper shares lessons learned regarding the challenges and pitfalls of naively using adaptive experimentation systems in industrial settings where non-stationarity is prevalent, while also providing perspectives on the proper objectives and system specifications in these settings. We developed an adaptive experimental design framework for counterfactual inference based on these experiences, and tested it in a commercial environment.

* In Workshops of the Conference on Recommender Systems (RecSys), 2022

Via

Access Paper or Ask Questions

Instance-optimal PAC Algorithms for Contextual Bandits

Jul 05, 2022

Zhaoqi Li, Lillian Ratliff, Houssam Nassif, Kevin Jamieson, Lalit Jain

Figure 1 for Instance-optimal PAC Algorithms for Contextual Bandits

Abstract:In the stochastic contextual bandit setting, regret-minimizing algorithms have been extensively researched, but their instance-minimizing best-arm identification counterparts remain seldom studied. In this work, we focus on the stochastic bandit problem in the $(\epsilon,\delta)$-$\textit{PAC}$ setting: given a policy class $\Pi$ the goal of the learner is to return a policy $\pi\in \Pi$ whose expected reward is within $\epsilon$ of the optimal policy with probability greater than $1-\delta$. We characterize the first $\textit{instance-dependent}$ PAC sample complexity of contextual bandits through a quantity $\rho_{\Pi}$, and provide matching upper and lower bounds in terms of $\rho_{\Pi}$ for the agnostic and linear contextual best-arm identification settings. We show that no algorithm can be simultaneously minimax-optimal for regret minimization and instance-dependent PAC for best-arm identification. Our main result is a new instance-optimal and computationally efficient algorithm that relies on a polynomial number of calls to an argmax oracle.

Via

Access Paper or Ask Questions

Improved Confidence Bounds for the Linear Logistic Model and Applications to Linear Bandits

Nov 23, 2020

Kwang-Sung Jun, Lalit Jain, Houssam Nassif

Figure 1 for Improved Confidence Bounds for the Linear Logistic Model and Applications to Linear Bandits

Abstract:We propose improved fixed-design confidence bounds for the linear logistic model. Our bounds significantly improve upon the state-of-the-art bounds of Li et al. (2017) by leveraging the self-concordance of the logistic loss inspired by Faury et al. (2020). Specifically, our confidence width does not scale with the problem dependent parameter $1/\kappa$, where $\kappa$ is the worst-case variance of an arm reward. At worse, $\kappa$ scales exponentially with the norm of the unknown linear parameter $\theta^*$. Instead, our bound scales directly on the local variance induced by $\theta^*$. We present two applications of our novel bounds on two logistic bandit problems: regret minimization and pure exploration. Our analysis shows that the new confidence bounds improve upon previous state-of-the-art performance guarantees.

Via

Access Paper or Ask Questions

Deep PQR: Solving Inverse Reinforcement Learning using Anchor Actions

Aug 15, 2020

Sinong Geng, Houssam Nassif, Carlos A. Manzanares, A. Max Reppen, Ronnie Sircar

Figure 1 for Deep PQR: Solving Inverse Reinforcement Learning using Anchor Actions

Figure 2 for Deep PQR: Solving Inverse Reinforcement Learning using Anchor Actions

Figure 3 for Deep PQR: Solving Inverse Reinforcement Learning using Anchor Actions

Figure 4 for Deep PQR: Solving Inverse Reinforcement Learning using Anchor Actions

Abstract:We propose a reward function estimation framework for inverse reinforcement learning with deep energy-based policies. We name our method PQR, as it sequentially estimates the Policy, the $Q$-function, and the Reward function by deep learning. PQR does not assume that the reward solely depends on the state, instead it allows for a dependency on the choice of action. Moreover, PQR allows for stochastic state transitions. To accomplish this, we assume the existence of one anchor action whose reward is known, typically the action of doing nothing, yielding no reward. We present both estimators and algorithms for the PQR method. When the environment transition is known, we prove that the PQR reward estimator uniquely recovers the true reward. With unknown transitions, we bound the estimation error of PQR. Finally, the performance of PQR is demonstrated by synthetic and real-world datasets.

* In Proceedings of the 37th ICML, Vienna, Austria, PMLR 119, 2020

Via

Access Paper or Ask Questions

Decoupling Learning Rates Using Empirical Bayes Priors

Feb 04, 2020

Sareh Nabi, Houssam Nassif, Joseph Hong, Hamed Mamani, Guido Imbens

Figure 1 for Decoupling Learning Rates Using Empirical Bayes Priors

Figure 2 for Decoupling Learning Rates Using Empirical Bayes Priors

Figure 3 for Decoupling Learning Rates Using Empirical Bayes Priors

Figure 4 for Decoupling Learning Rates Using Empirical Bayes Priors

Abstract:In this work, we propose an Empirical Bayes approach to decouple the learning rates of first order and second order features (or any other feature grouping) in a Generalized Linear Model. Such needs arise in small-batch or low-traffic use-cases. As the first order features are likely to have a more pronounced effect on the outcome, focusing on learning first order weights first is likely to improve performance and convergence time. Our Empirical Bayes method clamps features in each group together and uses the observed data for the deployed model to empirically compute a hierarchical prior in hindsight. We apply our method to a standard classification setting, as well as a contextual bandit setting in an Amazon production system. Both during simulations and live experiments, our method shows marked improvements, especially in cases of small traffic. Our findings are promising, as optimizing over sparse data is often a challenge. Furthermore, our approach can be applied to any problem instance modeled as a Bayesian framework.

Via

Access Paper or Ask Questions

Seeker: Real-Time Interactive Search

May 17, 2019

Ari Biswas, Thai T Pham, Michael Vogelsong, Benjamin Snyder, Houssam Nassif

Figure 1 for Seeker: Real-Time Interactive Search

Figure 2 for Seeker: Real-Time Interactive Search

Figure 3 for Seeker: Real-Time Interactive Search

Figure 4 for Seeker: Real-Time Interactive Search

Abstract:This paper introduces Seeker, a system that allows users to interactively refine search rankings in real time, through feedback in the form of likes and dislikes. When searching online, users may not know how to accurately describe their product of choice in words. An alternative approach is to search an embedding space, allowing the user to query using a representation of the item (like a tune for a song, or a picture for an object). However, this approach requires the user to possess an example representation of their desired item. Additionally, most current search systems do not allow the user to dynamically adapt the results with further feedback. On the other hand, users often have a mental picture of the desired item and are able to answer ordinal questions of the form: "Is this item similar to what you have in mind?" With this assumption, our algorithm allows for users to provide sequential feedback on search results to adapt the search feed. We show that our proposed approach works well both qualitatively and quantitatively. Unlike most previous representation-based search systems, we can quantify the quality of our algorithm by evaluating humans-in-the-loop experiments.

* This paper will appear in KDD 2019

Via

Access Paper or Ask Questions