Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yihan Zhou

Near-Polynomially Competitive Active Logistic Regression

Mar 07, 2025

Yihan Zhou, Eric Price, Trung Nguyen

Figure 1 for Near-Polynomially Competitive Active Logistic Regression

Figure 2 for Near-Polynomially Competitive Active Logistic Regression

Figure 3 for Near-Polynomially Competitive Active Logistic Regression

Abstract:We address the problem of active logistic regression in the realizable setting. It is well known that active learning can require exponentially fewer label queries compared to passive learning, in some cases using $\log \frac{1}{\eps}$ rather than $\poly(1/\eps)$ labels to get error $\eps$ larger than the optimum. We present the first algorithm that is polynomially competitive with the optimal algorithm on every input instance, up to factors polylogarithmic in the error and domain size. In particular, if any algorithm achieves label complexity polylogarithmic in $\eps$, so does ours. Our algorithm is based on efficient sampling and can be extended to learn more general class of functions. We further support our theoretical results with experiments demonstrating performance gains for logistic regression compared to existing active learning algorithms.

Via

Access Paper or Ask Questions

FactorSim: Generative Simulation via Factorized Representation

Sep 26, 2024

Fan-Yun Sun, S. I. Harini, Angela Yi, Yihan Zhou, Alex Zook, Jonathan Tremblay, Logan Cross, Jiajun Wu, Nick Haber

Abstract:Generating simulations to train intelligent agents in game-playing and robotics from natural language input, from user input or task documentation, remains an open-ended challenge. Existing approaches focus on parts of this challenge, such as generating reward functions or task hyperparameters. Unlike previous work, we introduce FACTORSIM that generates full simulations in code from language input that can be used to train agents. Exploiting the structural modularity specific to coded simulations, we propose to use a factored partially observable Markov decision process representation that allows us to reduce context dependence during each step of the generation. For evaluation, we introduce a generative simulation benchmark that assesses the generated simulation code's accuracy and effectiveness in facilitating zero-shot transfers in reinforcement learning settings. We show that FACTORSIM outperforms existing methods in generating simulations regarding prompt alignment (e.g., accuracy), zero-shot transfer abilities, and human evaluation. We also demonstrate its effectiveness in generating robotic tasks.

* neurips 2024, project website: https://cs.stanford.edu/~sunfanyun/factorsim/

Via

Access Paper or Ask Questions

Bridging the Gaps: Learning Verifiable Model-Free Quadratic Programming Controllers Inspired by Model Predictive Control

Dec 26, 2023

Yiwen Lu, Zishuo Li, Yihan Zhou, Na Li, Yilin Mo

Figure 1 for Bridging the Gaps: Learning Verifiable Model-Free Quadratic Programming Controllers Inspired by Model Predictive Control

Figure 2 for Bridging the Gaps: Learning Verifiable Model-Free Quadratic Programming Controllers Inspired by Model Predictive Control

Figure 3 for Bridging the Gaps: Learning Verifiable Model-Free Quadratic Programming Controllers Inspired by Model Predictive Control

Figure 4 for Bridging the Gaps: Learning Verifiable Model-Free Quadratic Programming Controllers Inspired by Model Predictive Control

Abstract:In this paper, we introduce a new class of parameterized controllers, drawing inspiration from Model Predictive Control (MPC). The controller resembles a Quadratic Programming (QP) solver of a linear MPC problem, with the parameters of the controller being trained via Deep Reinforcement Learning (DRL) rather than derived from system models. This approach addresses the limitations of common controllers with Multi-Layer Perceptron (MLP) or other general neural network architecture used in DRL, in terms of verifiability and performance guarantees, and the learned controllers possess verifiable properties like persistent feasibility and asymptotic stability akin to MPC. On the other hand, numerical examples illustrate that the proposed controller empirically matches MPC and MLP controllers in terms of control performance and has superior robustness against modeling uncertainty and noises. Furthermore, the proposed controller is significantly more computationally efficient compared to MPC and requires fewer parameters to learn than MLP controllers. Real-world experiments on vehicle drift maneuvering task demonstrate the potential of these controllers for robotics and other demanding control tasks.

Via

Access Paper or Ask Questions

A Competitive Algorithm for Agnostic Active Learning

Oct 28, 2023

Eric Price, Yihan Zhou

Abstract:For some hypothesis classes and input distributions, active agnostic learning needs exponentially fewer samples than passive learning; for other classes and distributions, it offers little to no improvement. The most popular algorithms for agnostic active learning express their performance in terms of a parameter called the disagreement coefficient, but it is known that these algorithms are inefficient on some inputs. We take a different approach to agnostic active learning, getting an algorithm that is competitive with the optimal algorithm for any binary hypothesis class $H$ and distribution $D_X$ over $X$. In particular, if any algorithm can use $m^*$ queries to get $O(\eta)$ error, then our algorithm uses $O(m^* \log |H|)$ queries to get $O(\eta)$ error. Our algorithm lies in the vein of the splitting-based approach of Dasgupta [2004], which gets a similar result for the realizable ($\eta = 0$) setting. We also show that it is NP-hard to do better than our algorithm's $O(\log |H|)$ overhead in general.

Via

Access Paper or Ask Questions

Analyzing and Improving Greedy 2-Coordinate Updates for Equality-Constrained Optimization via Steepest Descent in the 1-Norm

Jul 03, 2023

Amrutha Varshini Ramesh, Aaron Mishkin, Mark Schmidt, Yihan Zhou, Jonathan Wilder Lavington, Jennifer She

Figure 1 for Analyzing and Improving Greedy 2-Coordinate Updates for Equality-Constrained Optimization via Steepest Descent in the 1-Norm

Figure 2 for Analyzing and Improving Greedy 2-Coordinate Updates for Equality-Constrained Optimization via Steepest Descent in the 1-Norm

Figure 3 for Analyzing and Improving Greedy 2-Coordinate Updates for Equality-Constrained Optimization via Steepest Descent in the 1-Norm

Figure 4 for Analyzing and Improving Greedy 2-Coordinate Updates for Equality-Constrained Optimization via Steepest Descent in the 1-Norm

Abstract:We consider minimizing a smooth function subject to a summation constraint over its variables. By exploiting a connection between the greedy 2-coordinate update for this problem and equality-constrained steepest descent in the 1-norm, we give a convergence rate for greedy selection under a proximal Polyak-Lojasiewicz assumption that is faster than random selection and independent of the problem dimension $n$. We then consider minimizing with both a summation constraint and bound constraints, as arises in the support vector machine dual problem. Existing greedy rules for this setting either guarantee trivial progress only or require $O(n^2)$ time to compute. We show that bound- and summation-constrained steepest descent in the L1-norm guarantees more progress per iteration than previous rules and can be computed in only $O(n \log n)$ time.

Via

Access Paper or Ask Questions

Consecutive Inertia Drift of Autonomous RC Car via Primitive-based Planning and Data-driven Control

Jun 21, 2023

Yiwen Lu, Bo Yang, Jiayun Li, Yihan Zhou, Hongshuai Chen, Yilin Mo

Figure 1 for Consecutive Inertia Drift of Autonomous RC Car via Primitive-based Planning and Data-driven Control

Figure 2 for Consecutive Inertia Drift of Autonomous RC Car via Primitive-based Planning and Data-driven Control

Figure 3 for Consecutive Inertia Drift of Autonomous RC Car via Primitive-based Planning and Data-driven Control

Figure 4 for Consecutive Inertia Drift of Autonomous RC Car via Primitive-based Planning and Data-driven Control

Abstract:Inertia drift is an aggressive transitional driving maneuver, which is challenging due to the high nonlinearity of the system and the stringent requirement on control and planning performance. This paper presents a solution for the consecutive inertia drift of an autonomous RC car based on primitive-based planning and data-driven control. The planner generates complex paths via the concatenation of path segments called primitives, and the controller eases the burden on feedback by interpolating between multiple real trajectories with different initial conditions into one near-feasible reference trajectory. The proposed strategy is capable of drifting through various paths containing consecutive turns, which is validated in both simulation and reality.

* 9 pages, 10 figures, to appear to IROS 2023

Via

Access Paper or Ask Questions

Regret Bounds without Lipschitz Continuity: Online Learning with Relative-Lipschitz Losses

Oct 22, 2020

Yihan Zhou, Victor S. Portella, Mark Schmidt, Nicholas J. A. Harvey

Abstract:In online convex optimization (OCO), Lipschitz continuity of the functions is commonly assumed in order to obtain sublinear regret. Moreover, many algorithms have only logarithmic regret when these functions are also strongly convex. Recently, researchers from convex optimization proposed the notions of "relative Lipschitz continuity" and "relative strong convexity". Both of the notions are generalizations of their classical counterparts. It has been shown that subgradient methods in the relative setting have performance analogous to their performance in the classical setting. In this work, we consider OCO for relative Lipschitz and relative strongly convex functions. We extend the known regret bounds for classical OCO algorithms to the relative setting. Specifically, we show regret bounds for the follow the regularized leader algorithms and a variant of online mirror descent. Due to the generality of these methods, these results yield regret bounds for a wide variety of OCO algorithms. Furthermore, we further extend the results to algorithms with extra regularization such as regularized dual averaging.

* 22 pages, camera-ready version, accepted in NeurIPS 2020 for poster presentation

Via

Access Paper or Ask Questions