Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hideitsu Hino

An $(ε,δ)$-accurate level set estimation with a stopping criterion

Mar 26, 2025

Hideaki Ishibashi, Kota Matsui, Kentaro Kutsukake, Hideitsu Hino

Figure 1 for An $(ε,δ)$-accurate level set estimation with a stopping criterion

Figure 2 for An $(ε,δ)$-accurate level set estimation with a stopping criterion

Figure 3 for An $(ε,δ)$-accurate level set estimation with a stopping criterion

Figure 4 for An $(ε,δ)$-accurate level set estimation with a stopping criterion

Abstract:The level set estimation problem seeks to identify regions within a set of candidate points where an unknown and costly to evaluate function's value exceeds a specified threshold, providing an efficient alternative to exhaustive evaluations of function values. Traditional methods often use sequential optimization strategies to find $\epsilon$-accurate solutions, which permit a margin around the threshold contour but frequently lack effective stopping criteria, leading to excessive exploration and inefficiencies. This paper introduces an acquisition strategy for level set estimation that incorporates a stopping criterion, ensuring the algorithm halts when further exploration is unlikely to yield improvements, thereby reducing unnecessary function evaluations. We theoretically prove that our method satisfies $\epsilon$-accuracy with a confidence level of $1 - \delta$, addressing a key gap in existing approaches. Furthermore, we show that this also leads to guarantees on the lower bounds of performance metrics such as F-score. Numerical experiments demonstrate that the proposed acquisition function achieves comparable precision to existing methods while confirming that the stopping criterion effectively terminates the algorithm once adequate exploration is completed.

Via

Access Paper or Ask Questions

Scalable Sobolev IPM for Probability Measures on a Graph

Feb 02, 2025

Tam Le, Truyen Nguyen, Hideitsu Hino, Kenji Fukumizu

Abstract:We investigate the Sobolev IPM problem for probability measures supported on a graph metric space. Sobolev IPM is an important instance of integral probability metrics (IPM), and is obtained by constraining a critic function within a unit ball defined by the Sobolev norm. In particular, it has been used to compare probability measures and is crucial for several theoretical works in machine learning. However, to our knowledge, there are no efficient algorithmic approaches to compute Sobolev IPM effectively, which hinders its practical applications. In this work, we establish a relation between Sobolev norm and weighted $L^p$-norm, and leverage it to propose a \emph{novel regularization} for Sobolev IPM. By exploiting the graph structure, we demonstrate that the regularized Sobolev IPM provides a \emph{closed-form} expression for fast computation. This advancement addresses long-standing computational challenges, and paves the way to apply Sobolev IPM for practical applications, even in large-scale settings. Additionally, the regularized Sobolev IPM is negative definite. Utilizing this property, we design positive-definite kernels upon the regularized Sobolev IPM, and provide preliminary evidences of their advantages on document classification and topological data analysis for measures on a graph.

Via

Access Paper or Ask Questions

Orlicz-Sobolev Transport for Unbalanced Measures on a Graph

Feb 02, 2025

Tam Le, Truyen Nguyen, Hideitsu Hino, Kenji Fukumizu

Figure 1 for Orlicz-Sobolev Transport for Unbalanced Measures on a Graph

Figure 2 for Orlicz-Sobolev Transport for Unbalanced Measures on a Graph

Figure 3 for Orlicz-Sobolev Transport for Unbalanced Measures on a Graph

Figure 4 for Orlicz-Sobolev Transport for Unbalanced Measures on a Graph

Abstract:Moving beyond $L^p$ geometric structure, Orlicz-Wasserstein (OW) leverages a specific class of convex functions for Orlicz geometric structure. While OW remarkably helps to advance certain machine learning approaches, it has a high computational complexity due to its two-level optimization formula. Recently, Le et al. (2024) exploits graph structure to propose generalized Sobolev transport (GST), i.e., a scalable variant for OW. However, GST assumes that input measures have the same mass. Unlike optimal transport (OT), it is nontrivial to incorporate a mass constraint to extend GST for measures on a graph, possibly having different total mass. In this work, we propose to take a step back by considering the entropy partial transport (EPT) for nonnegative measures on a graph. By leveraging Caffarelli & McCann (2010)'s observations, EPT can be reformulated as a standard complete OT between two corresponding balanced measures. Consequently, we develop a novel EPT with Orlicz geometric structure, namely Orlicz-EPT, for unbalanced measures on a graph. Especially, by exploiting the dual EPT formulation and geometric structures of the graph-based Orlicz-Sobolev space, we derive a novel regularization to propose Orlicz-Sobolev transport (OST). The resulting distance can be efficiently computed by simply solving a univariate optimization problem, unlike the high-computational two-level optimization problem for Orlicz-EPT. Additionally, we derive geometric structures for the OST and draw its relations to other transport distances. We empirically show that OST is several-order faster than Orlicz-EPT. We further illustrate preliminary evidences on the advantages of OST for document classification, and several tasks in topological data analysis.

Via

Access Paper or Ask Questions

A Family of Distributions of Random Subsets for Controlling Positive and Negative Dependence

Aug 02, 2024

Takahiro Kawashima, Hideitsu Hino

Abstract:Positive and negative dependence are fundamental concepts that characterize the attractive and repulsive behavior of random subsets. Although some probabilistic models are known to exhibit positive or negative dependence, it is challenging to seamlessly bridge them with a practicable probabilistic model. In this study, we introduce a new family of distributions, named the discrete kernel point process (DKPP), which includes determinantal point processes and parts of Boltzmann machines. We also develop some computational methods for probabilistic operations and inference with DKPPs, such as calculating marginal and conditional probabilities and learning the parameters. Our numerical experiments demonstrate the controllability of positive and negative dependence and the effectiveness of the computational methods for DKPPs.

Via

Access Paper or Ask Questions

Duality induced by an embedding structure of determinantal point process

Apr 17, 2024

Hideitsu Hino, Keisuke Yano

Figure 1 for Duality induced by an embedding structure of determinantal point process

Figure 2 for Duality induced by an embedding structure of determinantal point process

Figure 3 for Duality induced by an embedding structure of determinantal point process

Abstract:This paper investigates the information geometrical structure of a determinantal point process (DPP). It demonstrates that a DPP is embedded in the exponential family of log-linear models. The extent of deviation from an exponential family is analyzed using the $\mathrm{e}$-embedding curvature tensor, which identifies partially flat parameters of a DPP. On the basis of this embedding structure, the duality related to a marginal kernel and an $L$-ensemble kernel is discovered.

Via

Access Paper or Ask Questions

A Short Survey on Importance Weighting for Machine Learning

Mar 15, 2024

Masanari Kimura, Hideitsu Hino

Figure 1 for A Short Survey on Importance Weighting for Machine Learning

Abstract:Importance weighting is a fundamental procedure in statistics and machine learning that weights the objective function or probability distribution based on the importance of the instance in some sense. The simplicity and usefulness of the idea has led to many applications of importance weighting. For example, it is known that supervised learning under an assumption about the difference between the training and test distributions, called distribution shift, can guarantee statistically desirable properties through importance weighting by their density ratio. This survey summarizes the broad applications of importance weighting in machine learning and related research.

Via

Access Paper or Ask Questions

Scalable Counterfactual Distribution Estimation in Multivariate Causal Models

Nov 02, 2023

Thong Pham, Shohei Shimizu, Hideitsu Hino, Tam Le

Figure 1 for Scalable Counterfactual Distribution Estimation in Multivariate Causal Models

Figure 2 for Scalable Counterfactual Distribution Estimation in Multivariate Causal Models

Figure 3 for Scalable Counterfactual Distribution Estimation in Multivariate Causal Models

Figure 4 for Scalable Counterfactual Distribution Estimation in Multivariate Causal Models

Abstract:We consider the problem of estimating the counterfactual joint distribution of multiple quantities of interests (e.g., outcomes) in a multivariate causal model extended from the classical difference-in-difference design. Existing methods for this task either ignore the correlation structures among dimensions of the multivariate outcome by considering univariate causal models on each dimension separately and hence produce incorrect counterfactual distributions, or poorly scale even for moderate-size datasets when directly dealing with such multivariate causal model. We propose a method that alleviates both issues simultaneously by leveraging a robust latent one-dimensional subspace of the original high-dimension space and exploiting the efficient estimation from the univariate causal model on such space. Since the construction of the one-dimensional subspace uses information from all the dimensions, our method can capture the correlation structures and produce good estimates of the counterfactual distribution. We demonstrate the advantages of our approach over existing methods on both synthetic and real-world data.

Via

Access Paper or Ask Questions

Information Geometrically Generalized Covariate Shift Adaptation

Apr 19, 2023

Masanari Kimura, Hideitsu Hino

Abstract:Many machine learning methods assume that the training and test data follow the same distribution. However, in the real world, this assumption is very often violated. In particular, the phenomenon that the marginal distribution of the data changes is called covariate shift, one of the most important research topics in machine learning. We show that the well-known family of covariate shift adaptation methods is unified in the framework of information geometry. Furthermore, we show that parameter search for geometrically generalized covariate shift adaptation method can be achieved efficiently. Numerical experiments show that our generalization can achieve better performance than the existing methods it encompasses.

Via

Access Paper or Ask Questions

Active Learning by Query by Committee with Robust Divergences

Nov 18, 2022

Hideitsu Hino, Shinto Eguchi

Abstract:Active learning is a widely used methodology for various problems with high measurement costs. In active learning, the next object to be measured is selected by an acquisition function, and measurements are performed sequentially. The query by committee is a well-known acquisition function. In conventional methods, committee disagreement is quantified by the Kullback--Leibler divergence. In this paper, the measure of disagreement is defined by the Bregman divergence, which includes the Kullback--Leibler divergence as an instance, and the dual $\gamma$-power divergence. As a particular class of the Bregman divergence, the $\beta$-divergence is considered. By deriving the influence function, we show that the proposed method using $\beta$-divergence and dual $\gamma$-power divergence are more robust than the conventional method in which the measure of disagreement is defined by the Kullback--Leibler divergence. Experimental results show that the proposed method performs as well as or better than the conventional method.

Via

Access Paper or Ask Questions

Unsupervised Domain Adaptation for Extra Features in the Target Domain Using Optimal Transport

Sep 10, 2022

Toshimitsu Aritake, Hideitsu Hino

Figure 1 for Unsupervised Domain Adaptation for Extra Features in the Target Domain Using Optimal Transport

Figure 2 for Unsupervised Domain Adaptation for Extra Features in the Target Domain Using Optimal Transport

Figure 3 for Unsupervised Domain Adaptation for Extra Features in the Target Domain Using Optimal Transport

Figure 4 for Unsupervised Domain Adaptation for Extra Features in the Target Domain Using Optimal Transport

Abstract:Domain adaptation aims to transfer knowledge of labeled instances obtained from a source domain to a target domain to fill the gap between the domains. Most domain adaptation methods assume that the source and target domains have the same dimensionality. Methods that are applicable when the number of features is different in each domain have rarely been studied, especially when no label information is given for the test data obtained from the target domain. In this paper, it is assumed that common features exist in both domains and that extra (new additional) features are observed in the target domain; hence, the dimensionality of the target domain is higher than that of the source domain. To leverage the homogeneity of the common features, the adaptation between these source and target domains is formulated as an optimal transport (OT) problem. In addition, a learning bound in the target domain for the proposed OT-based method is derived. The proposed algorithm is validated using both simulated and real-world data.

Via

Access Paper or Ask Questions