Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alexander Sietsema

Harmful Overfitting in Sobolev Spaces

Jan 31, 2026

Kedar Karhadkar, Alexander Sietsema, Deanna Needell, Guido Montufar

Abstract:Motivated by recent work on benign overfitting in overparameterized machine learning, we study the generalization behavior of functions in Sobolev spaces $W^{k, p}(\mathbb{R}^d)$ that perfectly fit a noisy training data set. Under assumptions of label noise and sufficient regularity in the data distribution, we show that approximately norm-minimizing interpolators, which are canonical solutions selected by smoothness bias, exhibit harmful overfitting: even as the training sample size $n \to \infty$, the generalization error remains bounded below by a positive constant with high probability. Our results hold for arbitrary values of $p \in [1, \infty)$, in contrast to prior results studying the Hilbert space case ($p = 2$) using kernel methods. Our proof uses a geometric argument which identifies harmful neighborhoods of the training data using Sobolev inequalities.

Via

Access Paper or Ask Questions

Stratified Non-Negative Tensor Factorization

Nov 27, 2024

Alexander Sietsema, Zerrin Vural, James Chapman, Yotam Yaniv, Deanna Needell

Figure 1 for Stratified Non-Negative Tensor Factorization

Figure 2 for Stratified Non-Negative Tensor Factorization

Figure 3 for Stratified Non-Negative Tensor Factorization

Figure 4 for Stratified Non-Negative Tensor Factorization

Abstract:Non-negative matrix factorization (NMF) and non-negative tensor factorization (NTF) decompose non-negative high-dimensional data into non-negative low-rank components. NMF and NTF methods are popular for their intrinsic interpretability and effectiveness on large-scale data. Recent work developed Stratified-NMF, which applies NMF to regimes where data may come from different sources (strata) with different underlying distributions, and seeks to recover both strata-dependent information and global topics shared across strata. Applying Stratified-NMF to multi-modal data requires flattening across modes, and therefore loses geometric structure contained implicitly within the tensor. To address this problem, we extend Stratified-NMF to the tensor setting by developing a multiplicative update rule and demonstrating the method on text and image data. We find that Stratified-NTF can identify interpretable topics with lower memory requirements than Stratified-NMF. We also introduce a regularized version of the method and demonstrate its effects on image data.

* 5 pages. Will appear in IEEE Asilomar Conference on Signals, Systems, and Computers 2024

Via

Access Paper or Ask Questions