Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Young-Jin Park

Identifying Reliable Predictions in Detection Transformers

Dec 02, 2024

Young-Jin Park, Carson Sobolewski, Navid Azizan

Abstract:DEtection TRansformer (DETR) has emerged as a promising architecture for object detection, offering an end-to-end prediction pipeline. In practice, however, DETR generates hundreds of predictions that far outnumber the actual number of objects present in an image. This raises the question: can we trust and use all of these predictions? Addressing this concern, we present empirical evidence highlighting how different predictions within the same image play distinct roles, resulting in varying reliability levels across those predictions. More specifically, while multiple predictions are often made for a single object, our findings show that most often one such prediction is well-calibrated, and the others are poorly calibrated. Based on these insights, we demonstrate identifying a reliable subset of DETR's predictions is crucial for accurately assessing the reliability of the model at both object and image levels. Building on this viewpoint, we first tackle the shortcomings of widely used performance and calibration metrics, such as average precision and various forms of expected calibration error. Specifically, they are inadequate for determining which subset of DETR's predictions should be trusted and utilized. In response, we present Object-level Calibration Error (OCE), which is capable of assessing the calibration quality both across different models and among various configurations within a specific model. As a final contribution, we introduce a post hoc Uncertainty Quantification (UQ) framework that predicts the accuracy of the model on a per-image basis. By contrasting the average confidence scores of positive (i.e., likely to be matched) and negative predictions determined by OCE, the framework assesses the reliability of the DETR model for each test image.

Via

Access Paper or Ask Questions

A Scalable and Transferable Time Series Prediction Framework for Demand Forecasting

Feb 29, 2024

Young-Jin Park, Donghyun Kim, Frédéric Odermatt, Juho Lee, Kyung-Min Kim

Figure 1 for A Scalable and Transferable Time Series Prediction Framework for Demand Forecasting

Figure 2 for A Scalable and Transferable Time Series Prediction Framework for Demand Forecasting

Figure 3 for A Scalable and Transferable Time Series Prediction Framework for Demand Forecasting

Figure 4 for A Scalable and Transferable Time Series Prediction Framework for Demand Forecasting

Abstract:Time series forecasting is one of the most essential and ubiquitous tasks in many business problems, including demand forecasting and logistics optimization. Traditional time series forecasting methods, however, have resulted in small models with limited expressive power because they have difficulty in scaling their model size up while maintaining high accuracy. In this paper, we propose Forecasting orchestra (Forchestra), a simple but powerful framework capable of accurately predicting future demand for a diverse range of items. We empirically demonstrate that the model size is scalable to up to 0.8 billion parameters. The proposed method not only outperforms existing forecasting models with a significant margin, but it could generalize well to unseen data points when evaluated in a zero-shot fashion on downstream datasets. Last but not least, we present extensive qualitative and quantitative studies to analyze how the proposed model outperforms baseline models and differs from conventional approaches. The original paper was presented as a full paper at ICDM 2022 and is available at: https://ieeexplore.ieee.org/document/10027662.

* Published as a full paper at ICDM 2022

Via

Access Paper or Ask Questions

Representation Reliability and Its Impact on Downstream Tasks

May 31, 2023

Young-Jin Park, Hao Wang, Shervin Ardeshir, Navid Azizan

Abstract:Self-supervised pre-trained models extract general-purpose representations from data, and quantifying how reliable they are is crucial because many downstream models use these representations as input for their own tasks. To this end, we first introduce a formal definition of representation reliability: the representation for a given test input is considered to be reliable if the downstream models built on top of that representation can consistently generate accurate predictions for that test point. It is desired to estimate the representation reliability without knowing the downstream tasks a priori. We provide a negative result showing that existing frameworks for uncertainty quantification in supervised learning are not suitable for this purpose. As an alternative, we propose an ensemble-based method for quantifying representation reliability, based on the concept of neighborhood consistency in the representation spaces across various pre-trained models. More specifically, the key insight is to use shared neighboring points as anchors to align different representation spaces. We demonstrate through comprehensive numerical experiments that our method is capable of predicting representation reliability with high accuracy.

Via

Access Paper or Ask Questions

VQ-AR: Vector Quantized Autoregressive Probabilistic Time Series Forecasting

May 31, 2022

Kashif Rasul, Young-Jin Park, Max Nihlén Ramström, Kyung-Min Kim

Figure 1 for VQ-AR: Vector Quantized Autoregressive Probabilistic Time Series Forecasting

Figure 2 for VQ-AR: Vector Quantized Autoregressive Probabilistic Time Series Forecasting

Figure 3 for VQ-AR: Vector Quantized Autoregressive Probabilistic Time Series Forecasting

Figure 4 for VQ-AR: Vector Quantized Autoregressive Probabilistic Time Series Forecasting

Abstract:Time series models aim for accurate predictions of the future given the past, where the forecasts are used for important downstream tasks like business decision making. In practice, deep learning based time series models come in many forms, but at a high level learn some continuous representation of the past and use it to output point or probabilistic forecasts. In this paper, we introduce a novel autoregressive architecture, VQ-AR, which instead learns a \emph{discrete} set of representations that are used to predict the future. Extensive empirical comparison with other competitive deep learning models shows that surprisingly such a discrete set of representations gives state-of-the-art or equivalent results on a wide variety of time series datasets. We also highlight the shortcomings of this approach, explore its zero-shot generalization capabilities, and present an ablation study on the number of representations. The full source code of the method will be available at the time of publication with the hope that researchers can further investigate this important but overlooked inductive bias for the time series domain.

Via

Access Paper or Ask Questions

Global-Local Item Embedding for Temporal Set Prediction

Sep 05, 2021

Seungjae Jung, Young-Jin Park, Jisu Jeong, Kyung-Min Kim, Hiun Kim, Minkyu Kim, Hanock Kwak

Figure 1 for Global-Local Item Embedding for Temporal Set Prediction

Figure 2 for Global-Local Item Embedding for Temporal Set Prediction

Figure 3 for Global-Local Item Embedding for Temporal Set Prediction

Figure 4 for Global-Local Item Embedding for Temporal Set Prediction

Abstract:Temporal set prediction is becoming increasingly important as many companies employ recommender systems in their online businesses, e.g., personalized purchase prediction of shopping baskets. While most previous techniques have focused on leveraging a user's history, the study of combining it with others' histories remains untapped potential. This paper proposes Global-Local Item Embedding (GLOIE) that learns to utilize the temporal properties of sets across whole users as well as within a user by coining the names as global and local information to distinguish the two temporal patterns. GLOIE uses Variational Autoencoder (VAE) and dynamic graph-based model to capture global and local information and then applies attention to integrate resulting item embeddings. Additionally, we propose to use Tweedie output for the decoder of VAE as it can easily model zero-inflated and long-tailed distribution, which is more suitable for several real-world data distributions than Gaussian or multinomial counterparts. When evaluated on three public benchmarks, our algorithm consistently outperforms previous state-of-the-art methods in most ranking metrics.

* 8 pages, 3 figures. To appear in RecSys 2021 LBR

Via

Access Paper or Ask Questions

One4all User Representation for Recommender Systems in E-commerce

May 24, 2021

Kyuyong Shin, Hanock Kwak, Kyung-Min Kim, Minkyu Kim, Young-Jin Park, Jisu Jeong, Seungjae Jung

Figure 1 for One4all User Representation for Recommender Systems in E-commerce

Figure 2 for One4all User Representation for Recommender Systems in E-commerce

Figure 3 for One4all User Representation for Recommender Systems in E-commerce

Figure 4 for One4all User Representation for Recommender Systems in E-commerce

Abstract:General-purpose representation learning through large-scale pre-training has shown promising results in the various machine learning fields. For an e-commerce domain, the objective of general-purpose, i.e., one for all, representations would be efficient applications for extensive downstream tasks such as user profiling, targeting, and recommendation tasks. In this paper, we systematically compare the generalizability of two learning strategies, i.e., transfer learning through the proposed model, ShopperBERT, vs. learning from scratch. ShopperBERT learns nine pretext tasks with 79.2M parameters from 0.8B user behaviors collected over two years to produce user embeddings. As a result, the MLPs that employ our embedding method outperform more complex models trained from scratch for five out of six tasks. Specifically, the pre-trained embeddings have superiority over the task-specific supervised features and the strong baselines, which learn the auxiliary dataset for the cold-start problem. We also show the computational efficiency and embedding visualization of the pre-trained features.

Via

Access Paper or Ask Questions

A Worrying Analysis of Probabilistic Time-series Models for Sales Forecasting

Nov 21, 2020

Seungjae Jung, Kyung-Min Kim, Hanock Kwak, Young-Jin Park

Figure 1 for A Worrying Analysis of Probabilistic Time-series Models for Sales Forecasting

Figure 2 for A Worrying Analysis of Probabilistic Time-series Models for Sales Forecasting

Figure 3 for A Worrying Analysis of Probabilistic Time-series Models for Sales Forecasting

Figure 4 for A Worrying Analysis of Probabilistic Time-series Models for Sales Forecasting

Abstract:Probabilistic time-series models become popular in the forecasting field as they help to make optimal decisions under uncertainty. Despite the growing interest, a lack of thorough analysis hinders choosing what is worth applying for the desired task. In this paper, we analyze the performance of three prominent probabilistic time-series models for sales forecasting. To remove the role of random chance in architecture's performance, we make two experimental principles; 1) Large-scale dataset with various cross-validation sets. 2) A standardized training and hyperparameter selection. The experimental results show that a simple Multi-layer Perceptron and Linear Regression outperform the probabilistic models on RMSE without any feature engineering. Overall, the probabilistic models fail to achieve better performance on point estimation, such as RMSE and MAPE, than comparably simple baselines. We analyze and discuss the performances of probabilistic time-series models.

* NeurIPS 2020 workshop (I Can't Believe It's Not Better, ICBINB@NeurIPS 2020). All authors contributed equally to this research

Via

Access Paper or Ask Questions

Distilling a Hierarchical Policy for Planning and Control via Representation and Reinforcement Learning

Nov 16, 2020

Jung-Su Ha, Young-Jin Park, Hyeok-Joo Chae, Soon-Seo Park, Han-Lim Choi

Figure 1 for Distilling a Hierarchical Policy for Planning and Control via Representation and Reinforcement Learning

Figure 2 for Distilling a Hierarchical Policy for Planning and Control via Representation and Reinforcement Learning

Figure 3 for Distilling a Hierarchical Policy for Planning and Control via Representation and Reinforcement Learning

Figure 4 for Distilling a Hierarchical Policy for Planning and Control via Representation and Reinforcement Learning

Abstract:We present a hierarchical planning and control framework that enables an agent to perform various tasks and adapt to a new task flexibly. Rather than learning an individual policy for each particular task, the proposed framework, DISH, distills a hierarchical policy from a set of tasks by representation and reinforcement learning. The framework is based on the idea of latent variable models that represent high-dimensional observations using low-dimensional latent variables. The resulting policy consists of two levels of hierarchy: (i) a planning module that reasons a sequence of latent intentions that would lead to an optimistic future and (ii) a feedback control policy, shared across the tasks, that executes the inferred intention. Because the planning is performed in low-dimensional latent space, the learned policy can immediately be used to solve or adapt to new tasks without additional training. We demonstrate the proposed framework can learn compact representations (3- and 1-dimensional latent states and commands for a humanoid with 197- and 36-dimensional state features and actions) while solving a small number of imitation tasks, and the resulting policy is directly applicable to other types of tasks, i.e., navigation in cluttered environments.

* the first two authors contributed equally

Via

Access Paper or Ask Questions

div2vec: Diversity-Emphasized Node Embedding

Sep 21, 2020

Jisu Jeong, Jeong-Min Yun, Hongi Keam, Young-Jin Park, Zimin Park, Junki Cho

Figure 1 for div2vec: Diversity-Emphasized Node Embedding

Figure 2 for div2vec: Diversity-Emphasized Node Embedding

Figure 3 for div2vec: Diversity-Emphasized Node Embedding

Figure 4 for div2vec: Diversity-Emphasized Node Embedding

Abstract:Recently, the interest of graph representation learning has been rapidly increasing in recommender systems. However, most existing studies have focused on improving accuracy, but in real-world systems, the recommendation diversity should be considered as well to improve user experiences. In this paper, we propose the diversity-emphasized node embedding div2vec, which is a random walk-based unsupervised learning method like DeepWalk and node2vec. When generating random walks, DeepWalk and node2vec sample nodes of higher degree more and nodes of lower degree less. On the other hand, div2vec samples nodes with the probability inversely proportional to its degree so that every node can evenly belong to the collection of random walks. This strategy improves the diversity of recommendation models. Offline experiments on the MovieLens dataset showed that our new method improves the recommendation performance in terms of both accuracy and diversity. Moreover, we evaluated the proposed model on two real-world services, WATCHA and LINE Wallet Coupon, and observed the div2vec improves the recommendation quality by diversifying the system.

* To appear in the ImpactRS Workshop at ACM RecSys 2020

Via

Access Paper or Ask Questions

Multi-Manifold Learning for Large-scale Targeted Advertising System

Jul 08, 2020

Kyuyong Shin, Young-Jin Park, Kyung-Min Kim, Sunyoung Kwon

Figure 1 for Multi-Manifold Learning for Large-scale Targeted Advertising System

Figure 2 for Multi-Manifold Learning for Large-scale Targeted Advertising System

Figure 3 for Multi-Manifold Learning for Large-scale Targeted Advertising System

Figure 4 for Multi-Manifold Learning for Large-scale Targeted Advertising System

Abstract:Messenger advertisements (ads) give direct and personal user experience yielding high conversion rates and sales. However, people are skeptical about ads and sometimes perceive them as spam, which eventually leads to a decrease in user satisfaction. Targeted advertising, which serves ads to individuals who may exhibit interest in a particular advertising message, is strongly required. The key to the success of precise user targeting lies in learning the accurate user and ad representation in the embedding space. Most of the previous studies have limited the representation learning in the Euclidean space, but recent studies have suggested hyperbolic manifold learning for the distinct projection of complex network properties emerging from real-world datasets such as social networks, recommender systems, and advertising. We propose a framework that can effectively learn the hierarchical structure in users and ads on the hyperbolic space, and extend to the Multi-Manifold Learning. Our method constructs multiple hyperbolic manifolds with learnable curvatures and maps the representation of user and ad to each manifold. The origin of each manifold is set as the centroid of each user cluster. The user preference for each ad is estimated using the distance between two entities in the hyperbolic space, and the final prediction is determined by aggregating the values calculated from the learned multiple manifolds. We evaluate our method on public benchmark datasets and a large-scale commercial messenger system LINE, and demonstrate its effectiveness through improved performance.

* AdKDD 2020
* Accepted at AdKDD 2020

Via

Access Paper or Ask Questions