Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Seulki Lee

TransPL: VQ-Code Transition Matrices for Pseudo-Labeling of Time Series Unsupervised Domain Adaptation

May 15, 2025

Jaeho Kim, Seulki Lee

Abstract:Unsupervised domain adaptation (UDA) for time series data remains a critical challenge in deep learning, with traditional pseudo-labeling strategies failing to capture temporal patterns and channel-wise shifts between domains, producing sub-optimal pseudo-labels. As such, we introduce TransPL, a novel approach that addresses these limitations by modeling the joint distribution $P(\mathbf{X}, y)$ of the source domain through code transition matrices, where the codes are derived from vector quantization (VQ) of time series patches. Our method constructs class- and channel-wise code transition matrices from the source domain and employs Bayes' rule for target domain adaptation, generating pseudo-labels based on channel-wise weighted class-conditional likelihoods. TransPL offers three key advantages: explicit modeling of temporal transitions and channel-wise shifts between different domains, versatility towards different UDA scenarios (e.g., weakly-supervised UDA), and explainable pseudo-label generation. We validate TransPL's effectiveness through extensive analysis on four time series UDA benchmarks and confirm that it consistently outperforms state-of-the-art pseudo-labeling methods by a strong margin (6.1% accuracy improvement, 4.9% F1 improvement), while providing interpretable insights into the domain adaptation process through its learned code transition matrices.

* ICML 2025 Accept

Via

Access Paper or Ask Questions

Position: The AI Conference Peer Review Crisis Demands Author Feedback and Reviewer Rewards

May 08, 2025

Jaeho Kim, Yunseok Lee, Seulki Lee

Abstract:The peer review process in major artificial intelligence (AI) conferences faces unprecedented challenges with the surge of paper submissions (exceeding 10,000 submissions per venue), accompanied by growing concerns over review quality and reviewer responsibility. This position paper argues for the need to transform the traditional one-way review system into a bi-directional feedback loop where authors evaluate review quality and reviewers earn formal accreditation, creating an accountability framework that promotes a sustainable, high-quality peer review system. The current review system can be viewed as an interaction between three parties: the authors, reviewers, and system (i.e., conference), where we posit that all three parties share responsibility for the current problems. However, issues with authors can only be addressed through policy enforcement and detection tools, and ethical concerns can only be corrected through self-reflection. As such, this paper focuses on reforming reviewer accountability with systematic rewards through two key mechanisms: (1) a two-stage bi-directional review system that allows authors to evaluate reviews while minimizing retaliatory behavior, (2)a systematic reviewer reward system that incentivizes quality reviewing. We ask for the community's strong interest in these problems and the reforms that are needed to enhance the peer review process.

* ICML2025 Position Track Oral

Via

Access Paper or Ask Questions

On-device Sora: Enabling Training-Free Diffusion-based Text-to-Video Generation for Mobile Devices

Apr 01, 2025

Bosung Kim, Kyuhwan Lee, Isu Jeong, Jungmin Cheon, Yeojin Lee, Seulki Lee

Abstract:We present On-device Sora, the first model training-free solution for diffusion-based on-device text-to-video generation that operates efficiently on smartphone-grade devices. To address the challenges of diffusion-based text-to-video generation on computation- and memory-limited mobile devices, the proposed On-device Sora applies three novel techniques to pre-trained video generative models. First, Linear Proportional Leap (LPL) reduces the excessive denoising steps required in video diffusion through an efficient leap-based approach. Second, Temporal Dimension Token Merging (TDTM) minimizes intensive token-processing computation in attention layers by merging consecutive tokens along the temporal dimension. Third, Concurrent Inference with Dynamic Loading (CI-DL) dynamically partitions large models into smaller blocks and loads them into memory for concurrent model inference, effectively addressing the challenges of limited device memory. We implement On-device Sora on the iPhone 15 Pro, and the experimental evaluations show that it is capable of generating high-quality videos on the device, comparable to those produced by high-end GPUs. These results show that On-device Sora enables efficient and high-quality video generation on resource-constrained mobile devices. We envision the proposed On-device Sora as a significant first step toward democratizing state-of-the-art generative technologies, enabling video generation on commodity mobile and embedded devices without resource-intensive re-training for model optimization (compression). The code implementation is available at a GitHub repository(https://github.com/eai-lab/On-device-Sora).

* Replicated Submission. arXiv:2502.04363 submitted as second version of the paper

Via

Access Paper or Ask Questions

On-device Sora: Enabling Diffusion-Based Text-to-Video Generation for Mobile Devices

Feb 05, 2025

Bosung Kim, Kyuhwan Lee, Isu Jeong, Jungmin Cheon, Yeojin Lee, Seulki Lee

Figure 1 for On-device Sora: Enabling Diffusion-Based Text-to-Video Generation for Mobile Devices

Figure 2 for On-device Sora: Enabling Diffusion-Based Text-to-Video Generation for Mobile Devices

Figure 3 for On-device Sora: Enabling Diffusion-Based Text-to-Video Generation for Mobile Devices

Figure 4 for On-device Sora: Enabling Diffusion-Based Text-to-Video Generation for Mobile Devices

Abstract:We present On-device Sora, a first pioneering solution for diffusion-based on-device text-to-video generation that operates efficiently on smartphone-grade devices. Building on Open-Sora, On-device Sora applies three novel techniques to address the challenges of diffusion-based text-to-video generation on computation- and memory-limited mobile devices. First, Linear Proportional Leap (LPL) reduces the excessive denoising steps required in video diffusion through an efficient leap-based approach. Second, Temporal Dimension Token Merging (TDTM) minimizes intensive token-processing computation in attention layers by merging consecutive tokens along the temporal dimension. Third, Concurrent Inference with Dynamic Loading (CI-DL) dynamically partitions large models into smaller blocks and loads them into memory for concurrent model inference, effectively addressing the challenges of limited device memory. We implement On-device Sora on the iPhone 15 Pro, and the experimental evaluations demonstrate that it is capable of generating high-quality videos on the device, comparable to those produced by Open-Sora running on high-end GPUs. These results show that On-device Sora enables efficient and high-quality video generation on resource-constrained mobile devices, expanding accessibility, ensuring user privacy, reducing dependence on cloud infrastructure, and lowering associated costs. We envision the proposed On-device Sora as a significant first step toward democratizing state-of-the-art generative technologies, enabling video generation capabilities on commodity mobile and embedded devices. The code implementation is publicly available at an GitHub repository: https://github.com/eai-lab/On-device-Sora.

Via

Access Paper or Ask Questions

SMMF: Square-Matricized Momentum Factorization for Memory-Efficient Optimization

Dec 12, 2024

Kwangryeol Park, Seulki Lee

Abstract:We propose SMMF (Square-Matricized Momentum Factorization), a memory-efficient optimizer that reduces the memory requirement of the widely used adaptive learning rate optimizers, such as Adam, by up to 96%. SMMF enables flexible and efficient factorization of an arbitrary rank (shape) of the first and second momentum tensors during optimization, based on the proposed square-matricization and one-time single matrix factorization. From this, it becomes effectively applicable to any rank (shape) of momentum tensors, i.e., bias, matrix, and any rank-d tensors, prevalent in various deep model architectures, such as CNNs (high rank) and Transformers (low rank), in contrast to existing memory-efficient optimizers that applies only to a particular (rank-2) momentum tensor, e.g., linear layers. We conduct a regret bound analysis of SMMF, which shows that it converges similarly to non-memory-efficient adaptive learning rate optimizers, such as AdamNC, providing a theoretical basis for its competitive optimization capability. In our experiment, SMMF takes up to 96% less memory compared to state-of-the-art memory efficient optimizers, e.g., Adafactor, CAME, and SM3, while achieving comparable model performance on various CNN and Transformer tasks.

Via

Access Paper or Ask Questions

Designing Extremely Memory-Efficient CNNs for On-device Vision Tasks

Aug 07, 2024

Jaewook Lee, Yoel Park, Seulki Lee

Figure 1 for Designing Extremely Memory-Efficient CNNs for On-device Vision Tasks

Figure 2 for Designing Extremely Memory-Efficient CNNs for On-device Vision Tasks

Figure 3 for Designing Extremely Memory-Efficient CNNs for On-device Vision Tasks

Figure 4 for Designing Extremely Memory-Efficient CNNs for On-device Vision Tasks

Abstract:In this paper, we introduce a memory-efficient CNN (convolutional neural network), which enables resource-constrained low-end embedded and IoT devices to perform on-device vision tasks, such as image classification and object detection, using extremely low memory, i.e., only 63 KB on ImageNet classification. Based on the bottleneck block of MobileNet, we propose three design principles that significantly curtail the peak memory usage of a CNN so that it can fit the limited KB memory of the low-end device. First, 'input segmentation' divides an input image into a set of patches, including the central patch overlapped with the others, reducing the size (and memory requirement) of a large input image. Second, 'patch tunneling' builds independent tunnel-like paths consisting of multiple bottleneck blocks per patch, penetrating through the entire model from an input patch to the last layer of the network, maintaining lightweight memory usage throughout the whole network. Lastly, 'bottleneck reordering' rearranges the execution order of convolution operations inside the bottleneck block such that the memory usage remains constant regardless of the size of the convolution output channels. The experiment result shows that the proposed network classifies ImageNet with extremely low memory (i.e., 63 KB) while achieving competitive top-1 accuracy (i.e., 61.58\%). To the best of our knowledge, the memory usage of the proposed network is far smaller than state-of-the-art memory-efficient networks, i.e., up to 89x and 3.1x smaller than MobileNet (i.e., 5.6 MB) and MCUNet (i.e., 196 KB), respectively.

Via

Access Paper or Ask Questions

CAFO: Feature-Centric Explanation on Time Series Classification

Jun 03, 2024

Jaeho Kim, Seok-Ju Hahn, Yoontae Hwang, Junghye Lee, Seulki Lee

Abstract:In multivariate time series (MTS) classification, finding the important features (e.g., sensors) for model performance is crucial yet challenging due to the complex, high-dimensional nature of MTS data, intricate temporal dynamics, and the necessity for domain-specific interpretations. Current explanation methods for MTS mostly focus on time-centric explanations, apt for pinpointing important time periods but less effective in identifying key features. This limitation underscores the pressing need for a feature-centric approach, a vital yet often overlooked perspective that complements time-centric analysis. To bridge this gap, our study introduces a novel feature-centric explanation and evaluation framework for MTS, named CAFO (Channel Attention and Feature Orthgonalization). CAFO employs a convolution-based approach with channel attention mechanisms, incorporating a depth-wise separable channel attention module (DepCA) and a QR decomposition-based loss for promoting feature-wise orthogonality. We demonstrate that this orthogonalization enhances the separability of attention distributions, thereby refining and stabilizing the ranking of feature importance. This improvement in feature-wise ranking enhances our understanding of feature explainability in MTS. Furthermore, we develop metrics to evaluate global and class-specific feature importance. Our framework's efficacy is validated through extensive empirical analyses on two major public benchmarks and real-world datasets, both synthetic and self-collected, specifically designed to highlight class-wise discriminative features. The results confirm CAFO's robustness and informative capacity in assessing feature importance in MTS classification tasks. This study not only advances the understanding of feature-centric explanations in MTS but also sets a foundation for future explorations in feature-centric explanations.

* Accepted to KDD 2024 Research Track

Via

Access Paper or Ask Questions

Intermittent Learning: On-Device Machine Learning on Intermittently Powered System

Apr 21, 2019

Seulki Lee, Bashima Islam, Yubo Luo, Shahriar Nirjon

Figure 1 for Intermittent Learning: On-Device Machine Learning on Intermittently Powered System

Figure 2 for Intermittent Learning: On-Device Machine Learning on Intermittently Powered System

Figure 3 for Intermittent Learning: On-Device Machine Learning on Intermittently Powered System

Figure 4 for Intermittent Learning: On-Device Machine Learning on Intermittently Powered System

Abstract:In this paper, we introduce the concept of intermittent learning, which enables energy harvested computing platforms to execute certain classes of machine learning tasks. We identify unique challenges to intermittent learning relating to the data and application semantics of machine learning tasks. To address these challenges, we devise an algorithm that determines a sequence of actions to achieve the desired learning objective under tight energy constraints. We further increase the energy efficiency of the system by proposing three heuristics that help an intermittent learner decide whether to learn or discard training examples at run-time. In order to provide a probabilistic bound on the completion of a learning task, we perform an energy event-based analysis that helps us analyze intermittent learning systems where the uncertainty lies in both energy and training example generation. We implement and evaluate three intermittent learning applications that learn the air quality, human presence, and vibration using solar, RF, and kinetic energy harvesters, respectively. We demonstrate that the proposed framework improves the energy efficiency of a learner by up to 100% and cuts down the number of learning examples by up to 50% when compared to state-of-the-art intermittent computing systems without our framework.

Via

Access Paper or Ask Questions