Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tsubasa Takahashi

One-D-Piece: Image Tokenizer Meets Quality-Controllable Compression

Jan 17, 2025

Keita Miwa, Kento Sasaki, Hidehisa Arai, Tsubasa Takahashi, Yu Yamaguchi

Abstract:Current image tokenization methods require a large number of tokens to capture the information contained within images. Although the amount of information varies across images, most image tokenizers only support fixed-length tokenization, leading to inefficiency in token allocation. In this study, we introduce One-D-Piece, a discrete image tokenizer designed for variable-length tokenization, achieving quality-controllable mechanism. To enable variable compression rate, we introduce a simple but effective regularization mechanism named "Tail Token Drop" into discrete one-dimensional image tokenizers. This method encourages critical information to concentrate at the head of the token sequence, enabling support of variadic tokenization, while preserving state-of-the-art reconstruction quality. We evaluate our tokenizer across multiple reconstruction quality metrics and find that it delivers significantly better perceptual quality than existing quality-controllable compression methods, including JPEG and WebP, at smaller byte sizes. Furthermore, we assess our tokenizer on various downstream computer vision tasks, including image classification, object detection, semantic segmentation, and depth estimation, confirming its adaptability to numerous applications compared to other variable-rate methods. Our approach demonstrates the versatility of variable-length discrete image tokenization, establishing a new paradigm in both compression efficiency and reconstruction performance. Finally, we validate the effectiveness of tail token drop via detailed analysis of tokenizers.

* Our Project Page: https://turingmotors.github.io/one-d-piece-tokenizer

Via

Access Paper or Ask Questions

ACT-Bench: Towards Action Controllable World Models for Autonomous Driving

Dec 06, 2024

Hidehisa Arai, Keishi Ishihara, Tsubasa Takahashi, Yu Yamaguchi

Abstract:World models have emerged as promising neural simulators for autonomous driving, with the potential to supplement scarce real-world data and enable closed-loop evaluations. However, current research primarily evaluates these models based on visual realism or downstream task performance, with limited focus on fidelity to specific action instructions - a crucial property for generating targeted simulation scenes. Although some studies address action fidelity, their evaluations rely on closed-source mechanisms, limiting reproducibility. To address this gap, we develop an open-access evaluation framework, ACT-Bench, for quantifying action fidelity, along with a baseline world model, Terra. Our benchmarking framework includes a large-scale dataset pairing short context videos from nuScenes with corresponding future trajectory data, which provides conditional input for generating future video frames and enables evaluation of action fidelity for executed motions. Furthermore, Terra is trained on multiple large-scale trajectory-annotated datasets to enhance action fidelity. Leveraging this framework, we demonstrate that the state-of-the-art model does not fully adhere to given instructions, while Terra achieves improved action fidelity. All components of our benchmark framework will be made publicly available to support future research.

Via

Access Paper or Ask Questions

Self-Preference Bias in LLM-as-a-Judge

Oct 29, 2024

Koki Wataoka, Tsubasa Takahashi, Ryokan Ri

Figure 1 for Self-Preference Bias in LLM-as-a-Judge

Figure 2 for Self-Preference Bias in LLM-as-a-Judge

Figure 3 for Self-Preference Bias in LLM-as-a-Judge

Figure 4 for Self-Preference Bias in LLM-as-a-Judge

Abstract:Automated evaluation leveraging large language models (LLMs), commonly referred to as LLM evaluators or LLM-as-a-judge, has been widely used in measuring the performance of dialogue systems. However, the self-preference bias in LLMs has posed significant risks, including promoting specific styles or policies intrinsic to the LLMs. Despite the importance of this issue, there is a lack of established methods to measure the self-preference bias quantitatively, and its underlying causes are poorly understood. In this paper, we introduce a novel quantitative metric to measure the self-preference bias. Our experimental results demonstrate that GPT-4 exhibits a significant degree of self-preference bias. To explore the causes, we hypothesize that LLMs may favor outputs that are more familiar to them, as indicated by lower perplexity. We analyze the relationship between LLM evaluations and the perplexities of outputs. Our findings reveal that LLMs assign significantly higher evaluations to outputs with lower perplexity than human evaluators, regardless of whether the outputs were self-generated. This suggests that the essence of the bias lies in perplexity and that the self-preference bias exists because LLMs prefer texts more familiar to them.

Via

Access Paper or Ask Questions

MergePrint: Robust Fingerprinting against Merging Large Language Models

Oct 11, 2024

Shojiro Yamabe, Tsubasa Takahashi, Futa Waseda, Koki Wataoka

Figure 1 for MergePrint: Robust Fingerprinting against Merging Large Language Models

Figure 2 for MergePrint: Robust Fingerprinting against Merging Large Language Models

Figure 3 for MergePrint: Robust Fingerprinting against Merging Large Language Models

Figure 4 for MergePrint: Robust Fingerprinting against Merging Large Language Models

Abstract:As the cost of training large language models (LLMs) rises, protecting their intellectual property has become increasingly critical. Model merging, which integrates multiple expert models into a single model capable of performing multiple tasks, presents a growing risk of unauthorized and malicious usage. While fingerprinting techniques have been studied for asserting model ownership, existing methods have primarily focused on fine-tuning, leaving model merging underexplored. To address this gap, we propose a novel fingerprinting method MergePrint that embeds robust fingerprints designed to preserve ownership claims even after model merging. By optimizing against a pseudo-merged model, which simulates post-merged model weights, MergePrint generates fingerprints that remain detectable after merging. Additionally, we optimize the fingerprint inputs to minimize performance degradation, enabling verification through specific outputs from targeted inputs. This approach provides a practical fingerprinting strategy for asserting ownership in cases of misappropriation through model merging.

* Under review

Via

Access Paper or Ask Questions

Watermark-embedded Adversarial Examples for Copyright Protection against Diffusion Models

Apr 19, 2024

Peifei Zhu, Tsubasa Takahashi, Hirokatsu Kataoka

Abstract:Diffusion Models (DMs) have shown remarkable capabilities in various image-generation tasks. However, there are growing concerns that DMs could be used to imitate unauthorized creations and thus raise copyright issues. To address this issue, we propose a novel framework that embeds personal watermarks in the generation of adversarial examples. Such examples can force DMs to generate images with visible watermarks and prevent DMs from imitating unauthorized images. We construct a generator based on conditional adversarial networks and design three losses (adversarial loss, GAN loss, and perturbation loss) to generate adversarial examples that have subtle perturbation but can effectively attack DMs to prevent copyright violations. Training a generator for a personal watermark by our method only requires 5-10 samples within 2-3 minutes, and once the generator is trained, it can generate adversarial examples with that watermark significantly fast (0.2s per image). We conduct extensive experiments in various conditional image-generation scenarios. Compared to existing methods that generate images with chaotic textures, our method adds visible watermarks on the generated images, which is a more straightforward way to indicate copyright violations. We also observe that our adversarial examples exhibit good transferability across unknown generative models. Therefore, this work provides a simple yet powerful way to protect copyright from DM-based imitation.

* updated references

Via

Access Paper or Ask Questions

Understanding Likelihood of Normalizing Flow and Image Complexity through the Lens of Out-of-Distribution Detection

Feb 16, 2024

Genki Osada, Tsubasa Takahashi, Takashi Nishide

Abstract:Out-of-distribution (OOD) detection is crucial to safety-critical machine learning applications and has been extensively studied. While recent studies have predominantly focused on classifier-based methods, research on deep generative model (DGM)-based methods have lagged relatively. This disparity may be attributed to a perplexing phenomenon: DGMs often assign higher likelihoods to unknown OOD inputs than to their known training data. This paper focuses on explaining the underlying mechanism of this phenomenon. We propose a hypothesis that less complex images concentrate in high-density regions in the latent space, resulting in a higher likelihood assignment in the Normalizing Flow (NF). We experimentally demonstrate its validity for five NF architectures, concluding that their likelihood is untrustworthy. Additionally, we show that this problem can be alleviated by treating image complexity as an independent variable. Finally, we provide evidence of the potential applicability of our hypothesis in another DGM, PixelCNN++.

* Accepted at AAAI-24

Via

Access Paper or Ask Questions

Scaling Private Deep Learning with Low-Rank and Sparse Gradients

Jul 06, 2022

Ryuichi Ito, Seng Pei Liew, Tsubasa Takahashi, Yuya Sasaki, Makoto Onizuka

Figure 1 for Scaling Private Deep Learning with Low-Rank and Sparse Gradients

Figure 2 for Scaling Private Deep Learning with Low-Rank and Sparse Gradients

Figure 3 for Scaling Private Deep Learning with Low-Rank and Sparse Gradients

Figure 4 for Scaling Private Deep Learning with Low-Rank and Sparse Gradients

Abstract:Applying Differentially Private Stochastic Gradient Descent (DPSGD) to training modern, large-scale neural networks such as transformer-based models is a challenging task, as the magnitude of noise added to the gradients at each iteration scales with model dimension, hindering the learning capability significantly. We propose a unified framework, $\textsf{LSG}$, that fully exploits the low-rank and sparse structure of neural networks to reduce the dimension of gradient updates, and hence alleviate the negative impacts of DPSGD. The gradient updates are first approximated with a pair of low-rank matrices. Then, a novel strategy is utilized to sparsify the gradients, resulting in low-dimensional, less noisy updates that are yet capable of retaining the performance of neural networks. Empirical evaluation on natural language processing and computer vision tasks shows that our method outperforms other state-of-the-art baselines.

Via

Access Paper or Ask Questions

Shuffle Gaussian Mechanism for Differential Privacy

Jul 04, 2022

Seng Pei Liew, Tsubasa Takahashi

Figure 1 for Shuffle Gaussian Mechanism for Differential Privacy

Figure 2 for Shuffle Gaussian Mechanism for Differential Privacy

Abstract:We study Gaussian mechanism in the shuffle model of differential privacy (DP). Particularly, we characterize the mechanism's R\'enyi differential privacy (RDP), showing that it is of the form: $$ \epsilon(\lambda) \leq \frac{1}{\lambda-1}\log\left(\frac{e^{-\lambda/2\sigma^2}}{n^\lambda}\sum_{\substack{k_1+\dotsc+k_n=\lambda;\\k_1,\dotsc,k_n\geq 0}}\binom{\lambda}{k_1,\dotsc,k_n}e^{\sum_{i=1}^nk_i^2/2\sigma^2}\right) $$ We further prove that the RDP is strictly upper-bounded by the Gaussian RDP without shuffling. The shuffle Gaussian RDP is advantageous in composing multiple DP mechanisms, where we demonstrate its improvement over the state-of-the-art approximate DP composition theorems in privacy guarantees of the shuffle model. Moreover, we extend our study to the subsampled shuffle mechanism and the recently proposed shuffled check-in mechanism, which are protocols geared towards distributed/federated learning. Finally, an empirical study of these mechanisms is given to demonstrate the efficacy of employing shuffle Gaussian mechanism under the distributed learning framework to guarantee rigorous user privacy.

* Fixed typos. The source code of our implementation is available at http://github.com/spliew/shuffgauss

Via

Access Paper or Ask Questions

Shuffled Check-in: Privacy Amplification towards Practical Distributed Learning

Jun 07, 2022

Seng Pei Liew, Satoshi Hasegawa, Tsubasa Takahashi

Figure 1 for Shuffled Check-in: Privacy Amplification towards Practical Distributed Learning

Figure 2 for Shuffled Check-in: Privacy Amplification towards Practical Distributed Learning

Figure 3 for Shuffled Check-in: Privacy Amplification towards Practical Distributed Learning

Figure 4 for Shuffled Check-in: Privacy Amplification towards Practical Distributed Learning

Abstract:Recent studies of distributed computation with formal privacy guarantees, such as differentially private (DP) federated learning, leverage random sampling of clients in each round (privacy amplification by subsampling) to achieve satisfactory levels of privacy. Achieving this however requires strong assumptions which may not hold in practice, including precise and uniform subsampling of clients, and a highly trusted aggregator to process clients' data. In this paper, we explore a more practical protocol, shuffled check-in, to resolve the aforementioned issues. The protocol relies on client making independent and random decision to participate in the computation, freeing the requirement of server-initiated subsampling, and enabling robust modelling of client dropouts. Moreover, a weaker trust model known as the shuffle model is employed instead of using a trusted aggregator. To this end, we introduce new tools to characterize the R\'enyi differential privacy (RDP) of shuffled check-in. We show that our new techniques improve at least three times in privacy guarantee over those using approximate DP's strong composition at various parameter regimes. Furthermore, we provide a numerical approach to track the privacy of generic shuffled check-in mechanism including distributed stochastic gradient descent (SGD) with Gaussian mechanism. To the best of our knowledge, this is also the first evaluation of Gaussian mechanism within the local/shuffle model under the distributed setting in the literature, which can be of independent interest.

* 16 pages, 4 figures

Via

Access Paper or Ask Questions

Network Shuffling: Privacy Amplification via Random Walks

Apr 08, 2022

Seng Pei Liew, Tsubasa Takahashi, Shun Takagi, Fumiyuki Kato, Yang Cao, Masatoshi Yoshikawa

Figure 1 for Network Shuffling: Privacy Amplification via Random Walks

Figure 2 for Network Shuffling: Privacy Amplification via Random Walks

Figure 3 for Network Shuffling: Privacy Amplification via Random Walks

Figure 4 for Network Shuffling: Privacy Amplification via Random Walks

Abstract:Recently, it is shown that shuffling can amplify the central differential privacy guarantees of data randomized with local differential privacy. Within this setup, a centralized, trusted shuffler is responsible for shuffling by keeping the identities of data anonymous, which subsequently leads to stronger privacy guarantees for systems. However, introducing a centralized entity to the originally local privacy model loses some appeals of not having any centralized entity as in local differential privacy. Moreover, implementing a shuffler in a reliable way is not trivial due to known security issues and/or requirements of advanced hardware or secure computation technology. Motivated by these practical considerations, we rethink the shuffle model to relax the assumption of requiring a centralized, trusted shuffler. We introduce network shuffling, a decentralized mechanism where users exchange data in a random-walk fashion on a network/graph, as an alternative of achieving privacy amplification via anonymity. We analyze the threat model under such a setting, and propose distributed protocols of network shuffling that is straightforward to implement in practice. Furthermore, we show that the privacy amplification rate is similar to other privacy amplification techniques such as uniform shuffling. To our best knowledge, among the recently studied intermediate trust models that leverage privacy amplification techniques, our work is the first that is not relying on any centralized entity to achieve privacy amplification.

* 15 pages, 9 figures; SIGMOD 2022 version

Via

Access Paper or Ask Questions