Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ting Cai

RakutenAI-7B: Extending Large Language Models for Japanese

Mar 21, 2024

Rakuten Group, Aaron Levine, Connie Huang, Chenguang Wang, Eduardo Batista, Ewa Szymanska, Hongyi Ding, Hou Wei Chou, Jean-François Pessiot, Johanes Effendi(+20 more)

Figure 1 for RakutenAI-7B: Extending Large Language Models for Japanese

Figure 2 for RakutenAI-7B: Extending Large Language Models for Japanese

Figure 3 for RakutenAI-7B: Extending Large Language Models for Japanese

Figure 4 for RakutenAI-7B: Extending Large Language Models for Japanese

Abstract:We introduce RakutenAI-7B, a suite of Japanese-oriented large language models that achieve the best performance on the Japanese LM Harness benchmarks among the open 7B models. Along with the foundation model, we release instruction- and chat-tuned models, RakutenAI-7B-instruct and RakutenAI-7B-chat respectively, under the Apache 2.0 license.

Via

Access Paper or Ask Questions

Active Cost-aware Labeling of Streaming Data

Apr 13, 2023

Ting Cai, Kirthevasan Kandasamy

Figure 1 for Active Cost-aware Labeling of Streaming Data

Figure 2 for Active Cost-aware Labeling of Streaming Data

Figure 3 for Active Cost-aware Labeling of Streaming Data

Figure 4 for Active Cost-aware Labeling of Streaming Data

Abstract:We study actively labeling streaming data, where an active learner is faced with a stream of data points and must carefully choose which of these points to label via an expensive experiment. Such problems frequently arise in applications such as healthcare and astronomy. We first study a setting when the data's inputs belong to one of $K$ discrete distributions and formalize this problem via a loss that captures the labeling cost and the prediction error. When the labeling cost is $B$, our algorithm, which chooses to label a point if the uncertainty is larger than a time and cost dependent threshold, achieves a worst-case upper bound of $O(B^{\frac{1}{3}} K^{\frac{1}{3}} T^{\frac{2}{3}})$ on the loss after $T$ rounds. We also provide a more nuanced upper bound which demonstrates that the algorithm can adapt to the arrival pattern, and achieves better performance when the arrival pattern is more favorable. We complement both upper bounds with matching lower bounds. We next study this problem when the inputs belong to a continuous domain and the output of the experiment is a smooth function with bounded RKHS norm. After $T$ rounds in $d$ dimensions, we show that the loss is bounded by $O(B^{\frac{1}{d+3}} T^{\frac{d+2}{d+3}})$ in an RKHS with a squared exponential kernel and by $O(B^{\frac{1}{2d+3}} T^{\frac{2d+2}{2d+3}})$ in an RKHS with a Mat\'ern kernel. Our empirical evaluation demonstrates that our method outperforms other baselines in several synthetic experiments and two real experiments in medicine and astronomy.

Via

Access Paper or Ask Questions

MVCNet: Multiview Contrastive Network for Unsupervised Representation Learning for 3D CT Lesions

Aug 18, 2021

Penghua Zhai, Huaiwei Cong, Gangming Zhao, Chaowei Fang, Jinpeng Li, Ting Cai, Huiguang He

Figure 1 for MVCNet: Multiview Contrastive Network for Unsupervised Representation Learning for 3D CT Lesions

Figure 2 for MVCNet: Multiview Contrastive Network for Unsupervised Representation Learning for 3D CT Lesions

Figure 3 for MVCNet: Multiview Contrastive Network for Unsupervised Representation Learning for 3D CT Lesions

Figure 4 for MVCNet: Multiview Contrastive Network for Unsupervised Representation Learning for 3D CT Lesions

Abstract:\emph{Objective and Impact Statement}. With the renaissance of deep learning, automatic diagnostic systems for computed tomography (CT) have achieved many successful applications. However, they are mostly attributed to careful expert annotations, which are often scarce in practice. This drives our interest to the unsupervised representation learning. \emph{Introduction}. Recent studies have shown that self-supervised learning is an effective approach for learning representations, but most of them rely on the empirical design of transformations and pretext tasks. \emph{Methods}. To avoid the subjectivity associated with these methods, we propose the MVCNet, a novel unsupervised three dimensional (3D) representation learning method working in a transformation-free manner. We view each 3D lesion from different orientations to collect multiple two dimensional (2D) views. Then, an embedding function is learned by minimizing a contrastive loss so that the 2D views of the same 3D lesion are aggregated, and the 2D views of different lesions are separated. We evaluate the representations by training a simple classification head upon the embedding layer. \emph{Results}. Experimental results show that MVCNet achieves state-of-the-art accuracies on the LIDC-IDRI (89.55\%), LNDb (77.69\%) and TianChi (79.96\%) datasets for \emph{unsupervised representation learning}. When fine-tuned on 10\% of the labeled data, the accuracies are comparable to the supervised learning model (89.46\% vs. 85.03\%, 73.85\% vs. 73.44\%, 83.56\% vs. 83.34\% on the three datasets, respectively). \emph{Conclusion}. Results indicate the superiority of MVCNet in \emph{learning representations with limited annotations}.

* This 16-page manuscript has been submitted to Meidcal Image Analysis for possible publication

Via

Access Paper or Ask Questions

Adversarially-Trained Nonnegative Matrix Factorization

Apr 10, 2021

Ting Cai, Vincent Y. F. Tan, Cédric Févotte

Figure 1 for Adversarially-Trained Nonnegative Matrix Factorization

Figure 2 for Adversarially-Trained Nonnegative Matrix Factorization

Figure 3 for Adversarially-Trained Nonnegative Matrix Factorization

Figure 4 for Adversarially-Trained Nonnegative Matrix Factorization

Abstract:We consider an adversarially-trained version of the nonnegative matrix factorization, a popular latent dimensionality reduction technique. In our formulation, an attacker adds an arbitrary matrix of bounded norm to the given data matrix. We design efficient algorithms inspired by adversarial training to optimize for dictionary and coefficient matrices with enhanced generalization abilities. Extensive simulations on synthetic and benchmark datasets demonstrate the superior predictive performance on matrix completion tasks of our proposed method compared to state-of-the-art competitors, including other variants of adversarial nonnegative matrix factorization.

* 5 pages, 4 figures

Via

Access Paper or Ask Questions

Investigating Critical Risk Factors in Liver Cancer Prediction

Feb 03, 2021

Jinpeng Li, Yaling Tao, Ting Cai

Figure 1 for Investigating Critical Risk Factors in Liver Cancer Prediction

Figure 2 for Investigating Critical Risk Factors in Liver Cancer Prediction

Figure 3 for Investigating Critical Risk Factors in Liver Cancer Prediction

Figure 4 for Investigating Critical Risk Factors in Liver Cancer Prediction

Abstract:We exploit liver cancer prediction model using machine learning algorithms based on epidemiological data of over 55 thousand peoples from 2014 to the present. The best performance is an AUC of 0.71. We analyzed model parameters to investigate critical risk factors that contribute the most to prediction.

* 8 pages, 4 figures, conference paper

Via

Access Paper or Ask Questions