Abstract:The rapid advancement of Large Language Models (LLMs), particularly those trained on multilingual corpora, has intensified the need for a deeper understanding of their performance across a diverse range of languages and model sizes. Our research addresses this critical need by studying the performance and scaling behavior of multilingual LLMs in text classification and machine translation tasks across 204 languages. We systematically examine both seen and unseen languages across three model families of varying sizes in zero-shot and few-shot settings. Our findings show significant differences in scaling behavior between zero-shot and two-shot scenarios, with striking disparities in performance between seen and unseen languages. Model scale has little effect on zero-shot performance, which remains mostly flat. However, in two-shot settings, larger models show clear linear improvements in multilingual text classification. For translation tasks, however, only the instruction-tuned model showed clear benefits from scaling. Our analysis also suggests that overall resource levels, not just the proportions of pretraining languages, are better predictors of model performance, shedding light on what drives multilingual LLM effectiveness.
Abstract:This paper examines the performance of a 2x2 Line of Sight (LoS) Multiple Input Multiple Output (MIMO) channel at three terahertz frequencies-340 GHz, 410 Ghz, and 460 GHz. While theoretical models predict very high channel capacities, we observe lower capacity which is explained by asymmetric transmit-to-receive signal strengths as well as due to signal attenuation over longer distances. Overall, however, we note that at 460 Ghz, channel capacity of higher than 12 bps/hz is possible even at sub-optimal inter-antenna spacings (for different distances). An important observation is also that we need to maintain appropriate receive signal levels at receive antennas in order to improve capacity.
Abstract:Terahertz frequencies are an untapped resource for providing high-speed short-range communications. As a result, it is of interest to study the propagation characteristics of terahertz waves and to develop channel models. In previous work we used a measurement-based approach to develop an accurate channel model for line of sight (LoS) links. In this paper we extend that work by developing channel models for non-line of sight (NLoS) links where the signal suffers one reflection. We study reflections that occur off a metal plate as well as a piece of wood. Our model for received magnitude includes the effects of standing waves that develop between the transmitter and receiver. Measurements show an excellent agreement between empirical data and the model. In addition, we have analyzed the received phase of the reflected signal at frequencies in the range 320- 480 GHz. We observed a linear error between the predicted and actual phase and developed a model to accommodate that discrepancy. The final model we have developed for predicting received phase is very accurate for the entire range 320 - 480 GHz and for both materials.
Abstract:There is a growing interest in exploiting the terahertz frequency band for future communication systems that demand high data rates. Given the complex propagation behavior of this frequency band, various researchers have developed channel models that can be utilized in the development of communication systems. These models however do not include a crucial aspect of terahertz propagation at short distances: the presence of standing waves. Our measurements show that at specific distances, the effect of standing waves is significant. In this paper, we extend previous terahertz channel models to include the effect of standing waves and show a good fit with our measurements. Our measurements and modeling cover the five most promising terahertz frequency bands: 140, 220, 340, 410, 460 GHz.
Abstract:We study anomaly detection for the case when the normal class consists of more than one object category. This is an obvious generalization of the standard one-class anomaly detection problem. However, we show that jointly using multiple one-class anomaly detectors to solve this problem yields poorer results as compared to training a single one-class anomaly detector on all normal object categories together. We further develop a new anomaly detector called DeepMAD that learns compact distinguishing features by exploiting the multiple normal objects categories. This algorithm achieves higher AUC values for different datasets compared to two top performing one-class algorithms that either are trained on each normal object category or jointly trained on all normal object categories combined. In addition to theoretical results we present empirical results using the CIFAR-10, fMNIST, CIFAR-100, and a new dataset we developed called RECYCLE.