Abstract:In this paper, we introduce our millimeter-wave (mmWave) radio channel measurement for integrated sensing and communication (ISAC) scenarios with distributed links at dual bands in an indoor cavity; we also characterize the channel in delay and azimuth-angular domains for the scenarios with the presence of 1 person with varying locations and facing orientations. In our setting of distributed links with two transmitters and two receivers where each transceiver operates at two bands, we can measure two links whose each transmitter faces to one receiver and thus capable of line-of-sight (LOS) communication; these two links have crossing Fresnel zones. We have another two links capable of capturing the reflectivity from the target presenting in the test area (as well as the background). The numerical results in this paper focus on analyzing the channel with the presence of one person. It is evident that not only the human location, but also the human facing orientation, shall be taken into account when modeling the ISAC channel.
Abstract:A site-specific radio channel representation considers the surroundings of the communication system through the environment geometry, such as buildings, vegetation, and mobile objects including their material and surface properties. In this article, we focus on communication technologies for 5G and beyond that are increasingly able to exploit the specific environment geometry for both communication and sensing. We present methods for a site-specific radio channel representation that is spatially consistent, such that mobile transmitter and receveiver cause a correlated time-varying channel impulse response. When modelled as random, this channel impulse response has non-stationary statistical properties, i.e., a time-variant Doppler spectrum, power delay profile, K-factor and spatial correlation. A site-specific radio channel representation will enable research into emerging 5G and beyond technologies such as distributed multiple-input multiple-output systems, reconfigurable intelligent surfaces, multi-band communication, and joint communication and sensing. These 5G and beyond technologies will be deployed for a wide range of environments, from dense urban areas to railways, road transportation, industrial automation, and unmanned aerial vehicles.
Abstract:Pre-trained large-scale language models (LLMs) excel at producing coherent articles, yet their outputs may be untruthful, toxic, or fail to align with user expectations. Current approaches focus on using reinforcement learning with human feedback (RLHF) to improve model alignment, which works by transforming coarse human preferences of LLM outputs into a feedback signal that guides the model learning process. However, because this approach operates on sequence-level feedback, it lacks the precision to identify the exact parts of the output affecting user preferences. To address this gap, we propose a method to enhance LLM alignment through fine-grained token-level supervision. Specifically, we ask annotators to minimally edit less preferred responses within the standard reward modeling dataset to make them more favorable, ensuring changes are made only where necessary while retaining most of the original content. The refined dataset is used to train a token-level reward model, which is then used for training our fine-grained Proximal Policy Optimization (PPO) model. Our experiment results demonstrate that this approach can achieve up to an absolute improvement of $5.1\%$ in LLM performance, in terms of win rate against the reference model, compared with the traditional PPO model.
Abstract:We propose novel methodologies aimed at accelerating the grokking phenomenon, which refers to the rapid increment of test accuracy after a long period of overfitting as reported in~\cite{power2022grokking}. Focusing on the grokking phenomenon that arises in learning arithmetic binary operations via the transformer model, we begin with a discussion on data augmentation in the case of commutative binary operations. To further accelerate, we elucidate arithmetic operations through the lens of the Kolmogorov-Arnold (KA) representation theorem, revealing its correspondence to the transformer architecture: embedding, decoder block, and classifier. Observing the shared structure between KA representations associated with binary operations, we suggest various transfer learning mechanisms that expedite grokking. This interpretation is substantiated through a series of rigorous experiments. In addition, our approach is successful in learning two nonstandard arithmetic tasks: composition of operations and a system of equations. Furthermore, we reveal that the model is capable of learning arithmetic operations using a limited number of tokens under embedding transfer, which is supported by a set of experiments as well.
Abstract:Recent advancements in AI have democratized its deployment as a healthcare assistant. While pretrained models from large-scale visual and audio datasets have demonstrably generalized to this task, surprisingly, no studies have explored pretrained speech models, which, as human-originated sounds, intuitively would share closer resemblance to lung sounds. This paper explores the efficacy of pretrained speech models for respiratory sound classification. We find that there is a characterization gap between speech and lung sound samples, and to bridge this gap, data augmentation is essential. However, the most widely used augmentation technique for audio and speech, SpecAugment, requires 2-dimensional spectrogram format and cannot be applied to models pretrained on speech waveforms. To address this, we propose RepAugment, an input-agnostic representation-level augmentation technique that outperforms SpecAugment, but is also suitable for respiratory sound classification with waveform pretrained models. Experimental results show that our approach outperforms the SpecAugment, demonstrating a substantial improvement in the accuracy of minority disease classes, reaching up to 7.14%.
Abstract:Named Entity Recognition (NER) plays a pivotal role in medical Natural Language Processing (NLP). Yet, there has not been an open-source medical NER dataset specifically for the Korean language. To address this, we utilized ChatGPT to assist in constructing the KBMC (Korean Bio-Medical Corpus), which we are now presenting to the public. With the KBMC dataset, we noticed an impressive 20% increase in medical NER performance compared to models trained on general Korean NER datasets. This research underscores the significant benefits and importance of using specialized tools and datasets, like ChatGPT, to enhance language processing in specialized fields such as healthcare.
Abstract:Metasurface has recently emerged as an economic solution to expand mmWave coverage. However, their pervasive deployment remains a challenge, mainly due to the difficulty in reaching the tight 260ns NR synchronization requirement and real-time wireless reconfiguration while maintaining multi-year battery life. This paper presents NR-Surface, the first real-time reconfigurable metasurface fully compliant with the NR standard, operating at 242.7 $\mu$W for a 2.1-year lifetime on an AA battery. NR-Surface incorporates (i) a new extremely low-power (14KHz sampling) reconfiguration interface, NarrowBand Packet Unit (NBPU), for synchronization and real-time reconfiguration, and (ii) a highly responsive and low-leakage metasurface designed for low-duty cycled operation, by carefully leveraging the structure and the periodicity of the NR beam management procedure in the NR standard. NR-Surface is prototyped and evaluated end-to-end with NR BS built on srsRAN to demonstrate diverse usage scenarios including multiple NR-Surface per BS, multiple UE per NR-Surface, and 3D beamforming. Around-the-corner UE evaluations showcase NR-Surface efficacy under different user mobility patterns (20.3dB gain) and dynamic blockage (22.2dB gain).
Abstract:This paper explores the challenges posed by aspect-based sentiment classification (ABSC) within pretrained language models (PLMs), with a particular focus on contextualization and hallucination issues. In order to tackle these challenges, we introduce CARBD-Ko (a Contextually Annotated Review Benchmark Dataset for Aspect-Based Sentiment Classification in Korean), a benchmark dataset that incorporates aspects and dual-tagged polarities to distinguish between aspect-specific and aspect-agnostic sentiment classification. The dataset consists of sentences annotated with specific aspects, aspect polarity, aspect-agnostic polarity, and the intensity of aspects. To address the issue of dual-tagged aspect polarities, we propose a novel approach employing a Siamese Network. Our experimental findings highlight the inherent difficulties in accurately predicting dual-polarities and underscore the significance of contextualized sentiment analysis models. The CARBD-Ko dataset serves as a valuable resource for future research endeavors in aspect-level sentiment classification.
Abstract:Multi-label classification poses challenges due to imbalanced and noisy labels in training data. We propose a unified data augmentation method, named BalanceMix, to address these challenges. Our approach includes two samplers for imbalanced labels, generating minority-augmented instances with high diversity. It also refines multi-labels at the label-wise granularity, categorizing noisy labels as clean, re-labeled, or ambiguous for robust optimization. Extensive experiments on three benchmark datasets demonstrate that BalanceMix outperforms existing state-of-the-art methods. We release the code at https://github.com/DISL-Lab/BalanceMix.
Abstract:This paper presents the DaG LLM (David and Goliath Large Language Model), a language model specialized for Korean and fine-tuned through Instruction Tuning across 41 tasks within 13 distinct categories.