Abstract:Global Positioning System (GPS) plays a critical role in navigation by utilizing satellite signals, but its accuracy in urban environments is often compromised by signal obstructions. Previous research has categorized GPS reception conditions into line-of-sight (LOS), non-line-of-sight (NLOS), and LOS+NLOS scenarios to enhance accuracy. This paper introduces a novel approach using quantum support vector machines (QSVM) with a ZZ feature map and fidelity quantum kernel to classify urban GPS signal reception conditions, comparing its performance against classical SVM methods. While classical SVM has been previously explored for this purpose, our study is the first to apply QSVM to this classification task. We conducted experiments using datasets from two distinct urban locations to train and evaluate SVM and QSVM models. Our results demonstrate that QSVM achieves superior classification accuracy compared to classical SVM for urban GPS signal datasets. Additionally, we emphasize the importance of appropriately scaling raw data when utilizing QSVM.
Abstract:In this study, we propose a novel positioning and searching system for emergency location services, namely the hyper-enhanced local positioning system (HELPS), which is applicable to all mobile phone users, including legacy feature phone users. In the case of an emergency, rescuers are dispatched with portable signal measurement equipment around the estimated location of the emergency caller. Each signal measurement device measures the uplink signal from the mobile phone of the caller. After calculating the rough location of the caller's mobile phone based on these measurements, rescuers can efficiently search for the caller using the received uplink signal strength. Thus, the positioning accuracy in a conventional sense is not a limitation for rescuers in finding the caller. HELPS is not a traditional positioning system but rather a system with humans in the loop designed to reduce search time in emergencies. HELPS can provide emergency location information even in environments where the GPS or Wi-Fi is not functional. Furthermore, for HELPS operation, no hardware changes or software installations are required on the caller's mobile phone.
Abstract:Creating multilingual task-oriented dialogue (TOD) agents is challenging due to the high cost of training data acquisition. Following the research trend of improving training data efficiency, we show for the first time, that in-context learning is sufficient to tackle multilingual TOD. To handle the challenging dialogue state tracking (DST) subtask, we break it down to simpler steps that are more compatible with in-context learning where only a handful of few-shot examples are used. We test our approach on the multilingual TOD dataset X-RiSAWOZ, which has 12 domains in Chinese, English, French, Korean, Hindi, and code-mixed Hindi-English. Our turn-by-turn DST accuracy on the 6 languages range from 55.6% to 80.3%, seemingly worse than the SOTA results from fine-tuned models that achieve from 60.7% to 82.8%; our BLEU scores in the response generation (RG) subtask are also significantly lower than SOTA. However, after manual evaluation of the validation set, we find that by correcting gold label errors and improving dataset annotation schema, GPT-4 with our prompts can achieve (1) 89.6%-96.8% accuracy in DST, and (2) more than 99% correct response generation across different languages. This leads us to conclude that current automatic metrics heavily underestimate the effectiveness of in-context learning.
Abstract:Several sensing techniques have been proposed for silent speech recognition (SSR); however, many of these methods require invasive processes or sensor attachment to the skin using adhesive tape or glue, rendering them unsuitable for frequent use in daily life. By contrast, impulse radio ultra-wideband (IR-UWB) radar can operate without physical contact with users' articulators and related body parts, offering several advantages for SSR. These advantages include high range resolution, high penetrability, low power consumption, robustness to external light or sound interference, and the ability to be embedded in space-constrained handheld devices. This study demonstrated IR-UWB radar-based contactless SSR using four types of speech stimuli (vowels, consonants, words, and phrases). To achieve this, a novel speech feature extraction algorithm specifically designed for IR-UWB radar-based SSR is proposed. Each speech stimulus is recognized by applying a classification algorithm to the extracted speech features. Two different algorithms, multidimensional dynamic time warping (MD-DTW) and deep neural network-hidden Markov model (DNN-HMM), were compared for the classification task. Additionally, a favorable radar antenna position, either in front of the user's lips or below the user's chin, was determined to achieve higher recognition accuracy. Experimental results demonstrated the efficacy of the proposed speech feature extraction algorithm combined with DNN-HMM for classifying vowels, consonants, words, and phrases. Notably, this study represents the first demonstration of phoneme-level SSR using contactless radar.
Abstract:Task-oriented dialogue research has mainly focused on a few popular languages like English and Chinese, due to the high dataset creation cost for a new language. To reduce the cost, we apply manual editing to automatically translated data. We create a new multilingual benchmark, X-RiSAWOZ, by translating the Chinese RiSAWOZ to 4 languages: English, French, Hindi, Korean; and a code-mixed English-Hindi language. X-RiSAWOZ has more than 18,000 human-verified dialogue utterances for each language, and unlike most multilingual prior work, is an end-to-end dataset for building fully-functioning agents. The many difficulties we encountered in creating X-RiSAWOZ led us to develop a toolset to accelerate the post-editing of a new language dataset after translation. This toolset improves machine translation with a hybrid entity alignment technique that combines neural with dictionary-based methods, along with many automated and semi-automated validation checks. We establish strong baselines for X-RiSAWOZ by training dialogue agents in the zero- and few-shot settings where limited gold data is available in the target language. Our results suggest that our translation and post-editing methodology and toolset can be used to create new high-quality multilingual dialogue agents cost-effectively. Our dataset, code, and toolkit are released open-source.
Abstract:In regions where global navigation satellite systems (GNSS) signals are unavailable, such as underground areas and tunnels, GNSS simulators can be deployed for transmitting simulated GNSS signals. Then, a GNSS receiver in the simulator coverage outputs the position based on the received GNSS signals (e.g., Global Positioning System (GPS) L1 signals in this study) transmitted by the corresponding simulator. This approach provides periodic position updates to GNSS users while deploying a small number of simulators without modifying the hardware and software of user receivers. However, the simulator clock should be synchronized to the GNSS satellite clock to generate almost identical signals to the live-sky GNSS signals, which is necessary for seamless indoor and outdoor positioning handover. The conventional clock synchronization method based on the wired connection between each simulator and an outdoor GNSS antenna causes practical difficulty and increases the cost of deploying the simulators. This study proposes a wireless clock synchronization method based on a private time server and time delay calibration. Additionally, we derived the constraints for determining the optimal simulator coverage and separation between adjacent simulators. The positioning performance of the proposed GPS simulator-based indoor positioning system was demonstrated in the underground testbed for a driving vehicle with a GPS receiver and a pedestrian with a smartphone. The average position errors were 3.7 m for the vehicle and 9.6 m for the pedestrian during the field tests with successful indoor and outdoor positioning handovers. Since those errors are within the coverage of each deployed simulator, it is confirmed that the proposed system with wireless clock synchronization can effectively provide periodic position updates to users where live-sky GNSS signals are unavailable.
Abstract:In urban areas, dense buildings frequently block and reflect global positioning system (GPS) signals, resulting in the reception of a few visible satellites with many multipath signals. This is a significant problem that results in unreliable positioning in urban areas. If a signal reception condition from a certain satellite can be detected, the positioning performance can be improved by excluding or de-weighting the multipath contaminated satellite signal. Thus, we developed a machine-learning-based method of classifying GPS signal reception conditions using a dual-polarized antenna. We employed a decision tree algorithm for classification using three features, one of which can be obtained only from a dual-polarized antenna. A machine-learning model was trained using GPS signals collected from various locations. When the features extracted from the GPS raw signal are input, the generated machine-learning model outputs one of the three signal reception conditions: non-line-of-sight (NLOS) only, line-of-sight (LOS) only, or LOS+NLOS. Multiple testing datasets were used to analyze the classification accuracy, which was then compared with an existing method using dual single-polarized antennas. Consequently, when the testing dataset was collected at different locations from the training dataset, a classification accuracy of 64.47% was obtained, which was slightly higher than the accuracy of the existing method using dual single-polarized antennas. Therefore, the dual-polarized antenna solution is more beneficial than the dual single-polarized antenna solution because it has a more compact form factor and its performance is similar to that of the other solution.
Abstract:The maximum likelihood (ML) estimator can be applied to localize a target mobile device using the RSS and TOA. However, the ML estimator for the RSS-TOA-based target localization problem is nonconvex and nonlinear, having no analytical solution. Therefore, the ML estimator should be solved numerically, unless it is relaxed into a convex or linear form. This study investigates the target localization performance and computational complexity of numerical methods for solving an ML estimator. The three widely used numerical methods are: grid search, gradient descent, and particle swarm optimization. In the experimental evaluation, the grid search yielded the lowest target localization root-mean-squared error; however, the 95th percentile error of the grid search was larger than those of the other two algorithms. The average code computation time of the grid search was extremely large compared with those of the other two algorithms, and gradient descent exhibited the lowest computation time.
Abstract:Neural networks have complex structures, and thus it is hard to understand their inner workings and ensure correctness. To understand and debug convolutional neural networks (CNNs) we propose techniques for testing the channels of CNNs. We design FtGAN, an extension to GAN, that can generate test data with varying the intensity (i.e., sum of the neurons) of a channel of a target CNN. We also proposed a channel selection algorithm to find representative channels for testing. To efficiently inspect the target CNN's inference computations, we define unexpectedness score, which estimates how similar the inference computation of the test data is to that of the training data. We evaluated FtGAN with five public datasets and showed that our techniques successfully identify defective channels in five different CNN models.
Abstract:This study investigates unmanned aerial vehicle (UAV) trajectory planning strategies for localizing a target mobile device in emergency situations. The global navigation satellite system (GNSS)-based accurate position information of a target mobile device in an emergency may not be always available to first responders. For example, 1) GNSS positioning accuracy may be degraded in harsh signal environments and 2) in countries where emergency positioning service is not mandatory, some mobile devices may not report their locations. Under the cases mentioned above, one way to find the target mobile device is to use UAVs. Dispatched UAVs may search the target directly on the emergency site by measuring the strength of the signal (e.g., LTE wireless communication signal) from the target mobile device. To accurately localize the target mobile device in the shortest time possible, UAVs should fly in the most efficient way possible. The two popular trajectory optimization strategies of UAVs are greedy and predictive approaches. However, the research on localization performances of the two approaches has been evaluated only under favorable settings (i.e., under good UAV geometries and small received signal strength (RSS) errors); more realistic scenarios still remain unexplored. In this study, we compare the localization performance of the greedy and predictive approaches under realistic RSS errors (i.e., up to 6 dB according to the ITU-R channel model).