Abstract:In this paper, we study cost-effective home robot solutions which are designed for home health monitoring. The recent advancements in Artificial Intelligence (AI) have significantly advanced the capabilities of the robots, enabling them to better and efficiently understand and interact with their surroundings. The most common robots currently used in homes are toy robots and cleaning robots. While these are relatively affordable, their functionalities are very limited. On the other hand, humanoid and quadruped robots offer more sophisticated features and capabilities, albeit at a much higher cost. Another category is educational robots, which provide educators with the flexibility to attach various sensors and integrate different design methods with the integrated operating systems. However, the challenge still exists in bridging the gap between affordability and functionality. Our research aims to address this by exploring the potential of developing advanced yet affordable and accessible robots for home robots, aiming for health monitoring, by using edge computing techniques and taking advantage of existing computing resources for home robots, such as mobile phones.
Abstract:Recognizing unauthorized Unmanned Aerial Vehicles (UAVs) within designated no-fly zones throughout the day and night is of paramount importance, where the unauthorized UAVs pose a substantial threat to both civil and military aviation safety. However, recognizing UAVs day and night with dual-vision cameras is nontrivial, since red-green-blue (RGB) images suffer from a low detection rate under an insufficient light condition, such as on cloudy or stormy days, while black-and-white infrared (IR) images struggle to capture UAVs that overlap with the background at night. In this paper, we propose a new optical flow-assisted graph-pooling residual network (OF-GPRN), which significantly enhances the UAV detection rate in day and night dual visions. The proposed OF-GPRN develops a new optical fusion to remove superfluous backgrounds, which improves RGB/IR imaging clarity. Furthermore, OF-GPRN extends optical fusion by incorporating a graph residual split attention network and a feature pyramid, which refines the perception of UAVs, leading to a higher success rate in UAV detection. A comprehensive performance evaluation is conducted using a benchmark UAV catch dataset. The results indicate that the proposed OF-GPRN elevates the UAV mean average precision (mAP) detection rate to 87.8%, marking a 17.9% advancement compared to the residual graph neural network (ResGCN)-based approach.
Abstract:Text-to-3D content creation is a rapidly evolving research area. Given the scarcity of 3D data, current approaches often adapt pre-trained 2D diffusion models for 3D synthesis. Among these approaches, Score Distillation Sampling (SDS) has been widely adopted. However, the issue of over-smoothing poses a significant limitation on the high-fidelity generation of 3D models. To address this challenge, LucidDreamer replaces the Denoising Diffusion Probabilistic Model (DDPM) in SDS with the Denoising Diffusion Implicit Model (DDIM) to construct Interval Score Matching (ISM). However, ISM inevitably inherits inconsistencies from DDIM, causing reconstruction errors during the DDIM inversion process. This results in poor performance in the detailed generation of 3D objects and loss of content. To alleviate these problems, we propose a novel method named Exact Score Matching (ESM). Specifically, ESM leverages auxiliary variables to mathematically guarantee exact recovery in the DDIM reverse process. Furthermore, to effectively capture the dynamic changes of the original and auxiliary variables, the LoRA of a pre-trained diffusion model implements these exact paths. Extensive experiments demonstrate the effectiveness of ESM in text-to-3D generation, particularly highlighting its superiority in detailed generation.
Abstract:With impressive achievements made, artificial intelligence is on the path forward to artificial general intelligence. Sora, developed by OpenAI, which is capable of minute-level world-simulative abilities can be considered as a milestone on this developmental path. However, despite its notable successes, Sora still encounters various obstacles that need to be resolved. In this survey, we embark from the perspective of disassembling Sora in text-to-video generation, and conducting a comprehensive review of literature, trying to answer the question, \textit{From Sora What We Can See}. Specifically, after basic preliminaries regarding the general algorithms are introduced, the literature is categorized from three mutually perpendicular dimensions: evolutionary generators, excellent pursuit, and realistic panorama. Subsequently, the widely used datasets and metrics are organized in detail. Last but more importantly, we identify several challenges and open problems in this domain and propose potential future directions for research and development.
Abstract:Measuring the complex permittivity of material is essential in many scenarios such as quality check and component analysis. Generally, measurement methods for characterizing the material are based on the usage of vector network analyzer, which is large and not easy for on-site measurement, especially in high frequency range such as millimeter wave (mmWave). In addition, some measurement methods require the destruction of samples, which is not suitable for non-destructive inspection. In this work, a small distance increment (SDI) method is proposed to non-destructively measure the complex permittivity of material. In SDI, the transmitter and receiver are formed as the monostatic radar, which is facing towards the material under test (MUT). During the measurement, the distance between radar and MUT changes with small increments and the signals are recorded at each position. A mathematical model is formulated to depict the relationship among the complex permittivity, distance increment, and measured signals. By fitting the model, the complex permittivity of MUT is estimated. To implement and evaluate the proposed SDI method, a commercial off-the-shelf mmWave radar is utilized and the measurement system is developed. Then, the evaluation was carried out on the acrylic plate. With the proposed method, the estimated complex permittivity of acrylic plate shows good agreement with the literature values, demonstrating the efficacy of SDI method for characterizing the complex permittivity of material.
Abstract:Large language models such as ChatGPT are increasingly explored in medical domains. However, the absence of standard guidelines for performance evaluation has led to methodological inconsistencies. This study aims to summarize the available evidence on evaluating ChatGPT's performance in medicine and provide direction for future research. We searched ten medical literature databases on June 15, 2023, using the keyword "ChatGPT". A total of 3520 articles were identified, of which 60 were reviewed and summarized in this paper and 17 were included in the meta-analysis. The analysis showed that ChatGPT displayed an overall integrated accuracy of 56% (95% CI: 51%-60%, I2 = 87%) in addressing medical queries. However, the studies varied in question resource, question-asking process, and evaluation metrics. Moreover, many studies failed to report methodological details, including the version of ChatGPT and whether each question was used independently or repeatedly. Our findings revealed that although ChatGPT demonstrated considerable potential for application in healthcare, the heterogeneity of the studies and insufficient reporting may affect the reliability of these results. Further well-designed studies with comprehensive and transparent reporting are needed to evaluate ChatGPT's performance in medicine.
Abstract:Human activity recognition (HAR) is an essential research field that has been used in different applications including home and workplace automation, security and surveillance as well as healthcare. Starting from conventional machine learning methods to the recently developing deep learning techniques and the Internet of things, significant contributions have been shown in the HAR area in the last decade. Even though several review and survey studies have been published, there is a lack of sensor-based HAR overview studies focusing on summarising the usage of wearable sensors and smart home sensors data as well as applications of HAR and deep learning techniques. Hence, we overview sensor-based HAR, discuss several important applications that rely on HAR, and highlight the most common machine learning methods that have been used for HAR. Finally, several challenges of HAR are explored that should be addressed to further improve the robustness of HAR.
Abstract:While the rollout of the fifth-generation mobile network (5G) is underway across the globe with the intention to deliver 4K/8K UHD videos, Augmented Reality (AR), and Virtual Reality (VR) content to the mass amounts of users, the coverage and throughput are still one of the most significant issues, especially in the rural areas, where only 5G in the low-frequency band are being deployed. This called for a high-performance adaptive bitrate (ABR) algorithm that can maximize the user quality of experience given 5G network characteristics and data rate of UHD contents. Recently, many of the newly proposed ABR techniques were machine-learning based. Among that, Pensieve is one of the state-of-the-art techniques, which utilized reinforcement-learning to generate an ABR algorithm based on observation of past decision performance. By incorporating the context of the 5G network and UHD content, Pensieve has been optimized into Pensieve 5G. New QoE metrics that more accurately represent the QoE of UHD video streaming on the different types of devices were proposed and used to evaluate Pensieve 5G against other ABR techniques including the original Pensieve. The results from the simulation based on the real 5G Standalone (SA) network throughput shows that Pensieve 5G outperforms both conventional algorithms and Pensieve with the average QoE improvement of 8.8% and 14.2%, respectively. Additionally, Pensieve 5G also performed well on the commercial 5G NR-NR Dual Connectivity (NR-DC) Network, despite the training being done solely using the data from the 5G Standalone (SA) network.
Abstract:Internet traffic is dramatically increasing with the development of network technologies. Within the total traffic, video streaming traffic accounts for a large amount, which reveals the importance to guarantee the quality of content delivery service. Based on the network conditions, adaptive bitrate (ABR) control is utilized as a common technique which can choose the proper bitrate to ensure the video streaming quality. In this paper, a new bitrate control method, QuDASH is proposed by taking advantage of the emerging quantum technology. In QuDASH, the adaptive control model is developed using the quadratic unconstrained binary optimization (QUBO), which aims at increasing the average bitrate and decreasing the video rebuffering events to maximize the user quality of experience (QoE). Then, the control model is solved by Digital Annealer, which is a quantum-Inspired computing technology. The evaluation of the proposed method is carried out by simulation with the measured throughput traces in real world. Experiment results demonstrated that the proposed QuDASH method has better performance in terms of QoE compared with other advanced ABR methods. In 68.2% of the examined cases, QuDASH achieves the highest QoE results, which shows the superiority of the QuDASH over conventional methods.
Abstract:Owing to the plentiful information released by the commodity devices, WiFi signals have been widely studied for various wireless sensing applications. In many works, both received signal strength indicator (RSSI) and the channel state information (CSI) are utilized as the key factors for precise sensing. However, the calculation and relationship between RSSI and CSI is not explained in detail. Furthermore, there are few works focusing on the measurement variation of the WiFi signal which impacts the sensing results. In this paper, the relationship between RSSI and CSI is studied in detail and the measurement variation of amplitude and phase information is investigated by extensive experiments. In the experiments, transmitter and receiver are directly connected by power divider and RF cables and the signal transmission is quantitatively controlled by RF attenuators. By changing the intensity of attenuation, the measurement of RSSI and CSI is carried out under different conditions. From the results, it is found that in order to get a reliable measurement of the signal amplitude and phase by commodity WiFi, the attenuation of the channels should not exceed 60 dB. Meanwhile, the difference between two channels should be lower than 10 dB. An active control mechanism is suggested to ensure the measurement stability. The findings and criteria of this work is promising to facilitate more precise sensing technologies with WiFi signal.