Abstract:In this paper, we introduce SpINR, a novel framework for volumetric reconstruction using Frequency-Modulated Continuous-Wave (FMCW) radar data. Traditional radar imaging techniques, such as backprojection, often assume ideal signal models and require dense aperture sampling, leading to limitations in resolution and generalization. To address these challenges, SpINR integrates a fully differentiable forward model that operates natively in the frequency domain with implicit neural representations (INRs). This integration leverages the linear relationship between beat frequency and scatterer distance inherent in FMCW radar systems, facilitating more efficient and accurate learning of scene geometry. Additionally, by computing outputs for only the relevant frequency bins, our forward model achieves greater computational efficiency compared to time-domain approaches that process the entire signal before transformation. Through extensive experiments, we demonstrate that SpINR significantly outperforms classical backprojection methods and existing learning-based approaches, achieving higher resolution and more accurate reconstructions of complex scenes. This work represents the first application of neural volumetic reconstruction in the radar domain, offering a promising direction for future research in radar-based imaging and perception systems.
Abstract:To deepen our understanding of optical astronomy, we must advance imaging technology to overcome conventional frame-based cameras' limited dynamic range and temporal resolution. Our Perspective paper examines how neuromorphic cameras can effectively address these challenges. Drawing inspiration from the human retina, neuromorphic cameras excel in speed and high dynamic range by utilizing asynchronous pixel operation and logarithmic photocurrent conversion, making them highly effective for celestial imaging. We use 1300 mm terrestrial telescope to demonstrate the neuromorphic camera's ability to simultaneously capture faint and bright celestial sources while preventing saturation effects. We illustrate its photometric capabilities through aperture photometry of a star field with faint stars. Detection of the faint gas cloud structure of the Trapezium cluster during a full moon night highlights the camera's high dynamic range, effectively mitigating static glare from lunar illumination. Our investigations also include detecting meteorite passing near the Moon and Earth, as well as imaging satellites and anthropogenic debris with exceptionally high temporal resolution using a 200mm telescope. Our observations show the immense potential of neuromorphic cameras in advancing astronomical optical imaging and pushing the boundaries of observational astronomy.
Abstract:This article aims to cover pupillography, and its potential use in a number of ophthalmological diagnostic applications in biomedical space. With the ever-increasing incorporation of technology within our daily lives and an ever-growing active research into smart devices and technologies, we try to make a case for a health ecosystem that revolves around continuous eye monitoring. We tend to summarize the design constraints & requirements for an IoT-based continuous pupil detection system, with an attempt at developing a pipeline for wearable pupillographic device, while comparing two compact mini-camera modules currently available in the market. We use a light algorithm that can be directly adopted to current micro-controllers, and share our results for different lighting conditions, and scenarios. Lastly, we present our findings, along with an analysis on the challenges faced and a way ahead towards successfully building this ecosystem.
Abstract:This paper presents a novel approach for predicting human poses using IMU data, diverging from previous studies such as DIP-IMU, IMUPoser, and TransPose, which use up to 6 IMUs in conjunction with bidirectional RNNs. We introduce two main innovations: a data-driven strategy for optimal IMU placement and a transformer-based model architecture for time series analysis. Our findings indicate that our approach not only outperforms traditional 6 IMU-based biRNN models but also that the transformer architecture significantly enhances pose reconstruction from data obtained from 24 IMU locations, with equivalent performance to biRNNs when using only 6 IMUs. The enhanced accuracy provided by our optimally chosen locations, when coupled with the parallelizability and performance of transformers, provides significant improvements to the field of IMU-based pose estimation.
Abstract:Localization of networked nodes is an essential problem in emerging applications, including first-responder navigation, automated manufacturing lines, vehicular and drone navigation, asset navigation and tracking, Internet of Things and 5G communication networks. In this paper, we present Locate3D, a novel system for peer-to-peer node localization and orientation estimation in large networks. Unlike traditional range-only methods, Locate3D introduces angle-of-arrival (AoA) data as an added network topology constraint. The system solves three key challenges: it uses angles to reduce the number of measurements required by 4x and jointly use range and angle data for location estimation. We develop a spanning-tree approach for fast location updates, and to ensure the output graphs are rigid and uniquely realizable, even in occluded or weakly connected areas. Locate3D cuts down latency by up to 75% without compromising accuracy, surpassing standard range-only solutions. It has a 10.2 meters median localization error for large-scale networks (30,000 nodes, 15 anchors spread across 14km square) and 0.5 meters for small-scale networks (10 nodes).
Abstract:Estimation of a speaker's direction and head orientation with binaural recordings can be a critical piece of information in many real-world applications with emerging `earable' devices, including smart headphones and AR/VR headsets. However, it requires predicting the mutual head orientations of both the speaker and the listener, which is challenging in practice. This paper presents a system for jointly predicting speaker-listener head orientations by leveraging inherent human voice directivity and listener's head-related transfer function (HRTF) as perceived by the ear-mounted microphones on the listener. We propose a convolution neural network model that, given binaural speech recording, can predict the orientation of both speaker and listener with respect to the line joining the two. The system builds on the core observation that the recordings from the left and right ears are differentially affected by the voice directivity as well as the HRTF. We also incorporate the fact that voice is more directional at higher frequencies compared to lower frequencies.
Abstract:This paper presents the design and implementation of WhisperWand, a comprehensive voice and motion tracking interface for voice assistants. Distinct from prior works, WhisperWand is a precise tracking interface that can co-exist with the voice interface on low sampling rate voice assistants. Taking handwriting as a specific application, it can also capture natural strokes and the individualized style of writing while occupying only a single frequency. The core technique includes an accurate acoustic ranging method called Cross Frequency Continuous Wave (CFCW) sonar, enabling voice assistants to use ultrasound as a ranging signal while using the regular microphone system of voice assistants as a receiver. We also design a new optimization algorithm that only requires a single frequency for time difference of arrival. WhisperWand prototype achieves 73 um of median error for 1D ranging and 1.4 mm of median error in 3D tracking of an acoustic beacon using the microphone array used in voice assistants. Our implementation of an in-air handwriting interface achieves 94.1% accuracy with automatic handwriting-to-text software, similar to writing on paper (96.6%). At the same time, the error rate of voice-based user authentication only increases from 6.26% to 8.28%.