Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Knut Peterson

ADAM: Autonomous Discovery and Annotation Model using LLMs for Context-Aware Annotations

Jun 10, 2025

Amirreza Rouhi, Solmaz Arezoomandan, Knut Peterson, Joseph T. Woods, David K. Han

Abstract:Object detection models typically rely on predefined categories, limiting their ability to identify novel objects in open-world scenarios. To overcome this constraint, we introduce ADAM: Autonomous Discovery and Annotation Model, a training-free, self-refining framework for open-world object labeling. ADAM leverages large language models (LLMs) to generate candidate labels for unknown objects based on contextual information from known entities within a scene. These labels are paired with visual embeddings from CLIP to construct an Embedding-Label Repository (ELR) that enables inference without category supervision. For a newly encountered unknown object, ADAM retrieves visually similar instances from the ELR and applies frequency-based voting and cross-modal re-ranking to assign a robust label. To further enhance consistency, we introduce a self-refinement loop that re-evaluates repository labels using visual cohesion analysis and k-nearest-neighbor-based majority re-labeling. Experimental results on the COCO and PASCAL datasets demonstrate that ADAM effectively annotates novel categories using only visual and contextual signals, without requiring any fine-tuning or retraining.

Via

Access Paper or Ask Questions

Domain-Transferred Synthetic Data Generation for Improving Monocular Depth Estimation

May 02, 2024

Seungyeop Lee, Knut Peterson, Solmaz Arezoomandan, Bill Cai, Peihan Li, Lifeng Zhou, David Han

Abstract:A major obstacle to the development of effective monocular depth estimation algorithms is the difficulty in obtaining high-quality depth data that corresponds to collected RGB images. Collecting this data is time-consuming and costly, and even data collected by modern sensors has limited range or resolution, and is subject to inconsistencies and noise. To combat this, we propose a method of data generation in simulation using 3D synthetic environments and CycleGAN domain transfer. We compare this method of data generation to the popular NYUDepth V2 dataset by training a depth estimation model based on the DenseDepth structure using different training sets of real and simulated data. We evaluate the performance of the models on newly collected images and LiDAR depth data from a Husky robot to verify the generalizability of the approach and show that GAN-transformed data can serve as an effective alternative to real-world data, particularly in depth estimation.

Via

Access Paper or Ask Questions

Exploring Consequential Robot Sound: Should We Make Robots Quiet and Kawaii-et?

Apr 05, 2021

Brian J. Zhang, Knut Peterson, Christopher A. Sanchez, Naomi T. Fitter

Figure 1 for Exploring Consequential Robot Sound: Should We Make Robots Quiet and Kawaii-et?

Figure 2 for Exploring Consequential Robot Sound: Should We Make Robots Quiet and Kawaii-et?

Figure 3 for Exploring Consequential Robot Sound: Should We Make Robots Quiet and Kawaii-et?

Figure 4 for Exploring Consequential Robot Sound: Should We Make Robots Quiet and Kawaii-et?

Abstract:All robots create consequential sound -- sound produced as a result of the robot's mechanisms -- yet little work has explored how sound impacts human-robot interaction. Recent work shows that the sound of different robot mechanisms affects perceived competence, trust, human-likeness, and discomfort. However, the physical sound characteristics responsible for these perceptions have not been clearly identified. In this paper, we aim to explore key characteristics of robot sound that might influence perceptions. A pilot study from our past work showed that quieter and higher-pitched robots may be perceived as more competent and less discomforting. To better understand how variance in these attributes affects perception, we performed audio manipulations on two sets of industrial robot arm videos within a series of four new studies presented in this paper. Results confirmed that quieter robots were perceived as less discomforting. In addition, higher-pitched robots were perceived as more energetic, happy, warm, and competent. Despite the robot's industrial purpose and appearance, participants seemed to prefer more "cute" (or "kawaii") sound profiles, which could have implications for the design of more acceptable and fulfilling sound profiles for human-robot interactions with practical collaborative robots.

* 7 pages, 4 figures. Submitted for review to the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Via

Access Paper or Ask Questions