Abstract:Underwater robot perception is crucial in scientific subsea exploration and commercial operations. The key challenges include non-uniform lighting and poor visibility in turbid environments. High-frequency forward-look sonar cameras address these issues, by providing high-resolution imagery at maximum range of tens of meters, despite complexities posed by high degree of speckle noise, and lack of color and texture. In particular, robust feature detection is an essential initial step for automated object recognition, localization, navigation, and 3-D mapping. Various local feature detectors developed for RGB images are not well-suited for sonar data. To assess their performances, we evaluate a number of feature detectors using real sonar images from five different sonar devices. Performance metrics such as detection accuracy, false positives, and robustness to variations in target characteristics and sonar devices are applied to analyze the experimental results. The study would provide a deeper insight into the bottlenecks of feature detection for sonar data, and developing more effective methods
Abstract:The use of generative AI in video game development is on the rise, and as the conversational and other capabilities of large language models continue to improve, we expect LLM-driven non-player characters (NPCs) to become widely deployed. In this paper, we seek to understand how human players collaborate with LLM-driven NPCs to accomplish in-game goals. We design a minigame within Minecraft where a player works with two GPT4-driven NPCs to complete a quest. We perform a user study in which 28 Minecraft players play this minigame and share their feedback. On analyzing the game logs and recordings, we find that several patterns of collaborative behavior emerge from the NPCs and the human players. We also report on the current limitations of language-only models that do not have rich game-state or visual understanding. We believe that this preliminary study and analysis will inform future game developers on how to better exploit these rapidly improving generative AI models for collaborative roles in games.
Abstract:We explore how interaction with large language models (LLMs) can give rise to emergent behaviors, empowering players to participate in the evolution of game narratives. Our testbed is a text-adventure game in which players attempt to solve a mystery under a fixed narrative premise, but can freely interact with non-player characters generated by GPT-4, a large language model. We recruit 28 gamers to play the game and use GPT-4 to automatically convert the game logs into a node-graph representing the narrative in the player's gameplay. We find that through their interactions with the non-deterministic behavior of the LLM, players are able to discover interesting new emergent nodes that were not a part of the original narrative but have potential for being fun and engaging. Players that created the most emergent nodes tended to be those that often enjoy games that facilitate discovery, exploration and experimentation.
Abstract:Dialogue-based Role Playing Games (RPGs) require powerful storytelling. The narratives of these may take years to write and typically involve a large creative team. In this work, we demonstrate the potential of large generative text models to assist this process. \textbf{GRIM}, a prototype \textbf{GR}aph-based \textbf{I}nteractive narrative visualization system for ga\textbf{M}es, generates a rich narrative graph with branching storylines that match a high-level narrative description and constraints provided by the designer. Game designers can interactively edit the graph by automatically generating new sub-graphs that fit the edits within the original narrative and constraints. We illustrate the use of \textbf{GRIM} in conjunction with GPT-4, generating branching narratives for four well-known stories with different contextual constraints.
Abstract:Here, we report a case study implementation of reinforcement learning (RL) to automate operations in the scanning transmission electron microscopy (STEM) workflow. To do so, we design a virtual, prototypical RL environment to test and develop a network to autonomously align the electron beam without prior knowledge. Using this simulator, we evaluate the impact of environment design and algorithm hyperparameters on alignment accuracy and learning convergence, showing robust convergence across a wide hyperparameter space. Additionally, we deploy a successful model on the microscope to validate the approach and demonstrate the value of designing appropriate virtual environments. Consistent with simulated results, the on-microscope RL model achieves convergence to the goal alignment after minimal training. Overall, the results highlight that by taking advantage of RL, microscope operations can be automated without the need for extensive algorithm design, taking another step towards augmenting electron microscopy with machine learning methods.