Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ayush Agrawal

Language Models' Factuality Depends on the Language of Inquiry

Feb 25, 2025

Tushar Aggarwal, Kumar Tanmay, Ayush Agrawal, Kumar Ayush, Hamid Palangi, Paul Pu Liang

Figure 1 for Language Models' Factuality Depends on the Language of Inquiry

Figure 2 for Language Models' Factuality Depends on the Language of Inquiry

Figure 3 for Language Models' Factuality Depends on the Language of Inquiry

Figure 4 for Language Models' Factuality Depends on the Language of Inquiry

Abstract:Multilingual language models (LMs) are expected to recall factual knowledge consistently across languages, yet they often fail to transfer knowledge between languages even when they possess the correct information in one of the languages. For example, we find that an LM may correctly identify Rashed Al Shashai as being from Saudi Arabia when asked in Arabic, but consistently fails to do so when asked in English or Swahili. To systematically investigate this limitation, we introduce a benchmark of 10,000 country-related facts across 13 languages and propose three novel metrics: Factual Recall Score, Knowledge Transferability Score, and Cross-Lingual Factual Knowledge Transferability Score-to quantify factual recall and knowledge transferability in LMs across different languages. Our results reveal fundamental weaknesses in today's state-of-the-art LMs, particularly in cross-lingual generalization where models fail to transfer knowledge effectively across different languages, leading to inconsistent performance sensitive to the language used. Our findings emphasize the need for LMs to recognize language-specific factual reliability and leverage the most trustworthy information across languages. We release our benchmark and evaluation framework to drive future research in multilingual knowledge transfer.

Via

Access Paper or Ask Questions

Physical Reasoning and Object Planning for Household Embodied Agents

Nov 22, 2023

Ayush Agrawal, Raghav Prabhakar, Anirudh Goyal, Dianbo Liu

Figure 1 for Physical Reasoning and Object Planning for Household Embodied Agents

Figure 2 for Physical Reasoning and Object Planning for Household Embodied Agents

Figure 3 for Physical Reasoning and Object Planning for Household Embodied Agents

Figure 4 for Physical Reasoning and Object Planning for Household Embodied Agents

Abstract:In this study, we explore the sophisticated domain of task planning for robust household embodied agents, with a particular emphasis on the intricate task of selecting substitute objects. We introduce the CommonSense Object Affordance Task (COAT), a novel framework designed to analyze reasoning capabilities in commonsense scenarios. This approach is centered on understanding how these agents can effectively identify and utilize alternative objects when executing household tasks, thereby offering insights into the complexities of practical decision-making in real-world environments.Drawing inspiration from human decision-making, we explore how large language models tackle this challenge through three meticulously crafted commonsense question-and-answer datasets, featuring refined rules and human annotations. Our evaluation of state-of-the-art language models on these datasets sheds light on three pivotal considerations: 1) aligning an object's inherent utility with the task at hand, 2) navigating contextual dependencies (societal norms, safety, appropriateness, and efficiency), and 3) accounting for the current physical state of the object. To maintain accessibility, we introduce five abstract variables reflecting an object's physical condition, modulated by human insights to simulate diverse household scenarios. Our contributions include insightful Object-Utility mappings addressing the first consideration and two extensive QA datasets (15k and 130k questions) probing the intricacies of contextual dependencies and object states. The datasets, along with our findings, are accessible at: \url{https://github.com/com-phy-affordance/COAT}. This research not only advances our understanding of physical commonsense reasoning in language models but also paves the way for future improvements in household agent intelligence.

* Total: 32 pages ( 16 pages main content, 11 Figures)

Via

Access Paper or Ask Questions

CLIPGraphs: Multimodal Graph Networks to Infer Object-Room Affinities

Jun 02, 2023

Ayush Agrawal, Raghav Arora, Ahana Datta, Snehasis Banerjee, Brojeshwar Bhowmick, Krishna Murthy Jatavallabhula, Mohan Sridharan, Madhava Krishna

Figure 1 for CLIPGraphs: Multimodal Graph Networks to Infer Object-Room Affinities

Figure 2 for CLIPGraphs: Multimodal Graph Networks to Infer Object-Room Affinities

Figure 3 for CLIPGraphs: Multimodal Graph Networks to Infer Object-Room Affinities

Figure 4 for CLIPGraphs: Multimodal Graph Networks to Infer Object-Room Affinities

Abstract:This paper introduces a novel method for determining the best room to place an object in, for embodied scene rearrangement. While state-of-the-art approaches rely on large language models (LLMs) or reinforcement learned (RL) policies for this task, our approach, CLIPGraphs, efficiently combines commonsense domain knowledge, data-driven methods, and recent advances in multimodal learning. Specifically, it (a)encodes a knowledge graph of prior human preferences about the room location of different objects in home environments, (b) incorporates vision-language features to support multimodal queries based on images or text, and (c) uses a graph network to learn object-room affinities based on embeddings of the prior knowledge and the vision-language features. We demonstrate that our approach provides better estimates of the most appropriate location of objects from a benchmark set of object categories in comparison with state-of-the-art baselines

* RO-MAN 2023 Conference

Via

Access Paper or Ask Questions

Do Language Models Know When They're Hallucinating References?

May 29, 2023

Ayush Agrawal, Lester Mackey, Adam Tauman Kalai

Figure 1 for Do Language Models Know When They're Hallucinating References?

Figure 2 for Do Language Models Know When They're Hallucinating References?

Figure 3 for Do Language Models Know When They're Hallucinating References?

Figure 4 for Do Language Models Know When They're Hallucinating References?

Abstract:Current state-of-the-art language models (LMs) are notorious for generating text with "hallucinations," a primary example being book and paper references that lack any solid basis in their training data. However, we find that many of these fabrications can be identified using the same LM, using only black-box queries without consulting any external resources. Consistency checks done with direct queries about whether the generated reference title is real (inspired by Kadavath et al. 2022, Lin et al. 2022, Manakul et al. 2023) are compared to consistency checks with indirect queries which ask for ancillary details such as the authors of the work. These consistency checks are found to be partially reliable indicators of whether or not the reference is a hallucination. In particular, we find that LMs in the GPT-series will hallucinate differing authors of hallucinated references when queried in independent sessions, while it will consistently identify authors of real references. This suggests that the hallucination may be more a result of generation techniques than the underlying representation.

Via

Access Paper or Ask Questions

Sequence-Agnostic Multi-Object Navigation

May 10, 2023

Nandiraju Gireesh, Ayush Agrawal, Ahana Datta, Snehasis Banerjee, Mohan Sridharan, Brojeshwar Bhowmick, Madhava Krishna

Abstract:The Multi-Object Navigation (MultiON) task requires a robot to localize an instance (each) of multiple object classes. It is a fundamental task for an assistive robot in a home or a factory. Existing methods for MultiON have viewed this as a direct extension of Object Navigation (ON), the task of localising an instance of one object class, and are pre-sequenced, i.e., the sequence in which the object classes are to be explored is provided in advance. This is a strong limitation in practical applications characterized by dynamic changes. This paper describes a deep reinforcement learning framework for sequence-agnostic MultiON based on an actor-critic architecture and a suitable reward specification. Our framework leverages past experiences and seeks to reward progress toward individual as well as multiple target object classes. We use photo-realistic scenes from the Gibson benchmark dataset in the AI Habitat 3D simulation environment to experimentally show that our method performs better than a pre-sequenced approach and a state of the art ON method extended to MultiON.

* ICRA 2023 conference

Via

Access Paper or Ask Questions

Towards a Mathematics Formalisation Assistant using Large Language Models

Nov 14, 2022

Ayush Agrawal, Siddhartha Gadgil, Navin Goyal, Ashvni Narayanan, Anand Tadipatri

Figure 1 for Towards a Mathematics Formalisation Assistant using Large Language Models

Figure 2 for Towards a Mathematics Formalisation Assistant using Large Language Models

Figure 3 for Towards a Mathematics Formalisation Assistant using Large Language Models

Figure 4 for Towards a Mathematics Formalisation Assistant using Large Language Models

Abstract:Mathematics formalisation is the task of writing mathematics (i.e., definitions, theorem statements, proofs) in natural language, as found in books and papers, into a formal language that can then be checked for correctness by a program. It is a thriving activity today, however formalisation remains cumbersome. In this paper, we explore the abilities of a large language model (Codex) to help with formalisation in the Lean theorem prover. We find that with careful input-dependent prompt selection and postprocessing, Codex is able to formalise short mathematical statements at undergrad level with nearly 75\% accuracy for $120$ theorem statements. For proofs quantitative analysis is infeasible and we undertake a detailed case study. We choose a diverse set of $13$ theorems at undergrad level with proofs that fit in two-three paragraphs. We show that with a new prompting strategy Codex can formalise these proofs in natural language with at least one out of twelve Codex completion being easy to repair into a complete proof. This is surprising as essentially no aligned data exists for formalised mathematics, particularly for proofs. These results suggest that large language models are a promising avenue towards fully or partially automating formalisation.

Via

Access Paper or Ask Questions

Lyapunov Design for Robust and Efficient Robotic Reinforcement Learning

Aug 13, 2022

Tyler Westenbroek, Fernando Castaneda, Ayush Agrawal, Shankar Sastry, Koushil Sreenath

Figure 1 for Lyapunov Design for Robust and Efficient Robotic Reinforcement Learning

Figure 2 for Lyapunov Design for Robust and Efficient Robotic Reinforcement Learning

Figure 3 for Lyapunov Design for Robust and Efficient Robotic Reinforcement Learning

Figure 4 for Lyapunov Design for Robust and Efficient Robotic Reinforcement Learning

Abstract:Recent advances in the reinforcement learning (RL) literature have enabled roboticists to automatically train complex policies in simulated environments. However, due to the poor sample complexity of these methods, solving reinforcement learning problems using real-world data remains a challenging problem. This paper introduces a novel cost-shaping method which aims to reduce the number of samples needed to learn a stabilizing controller. The method adds a term involving a control Lyapunov function (CLF) -- an `energy-like' function from the model-based control literature -- to typical cost formulations. Theoretical results demonstrate the new costs lead to stabilizing controllers when smaller discount factors are used, which is well-known to reduce sample complexity. Moreover, the addition of the CLF term `robustifies' the search for a stabilizing controller by ensuring that even highly sub-optimal polices will stabilize the system. We demonstrate our approach with two hardware examples where we learn stabilizing controllers for a cartpole and an A1 quadruped with only seconds and a few minutes of fine-tuning data, respectively.

Via

Access Paper or Ask Questions

Computation of Regions of Attraction for Hybrid Limit Cycles Using Reachability: An Application to Walking Robots

Feb 09, 2022

Jason J. Choi, Ayush Agrawal, Koushil Sreenath, Claire J. Tomlin, Somil Bansal

Figure 1 for Computation of Regions of Attraction for Hybrid Limit Cycles Using Reachability: An Application to Walking Robots

Figure 2 for Computation of Regions of Attraction for Hybrid Limit Cycles Using Reachability: An Application to Walking Robots

Figure 3 for Computation of Regions of Attraction for Hybrid Limit Cycles Using Reachability: An Application to Walking Robots

Figure 4 for Computation of Regions of Attraction for Hybrid Limit Cycles Using Reachability: An Application to Walking Robots

Abstract:Contact-rich robotic systems, such as legged robots and manipulators, are often represented as hybrid systems. However, the stability analysis and region-of-attraction computation for these systems are often challenging because of the discontinuous state changes upon contact (also referred to as state resets). In this work, we cast the computation of region-ofattraction as a Hamilton-Jacobi (HJ) reachability problem. This enables us to leverage HJ reachability tools that are compatible with general nonlinear system dynamics, and can formally deal with state and input constraints as well as bounded disturbances. Our main contribution is the generalization of HJ reachability framework to account for the discontinuous state changes originating from state resets, which has remained a challenge until now. We apply our approach for computing region-of-attractions for several underactuated walking robots and demonstrate that the proposed approach can (a) recover a bigger region-of-attraction than state-of-the-art approaches, (b) handle state resets, nonlinear dynamics, external disturbances, and input constraints, and (c) also provides a stabilizing controller for the system that can leverage the state resets for enhancing system stability.

* Accepted to IEEE RA-L & ICRA, 2022

Via

Access Paper or Ask Questions

Vision-aided Dynamic Quadrupedal Locomotion on Discrete Terrain using Motion Libraries

Oct 02, 2021

Ayush Agrawal, Shuxiao Chen, Akshara Rai, Koushil Sreenath

Figure 1 for Vision-aided Dynamic Quadrupedal Locomotion on Discrete Terrain using Motion Libraries

Figure 2 for Vision-aided Dynamic Quadrupedal Locomotion on Discrete Terrain using Motion Libraries

Figure 3 for Vision-aided Dynamic Quadrupedal Locomotion on Discrete Terrain using Motion Libraries

Figure 4 for Vision-aided Dynamic Quadrupedal Locomotion on Discrete Terrain using Motion Libraries

Abstract:In this paper, we present a framework rooted in control and planning that enables quadrupedal robots to traverse challenging terrains with discrete footholds using visual feedback. Navigating discrete terrain is challenging for quadrupeds because the motion of the robot can be aperiodic, highly dynamic, and blind for the hind legs of the robot. Additionally, the robot needs to reason over both the feasible footholds as well as robot velocity by speeding up and slowing down at different parts of the terrain. We build an offline library of periodic gaits which span two trotting steps on the robot, and switch between different motion primitives to achieve aperiodic motions of different step lengths on an A1 robot. The motion library is used to provide targets to a geometric model predictive controller which controls stance. To incorporate visual feedback, we use terrain mapping tools to build a local height map of the terrain around the robot using RGB and depth cameras, and extract feasible foothold locations around both the front and hind legs of the robot. Our experiments show a Unitree A1 robot navigating multiple unknown, challenging and discrete terrains in the real world.

* Submitted to ICRA 2022

Via

Access Paper or Ask Questions

Autonomous Navigation for Quadrupedal Robots with Optimized Jumping through Constrained Obstacles

Jul 01, 2021

Scott Gilroy, Derek Lau, Lizhi Yang, Ed Izaguirre, Kristen Biermayer, Anxing Xiao, Mengti Sun, Ayush Agrawal, Jun Zeng, Zhongyu Li(+1 more)

Figure 1 for Autonomous Navigation for Quadrupedal Robots with Optimized Jumping through Constrained Obstacles

Figure 2 for Autonomous Navigation for Quadrupedal Robots with Optimized Jumping through Constrained Obstacles

Figure 3 for Autonomous Navigation for Quadrupedal Robots with Optimized Jumping through Constrained Obstacles

Figure 4 for Autonomous Navigation for Quadrupedal Robots with Optimized Jumping through Constrained Obstacles

Abstract:Quadrupeds are strong candidates for navigating challenging environments because of their agile and dynamic designs. This paper presents a methodology that extends the range of exploration for quadrupedal robots by creating an end-to-end navigation framework that exploits walking and jumping modes. To obtain a dynamic jumping maneuver while avoiding obstacles, dynamically-feasible trajectories are optimized offline through collocation-based optimization where safety constraints are imposed. Such optimization schematic allows the robot to jump through window-shaped obstacles by considering both obstacles in the air and on the ground. The resulted jumping mode is utilized in an autonomous navigation pipeline that leverages a search-based global planner and a local planner to enable the robot to reach the goal location by walking. A state machine together with a decision making strategy allows the system to switch behaviors between walking around obstacles or jumping through them. The proposed framework is experimentally deployed and validated on a quadrupedal robot, a Mini Cheetah, to enable the robot to autonomously navigate through an environment while avoiding obstacles and jumping over a maximum height of 13 cm to pass through a window-shaped opening in order to reach its goal.

* Accepted to 2021 IEEE 17th International Conference on Automation Science and Engineering (CASE 2021)

Via

Access Paper or Ask Questions