Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Muhammad Adil Asif

Teaching LLMs How to Learn with Contextual Fine-Tuning

Mar 12, 2025

Younwoo Choi, Muhammad Adil Asif, Ziwen Han, John Willes, Rahul G. Krishnan

Abstract:Prompting Large Language Models (LLMs), or providing context on the expected model of operation, is an effective way to steer the outputs of such models to satisfy human desiderata after they have been trained. But in rapidly evolving domains, there is often need to fine-tune LLMs to improve either the kind of knowledge in their memory or their abilities to perform open ended reasoning in new domains. When human's learn new concepts, we often do so by linking the new material that we are studying to concepts we have already learned before. To that end, we ask, "can prompting help us teach LLMs how to learn". In this work, we study a novel generalization of instruction tuning, called contextual fine-tuning, to fine-tune LLMs. Our method leverages instructional prompts designed to mimic human cognitive strategies in learning and problem-solving to guide the learning process during training, aiming to improve the model's interpretation and understanding of domain-specific knowledge. We empirically demonstrate that this simple yet effective modification improves the ability of LLMs to be fine-tuned rapidly on new datasets both within the medical and financial domains.

* ICLR 2025

Via

Access Paper or Ask Questions

Geometry Matching for Multi-Embodiment Grasping

Dec 06, 2023

Maria Attarian, Muhammad Adil Asif, Jingzhou Liu, Ruthrash Hari, Animesh Garg, Igor Gilitschenski, Jonathan Tompson

Figure 1 for Geometry Matching for Multi-Embodiment Grasping

Figure 2 for Geometry Matching for Multi-Embodiment Grasping

Figure 3 for Geometry Matching for Multi-Embodiment Grasping

Figure 4 for Geometry Matching for Multi-Embodiment Grasping

Abstract:Many existing learning-based grasping approaches concentrate on a single embodiment, provide limited generalization to higher DoF end-effectors and cannot capture a diverse set of grasp modes. We tackle the problem of grasping using multiple embodiments by learning rich geometric representations for both objects and end-effectors using Graph Neural Networks. Our novel method - GeoMatch - applies supervised learning on grasping data from multiple embodiments, learning end-to-end contact point likelihood maps as well as conditional autoregressive predictions of grasps keypoint-by-keypoint. We compare our method against baselines that support multiple embodiments. Our approach performs better across three end-effectors, while also producing diverse grasps. Examples, including real robot demos, can be found at geo-match.github.io.

* 7th Annual Conference on Robot Learning, 2023

Via

Access Paper or Ask Questions

FlexModel: A Framework for Interpretability of Distributed Large Language Models

Dec 05, 2023

Matthew Choi, Muhammad Adil Asif, John Willes, David Emerson

Figure 1 for FlexModel: A Framework for Interpretability of Distributed Large Language Models

Figure 2 for FlexModel: A Framework for Interpretability of Distributed Large Language Models

Figure 3 for FlexModel: A Framework for Interpretability of Distributed Large Language Models

Figure 4 for FlexModel: A Framework for Interpretability of Distributed Large Language Models

Abstract:With the growth of large language models, now incorporating billions of parameters, the hardware prerequisites for their training and deployment have seen a corresponding increase. Although existing tools facilitate model parallelization and distributed training, deeper model interactions, crucial for interpretability and responsible AI techniques, still demand thorough knowledge of distributed computing. This often hinders contributions from researchers with machine learning expertise but limited distributed computing background. Addressing this challenge, we present FlexModel, a software package providing a streamlined interface for engaging with models distributed across multi-GPU and multi-node configurations. The library is compatible with existing model distribution libraries and encapsulates PyTorch models. It exposes user-registerable HookFunctions to facilitate straightforward interaction with distributed model internals, bridging the gap between distributed and single-device model paradigms. Primarily, FlexModel enhances accessibility by democratizing model interactions and promotes more inclusive research in the domain of large-scale neural networks. The package is found at https://github.com/VectorInstitute/flex_model.

* 14 pages, 8 figures. To appear at the Socially Responsible Language Modelling Research (SoLaR) Workshop, 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

Via

Access Paper or Ask Questions