Picture for Harshad Khadilkar

Harshad Khadilkar

$TAR^2$: Temporal-Agent Reward Redistribution for Optimal Policy Preservation in Multi-Agent Reinforcement Learning

Add code
Feb 07, 2025
Viaarxiv icon

Agent-Temporal Credit Assignment for Optimal Policy Preservation in Sparse Multi-Agent Reinforcement Learning

Add code
Dec 19, 2024
Viaarxiv icon

DeepClean: Integrated Distortion Identification and Algorithm Selection for Rectifying Image Corruptions

Add code
Jul 23, 2024
Viaarxiv icon

Leveraging Domain Knowledge for Efficient Reward Modelling in RLHF: A Case-Study in E-Commerce Opinion Summarization

Add code
Feb 23, 2024
Figure 1 for Leveraging Domain Knowledge for Efficient Reward Modelling in RLHF: A Case-Study in E-Commerce Opinion Summarization
Figure 2 for Leveraging Domain Knowledge for Efficient Reward Modelling in RLHF: A Case-Study in E-Commerce Opinion Summarization
Figure 3 for Leveraging Domain Knowledge for Efficient Reward Modelling in RLHF: A Case-Study in E-Commerce Opinion Summarization
Figure 4 for Leveraging Domain Knowledge for Efficient Reward Modelling in RLHF: A Case-Study in E-Commerce Opinion Summarization
Viaarxiv icon

Transformers are Expressive, But Are They Expressive Enough for Regression?

Add code
Feb 23, 2024
Viaarxiv icon

Reinforcement Replaces Supervision: Query focused Summarization using Deep Reinforcement Learning

Add code
Nov 29, 2023
Viaarxiv icon

Multi-Agent Learning of Efficient Fulfilment and Routing Strategies in E-Commerce

Add code
Nov 20, 2023
Viaarxiv icon

Using General Value Functions to Learn Domain-Backed Inventory Management Policies

Add code
Nov 03, 2023
Viaarxiv icon

Using Linear Regression for Iteratively Training Neural Networks

Add code
Jul 14, 2023
Viaarxiv icon

DCT: Dual Channel Training of Action Embeddings for Reinforcement Learning with Large Discrete Action Spaces

Add code
Jun 28, 2023
Viaarxiv icon