Naloxone, an opioid antagonist, has been widely used to save lives from opioid overdose, a leading cause for death in the opioid epidemic. However, naloxone has short brain retention ability, which limits its therapeutic efficacy. Developing better opioid antagonists is critical in combating the opioid epidemic.Instead of exhaustively searching in a huge chemical space for better opioid antagonists, we adopt reinforcement learning which allows efficient gradient-based search towards molecules with desired physicochemical and/or biological properties. Specifically, we implement a deep reinforcement learning framework to discover potential lead compounds as better opioid antagonists with enhanced brain retention ability. A customized multi-objective reward function is designed to bias the generation towards molecules with both sufficient opioid antagonistic effect and enhanced brain retention ability. Thorough evaluation demonstrates that with this framework, we are able to identify valid, novel and feasible molecules with multiple desired properties, which has high potential in drug discovery.