Interference continues to be a key limiting factor in cellular radio access network (RAN) deployments. Effective, data-driven, self-adapting radio resource management (RRM) solutions are essential for tackling interference, and thus achieving the desired performance levels particularly at the cell-edge. In future network architecture, RAN intelligent controller (RIC) running with near-real-time applications, called xApps, is considered as a potential component to enable RRM. In this paper, based on deep reinforcement learning (RL) xApp, a joint sub-band masking and power management is proposed for smart interference management. The sub-band resource masking problem is formulated as a Markov Decision Process (MDP) that can be solved employing deep RL to approximate the policy functions as well as to avoid extremely high computational and storage costs of conventional tabular-based approaches. The developed xApp is scalable in both storage and computation. Simulation results demonstrate advantages of the proposed approach over decentralized baselines in terms of the trade-off between cell-centre and cell-edge user rates, energy efficiency and computational efficiency.