Opportunistic routing relies on the broadcast capability of wireless networks. It brings higher reliability and robustness in highly dynamic and/or severe environments such as mobile or vehicular ad-hoc networks (MANETs/VANETs). To reduce the cost of broadcast, multicast routing schemes use the connected dominating set (CDS) or multi-point relaying (MPR) set to decrease the network overhead and hence, their selection algorithms are critical. Common MPR selection algorithms are heuristic, rely on coordination between nodes, need high computational power for large networks, and are difficult to tune for network uncertainties. In this paper, we use multi-agent deep reinforcement learning to design a novel MPR multicast routing technique, DeepMPR, which is outperforming the OLSR MPR selection algorithm while it does not require MPR announcement messages from the neighbors. Our evaluation results demonstrate the performance gains of our trained DeepMPR multicast forwarding policy compared to other popular techniques.