We are assisting at a growing interest in the development of learning architectures with application to digital communication systems. Herein, we consider the detection/decoding problem. We aim at developing an optimal neural architecture for such a task. The definition of the optimal criterion is a fundamental step. We propose to use the mutual information (MI) of the channel input-output signal pair. The computation of the MI is a formidable task, and for the majority of communication channels it is unknown. Therefore, the MI has to be learned. For such an objective, we propose a novel neural MI estimator based on a discriminative formulation. This leads to the derivation of the mutual information neural decoder (MIND). The developed neural architecture is capable not only to solve the decoding problem in unknown channels, but also to return an estimate of the average MI achieved with the coding scheme, as well as the decoding error probability. Several numerical results are reported and compared with maximum a-posteriori (MAP) and maximum likelihood (MaxL) decoding strategies.