Grant-free random access has the potential to support massive connectivity in Internet of Things (IoT) networks, where joint activity detection and channel estimation (JADCE) is a key issue that needs to be tackled. The existing methods for JADCE usually suffer from one of the following limitations: high computational complexity, ineffective in inducing sparsity, and incapable of handling complex matrix estimation. To mitigate all the aforementioned limitations, we in this paper develop an effective unfolding neural network framework built upon the proximal operator method to tackle the JADCE problem in IoT networks, where the base station is equipped with multiple antennas. Specifically, the JADCE problem is formulated as a group-sparse-matrix estimation problem, which is regularized by non-convex minimax concave penalty (MCP). This problem can be iteratively solved by using the proximal operator method, based on which we develop a unfolding neural network structure by parameterizing the algorithmic iterations. By further exploiting the coupling structure among the training parameters as well as the analytical computation, we develop two additional unfolding structures to reduce the training complexity. We prove that the proposed algorithm achieves a linear convergence rate. Results show that our proposed three unfolding structures not only achieve a faster convergence rate but also obtain a higher estimation accuracy than the baseline methods.