Reconfigurable intelligent surface (RIS) has been envisioned as a promising technique to enable and enhance future wireless communications due to its potential to engineer the wireless channels in a cost-effective manner. Extensive research attention has been drawn to the use of conventional RIS 1.0 with diagonal phase shift matrices, where each RIS element is connected to its own load to ground but not connected to other elements. However, the simple architecture of RIS 1.0 limits its flexibility of manipulating passive beamforming. To fully exploit the benefits of RIS, in this paper, we introduce RIS 2.0 beyond diagonal phase shift matrices, namely beyond diagonal RIS (BD-RIS). We first explain the modeling of BD-RIS based on the scattering parameter network analysis and classify BD-RIS by the mathematical characteristics of the scattering matrix, supported modes, and architectures. Then, we provide simulations to evaluate the sum-rate performance with different modes/architectures of BD-RIS. We summarize the benefits of BD-RIS in providing high flexibility in wave manipulation, enlarging coverage, facilitating the deployment, and requiring low complexity in resolution bit and element numbers. Inspired by the benefits of BD-RIS, we also discuss potential applications of BD-RIS in various wireless systems. Finally, we list key challenges in modeling, designing, and implementing BD-RIS in practice and point to possible future research directions for BD-RIS.