The use of deep learning (DL) for channel state information (CSI) feedback has garnered widespread attention across academia and industry. The mainstream DL architectures, e.g., CsiNet, deploy DL models on the base station (BS) side and the user equipment (UE) side, which are highly coupled and need to be trained jointly. However, two-sided DL models require collaborations between different network vendors and UE vendors, which entails considerable challenges in order to achieve consensus, e.g., model maintenance and responsibility. Furthermore, DL-based CSI feedback design invokes DL to reduce only the CSI feedback error, whereas jointly optimizing several modules at the transceivers would provide more significant gains. This article presents DL-based CSI feedback from the perspectives of one-sided model and joint multi-module learning. We herein introduce various novel one-sided CSI feedback architectures. In particular, the recently proposed CSI-PPPNet provides a one-sided one-for-all framework, which allows a DL model to deal with arbitrary CSI compression ratios. We review different joint multi-module learning methods, where the CSI feedback module is learned jointly with other modules including channel coding, channel estimation, pilot design and precoding design. Finally, future directions and challenges for DL-based CSI feedback are discussed, from the perspectives of inherent limitations of artificial intelligence (AI) and practical deployment issues.