Deep Learning (DL) has enabled a paradigm shift in wireless communication system with data driven end-to-end (E2E) learning and optimization of the Physical Layer (PHY). By leveraging the representation learning of DL, E2E systems exhibit enhanced adaptability and performance in complex wireless environments, fulfilling the demands of 5G and beyond network systems and applications. The evolution of data-driven techniques in the PHY has enabled advanced semantic applications across various modalities including text, image, audio, video, and multi-modal transmissions. These applications transcend from traditional bit-level communication to semantic-level intelligent communication systems, which are capable of understanding and adapting to the context and intent of the data transmission. Although PHY as a DL architecture for data-driven E2E communication is a key factor in enabling semantic communication systems (SemCom), and various studies in recent years have surveyed them separately, their combination has not been thoroughly reviewed. Additionally, these are emerging fields that are still in their infancy, with several techniques having been developed and evolved in recent years. Therefore, this article provides a holistic review of data-driven PHY for E2E communication system, and their enabling semantic applications across different modalities. Furthermore, it identifies critical challenges and prospective research directions, providing a pivotal reference for future development of DL in PHY and SemCom.