In this paper, we propose two deep joint source and channel coding (DJSCC) structures with attention modules for the multi-input multi-output (MIMO) channel, including a serial structure and a parallel structure. With singular value decomposition (SVD)-based precoding scheme, the MIMO channel can be decomposed into various sub-channels, and the feature outputs will experience sub-channels with different channel qualities. In the serial structure, one single network is used at both the transmitter and the receiver to jointly process data streams of all MIMO subchannels, while data steams of different MIMO subchannels are processed independently via multiple sub-networks in the parallel structure. The attention modules in both serial and parallel architectures enable the system to adapt to varying channel qualities and adjust the quantity of information outputs in accordance with the channel qualities. Experimental results demonstrate the proposed DJSCC structures have improved image transmission performance, and reveal the phenomenon via non-parameter entropy estimation that the learned DJSCC transceivers tend to transmit more information over better sub-channels.