We propose a novel approach for channel state information (CSI) compression in multiple-input multiple-output orthogonal frequency division multiplexing (MIMO-OFDM) systems, where the frequency-domain channel matrix is treated as a high-dimensional complex-valued image. Our method leverages transformer-based nonlinear transform coding (NTC), an advanced deep-learning-driven image compression technique that generates a highly compact binary representation of the CSI. Unlike conventional autoencoder-based CSI compression, NTC optimizes a nonlinear mapping to produce a latent vector while simultaneously estimating its probability distribution for efficient entropy coding. By exploiting the statistical independence of latent vector entries, we integrate a transformer-based deep neural network with a scalar nested-lattice uniform quantization scheme, enabling low-complexity, multi-rate CSI feedback that dynamically adapts to varying feedback channel conditions. The proposed multi-rate CSI compression scheme achieves state-of-the-art rate-distortion performance, outperforming existing techniques with the same number of neural network parameters. Simulation results further demonstrate that our approach provides a superior rate-distortion trade-off, requiring only 6% of the neural network parameters compared to existing methods, making it highly efficient for practical deployment.