Chromosomes exhibit non-rigid and non-articulated nature with varying degrees of curvature. Chromosome straightening is an essential step for subsequent karyotype construction, pathological diagnosis and cytogenetic map development. However, robust chromosome straightening remains challenging, due to the unavailability of training images, distorted chromosome details and shapes after straightening, as well as poor generalization capability. We propose a novel architecture, ViT-Patch GAN, consisting of a motion transformation generator and a Vision Transformer-based patch (ViT-Patch) discriminator. The generator learns the motion representation of chromosomes for straightening. With the help of the ViT-Patch discriminator, the straightened chromosomes retain more shape and banding pattern details. The proposed framework is trained on a small dataset and is able to straighten chromosome images with state-of-the-art performance for two large datasets.