One critical challenge for applying today's Artificial Intelligence (AI) technologies to real-world applications is the common existence of data silos across different organizations. Due to legal, privacy and other practical constraints, data from different organizations cannot be easily integrated. Federated learning (FL), especially the vertical FL (VFL), allows multiple parties having different sets of attributes about the same user collaboratively build models while preserving user privacy. However, communication overhead is a principal bottleneck since the existing VFL protocols require per-iteration communications among all parties. In this paper, we propose the Federated Stochastic Block Coordinate Descent (FedBCD) to effectively reduce the communication rounds for VFL. We show that when the batch size, sample size, and the local iterations are selected appropriately, the algorithm requires $\mathcal{O}(\sqrt{T})$ communication rounds to achieve $\mathcal{O}(1/\sqrt{T})$ accuracy. Finally, we demonstrate the performance of FedBCD on several models and datasets, and on a large-scale industrial platform for VFL.