We present a short and elementary proof of the duality for Wasserstein distributionally robust optimization, which holds for any arbitrary Kantorovich transport distance, any arbitrary measurable loss function, and any arbitrary nominal probability distribution, as long as certain interchangeability principle holds.