Filtering images of more than one channel is challenging in terms of both efficiency and effectiveness. By grouping similar patches to utilize the self-similarity and sparse linear approximation of natural images, recent nonlocal and transform-domain methods have been widely used in color and multispectral image (MSI) denoising. Many related methods focus on the modeling of group level correlation to enhance sparsity, which often resorts to a recursive strategy with a large number of similar patches. The importance of the patch level representation is understated. In this paper, we mainly investigate the influence and potential of representation at patch level by considering a general formulation with block diagonal matrix. We further show that by training a proper global patch basis, along with a local principal component analysis transform in the grouping dimension, a simple transform-threshold-inverse method could produce very competitive results. Fast implementation is also developed to reduce computational complexity. Extensive experiments on both simulated and real datasets demonstrate its robustness, effectiveness and efficiency.