Many measurement modalities which perform imaging by probing an object pixel-by-pixel, such as via Photoacoustic Microscopy, produce a multi-dimensional feature (typically a time-domain signal) at each pixel. In principle, the many degrees of freedom in the time-domain signal would admit the possibility of significant multi-modal information being implicitly present, much more than a single scalar "brightness", regarding the underlying targets being observed. However, the measured signal is neither a weighted-sum of basis functions (such as principal components) nor one of a set of prototypes (K-means), which has motivated the novel clustering method proposed here, capable of learning centroids (signal shapes) that are related to the underlying, albeit unknown, target characteristics in a scalable and noise-robust manner.