Representing data by means of graph structures identifies one of the most valid approach to extract information in several data analysis applications. This is especially true when multimodal datasets are investigated, as records collected by means of diverse sensing strategies are taken into account and explored. Nevertheless, classic graph signal processing is based on a model for information propagation that is configured according to heat diffusion mechanism. This system provides several constraints and assumptions on the data properties that might be not valid for multimodal data analysis, especially when large scale datasets collected from heterogeneous sources are considered, so that the accuracy and robustness of the outcomes might be severely jeopardized. In this paper, we introduce a novel model for graph definition based on fluid diffusion. The proposed approach improves the ability of graph-based data analysis to take into account several issues of modern data analysis in operational scenarios, so to provide a platform for precise, versatile, and efficient understanding of the phenomena underlying the records under exam, and to fully exploit the potential provided by the diversity of the records in obtaining a thorough characterization of the data and their significance. In this work, we focus our attention to using this fluid diffusion model to drive a community detection scheme, i.e., to divide multimodal datasets into many groups according to similarity among nodes in an unsupervised fashion. Experimental results achieved by testing real multimodal datasets in diverse application scenarios show that our method is able to strongly outperform state-of-the-art schemes for community detection in multimodal data analysis.