The scale of systems employed in industrial environments demands a large number of sensors to facilitate meticulous monitoring and functioning. These requirements could potentially lead to inefficient system designs. The data coming from various sensors are often correlated due to the underlying relations in the system parameters that the sensors monitor. In theory, it should be possible to emulate the output of certain sensors based on other sensors. Tapping into such possibilities holds tremendous advantages in terms of reducing system design complexity. In order to identify the subset of sensors whose readings can be emulated, the sensors must be grouped into clusters. Complex systems generally have a large quantity of sensors that collect and store data over prolonged periods of time. This leads to the accumulation of massive amounts of data. In this paper we propose an end-to-end algorithmic solution, to realise virtual sensors in such systems. This algorithm splits the dataset into blocks and clusters each of them individually. It then fuses these clustering solutions to obtain a global solution using an Ant Colony inspired technique, FAC2T. Having grouped the sensors into clusters, we select representative sensors from each cluster. These sensors are retained in the system while the other sensors readings are emulated by applying supervised learning algorithms.