High-throughput data collection techniques and largescale (cloud) computing are transforming our understanding of ecosystems at all scales by allowing the integration of multimodal data such as physics, chemistry, biology, ecology, fishing, economics and other social sciences in a common computational framework. We focus in this paper on a large scale data assimilation and prediction backbone based on Deep Stacking Networks (DSN) in the frame of the IDEA (Island Digital Ecosystem Avatars) project (Moorea Island), based on the subdivision of the island in watersheds and lagoon units. We also describe several kinds of raw data that can train and constrain such an ecosystem avatar model, as well as second level data such as ecological or physical indexes / indicators.