Uncertainties in a structure is inevitable, which generally lead to variation in dynamic response predictions. For a complex structure, brute force Monte Carlo simulation for response variation analysis is infeasible since one single run may already be computationally costly. Data driven meta-modeling approaches have thus been explored to facilitate efficient emulation and statistical inference. The performance of a meta-model hinges upon both the quality and quantity of training dataset. In actual practice, however, high-fidelity data acquired from high-dimensional finite element simulation or experiment are generally scarce, which poses significant challenge to meta-model establishment. In this research, we take advantage of the multi-level response prediction opportunity in structural dynamic analysis, i.e., acquiring rapidly a large amount of low-fidelity data from reduced-order modeling, and acquiring accurately a small amount of high-fidelity data from full-scale finite element analysis. Specifically, we formulate a composite neural network fusion approach that can fully utilize the multi-level, heterogeneous datasets obtained. It implicitly identifies the correlation of the low- and high-fidelity datasets, which yields improved accuracy when compared with the state-of-the-art. Comprehensive investigations using frequency response variation characterization as case example are carried out to demonstrate the performance.