This paper contributes to the challenge of learning a function on streamed multimodal data through evaluation. The core of the result of our paper is the combination of two quite different approaches to this problem. One comes from the mathematically principled technology of signatures and log-signatures as representations for streamed data, while the other draws on the techniques of recurrent neural networks (RNN). The ability of the former to manage high sample rate streams and the latter to manage large scale nonlinear interactions allows hybrid algorithms that are easy to code, quicker to train, and of lower complexity for a given accuracy. We illustrate the approach by approximating the unknown functional as a controlled differential equation. Linear functionals on solutions of controlled differential equations are the natural universal class of functions on data streams. Following this approach, we propose a hybrid Logsig-RNN algorithm that learns functionals on streamed data. By testing on various datasets, i.e. synthetic data, NTU RGB+D 120 skeletal action data, and Chalearn2013 gesture data, our algorithm achieves the outstanding accuracy with superior efficiency and robustness.