Abstract:Controlling and analyzing cyberphysical and robotics systems is increasingly becoming a Big Data challenge. Pushing this data to, and processing in the cloud is more efficient than on-board processing. However, current cloud-based solutions are not suitable for the latency requirements of these applications. We present a new concept, Discretized Streams or D-Streams, that enables massively scalable computations on streaming data with latencies as short as a second. We experiment with an implementation of D-Streams on top of the Spark computing framework. We demonstrate the usefulness of this concept with a novel algorithm to estimate vehicular traffic in urban networks. Our online EM algorithm can estimate traffic on a very large city network (the San Francisco Bay Area) by processing tens of thousands of observations per second, with a latency of a few seconds.