Abstract:Serving deep learning (DL) models on relational data has become a critical requirement across diverse commercial and scientific domains, sparking growing interest recently. In this visionary paper, we embark on a comprehensive exploration of representative architectures to address the requirement. We highlight three pivotal paradigms: The state-of-the-artDL-Centricarchitecture offloadsDL computations to dedicated DL frameworks. The potential UDF-Centric architecture encapsulates one or more tensor computations into User Defined Functions (UDFs) within the database system. The potentialRelation-Centricarchitecture aims to represent a large-scale tensor computation through relational operators. While each of these architectures demonstrates promise in specific use scenarios, we identify urgent requirements for seamless integration of these architectures and the middle ground between these architectures. We delve into the gaps that impede the integration and explore innovative strategies to close them. We present a pathway to establish a novel database system for enabling a broad class of data-intensive DL inference applications.
Abstract:Motivated by extreme multi-label classification applications, we consider training deep learning models over sparse data in multi-GPU servers. The variance in the number of non-zero features across training batches and the intrinsic GPU heterogeneity combine to limit accuracy and increase the time to convergence. We address these challenges with Adaptive SGD, an adaptive elastic model averaging stochastic gradient descent algorithm for heterogeneous multi-GPUs that is characterized by dynamic scheduling, adaptive batch size scaling, and normalized model merging. Instead of statically partitioning batches to GPUs, batches are routed based on the relative processing speed. Batch size scaling assigns larger batches to the faster GPUs and smaller batches to the slower ones, with the goal to arrive at a steady state in which all the GPUs perform the same number of model updates. Normalized model merging computes optimal weights for every GPU based on the assigned batches such that the combined model achieves better accuracy. We show experimentally that Adaptive SGD outperforms four state-of-the-art solutions in time-to-accuracy and is scalable with the number of GPUs.
Abstract:Residential customers have traditionally not been treated as individual entities due to the high volatility in residential consumption patterns as well as a historic focus on aggregated loads from the utility and system feeder perspective. Large-scale deployment of smart meters has motivated increasing studies to explore disaggregated daily load patterns, which can reveal important heterogeneity across different time scales, weather conditions, as well as within and across individual households. This paper aims to shed light on the mechanisms by which electricity consumption patterns exhibit variability and the different constraints that may affect demand-response (DR) flexibility. We systematically evaluate the relationship between daily time-of-use patterns and their variability to external and internal influencing factors, including time scales of interest, meteorological conditions, and household characteristics by application of an improved version of the adaptive K-means clustering method to profile "household-days" of a summer peaking utility. We find that for this summer-peaking utility, outdoor temperature is the most important external driver of the load shape variability relative to seasonality and day-of-week. The top three consumption patterns represent approximately 50% of usage on the highest temperature days. The variability in summer load shapes across customers can be explained by the responsiveness of the households to outside temperature. Our results suggest that depending on the influencing factors, not all the consumption variability can be readily translated to consumption flexibility. Such information needs to be further explored in segmenting customers for better program targeting and tailoring to meet the needs of the rapidly evolving electricity grid.