Abstract:Diffusion models are pivotal for generating high-quality images and videos. Inspired by the success of OpenAI's Sora, the backbone of diffusion models is evolving from U-Net to Transformer, known as Diffusion Transformers (DiTs). However, generating high-quality content necessitates longer sequence lengths, exponentially increasing the computation required for the attention mechanism, and escalating DiTs inference latency. Parallel inference is essential for real-time DiTs deployments, but relying on a single parallel method is impractical due to poor scalability at large scales. This paper introduces xDiT, a comprehensive parallel inference engine for DiTs. After thoroughly investigating existing DiTs parallel approaches, xDiT chooses Sequence Parallel (SP) and PipeFusion, a novel Patch-level Pipeline Parallel method, as intra-image parallel strategies, alongside CFG parallel for inter-image parallelism. xDiT can flexibly combine these parallel approaches in a hybrid manner, offering a robust and scalable solution. Experimental results on two 8xL40 GPUs (PCIe) nodes interconnected by Ethernet and an 8xA100 (NVLink) node showcase xDiT's exceptional scalability across five state-of-the-art DiTs. Notably, we are the first to demonstrate DiTs scalability on Ethernet-connected GPU clusters. xDiT is available at https://github.com/xdit-project/xDiT.
Abstract:The vast adoption of Wi-Fi and/or Bluetooth capabilities in Internet of Things (IoT) devices, along with the rapid growth of deployed smart devices, has caused significant interference and congestion in the industrial, scientific, and medical (ISM) bands. Traditional Wi-Fi Medium Access Control (MAC) design faces significant challenges in managing increasingly complex wireless environments while ensuring network Quality of Service (QoS) performance. This paper explores the potential integration of advanced Artificial Intelligence (AI) methods into the design of Wi-Fi MAC protocols. We propose AI-MAC, an innovative approach that employs machine learning algorithms to dynamically adapt to changing network conditions, optimize channel access, mitigate interference, and ensure deterministic latency. By intelligently predicting and managing interference, AI-MAC aims to provide a robust solution for next generation of Wi-Fi networks, enabling seamless connectivity and enhanced QoS. Our experimental results demonstrate that AI-MAC significantly reduces both interference and latency, paving the way for more reliable and efficient wireless communications in the increasingly crowded ISM band.