https://github.com/BNL-DAQ-LDRD/NeuralCompression_v2.
High-energy large-scale particle colliders produce data at high speed in the order of 1 terabytes per second in nuclear physics and petabytes per second in high-energy physics. Developing real-time data compression algorithms to reduce such data at high throughput to fit permanent storage has drawn increasing attention. Specifically, at the newly constructed sPHENIX experiment at the Relativistic Heavy Ion Collider (RHIC), a time projection chamber is used as the main tracking detector, which records particle trajectories in a volume of a three-dimensional (3D) cylinder. The resulting data are usually very sparse with occupancy around 10.8%. Such sparsity presents a challenge to conventional learning-free lossy compression algorithms, such as SZ, ZFP, and MGARD. The 3D convolutional neural network (CNN)-based approach, Bicephalous Convolutional Autoencoder (BCAE), outperforms traditional methods both in compression rate and reconstruction accuracy. BCAE can also utilize the computation power of graphical processing units suitable for deployment in a modern heterogeneous high-performance computing environment. This work introduces two BCAE variants: BCAE++ and BCAE-2D. BCAE++ achieves a 15% better compression ratio and a 77% better reconstruction accuracy measured in mean absolute error compared with BCAE. BCAE-2D treats the radial direction as the channel dimension of an image, resulting in a 3x speedup in compression throughput. In addition, we demonstrate an unbalanced autoencoder with a larger decoder can improve reconstruction accuracy without significantly sacrificing throughput. Lastly, we observe both the BCAE++ and BCAE-2D can benefit more from using half-precision mode in throughput (76-79% increase) without loss in reconstruction accuracy. The source code and links to data and pretrained models can be found at