Abstract:This work introduces StageSAT, a new approach to solving floating-point satisfiability that bridges SMT solving with numerical optimization. StageSAT reframes a floating-point formula as a series of optimization problems in three stages of increasing precision. It begins with a fast, projection-aided descent objective to guide the search toward a feasible region, proceeding to bit-level accuracy with ULP$^2$ optimization and a final $n$-ULP lattice refinement. By construction, the final stage uses a representing function that is zero if and only if a candidate satisfies all constraints. Thus, when optimization drives the objective to zero, the resulting assignment is a valid solution, providing a built-in guarantee of soundness. To improve search, StageSAT introduces a partial monotone descent property on linear constraints via orthogonal projection, preventing the optimizer from stalling on flat or misleading landscapes. Critically, this solver requires no heavy bit-level reasoning or specialized abstractions; it treats complex arithmetic as a black-box, using runtime evaluations to navigate the input space. We implement StageSAT and evaluate it on extensive benchmarks, including SMT-COMP'25 suites and difficult cases from prior work. StageSAT proved more scalable and accurate than state-of-the-art optimization-based alternatives. It solved strictly more formulas than any competing solver under the same time budget, finding most satisfiable instances without producing spurious models. This amounts to 99.4% recall on satisfiable cases with 0% false SAT, exceeding the reliability of prior optimization-based solvers. StageSAT also delivered significant speedups (often 5--10$\times$) over traditional bit-precise SMT and numeric solvers. These results demonstrate that staged optimization significantly improves performance and correctness of floating-point satisfiability solving.
Abstract:Reducing the number of bits needed to encode the weights and activations of neural networks is highly desirable as it speeds up their training and inference time while reducing memory consumption. For these reasons, research in this area has attracted significant attention toward developing neural networks that leverage lower-precision computing, such as mixed-precision training. Interestingly, none of the existing approaches has investigated pure 16-bit floating-point settings. In this paper, we shed light on the overlooked efficiency of pure 16-bit floating-point neural networks. As such, we provide a comprehensive theoretical analysis to investigate the factors contributing to the differences observed between 16-bit and 32-bit models. We formalize the concepts of floating-point error and tolerance, enabling us to quantitatively explain the conditions under which a 16-bit model can closely approximate the results of its 32-bit counterpart. This theoretical exploration offers perspective that is distinct from the literature which attributes the success of low-precision neural networks to its regularization effect. This in-depth analysis is supported by an extensive series of experiments. Our findings demonstrate that pure 16-bit floating-point neural networks can achieve similar or even better performance than their mixed-precision and 32-bit counterparts. We believe the results presented in this paper will have significant implications for machine learning practitioners, offering an opportunity to reconsider using pure 16-bit networks in various applications.
Abstract:Lowering the precision of neural networks from the prevalent 32-bit precision has long been considered harmful to performance, despite the gain in space and time. Many works propose various techniques to implement half-precision neural networks, but none study pure 16-bit settings. This paper investigates the unexpected performance gain of pure 16-bit neural networks over the 32-bit networks in classification tasks. We present extensive experimental results that favorably compare various 16-bit neural networks' performance to those of the 32-bit models. In addition, a theoretical analysis of the efficiency of 16-bit models is provided, which is coupled with empirical evidence to back it up. Finally, we discuss situations in which low-precision training is indeed detrimental.