Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Daniel Ritter

M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

Apr 14, 2025

Junxiong Wang, Wen-Ding Li, Daniele Paliotta, Daniel Ritter, Alexander M. Rush, Tri Dao

Abstract:Effective reasoning is crucial to solving complex mathematical problems. Recent large language models (LLMs) have boosted performance by scaling test-time computation through long chain-of-thought reasoning. However, transformer-based models are inherently limited in extending context length due to their quadratic computational complexity and linear memory requirements. In this paper, we introduce a novel hybrid linear RNN reasoning model, M1, built on the Mamba architecture, which allows memory-efficient inference. Our approach leverages a distillation process from existing reasoning models and is further enhanced through RL training. Experimental results on the AIME and MATH benchmarks show that M1 not only outperforms previous linear RNN models but also matches the performance of state-of-the-art Deepseek R1 distilled reasoning models at a similar scale. We also compare our generation speed with a highly performant general purpose inference engine, vLLM, and observe more than a 3x speedup compared to a same size transformer. With throughput speedup, we are able to achieve higher accuracy compared to DeepSeek R1 distilled transformer reasoning models under a fixed generation time budget using self-consistency voting. Overall, we introduce a hybrid Mamba reasoning model and provide a more effective approach to scaling test-time generation using self-consistency or long chain of thought reasoning.

* Code is available https://github.com/jxiw/M1

Via

Access Paper or Ask Questions

Learning Finite Linear Temporal Logic Specifications with a Specialized Neural Operator

Nov 21, 2021

Homer Walke, Daniel Ritter, Carl Trimbach, Michael Littman

Figure 1 for Learning Finite Linear Temporal Logic Specifications with a Specialized Neural Operator

Figure 2 for Learning Finite Linear Temporal Logic Specifications with a Specialized Neural Operator

Figure 3 for Learning Finite Linear Temporal Logic Specifications with a Specialized Neural Operator

Figure 4 for Learning Finite Linear Temporal Logic Specifications with a Specialized Neural Operator

Abstract:Finite linear temporal logic ($\mathsf{LTL}_f$) is a powerful formal representation for modeling temporal sequences. We address the problem of learning a compact $\mathsf{LTL}_f$ formula from labeled traces of system behavior. We propose a novel neural network operator and evaluate the resulting architecture, Neural$\mathsf{LTL}_f$. Our approach includes a specialized recurrent filter, designed to subsume $\mathsf{LTL}_f$ temporal operators, to learn a highly accurate classifier for traces. Then, it discretizes the activations and extracts the truth table represented by the learned weights. This truth table is converted to symbolic form and returned as the learned formula. Experiments on randomly generated $\mathsf{LTL}_f$ formulas show Neural$\mathsf{LTL}_f$ scales to larger formula sizes than existing approaches and maintains high accuracy even in the presence of noise.

* 10 pages, 5 figures

Via

Access Paper or Ask Questions

Formalizing Integration Patterns with Multimedia Data (Extended Version)

Sep 09, 2020

Marco Montali, Andrey Rivkin, Daniel Ritter

Figure 1 for Formalizing Integration Patterns with Multimedia Data (Extended Version)

Figure 2 for Formalizing Integration Patterns with Multimedia Data (Extended Version)

Figure 3 for Formalizing Integration Patterns with Multimedia Data (Extended Version)

Figure 4 for Formalizing Integration Patterns with Multimedia Data (Extended Version)

Abstract:The previous works on formalizing enterprise application integration (EAI) scenarios showed an emerging need for setting up formal foundations for integration patterns, the EAI building blocks, in order to facilitate the model-driven development and ensure its correctness. So far, the formalization requirements were focusing on more "conventional" integration scenarios, in which control-flow, transactional persistent data and time aspects were considered. However, none of these works took into consideration another arising EAI trend that covers social and multimedia computing. In this work we propose a Petri net-based formalism that addresses requirements arising from the multimedia domain. We also demonstrate realizations of one of the most frequently used multimedia patterns and discuss which implications our formal proposal may bring into the area of the multimedia EAI development.

Via

Access Paper or Ask Questions

A Logic Programming Approach to Integration Network Inference

Jan 08, 2013

Daniel Ritter

Figure 1 for A Logic Programming Approach to Integration Network Inference

Figure 2 for A Logic Programming Approach to Integration Network Inference

Figure 3 for A Logic Programming Approach to Integration Network Inference

Figure 4 for A Logic Programming Approach to Integration Network Inference

Abstract:The discovery, representation and reconstruction of (technical) integration networks from Network Mining (NM) raw data is a difficult problem for enterprises. This is due to large and complex IT landscapes within and across enterprise boundaries, heterogeneous technology stacks, and fragmented data. To remain competitive, visibility into the enterprise and partner IT networks on different, interrelated abstraction levels is desirable. We present an approach to represent and reconstruct the integration networks from NM raw data using logic programming based on first-order logic. The raw data expressed as integration network model is represented as facts, on which rules are applied to reconstruct the network. We have built a system that is used to apply this approach to real-world enterprise landscapes and we report on our experience with this system.

* 15 pages, The 26th Workshop on Logic Programming (WLP), Bonn, 2012

Via

Access Paper or Ask Questions