Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jiaqi Shang

Basis Transformers for Multi-Task Tabular Regression

Jun 07, 2025

Wei Min Loh, Jiaqi Shang, Pascal Poupart

Abstract:Dealing with tabular data is challenging due to partial information, noise, and heterogeneous structure. Existing techniques often struggle to simultaneously address key aspects of tabular data such as textual information, a variable number of columns, and unseen data without metadata besides column names. We propose a novel architecture, \textit{basis transformers}, specifically designed to tackle these challenges while respecting inherent invariances in tabular data, including hierarchical structure and the representation of numeric values. We evaluate our design on a multi-task tabular regression benchmark, achieving an improvement of 0.338 in the median $R^2$ score and the lowest standard deviation across 34 tasks from the OpenML-CTR23 benchmark. Furthermore, our model has five times fewer parameters than the best-performing baseline and surpasses pretrained large language model baselines -- even when initialized from randomized weights.

Via

Access Paper or Ask Questions

Unraveling the geometry of visual relational reasoning

Feb 24, 2025

Jiaqi Shang, Gabriel Kreiman, Haim Sompolinsky

Figure 1 for Unraveling the geometry of visual relational reasoning

Figure 2 for Unraveling the geometry of visual relational reasoning

Figure 3 for Unraveling the geometry of visual relational reasoning

Figure 4 for Unraveling the geometry of visual relational reasoning

Abstract:Humans and other animals readily generalize abstract relations, such as recognizing constant in shape or color, whereas neural networks struggle. To investigate how neural networks generalize abstract relations, we introduce SimplifiedRPM, a novel benchmark for systematic evaluation. In parallel, we conduct human experiments to benchmark relational difficulty, enabling direct model-human comparisons. Testing four architectures--ResNet-50, Vision Transformer, Wild Relation Network, and Scattering Compositional Learner (SCL)--we find that SCL best aligns with human behavior and generalizes best. Building on a geometric theory of neural representations, we show representational geometries that predict generalization. Layer-wise analysis reveals distinct relational reasoning strategies across models and suggests a trade-off where unseen rule representations compress into training-shaped subspaces. Guided by our geometric perspective, we propose and evaluate SNRloss, a novel objective balancing representation geometry. Our findings offer geometric insights into how neural networks generalize abstract relations, paving the way for more human-like visual reasoning in AI.

* 47 pages, 7 figures, 8 SI figures, 2 SI tables

Via

Access Paper or Ask Questions

Diverse capability and scaling of diffusion and auto-regressive models when learning abstract rules

Nov 12, 2024

Binxu Wang, Jiaqi Shang, Haim Sompolinsky

Figure 1 for Diverse capability and scaling of diffusion and auto-regressive models when learning abstract rules

Figure 2 for Diverse capability and scaling of diffusion and auto-regressive models when learning abstract rules

Figure 3 for Diverse capability and scaling of diffusion and auto-regressive models when learning abstract rules

Figure 4 for Diverse capability and scaling of diffusion and auto-regressive models when learning abstract rules

Abstract:Humans excel at discovering regular structures from limited samples and applying inferred rules to novel settings. We investigate whether modern generative models can similarly learn underlying rules from finite samples and perform reasoning through conditional sampling. Inspired by Raven's Progressive Matrices task, we designed GenRAVEN dataset, where each sample consists of three rows, and one of 40 relational rules governing the object position, number, or attributes applies to all rows. We trained generative models to learn the data distribution, where samples are encoded as integer arrays to focus on rule learning. We compared two generative model families: diffusion (EDM, DiT, SiT) and autoregressive models (GPT2, Mamba). We evaluated their ability to generate structurally consistent samples and perform panel completion via unconditional and conditional sampling. We found diffusion models excel at unconditional generation, producing more novel and consistent samples from scratch and memorizing less, but performing less well in panel completion, even with advanced conditional sampling methods. Conversely, autoregressive models excel at completing missing panels in a rule-consistent manner but generate less consistent samples unconditionally. We observe diverse data scaling behaviors: for both model families, rule learning emerges at a certain dataset size - around 1000s examples per rule. With more training data, diffusion models improve both their unconditional and conditional generation capabilities. However, for autoregressive models, while panel completion improves with more training data, unconditional generation consistency declines. Our findings highlight complementary capabilities and limitations of diffusion and autoregressive models in rule learning and reasoning tasks, suggesting avenues for further research into their mechanisms and potential for human-like reasoning.

* 12 pages, 5 figures. Accepted to NeurIPS2024 Workshop on System 2 Reasoning At Scale as long paper

Via

Access Paper or Ask Questions

Advancing Efficient Brain Tumor Multi-Class Classification -- New Insights from the Vision Mamba Model in Transfer Learning

Oct 29, 2024

Yinyi Lai, Anbo Cao, Yuan Gao, Jiaqi Shang, Zongyu Li, Jia Guo

Figure 1 for Advancing Efficient Brain Tumor Multi-Class Classification -- New Insights from the Vision Mamba Model in Transfer Learning

Figure 2 for Advancing Efficient Brain Tumor Multi-Class Classification -- New Insights from the Vision Mamba Model in Transfer Learning

Figure 3 for Advancing Efficient Brain Tumor Multi-Class Classification -- New Insights from the Vision Mamba Model in Transfer Learning

Figure 4 for Advancing Efficient Brain Tumor Multi-Class Classification -- New Insights from the Vision Mamba Model in Transfer Learning

Abstract:Early and accurate diagnosis of brain tumors is crucial for improving patient survival rates. However, the detection and classification of brain tumors are challenging due to their diverse types and complex morphological characteristics. This study investigates the application of pre-trained models for brain tumor classification, with a particular focus on deploying the Mamba model. We fine-tuned several mainstream transfer learning models and applied them to the multi-class classification of brain tumors. By comparing these models to those trained from scratch, we demonstrated the significant advantages of transfer learning, especially in the medical imaging field, where annotated data is often limited. Notably, we introduced the Vision Mamba (Vim), a novel network architecture, and applied it for the first time in brain tumor classification, achieving exceptional classification accuracy. Experimental results indicate that the Vim model achieved 100% classification accuracy on an independent test set, emphasizing its potential for tumor classification tasks. These findings underscore the effectiveness of transfer learning in brain tumor classification and reveal that, compared to existing state-of-the-art models, the Vim model is lightweight, efficient, and highly accurate, offering a new perspective for clinical applications. Furthermore, the framework proposed in this study for brain tumor classification, based on transfer learning and the Vision Mamba model, is broadly applicable to other medical imaging classification problems.

Via

Access Paper or Ask Questions

**Deep-Sea A+: An Advanced Path Planning Method Integrating Enhanced A and Dynamic Window Approach for Autonomous Underwater Vehicles**

Oct 22, 2024

Yinyi Lai, Jiaqi Shang, Zenghui Liu, Zheyu Jiang, Yuyang Li, Longchao Chen

Figure 1 for Deep-Sea A*+: An Advanced Path Planning Method Integrating Enhanced A* and Dynamic Window Approach for Autonomous Underwater Vehicles

Figure 2 for Deep-Sea A*+: An Advanced Path Planning Method Integrating Enhanced A* and Dynamic Window Approach for Autonomous Underwater Vehicles

Figure 3 for Deep-Sea A*+: An Advanced Path Planning Method Integrating Enhanced A* and Dynamic Window Approach for Autonomous Underwater Vehicles

Figure 4 for Deep-Sea A*+: An Advanced Path Planning Method Integrating Enhanced A* and Dynamic Window Approach for Autonomous Underwater Vehicles

Abstract:As terrestrial resources become increasingly depleted, the demand for deep-sea resource exploration has intensified. However, the extreme conditions in the deep-sea environment pose significant challenges for underwater operations, necessitating the development of robust detection robots. In this paper, we propose an advanced path planning methodology that integrates an improved A* algorithm with the Dynamic Window Approach (DWA). By optimizing the search direction of the traditional A* algorithm and introducing an enhanced evaluation function, our improved A* algorithm accelerates path searching and reduces computational load. Additionally, the path-smoothing process has been refined to improve continuity and smoothness, minimizing sharp turns. This method also integrates global path planning with local dynamic obstacle avoidance via DWA, improving the real-time response of underwater robots in dynamic environments. Simulation results demonstrate that our proposed method surpasses the traditional A* algorithm in terms of path smoothness, obstacle avoidance, and real-time performance. The robustness of this approach in complex environments with both static and dynamic obstacles highlights its potential in autonomous underwater vehicle (AUV) navigation and obstacle avoidance.

* Accepted by 2024 International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE 2024)

Via

Access Paper or Ask Questions

Jiaqi Shang

Basis Transformers for Multi-Task Tabular Regression

Unraveling the geometry of visual relational reasoning

Diverse capability and scaling of diffusion and auto-regressive models when learning abstract rules

Advancing Efficient Brain Tumor Multi-Class Classification -- New Insights from the Vision Mamba Model in Transfer Learning

Deep-Sea A*+: An Advanced Path Planning Method Integrating Enhanced A* and Dynamic Window Approach for Autonomous Underwater Vehicles

**Deep-Sea A+: An Advanced Path Planning Method Integrating Enhanced A and Dynamic Window Approach for Autonomous Underwater Vehicles**