Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiang Jiang

DeepMath-Creative: A Benchmark for Evaluating Mathematical Creativity of Large Language Models

May 13, 2025

Xiaoyang Chen, Xinan Dai, Yu Du, Qian Feng, Naixu Guo, Tingshuo Gu, Yuting Gao, Yingyi Gao, Xudong Han, Xiang Jiang(+21 more)

Abstract:To advance the mathematical proficiency of large language models (LLMs), the DeepMath team has launched an open-source initiative aimed at developing an open mathematical LLM and systematically evaluating its mathematical creativity. This paper represents the initial contribution of this initiative. While recent developments in mathematical LLMs have predominantly emphasized reasoning skills, as evidenced by benchmarks on elementary to undergraduate-level mathematical tasks, the creative capabilities of these models have received comparatively little attention, and evaluation datasets remain scarce. To address this gap, we propose an evaluation criteria for mathematical creativity and introduce DeepMath-Creative, a novel, high-quality benchmark comprising constructive problems across algebra, geometry, analysis, and other domains. We conduct a systematic evaluation of mainstream LLMs' creative problem-solving abilities using this dataset. Experimental results show that even under lenient scoring criteria -- emphasizing core solution components and disregarding minor inaccuracies, such as small logical gaps, incomplete justifications, or redundant explanations -- the best-performing model, O3 Mini, achieves merely 70% accuracy, primarily on basic undergraduate-level constructive tasks. Performance declines sharply on more complex problems, with models failing to provide substantive strategies for open problems. These findings suggest that, although current LLMs display a degree of constructive proficiency on familiar and lower-difficulty problems, such performance is likely attributable to the recombination of memorized patterns rather than authentic creative insight or novel synthesis.

* 14 pages, 4 figures

Via

Access Paper or Ask Questions

NeoQA: Evidence-based Question Answering with Generated News Events

May 09, 2025

Max Glockner, Xiang Jiang, Leonardo F. R. Ribeiro, Iryna Gurevych, Markus Dreyer

Abstract:Evaluating Retrieval-Augmented Generation (RAG) in large language models (LLMs) is challenging because benchmarks can quickly become stale. Questions initially requiring retrieval may become answerable from pretraining knowledge as newer models incorporate more recent information during pretraining, making it difficult to distinguish evidence-based reasoning from recall. We introduce NeoQA (News Events for Out-of-training Question Answering), a benchmark designed to address this issue. To construct NeoQA, we generated timelines and knowledge bases of fictional news events and entities along with news articles and Q\&A pairs to prevent LLMs from leveraging pretraining knowledge, ensuring that no prior evidence exists in their training data. We propose our dataset as a new platform for evaluating evidence-based question answering, as it requires LLMs to generate responses exclusively from retrieved evidence and only when sufficient evidence is available. NeoQA enables controlled evaluation across various evidence scenarios, including cases with missing or misleading details. Our findings indicate that LLMs struggle to distinguish subtle mismatches between questions and evidence, and suffer from short-cut reasoning when key information required to answer a question is missing from the evidence, underscoring key limitations in evidence-based reasoning.

Via

Access Paper or Ask Questions

Fuzzy Cluster-Aware Contrastive Clustering for Time Series

Mar 28, 2025

Congyu Wang, Mingjing Du, Xiang Jiang, Yongquan Dong

Abstract:The rapid growth of unlabeled time series data, driven by the Internet of Things (IoT), poses significant challenges in uncovering underlying patterns. Traditional unsupervised clustering methods often fail to capture the complex nature of time series data. Recent deep learning-based clustering approaches, while effective, struggle with insufficient representation learning and the integration of clustering objectives. To address these issues, we propose a fuzzy cluster-aware contrastive clustering framework (FCACC) that jointly optimizes representation learning and clustering. Our approach introduces a novel three-view data augmentation strategy to enhance feature extraction by leveraging various characteristics of time series data. Additionally, we propose a cluster-aware hard negative sample generation mechanism that dynamically constructs high-quality negative samples using clustering structure information, thereby improving the model's discriminative ability. By leveraging fuzzy clustering, FCACC dynamically generates cluster structures to guide the contrastive learning process, resulting in more accurate clustering. Extensive experiments on 40 benchmark datasets show that FCACC outperforms the selected baseline methods (eight in total), providing an effective solution for unsupervised time series learning.

Via

Access Paper or Ask Questions

Hypothesis Disparity Regularized Mutual Information Maximization

Dec 15, 2020

Qicheng Lao, Xiang Jiang, Mohammad Havaei

Figure 1 for Hypothesis Disparity Regularized Mutual Information Maximization

Figure 2 for Hypothesis Disparity Regularized Mutual Information Maximization

Figure 3 for Hypothesis Disparity Regularized Mutual Information Maximization

Figure 4 for Hypothesis Disparity Regularized Mutual Information Maximization

Abstract:We propose a hypothesis disparity regularized mutual information maximization~(HDMI) approach to tackle unsupervised hypothesis transfer -- as an effort towards unifying hypothesis transfer learning (HTL) and unsupervised domain adaptation (UDA) -- where the knowledge from a source domain is transferred solely through hypotheses and adapted to the target domain in an unsupervised manner. In contrast to the prevalent HTL and UDA approaches that typically use a single hypothesis, HDMI employs multiple hypotheses to leverage the underlying distributions of the source and target hypotheses. To better utilize the crucial relationship among different hypotheses -- as opposed to unconstrained optimization of each hypothesis independently -- while adapting to the unlabeled target domain through mutual information maximization, HDMI incorporates a hypothesis disparity regularization that coordinates the target hypotheses jointly learn better target representations while preserving more transferable source knowledge with better-calibrated prediction uncertainty. HDMI achieves state-of-the-art adaptation performance on benchmark datasets for UDA in the context of HTL, without the need to access the source data during the adaptation.

* Accepted to AAAI 2021

Via

Access Paper or Ask Questions

Implicit Class-Conditioned Domain Alignment for Unsupervised Domain Adaptation

Jun 09, 2020

Xiang Jiang, Qicheng Lao, Stan Matwin, Mohammad Havaei

Figure 1 for Implicit Class-Conditioned Domain Alignment for Unsupervised Domain Adaptation

Figure 2 for Implicit Class-Conditioned Domain Alignment for Unsupervised Domain Adaptation

Figure 3 for Implicit Class-Conditioned Domain Alignment for Unsupervised Domain Adaptation

Figure 4 for Implicit Class-Conditioned Domain Alignment for Unsupervised Domain Adaptation

Abstract:We present an approach for unsupervised domain adaptation---with a strong focus on practical considerations of within-domain class imbalance and between-domain class distribution shift---from a class-conditioned domain alignment perspective. Current methods for class-conditioned domain alignment aim to explicitly minimize a loss function based on pseudo-label estimations of the target domain. However, these methods suffer from pseudo-label bias in the form of error accumulation. We propose a method that removes the need for explicit optimization of model parameters from pseudo-labels directly. Instead, we present a sampling-based implicit alignment approach, where the sample selection procedure is implicitly guided by the pseudo-labels. Theoretical analysis reveals the existence of a domain-discriminator shortcut in misaligned classes, which is addressed by the proposed implicit alignment approach to facilitate domain-adversarial learning. Empirical results and ablation studies confirm the effectiveness of the proposed approach, especially in the presence of within-domain class imbalance and between-domain class distribution shift.

* Accepted at ICML2020. For code, see https://github.com/xiangdal/implicit_alignment

Via

Access Paper or Ask Questions

Continuous Domain Adaptation with Variational Domain-Agnostic Feature Replay

Mar 09, 2020

Qicheng Lao, Xiang Jiang, Mohammad Havaei, Yoshua Bengio

Figure 1 for Continuous Domain Adaptation with Variational Domain-Agnostic Feature Replay

Figure 2 for Continuous Domain Adaptation with Variational Domain-Agnostic Feature Replay

Figure 3 for Continuous Domain Adaptation with Variational Domain-Agnostic Feature Replay

Figure 4 for Continuous Domain Adaptation with Variational Domain-Agnostic Feature Replay

Abstract:Learning in non-stationary environments is one of the biggest challenges in machine learning. Non-stationarity can be caused by either task drift, i.e., the drift in the conditional distribution of labels given the input data, or the domain drift, i.e., the drift in the marginal distribution of the input data. This paper aims to tackle this challenge in the context of continuous domain adaptation, where the model is required to learn new tasks adapted to new domains in a non-stationary environment while maintaining previously learned knowledge. To deal with both drifts, we propose variational domain-agnostic feature replay, an approach that is composed of three components: an inference module that filters the input data into domain-agnostic representations, a generative module that facilitates knowledge transfer, and a solver module that applies the filtered and transferable knowledge to solve the queries. We address the two fundamental scenarios in continuous domain adaptation, demonstrating the effectiveness of our proposed approach for practical usage.

Via

Access Paper or Ask Questions

On the Importance of Attention in Meta-Learning for Few-Shot Text Classification

Jun 03, 2018

Xiang Jiang, Mohammad Havaei, Gabriel Chartrand, Hassan Chouaib, Thomas Vincent, Andrew Jesson, Nicolas Chapados, Stan Matwin

Figure 1 for On the Importance of Attention in Meta-Learning for Few-Shot Text Classification

Figure 2 for On the Importance of Attention in Meta-Learning for Few-Shot Text Classification

Figure 3 for On the Importance of Attention in Meta-Learning for Few-Shot Text Classification

Figure 4 for On the Importance of Attention in Meta-Learning for Few-Shot Text Classification

Abstract:Current deep learning based text classification methods are limited by their ability to achieve fast learning and generalization when the data is scarce. We address this problem by integrating a meta-learning procedure that uses the knowledge learned across many tasks as an inductive bias towards better natural language understanding. Based on the Model-Agnostic Meta-Learning framework (MAML), we introduce the Attentive Task-Agnostic Meta-Learning (ATAML) algorithm for text classification. The essential difference between MAML and ATAML is in the separation of task-agnostic representation learning and task-specific attentive adaptation. The proposed ATAML is designed to encourage task-agnostic representation learning by way of task-agnostic parameterization and facilitate task-specific adaptation via attention mechanisms. We provide evidence to show that the attention mechanism in ATAML has a synergistic effect on learning performance. In comparisons with models trained from random initialization, pretrained models and meta trained MAML, our proposed ATAML method generalizes better on single-label and multi-label classification tasks in miniRCV1 and miniReuters-21578 datasets.

* 13 pages, 4 figures, submitted to NIPS

Via

Access Paper or Ask Questions

TrajectoryNet: An Embedded GPS Trajectory Representation for Point-based Classification Using Recurrent Neural Networks

Aug 30, 2017

Xiang Jiang, Erico N de Souza, Ahmad Pesaranghader, Baifan Hu, Daniel L. Silver, Stan Matwin

Figure 1 for TrajectoryNet: An Embedded GPS Trajectory Representation for Point-based Classification Using Recurrent Neural Networks

Figure 2 for TrajectoryNet: An Embedded GPS Trajectory Representation for Point-based Classification Using Recurrent Neural Networks

Figure 3 for TrajectoryNet: An Embedded GPS Trajectory Representation for Point-based Classification Using Recurrent Neural Networks

Figure 4 for TrajectoryNet: An Embedded GPS Trajectory Representation for Point-based Classification Using Recurrent Neural Networks

Abstract:Understanding and discovering knowledge from GPS (Global Positioning System) traces of human activities is an essential topic in mobility-based urban computing. We propose TrajectoryNet-a neural network architecture for point-based trajectory classification to infer real world human transportation modes from GPS traces. To overcome the challenge of capturing the underlying latent factors in the low-dimensional and heterogeneous feature space imposed by GPS data, we develop a novel representation that embeds the original feature space into another space that can be understood as a form of basis expansion. We also enrich the feature space via segment-based information and use Maxout activations to improve the predictive power of Recurrent Neural Networks (RNNs). We achieve over 98% classification accuracy when detecting four types of transportation modes, outperforming existing models without additional sensory data or location-based prior knowledge.

Via

Access Paper or Ask Questions

Camera Fingerprint: A New Perspective for Identifying User's Identity

Oct 25, 2016

Xiang Jiang, Shikui Wei, Ruizhen Zhao, Yao Zhao, Xindong Wu

Figure 1 for Camera Fingerprint: A New Perspective for Identifying User's Identity

Figure 2 for Camera Fingerprint: A New Perspective for Identifying User's Identity

Figure 3 for Camera Fingerprint: A New Perspective for Identifying User's Identity

Figure 4 for Camera Fingerprint: A New Perspective for Identifying User's Identity

Abstract:Identifying user's identity is a key problem in many data mining applications, such as product recommendation, customized content delivery and criminal identification. Given a set of accounts from the same or different social network platforms, user identification attempts to identify all accounts belonging to the same person. A commonly used solution is to build the relationship among different accounts by exploring their collective patterns, e.g., user profile, writing style, similar comments. However, this kind of method doesn't work well in many practical scenarios, since the information posted explicitly by users may be false due to various reasons. In this paper, we re-inspect the user identification problem from a novel perspective, i.e., identifying user's identity by matching his/her cameras. The underlying assumption is that multiple accounts belonging to the same person contain the same or similar camera fingerprint information. The proposed framework, called User Camera Identification (UCI), is based on camera fingerprints, which takes fully into account the problems of multiple cameras and reposting behaviors.

* 12 pages, 7 figures

Via

Access Paper or Ask Questions