Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xingcheng Xu

DeepInnovation AI: A Global Dataset Mapping the AI innovation from Academic Research to Industrial Patents

Mar 13, 2025

Haixing Gong, Hui Zou, Xingzhou Liang, Shiyuan Meng, Pinlong Cai, Xingcheng Xu, Jingjing Qu

Abstract:In the rapidly evolving field of artificial intelligence (AI), mapping innovation patterns and understanding effective technology transfer from research to applications are essential for economic growth. However, existing data infrastructures suffer from fragmentation, incomplete coverage, and insufficient evaluative capacity. Here, we present DeepInnovationAI, a comprehensive global dataset containing three structured files. DeepPatentAI.csv: Contains 2,356,204 patent records with 8 field-specific attributes. DeepDiveAI.csv: Encompasses 3,511,929 academic publications with 13 metadata fields. These two datasets leverage large language models, multilingual text analysis and dual-layer BERT classifiers to accurately identify AI-related content, while utilizing hypergraph analysis to create robust innovation metrics. Additionally, DeepCosineAI.csv: By applying semantic vector proximity analysis, this file presents approximately one hundred million calculated paper-patent similarity pairs to enhance understanding of how theoretical advancements translate into commercial technologies. DeepInnovationAI enables researchers, policymakers, and industry leaders to anticipate trends and identify collaboration opportunities. With extensive temporal and geographical scope, it supports detailed analysis of technological development patterns and international competition dynamics, establishing a foundation for modeling AI innovation and technology transfer processes.

* 32 pages and 8 figures

Via

Access Paper or Ask Questions

Optimal Trade and Industrial Policies in the Global Economy: A Deep Learning Framework

Jul 25, 2024

Zi Wang, Xingcheng Xu, Yanqing Yang, Xiaodong Zhu

Figure 1 for Optimal Trade and Industrial Policies in the Global Economy: A Deep Learning Framework

Figure 2 for Optimal Trade and Industrial Policies in the Global Economy: A Deep Learning Framework

Figure 3 for Optimal Trade and Industrial Policies in the Global Economy: A Deep Learning Framework

Figure 4 for Optimal Trade and Industrial Policies in the Global Economy: A Deep Learning Framework

Abstract:We propose a deep learning framework, DL-opt, designed to efficiently solve for optimal policies in quantifiable general equilibrium trade models. DL-opt integrates (i) a nested fixed point (NFXP) formulation of the optimization problem, (ii) automatic implicit differentiation to enhance gradient descent for solving unilateral optimal policies, and (iii) a best-response dynamics approach for finding Nash equilibria. Utilizing DL-opt, we solve for non-cooperative tariffs and industrial subsidies across 7 economies and 44 sectors, incorporating sectoral external economies of scale. Our quantitative analysis reveals significant sectoral heterogeneity in Nash policies: Nash industrial subsidies increase with scale elasticities, whereas Nash tariffs decrease with trade elasticities. Moreover, we show that global dual competition, involving both tariffs and industrial subsidies, results in lower tariffs and higher welfare outcomes compared to a global tariff war. These findings highlight the importance of considering sectoral heterogeneity and policy combinations in understanding global economic competition.

Via

Access Paper or Ask Questions

Relating the Seemingly Unrelated: Principled Understanding of Generalization for Generative Models in Arithmetic Reasoning Tasks

Jul 25, 2024

Xingcheng Xu, Zibo Zhao, Haipeng Zhang, Yanqing Yang

Figure 1 for Relating the Seemingly Unrelated: Principled Understanding of Generalization for Generative Models in Arithmetic Reasoning Tasks

Figure 2 for Relating the Seemingly Unrelated: Principled Understanding of Generalization for Generative Models in Arithmetic Reasoning Tasks

Figure 3 for Relating the Seemingly Unrelated: Principled Understanding of Generalization for Generative Models in Arithmetic Reasoning Tasks

Figure 4 for Relating the Seemingly Unrelated: Principled Understanding of Generalization for Generative Models in Arithmetic Reasoning Tasks

Abstract:Large language models (LLMs) have demonstrated impressive versatility across numerous tasks, yet their generalization capabilities remain poorly understood. To investigate these behaviors, arithmetic tasks serve as important venues. In previous studies, seemingly unrelated mysteries still exist -- (1) models with appropriate positional embeddings can correctly perform longer unseen arithmetic operations such as addition, but their effectiveness varies in more complex tasks like multiplication; (2) models perform well for longer unseen cases in modular addition under specific moduli (e.g., modulo 100) but struggle under very close moduli (e.g., modulo 101), regardless of the positional encoding used. We believe previous studies have been treating the symptoms rather than addressing the root cause -- they have paid excessive attention to improving model components, while overlooking the differences in task properties that may be the real drivers. This is confirmed by our unified theoretical framework for different arithmetic scenarios. For example, unlike multiplication, the digital addition task has the property of translation invariance which naturally aligns with the relative positional encoding, and this combination leads to successful generalization of addition to unseen longer domains. The discrepancy in operations modulo 100 and 101 arises from the base. Modulo 100, unlike 101, is compatible with the decimal system (base 10), such that unseen information in digits beyond the units digit and the tens digit is actually not needed for the task. Extensive experiments with GPT-like models validate our theoretical predictions. These findings deepen our understanding of the generalization mechanisms, and facilitate more data-efficient model training and objective-oriented AI alignment.

Via

Access Paper or Ask Questions

Machine Learning for Economic Forecasting: An Application to China's GDP Growth

Jul 04, 2024

Yanqing Yang, Xingcheng Xu, Jinfeng Ge, Yan Xu

Figure 1 for Machine Learning for Economic Forecasting: An Application to China's GDP Growth

Figure 2 for Machine Learning for Economic Forecasting: An Application to China's GDP Growth

Figure 3 for Machine Learning for Economic Forecasting: An Application to China's GDP Growth

Figure 4 for Machine Learning for Economic Forecasting: An Application to China's GDP Growth

Abstract:This paper aims to explore the application of machine learning in forecasting Chinese macroeconomic variables. Specifically, it employs various machine learning models to predict the quarterly real GDP growth of China, and analyzes the factors contributing to the performance differences among these models. Our findings indicate that the average forecast errors of machine learning models are generally lower than those of traditional econometric models or expert forecasts, particularly in periods of economic stability. However, during certain inflection points, although machine learning models still outperform traditional econometric models, expert forecasts may exhibit greater accuracy in some instances due to experts' more comprehensive understanding of the macroeconomic environment and real-time economic variables. In addition to macroeconomic forecasting, this paper employs interpretable machine learning methods to identify the key attributive variables from different machine learning models, aiming to enhance the understanding and evaluation of their contributions to macroeconomic fluctuations.

Via

Access Paper or Ask Questions

Large Language Models at Work in China's Labor Market

Aug 17, 2023

Qin Chen, Jinfeng Ge, Huaqing Xie, Xingcheng Xu, Yanqing Yang

Figure 1 for Large Language Models at Work in China's Labor Market

Figure 2 for Large Language Models at Work in China's Labor Market

Figure 3 for Large Language Models at Work in China's Labor Market

Figure 4 for Large Language Models at Work in China's Labor Market

Abstract:This paper explores the potential impacts of large language models (LLMs) on the Chinese labor market. We analyze occupational exposure to LLM capabilities by incorporating human expertise and LLM classifications, following Eloundou et al. (2023)'s methodology. We then aggregate occupation exposure to the industry level to obtain industry exposure scores. The results indicate a positive correlation between occupation exposure and wage levels/experience premiums, suggesting higher-paying and experience-intensive jobs may face greater displacement risks from LLM-powered software. The industry exposure scores align with expert assessments and economic intuitions. We also develop an economic growth model incorporating industry exposure to quantify the productivity-employment trade-off from AI adoption. Overall, this study provides an analytical basis for understanding the labor market impacts of increasingly capable AI systems in China. Key innovations include the occupation-level exposure analysis, industry aggregation approach, and economic modeling incorporating AI adoption and labor market effects. The findings will inform policymakers and businesses on strategies for maximizing the benefits of AI while mitigating adverse disruption risks.

Via

Access Paper or Ask Questions

It Ain't That Bad: Understanding the Mysterious Performance Drop in OOD Generalization for Generative Transformer Models

Aug 16, 2023

Xingcheng Xu, Zihao Pan, Haipeng Zhang, Yanqing Yang

Figure 1 for It Ain't That Bad: Understanding the Mysterious Performance Drop in OOD Generalization for Generative Transformer Models

Figure 2 for It Ain't That Bad: Understanding the Mysterious Performance Drop in OOD Generalization for Generative Transformer Models

Figure 3 for It Ain't That Bad: Understanding the Mysterious Performance Drop in OOD Generalization for Generative Transformer Models

Figure 4 for It Ain't That Bad: Understanding the Mysterious Performance Drop in OOD Generalization for Generative Transformer Models

Abstract:Generative Transformer-based models have achieved remarkable proficiency on solving diverse problems. However, their generalization ability is not fully understood and not always satisfying. Researchers take basic mathematical tasks like n-digit addition or multiplication as important perspectives for investigating their generalization behaviors. Curiously, it is observed that when training on n-digit operations (e.g., additions) in which both input operands are n-digit in length, models generalize successfully on unseen n-digit inputs (in-distribution (ID) generalization), but fail miserably and mysteriously on longer, unseen cases (out-of-distribution (OOD) generalization). Studies try to bridge this gap with workarounds such as modifying position embedding, fine-tuning, and priming with more extensive or instructive data. However, without addressing the essential mechanism, there is hardly any guarantee regarding the robustness of these solutions. We bring this unexplained performance drop into attention and ask whether it is purely from random errors. Here we turn to the mechanistic line of research which has notable successes in model interpretability. We discover that the strong ID generalization stems from structured representations, while behind the unsatisfying OOD performance, the models still exhibit clear learned algebraic structures. Specifically, these models map unseen OOD inputs to outputs with equivalence relations in the ID domain. These highlight the potential of the models to carry useful information for improved generalization.

Via

Access Paper or Ask Questions

Deep Generative Modeling with Backward Stochastic Differential Equations

Apr 08, 2023

Xingcheng Xu

Abstract:This paper proposes a novel deep generative model, called BSDE-Gen, which combines the flexibility of backward stochastic differential equations (BSDEs) with the power of deep neural networks for generating high-dimensional complex target data, particularly in the field of image generation. The incorporation of stochasticity and uncertainty in the generative modeling process makes BSDE-Gen an effective and natural approach for generating high-dimensional data. The paper provides a theoretical framework for BSDE-Gen, describes its model architecture, presents the maximum mean discrepancy (MMD) loss function used for training, and reports experimental results.

* 17 pages, 5 figures

Via

Access Paper or Ask Questions

GAMMT: Generative Ambiguity Modeling Using Multiple Transformers

Nov 16, 2022

Xingcheng Xu

Abstract:We introduce a new model based on sets of probabilities for sequential data. We name the model GAMMT, which stands for Generative Ambiguity Models using Multiple Transformers. We suppose that data generating process of a sequence is ambiguous and determined by a set of probabilities rather than one as in the conventional model. We use multiple parallel transformers connected by a selection mechanism to approximate ambiguous probabilities. The GAMMT allows for ambiguity modeling in a generative way and multiple representations of the input tokens and the input sequence. This work explores the combination of attention mechanism and ambiguity by deep neural networks. We expect that this framework will facilitate new research into machine learning, improving our understanding of the attention-ambiguity mechanism.

* 10 pages, 2 figures, 3 algorithms

Via

Access Paper or Ask Questions