Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Junming Huang

Is GPT-OSS Good? A Comprehensive Evaluation of OpenAI's Latest Open Source Models

Aug 17, 2025

Ziqian Bi, Keyu Chen, Chiung-Yi Tseng, Danyang Zhang, Tianyang Wang, Hongying Luo, Lu Chen, Junming Huang, Jibin Guan, Junfeng Hao(+1 more)

Abstract:In August 2025, OpenAI released GPT-OSS models, its first open weight large language models since GPT-2 in 2019, comprising two mixture of experts architectures with 120B and 20B parameters. We evaluated both variants against six contemporary open source large language models ranging from 14.7B to 235B parameters, representing both dense and sparse designs, across ten benchmarks covering general knowledge, mathematical reasoning, code generation, multilingual understanding, and conversational ability. All models were tested in unquantised form under standardised inference settings, with statistical validation using McNemars test and effect size analysis. Results show that gpt-oss-20B consistently outperforms gpt-oss-120B on several benchmarks, such as HumanEval and MMLU, despite requiring substantially less memory and energy per response. Both models demonstrate mid-tier overall performance within the current open source landscape, with relative strength in code generation and notable weaknesses in multilingual tasks. These findings provide empirical evidence that scaling in sparse architectures may not yield proportional performance gains, underscoring the need for further investigation into optimisation strategies and informing more efficient model selection for future open source deployments.

Via

Access Paper or Ask Questions

Uncovering inequalities in new knowledge learning by large language models across different languages

Mar 06, 2025

Chenglong Wang, Haoyu Tang, Xiyuan Yang, Yueqi Xie, Jina Suh, Sunayana Sitaram, Junming Huang, Yu Xie, Zhaoya Gong, Xing Xie(+1 more)

Abstract:As large language models (LLMs) gradually become integral tools for problem solving in daily life worldwide, understanding linguistic inequality is becoming increasingly important. Existing research has primarily focused on static analyses that assess the disparities in the existing knowledge and capabilities of LLMs across languages. However, LLMs are continuously evolving, acquiring new knowledge to generate up-to-date, domain-specific responses. Investigating linguistic inequalities within this dynamic process is, therefore, also essential. In this paper, we explore inequalities in new knowledge learning by LLMs across different languages and four key dimensions: effectiveness, transferability, prioritization, and robustness. Through extensive experiments under two settings (in-context learning and fine-tuning) using both proprietary and open-source models, we demonstrate that low-resource languages consistently face disadvantages across all four dimensions. By shedding light on these disparities, we aim to raise awareness of linguistic inequalities in LLMs' new knowledge learning, fostering the development of more inclusive and equitable future LLMs.

Via

Access Paper or Ask Questions

A Recommendation Model Utilizing Separation Embedding and Self-Attention for Feature Mining

Oct 19, 2024

Wenyi Liu, Rui Wang, Yuanshuai Luo, Jianjun Wei, Zihao Zhao, Junming Huang

Figure 1 for A Recommendation Model Utilizing Separation Embedding and Self-Attention for Feature Mining

Figure 2 for A Recommendation Model Utilizing Separation Embedding and Self-Attention for Feature Mining

Figure 3 for A Recommendation Model Utilizing Separation Embedding and Self-Attention for Feature Mining

Figure 4 for A Recommendation Model Utilizing Separation Embedding and Self-Attention for Feature Mining

Abstract:With the explosive growth of Internet data, users are facing the problem of information overload, which makes it a challenge to efficiently obtain the required resources. Recommendation systems have emerged in this context. By filtering massive amounts of information, they provide users with content that meets their needs, playing a key role in scenarios such as advertising recommendation and product recommendation. However, traditional click-through rate prediction and TOP-K recommendation mechanisms are gradually unable to meet the recommendations needs in modern life scenarios due to high computational complexity, large memory consumption, long feature selection time, and insufficient feature interaction. This paper proposes a recommendations system model based on a separation embedding cross-network. The model uses an embedding neural network layer to transform sparse feature vectors into dense embedding vectors, and can independently perform feature cross operations on different dimensions, thereby improving the accuracy and depth of feature mining. Experimental results show that the model shows stronger adaptability and higher prediction accuracy in processing complex data sets, effectively solving the problems existing in existing models.

Via

Access Paper or Ask Questions

Integration of Mamba and Transformer -- MAT for Long-Short Range Time Series Forecasting with Application to Weather Dynamics

Sep 13, 2024

Wenqing Zhang, Junming Huang, Ruotong Wang, Changsong Wei, Wenqian Huang, Yuxin Qiao

Figure 1 for Integration of Mamba and Transformer -- MAT for Long-Short Range Time Series Forecasting with Application to Weather Dynamics

Figure 2 for Integration of Mamba and Transformer -- MAT for Long-Short Range Time Series Forecasting with Application to Weather Dynamics

Figure 3 for Integration of Mamba and Transformer -- MAT for Long-Short Range Time Series Forecasting with Application to Weather Dynamics

Figure 4 for Integration of Mamba and Transformer -- MAT for Long-Short Range Time Series Forecasting with Application to Weather Dynamics

Abstract:Long-short range time series forecasting is essential for predicting future trends and patterns over extended periods. While deep learning models such as Transformers have made significant strides in advancing time series forecasting, they often encounter difficulties in capturing long-term dependencies and effectively managing sparse semantic features. The state-space model, Mamba, addresses these issues through its adept handling of selective input and parallel computing, striking a balance between computational efficiency and prediction accuracy. This article examines the advantages and disadvantages of both Mamba and Transformer models, and introduces a combined approach, MAT, which leverages the strengths of each model to capture unique long-short range dependencies and inherent evolutionary patterns in multivariate time series. Specifically, MAT harnesses the long-range dependency capabilities of Mamba and the short-range characteristics of Transformers. Experimental results on benchmark weather datasets demonstrate that MAT outperforms existing comparable methods in terms of prediction accuracy, scalability, and memory efficiency.

* 6 pages, 4 figures, to be presented at the 5th International Conference on Electrical, Communication and Computer Engineering (ICECCE)

Via

Access Paper or Ask Questions

Measuring Human Contribution in AI-Assisted Content Generation

Aug 27, 2024

Yueqi Xie, Tao Qi, Jingwei Yi, Ryan Whalen, Junming Huang, Qian Ding, Yu Xie, Xing Xie, Fangzhao Wu

Figure 1 for Measuring Human Contribution in AI-Assisted Content Generation

Figure 2 for Measuring Human Contribution in AI-Assisted Content Generation

Figure 3 for Measuring Human Contribution in AI-Assisted Content Generation

Figure 4 for Measuring Human Contribution in AI-Assisted Content Generation

Abstract:With the growing prevalence of generative artificial intelligence (AI), an increasing amount of content is no longer exclusively generated by humans but by generative AI models with human guidance. This shift presents notable challenges for the delineation of originality due to the varying degrees of human contribution in AI-assisted works. This study raises the research question of measuring human contribution in AI-assisted content generation and introduces a framework to address this question that is grounded in information theory. By calculating mutual information between human input and AI-assisted output relative to self-information of AI-assisted output, we quantify the proportional information contribution of humans in content generation. Our experimental results demonstrate that the proposed measure effectively discriminates between varying degrees of human contribution across multiple creative domains. We hope that this work lays a foundation for measuring human contributions in AI-assisted content generation in the era of generative AI.

Via

Access Paper or Ask Questions

Attention Mechanism and Context Modeling System for Text Mining Machine Translation

Aug 08, 2024

Shi Bo, Yuwei Zhang, Junming Huang, Sitong Liu, Zexi Chen, Zizheng Li

Figure 1 for Attention Mechanism and Context Modeling System for Text Mining Machine Translation

Figure 2 for Attention Mechanism and Context Modeling System for Text Mining Machine Translation

Figure 3 for Attention Mechanism and Context Modeling System for Text Mining Machine Translation

Figure 4 for Attention Mechanism and Context Modeling System for Text Mining Machine Translation

Abstract:This paper advances a novel architectural schema anchored upon the Transformer paradigm and innovatively amalgamates the K-means categorization algorithm to augment the contextual apprehension capabilities of the schema. The transformer model performs well in machine translation tasks due to its parallel computing power and multi-head attention mechanism. However, it may encounter contextual ambiguity or ignore local features when dealing with highly complex language structures. To circumvent this constraint, this exposition incorporates the K-Means algorithm, which is used to stratify the lexis and idioms of the input textual matter, thereby facilitating superior identification and preservation of the local structure and contextual intelligence of the language. The advantage of this combination is that K-Means can automatically discover the topic or concept regions in the text, which may be directly related to translation quality. Consequently, the schema contrived herein enlists K-Means as a preparatory phase antecedent to the Transformer and recalibrates the multi-head attention weights to assist in the discrimination of lexis and idioms bearing analogous semantics or functionalities. This ensures the schema accords heightened regard to the contextual intelligence embodied by these clusters during the training phase, rather than merely focusing on locational intelligence.

Via

Access Paper or Ask Questions

Text-Driven Diverse Facial Texture Generation via Progressive Latent-Space Refinement

Apr 15, 2024

Chi Wang, Junming Huang, Rong Zhang, Qi Wang, Haotian Yang, Haibin Huang, Chongyang Ma, Weiwei Xu

Figure 1 for Text-Driven Diverse Facial Texture Generation via Progressive Latent-Space Refinement

Figure 2 for Text-Driven Diverse Facial Texture Generation via Progressive Latent-Space Refinement

Figure 3 for Text-Driven Diverse Facial Texture Generation via Progressive Latent-Space Refinement

Figure 4 for Text-Driven Diverse Facial Texture Generation via Progressive Latent-Space Refinement

Abstract:Automatic 3D facial texture generation has gained significant interest recently. Existing approaches may not support the traditional physically based rendering pipeline or rely on 3D data captured by Light Stage. Our key contribution is a progressive latent space refinement approach that can bootstrap from 3D Morphable Models (3DMMs)-based texture maps generated from facial images to generate high-quality and diverse PBR textures, including albedo, normal, and roughness. It starts with enhancing Generative Adversarial Networks (GANs) for text-guided and diverse texture generation. To this end, we design a self-supervised paradigm to overcome the reliance on ground truth 3D textures and train the generative model with only entangled texture maps. Besides, we foster mutual enhancement between GANs and Score Distillation Sampling (SDS). SDS boosts GANs with more generative modes, while GANs promote more efficient optimization of SDS. Furthermore, we introduce an edge-aware SDS for multi-view consistent facial structure. Experiments demonstrate that our method outperforms existing 3D texture generation methods regarding photo-realistic quality, diversity, and efficiency.

Via

Access Paper or Ask Questions

How COVID-19 has Impacted American Attitudes Toward China: A Study on Twitter

Aug 25, 2021

Gavin Cook, Junming Huang, Yu Xie

Figure 1 for How COVID-19 has Impacted American Attitudes Toward China: A Study on Twitter

Figure 2 for How COVID-19 has Impacted American Attitudes Toward China: A Study on Twitter

Figure 3 for How COVID-19 has Impacted American Attitudes Toward China: A Study on Twitter

Figure 4 for How COVID-19 has Impacted American Attitudes Toward China: A Study on Twitter

Abstract:Past research has studied social determinants of attitudes toward foreign countries. Confounded by potential endogeneity biases due to unobserved factors or reverse causality, the causal impact of these factors on public opinion is usually difficult to establish. Using social media data, we leverage the suddenness of the COVID-19 pandemic to examine whether a major global event has causally changed American views of another country. We collate a database of more than 297 million posts on the social media platform Twitter about China or COVID-19 up to June 2020, and we treat tweeting about COVID-19 as a proxy for individual awareness of COVID-19. Using regression discontinuity and difference-in-difference estimation, we find that awareness of COVID-19 causes a sharp rise in anti-China attitudes. Our work has implications for understanding how self-interest affects policy preference and how Americans view migrant communities.

* 26 pages, 8 figures

Via

Access Paper or Ask Questions

Do Mass Media Shape Public Opinion toward China? Quantitative Evidence on New York Times with Deep Learning

Dec 08, 2020

Junming Huang, Gavin Cook, Yu Xie

Figure 1 for Do Mass Media Shape Public Opinion toward China? Quantitative Evidence on New York Times with Deep Learning

Figure 2 for Do Mass Media Shape Public Opinion toward China? Quantitative Evidence on New York Times with Deep Learning

Figure 3 for Do Mass Media Shape Public Opinion toward China? Quantitative Evidence on New York Times with Deep Learning

Figure 4 for Do Mass Media Shape Public Opinion toward China? Quantitative Evidence on New York Times with Deep Learning

Abstract:Do mass media influence people's opinion of other countries? Using BERT, a deep neural network-based natural language processing model, we analyze a large corpus of 267,907 China-related articles published by The New York Times since 1970. We then compare our output from The New York Times to a longitudinal data set constructed from 101 cross-sectional surveys of the American public's views on China. We find that the reporting of The New York Times on China in one year explains 54% of the variance in American public opinion on China in the next. Our result confirms hypothesized links between media and public opinion and helps shed light on how mass media can influence public opinion of foreign countries.

* 31 pages, 4 figures, 2 tables

Via

Access Paper or Ask Questions

Conquering the rating bound problem in neighborhood-based collaborative filtering: a function recovery approach

Sep 05, 2012

Junming Huang, Xue-Qi Cheng, Hua-Wei Shen, Xiaoming Sun, Tao Zhou, Xiaolong Jin

Figure 1 for Conquering the rating bound problem in neighborhood-based collaborative filtering: a function recovery approach

Figure 2 for Conquering the rating bound problem in neighborhood-based collaborative filtering: a function recovery approach

Figure 3 for Conquering the rating bound problem in neighborhood-based collaborative filtering: a function recovery approach

Figure 4 for Conquering the rating bound problem in neighborhood-based collaborative filtering: a function recovery approach

Abstract:As an important tool for information filtering in the era of socialized web, recommender systems have witnessed rapid development in the last decade. As benefited from the better interpretability, neighborhood-based collaborative filtering techniques, such as item-based collaborative filtering adopted by Amazon, have gained a great success in many practical recommender systems. However, the neighborhood-based collaborative filtering method suffers from the rating bound problem, i.e., the rating on a target item that this method estimates is bounded by the observed ratings of its all neighboring items. Therefore, it cannot accurately estimate the unobserved rating on a target item, if its ground truth rating is actually higher (lower) than the highest (lowest) rating over all items in its neighborhood. In this paper, we address this problem by formalizing rating estimation as a task of recovering a scalar rating function. With a linearity assumption, we infer all the ratings by optimizing the low-order norm, e.g., the $l_1/2$-norm, of the second derivative of the target scalar function, while remaining its observed ratings unchanged. Experimental results on three real datasets, namely Douban, Goodreads and MovieLens, demonstrate that the proposed approach can well overcome the rating bound problem. Particularly, it can significantly improve the accuracy of rating estimation by 37% than the conventional neighborhood-based methods.

* 10 pages, 4 figures

Via

Access Paper or Ask Questions