Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yanming Guo

Group Reasoning Emission Estimation Networks

Feb 08, 2025

Yanming Guo, Xiao Qian, Kevin Credit, Jin Ma

Abstract:Accurate greenhouse gas (GHG) emission reporting is critical for governments, businesses, and investors. However, adoption remains limited particularly among small and medium enterprises due to high implementation costs, fragmented emission factor databases, and a lack of robust sector classification methods. To address these challenges, we introduce Group Reasoning Emission Estimation Networks (GREEN), an AI-driven carbon accounting framework that standardizes enterprise-level emission estimation, constructs a large-scale benchmark dataset, and leverages a novel reasoning approach with large language models (LLMs). Specifically, we compile textual descriptions for 20,850 companies with validated North American Industry Classification System (NAICS) labels and align these with an economic model of carbon intensity factors. By reframing sector classification as an information retrieval task, we fine-tune Sentence-BERT models using a contrastive learning loss. To overcome the limitations of single-stage models in handling thousands of hierarchical categories, we propose a Group Reasoning method that ensembles LLM classifiers based on the natural NAICS ontology, decomposing the task into multiple sub-classification steps. We theoretically prove that this approach reduces classification uncertainty and computational complexity. Experiments on 1,114 NAICS categories yield state-of-the-art performance (83.68% Top-1, 91.47% Top-10 accuracy), and case studies on 20 companies report a mean absolute percentage error (MAPE) of 45.88%. The project is available at: https://huggingface.co/datasets/Yvnminc/ExioNAICS.

Via

Access Paper or Ask Questions

Multimodal Multilabel Classification by CLIP

Jun 23, 2024

Yanming Guo

Abstract:Multimodal multilabel classification (MMC) is a challenging task that aims to design a learning algorithm to handle two data sources, the image and text, and learn a comprehensive semantic feature presentation across the modalities. In this task, we review the extensive number of state-of-the-art approaches in MMC and leverage a novel technique that utilises the Contrastive Language-Image Pre-training (CLIP) as the feature extractor and fine-tune the model by exploring different classification heads, fusion methods and loss functions. Finally, our best result achieved more than 90% F_1 score in the public Kaggle competition leaderboard. This paper provides detailed descriptions of novel training methods and quantitative analysis through the experimental results.

Via

Access Paper or Ask Questions

ExioML: Eco-economic dataset for Machine Learning in Global Sectoral Sustainability

Jun 11, 2024

Yanming Guo, Jin Ma

Abstract:The Environmental Extended Multi-Regional Input-Output analysis is the predominant framework in Ecological Economics for assessing the environmental impact of economic activities. This paper introduces ExioML, the first Machine Learning benchmark dataset designed for sustainability analysis, aimed at lowering barriers and fostering collaboration between Machine Learning and Ecological Economics research. A crucial greenhouse gas emission regression task was conducted to evaluate sectoral sustainability and demonstrate the usability of the dataset. We compared the performance of traditional shallow models with deep learning models, utilizing a diverse Factor Accounting table and incorporating various categorical and numerical features. Our findings reveal that ExioML, with its high usability, enables deep and ensemble models to achieve low mean square errors, establishing a baseline for future Machine Learning research. Through ExioML, we aim to build a foundational dataset supporting various Machine Learning applications and promote climate actions and sustainable investment decisions.

Via

Access Paper or Ask Questions

Augmentation is AUtO-Net: Augmentation-Driven Contrastive Multiview Learning for Medical Image Segmentation

Nov 02, 2023

Yanming Guo

Abstract:The utilisation of deep learning segmentation algorithms that learn complex organs and tissue patterns and extract essential regions of interest from the noisy background to improve the visual ability for medical image diagnosis has achieved impressive results in Medical Image Computing (MIC). This thesis focuses on retinal blood vessel segmentation tasks, providing an extensive literature review of deep learning-based medical image segmentation approaches while comparing the methodologies and empirical performances. The work also examines the limitations of current state-of-the-art methods by pointing out the two significant existing limitations: data size constraints and the dependency on high computational resources. To address such problems, this work proposes a novel efficient, simple multiview learning framework that contrastively learns invariant vessel feature representation by comparing with multiple augmented views by various transformations to overcome data shortage and improve generalisation ability. Moreover, the hybrid network architecture integrates the attention mechanism into a Convolutional Neural Network to further capture complex continuous curvilinear vessel structures. The result demonstrates the proposed method validated on the CHASE-DB1 dataset, attaining the highest F1 score of 83.46% and the highest Intersection over Union (IOU) score of 71.62% with UNet structure, surpassing existing benchmark UNet-based methods by 1.95% and 2.8%, respectively. The combination of the metrics indicates the model detects the vessel object accurately with a highly coincidental location with the ground truth. Moreover, the proposed approach could be trained within 30 minutes by consuming less than 3 GB GPU RAM, and such characteristics support the efficient implementation for real-world applications and deployments.

Via

Access Paper or Ask Questions

Deep Learning for Video-Text Retrieval: a Review

Feb 24, 2023

Cunjuan Zhu, Qi Jia, Wei Chen, Yanming Guo, Yu Liu

Abstract:Video-Text Retrieval (VTR) aims to search for the most relevant video related to the semantics in a given sentence, and vice versa. In general, this retrieval task is composed of four successive steps: video and textual feature representation extraction, feature embedding and matching, and objective functions. In the last, a list of samples retrieved from the dataset is ranked based on their matching similarities to the query. In recent years, significant and flourishing progress has been achieved by deep learning techniques, however, VTR is still a challenging task due to the problems like how to learn an efficient spatial-temporal video feature and how to narrow the cross-modal gap. In this survey, we review and summarize over 100 research papers related to VTR, demonstrate state-of-the-art performance on several commonly benchmarked datasets, and discuss potential challenges and directions, with the expectation to provide some insights for researchers in the field of video-text retrieval.

* International Journal of Multimedia Information Retrieval (IJMIR)

Via

Access Paper or Ask Questions

On the Exploration of Convolutional Fusion Networks for Visual Recognition

Nov 16, 2016

Yu Liu, Yanming Guo, Michael S. Lew

Figure 1 for On the Exploration of Convolutional Fusion Networks for Visual Recognition

Figure 2 for On the Exploration of Convolutional Fusion Networks for Visual Recognition

Figure 3 for On the Exploration of Convolutional Fusion Networks for Visual Recognition

Figure 4 for On the Exploration of Convolutional Fusion Networks for Visual Recognition

Abstract:Despite recent advances in multi-scale deep representations, their limitations are attributed to expensive parameters and weak fusion modules. Hence, we propose an efficient approach to fuse multi-scale deep representations, called convolutional fusion networks (CFN). Owing to using 1$\times$1 convolution and global average pooling, CFN can efficiently generate the side branches while adding few parameters. In addition, we present a locally-connected fusion module, which can learn adaptive weights for the side branches and form a discriminatively fused feature. CFN models trained on the CIFAR and ImageNet datasets demonstrate remarkable improvements over the plain CNNs. Furthermore, we generalize CFN to three new tasks, including scene recognition, fine-grained recognition and image retrieval. Our experiments show that it can obtain consistent improvements towards the transferring tasks.

* 23rd International Conference on MultiMedia Modeling (MMM 2017)

Via

Access Paper or Ask Questions