Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Junfeng Li

FinTMMBench: Benchmarking Temporal-Aware Multi-Modal RAG in Finance

Mar 07, 2025

Fengbin Zhu, Junfeng Li, Liangming Pan, Wenjie Wang, Fuli Feng, Chao Wang, Huanbo Luan, Tat-Seng Chua

Figure 1 for FinTMMBench: Benchmarking Temporal-Aware Multi-Modal RAG in Finance

Figure 2 for FinTMMBench: Benchmarking Temporal-Aware Multi-Modal RAG in Finance

Figure 3 for FinTMMBench: Benchmarking Temporal-Aware Multi-Modal RAG in Finance

Figure 4 for FinTMMBench: Benchmarking Temporal-Aware Multi-Modal RAG in Finance

Abstract:Finance decision-making often relies on in-depth data analysis across various data sources, including financial tables, news articles, stock prices, etc. In this work, we introduce FinTMMBench, the first comprehensive benchmark for evaluating temporal-aware multi-modal Retrieval-Augmented Generation (RAG) systems in finance. Built from heterologous data of NASDAQ 100 companies, FinTMMBench offers three significant advantages. 1) Multi-modal Corpus: It encompasses a hybrid of financial tables, news articles, daily stock prices, and visual technical charts as the corpus. 2) Temporal-aware Questions: Each question requires the retrieval and interpretation of its relevant data over a specific time period, including daily, weekly, monthly, quarterly, and annual periods. 3) Diverse Financial Analysis Tasks: The questions involve 10 different tasks, including information extraction, trend analysis, sentiment analysis and event detection, etc. We further propose a novel TMMHybridRAG method, which first leverages LLMs to convert data from other modalities (e.g., tabular, visual and time-series data) into textual format and then incorporates temporal information in each node when constructing graphs and dense indexes. Its effectiveness has been validated in extensive experiments, but notable gaps remain, highlighting the challenges presented by our FinTMMBench.

* Under review

Via

Access Paper or Ask Questions

Adversarial multi-task underwater acoustic target recognition: towards robustness against various influential factors

Nov 05, 2024

Yuan Xie, Ji Xu, Jiawei Ren, Junfeng Li

Abstract:Underwater acoustic target recognition based on passive sonar faces numerous challenges in practical maritime applications. One of the main challenges lies in the susceptibility of signal characteristics to diverse environmental conditions and data acquisition configurations, which can lead to instability in recognition systems. While significant efforts have been dedicated to addressing these influential factors in other domains of underwater acoustics, they are often neglected in the field of underwater acoustic target recognition. To overcome this limitation, this study designs auxiliary tasks that model influential factors (e.g., source range, water column depth, or wind speed) based on available annotations and adopts a multi-task framework to connect these factors to the recognition task. Furthermore, we integrate an adversarial learning mechanism into the multi-task framework to prompt the model to extract representations that are robust against influential factors. Through extensive experiments and analyses on the ShipsEar dataset, our proposed adversarial multi-task model demonstrates its capacity to effectively model the influential factors and achieve state-of-the-art performance on the 12-class recognition task.

Via

Access Paper or Ask Questions

Advancing Robust Underwater Acoustic Target Recognition through Multi-task Learning and Multi-Gate Mixture-of-Experts

Nov 05, 2024

Yuan Xie, Jiawei Ren, Junfeng Li, Ji Xu

Abstract:Underwater acoustic target recognition has emerged as a prominent research area within the field of underwater acoustics. However, the current availability of authentic underwater acoustic signal recordings remains limited, which hinders data-driven acoustic recognition models from learning robust patterns of targets from a limited set of intricate underwater signals, thereby compromising their stability in practical applications. To overcome these limitations, this study proposes a recognition framework called M3 (Multi-task, Multi-gate, Multi-expert) to enhance the model's ability to capture robust patterns by making it aware of the inherent properties of targets. In this framework, an auxiliary task that focuses on target properties, such as estimating target size, is designed. The auxiliary task then shares parameters with the recognition task to realize multi-task learning. This paradigm allows the model to concentrate on shared information across tasks and identify robust patterns of targets in a regularized manner, thereby enhancing the model's generalization ability. Moreover, M3 incorporates multi-expert and multi-gate mechanisms, allowing for the allocation of distinct parameter spaces to various underwater signals. This enables the model to process intricate signal patterns in a fine-grained and differentiated manner. To evaluate the effectiveness of M3, extensive experiments were implemented on the ShipsEar underwater ship-radiated noise dataset. The results substantiate that M3 has the ability to outperform the most advanced single-task recognition models, thereby achieving the state-of-the-art performance.

Via

Access Paper or Ask Questions

StreamAAD: Decoding Spatial Auditory Attention with a Streaming Architecture

Aug 24, 2024

Zelin Qiu, Dingding Yao, Junfeng Li

Figure 1 for StreamAAD: Decoding Spatial Auditory Attention with a Streaming Architecture

Figure 2 for StreamAAD: Decoding Spatial Auditory Attention with a Streaming Architecture

Figure 3 for StreamAAD: Decoding Spatial Auditory Attention with a Streaming Architecture

Figure 4 for StreamAAD: Decoding Spatial Auditory Attention with a Streaming Architecture

Abstract:In this paper, we present our approach for the Track 1 of the Chinese Auditory Attention Decoding (Chinese AAD) Challenge at ISCSLP 2024. Most existing spatial auditory attention decoding (Sp-AAD) methods employ an isolated window architecture, focusing solely on global invariant features without considering relationships between different decision windows, which can lead to suboptimal performance. To address this issue, we propose a novel streaming decoding architecture, termed StreamAAD. In StreamAAD, decision windows are input to the network as a sequential stream and decoded in order, allowing for the modeling of inter-window relationships. Additionally, we employ a model ensemble strategy, achieving significant better performance than the baseline, ranking First in the challenge.

Via

Access Paper or Ask Questions

YAYI 2: Multilingual Open-Source Large Language Models

Dec 22, 2023

Yin Luo, Qingchao Kong, Nan Xu, Jia Cao, Bao Hao, Baoyu Qu, Bo Chen, Chao Zhu, Chenyang Zhao, Donglei Zhang(+43 more)

Abstract:As the latest advancements in natural language processing, large language models (LLMs) have achieved human-level language understanding and generation abilities in many real-world tasks, and even have been regarded as a potential path to the artificial general intelligence. To better facilitate research on LLMs, many open-source LLMs, such as Llama 2 and Falcon, have recently been proposed and gained comparable performances to proprietary models. However, these models are primarily designed for English scenarios and exhibit poor performances in Chinese contexts. In this technical report, we propose YAYI 2, including both base and chat models, with 30 billion parameters. YAYI 2 is pre-trained from scratch on a multilingual corpus which contains 2.65 trillion tokens filtered by our pre-training data processing pipeline. The base model is aligned with human values through supervised fine-tuning with millions of instructions and reinforcement learning from human feedback. Extensive experiments on multiple benchmarks, such as MMLU and CMMLU, consistently demonstrate that the proposed YAYI 2 outperforms other similar sized open-source models.

Via

Access Paper or Ask Questions

Traffic Flow Simulation for Autonomous Driving

Jul 23, 2023

Junfeng Li, Changqing Yan

Abstract:A traffic system is a random and complex large system, which is difficult to conduct repeated modelling and control research in a real traffic environment. With the development of automatic driving technology, the requirements for testing and evaluating the development of automatic driving technology are getting higher and higher, so the application of computer technology for traffic simulation has become a very effective technical means. Based on the micro-traffic flow modelling, this paper adopts the vehicle motion model based on cellular automata and the theory of bicycle intelligence to build the simulation environment of autonomous vehicle flow. The architecture of autonomous vehicles is generally divided into a perception system, decision system and control system. The perception system is generally divided into many subsystems, responsible for autonomous vehicle positioning, obstacle recognition, traffic signal detection and recognition and other tasks. Decision systems are typically divided into many subsystems that are responsible for tasks such as path planning, path planning, behavior selection, motion planning, and control. The control system is the basis of the selfdriving car, and each control system of the vehicle needs to be connected with the decision-making system through the bus, and can accurately control the acceleration degree, braking degree, steering amplitude, lighting control and other driving actions according to the bus instructions issued by the decision-making system, so as to achieve the autonomous driving of the vehicle.

Via

Access Paper or Ask Questions

Analytical Shaping Method for Low-Thrust Rendezvous Trajectory Using Cubic Spline Functions

Jan 01, 2022

Di Wu, Tongxin Zhang, Yuan Zhong, Fanghua Jiang, Junfeng Li

Figure 1 for Analytical Shaping Method for Low-Thrust Rendezvous Trajectory Using Cubic Spline Functions

Figure 2 for Analytical Shaping Method for Low-Thrust Rendezvous Trajectory Using Cubic Spline Functions

Figure 3 for Analytical Shaping Method for Low-Thrust Rendezvous Trajectory Using Cubic Spline Functions

Figure 4 for Analytical Shaping Method for Low-Thrust Rendezvous Trajectory Using Cubic Spline Functions

Abstract:Preliminary mission design requires an efficient and accurate approximation to the low-thrust rendezvous trajectories, which might be generally three-dimensional and involve multiple revolutions. In this paper, a new shaping method using cubic spline functions is developed for the analytical approximation, which shows advantages in the optimality and computational efficiency. The rendezvous constraints on the boundary states and transfer time are all satisfied analytically, under the assumption that the boundary conditions and segment numbers of cubic spline functions are designated in advance. Two specific shapes are then formulated according to whether they have free optimization parameters. The shape without free parameters provides an efficient and robust estimation, while the other one allows a subsequent optimization for the satisfaction of additional constraints such as the constraint on the thrust magnitude. Applications of the proposed method in combination with the particle swarm optimization algorithm are discussed through two typical interplanetary rendezvous missions, that is, an inclined multi-revolution trajectory from the Earth to asteroid Dionysus and a multi-rendezvous trajectory of sample return. Simulation examples show that the proposed method is superior to existing methods in terms of providing good estimation for the global search and generating suitable initial guess for the subsequent trajectory optimization.

Via

Access Paper or Ask Questions

A Model Compression Method with Matrix Product Operators for Speech Enhancement

Oct 10, 2020

Xingwei Sun, Ze-Feng Gao, Zhong-Yi Lu, Junfeng Li, Yonghong Yan

Figure 1 for A Model Compression Method with Matrix Product Operators for Speech Enhancement

Figure 2 for A Model Compression Method with Matrix Product Operators for Speech Enhancement

Figure 3 for A Model Compression Method with Matrix Product Operators for Speech Enhancement

Figure 4 for A Model Compression Method with Matrix Product Operators for Speech Enhancement

Abstract:The deep neural network (DNN) based speech enhancement approaches have achieved promising performance. However, the number of parameters involved in these methods is usually enormous for the real applications of speech enhancement on the device with the limited resources. This seriously restricts the applications. To deal with this issue, model compression techniques are being widely studied. In this paper, we propose a model compression method based on matrix product operators (MPO) to substantially reduce the number of parameters in DNN models for speech enhancement. In this method, the weight matrices in the linear transformations of neural network model are replaced by the MPO decomposition format before training. In experiment, this process is applied to the causal neural network models, such as the feedforward multilayer perceptron (MLP) and long short-term memory (LSTM) models. Both MLP and LSTM models with/without compression are then utilized to estimate the ideal ratio mask for monaural speech enhancement. The experimental results show that our proposed MPO-based method outperforms the widely-used pruning method for speech enhancement under various compression rates, and further improvement can be achieved with respect to low compression rates. Our proposal provides an effective model compression method for speech enhancement, especially in cloud-free application.

* IEEE Transactions on Audio, Speech and Language Processing.2020.3030495(2020)
* 11 pages, 6 figures, 7 tables

Via

Access Paper or Ask Questions

Iterative Residual Image Deconvolution

Nov 04, 2018

Li Si-Yao, Dongwei Ren, Furong Zhao, Zijian Hu, Junfeng Li, Qian Yin

Figure 1 for Iterative Residual Image Deconvolution

Figure 2 for Iterative Residual Image Deconvolution

Figure 3 for Iterative Residual Image Deconvolution

Figure 4 for Iterative Residual Image Deconvolution

Abstract:Image deblurring, a.k.a. image deconvolution, recovers a clear image from pixel superposition caused by blur degradation. Few deep convolutional neural networks (CNN) succeed in addressing this task. In this paper, we first demonstrate that the minimum-mean-square-error (MMSE) solution to image deblurring can be interestingly unfolded into a series of residual components. Based on this analysis, we propose a novel iterative residual deconvolution (IRD) algorithm. Further, IRD motivates us to take one step forward to design an explicable and effective CNN architecture for image deconvolution. Specifically, a sequence of residual CNN units are deployed, whose intermediate outputs are then concatenated and integrated, resulting in concatenated residual convolutional network (CRCNet). The experimental results demonstrate that proposed CRCNet not only achieves better quantitative metrics but also recovers more visually plausible texture details compared with state-of-the-art methods.

* rejected by AAAI 2019

Via

Access Paper or Ask Questions