Abstract:The proliferation of Internet of Things (IoT) devices and the advent of 6G technologies have introduced computationally intensive tasks that often surpass the processing capabilities of user devices. Efficient and secure resource allocation in serverless multi-cloud edge computing environments is essential for supporting these demands and advancing distributed computing. However, existing solutions frequently struggle with the complexity of multi-cloud infrastructures, robust security integration, and effective application of traditional deep reinforcement learning (DRL) techniques under system constraints. To address these challenges, we present SARMTO, a novel framework that integrates an action-constrained DRL model. SARMTO dynamically balances resource allocation, task offloading, security, and performance by utilizing a Markov decision process formulation, an adaptive security mechanism, and sophisticated optimization techniques. Extensive simulations across varying scenarios, including different task loads, data sizes, and MEC capacities, show that SARMTO consistently outperforms five baseline approaches, achieving up to a 40% reduction in system costs and a 41.5% improvement in energy efficiency over state-of-the-art methods. These enhancements highlight SARMTO's potential to revolutionize resource management in intricate distributed computing environments, opening the door to more efficient and secure IoT and edge computing applications.
Abstract:Retrieval Augmented Generation (RAG) has proven to be highly effective in boosting the generative performance of language model in knowledge-intensive tasks. However, existing RAG framework either indiscriminately perform retrieval or rely on rigid single-class classifiers to select retrieval methods, leading to inefficiencies and suboptimal performance across queries of varying complexity. To address these challenges, we propose a reinforcement learning-based framework that dynamically selects the most suitable retrieval strategy based on query complexity. % our solution Our approach leverages a multi-armed bandit algorithm, which treats each retrieval method as a distinct ``arm'' and adapts the selection process by balancing exploration and exploitation. Additionally, we introduce a dynamic reward function that balances accuracy and efficiency, penalizing methods that require more retrieval steps, even if they lead to a correct result. Our method achieves new state of the art results on multiple single-hop and multi-hop datasets while reducing retrieval costs. Our code are available at https://github.com/FUTUREEEEEE/MBA .
Abstract:Existing cross-document event coreference resolution models, which either compute mention similarity directly or enhance mention representation by extracting event arguments (such as location, time, agent, and patient), lacking the ability to utilize document-level information. As a result, they struggle to capture long-distance dependencies. This shortcoming leads to their underwhelming performance in determining coreference for the events where their argument information relies on long-distance dependencies. In light of these limitations, we propose the construction of document-level Rhetorical Structure Theory (RST) trees and cross-document Lexical Chains to model the structural and semantic information of documents. Subsequently, cross-document heterogeneous graphs are constructed and GAT is utilized to learn the representations of events. Finally, a pair scorer calculates the similarity between each pair of events and co-referred events can be recognized using standard clustering algorithm. Additionally, as the existing cross-document event coreference datasets are limited to English, we have developed a large-scale Chinese cross-document event coreference dataset to fill this gap, which comprises 53,066 event mentions and 4,476 clusters. After applying our model on the English and Chinese datasets respectively, it outperforms all baselines by large margins.
Abstract:Document-level event extraction aims to extract structured event information from unstructured text. However, a single document often contains limited event information and the roles of different event arguments may be biased due to the influence of the information source. This paper addresses the limitations of traditional document-level event extraction by proposing the task of cross-document event extraction (CDEE) to integrate event information from multiple documents and provide a comprehensive perspective on events. We construct a novel cross-document event extraction dataset, namely CLES, which contains 20,059 documents and 37,688 mention-level events, where over 70% of them are cross-document. To build a benchmark, we propose a CDEE pipeline that includes 5 steps, namely event extraction, coreference resolution, entity normalization, role normalization and entity-role resolution. Our CDEE pipeline achieves about 72% F1 in end-to-end cross-document event extraction, suggesting the challenge of this task. Our work builds a new line of information extraction research and will attract new research attention.
Abstract:Generic Face Image Quality Assessment (GFIQA) evaluates the perceptual quality of facial images, which is crucial in improving image restoration algorithms and selecting high-quality face images for downstream tasks. We present a novel transformer-based method for GFIQA, which is aided by two unique mechanisms. First, a Dual-Set Degradation Representation Learning (DSL) mechanism uses facial images with both synthetic and real degradations to decouple degradation from content, ensuring generalizability to real-world scenarios. This self-supervised method learns degradation features on a global scale, providing a robust alternative to conventional methods that use local patch information in degradation learning. Second, our transformer leverages facial landmarks to emphasize visually salient parts of a face image in evaluating its perceptual quality. We also introduce a balanced and diverse Comprehensive Generic Face IQA (CGFIQA-40k) dataset of 40K images carefully designed to overcome the biases, in particular the imbalances in skin tone and gender representation, in existing datasets. Extensive analysis and evaluation demonstrate the robustness of our method, marking a significant improvement over prior methods.
Abstract:Deep neural networks have recently achieved considerable improvements in learning human behavioral patterns and individual preferences from massive spatial-temporal trajectories data. However, most of the existing research concentrates on fusing different semantics underlying sequential trajectories for mobility pattern learning which, in turn, yields a narrow perspective on comprehending human intrinsic motions. In addition, the inherent sparsity and under-explored heterogeneous collaborative items pertaining to human check-ins hinder the potential exploitation of human diverse periodic regularities as well as common interests. Motivated by recent advances in disentanglement learning, in this study we propose a novel disentangled solution called SSDL for tackling the next POI prediction problem. SSDL primarily seeks to disentangle the potential time-invariant and time-varying factors into different latent spaces from massive trajectories data, providing an interpretable view to understand the intricate semantics underlying human diverse mobility representations. To address the data sparsity issue, we present two realistic trajectory augmentation approaches to enhance the understanding of both the human intrinsic periodicity and constantly-changing intents. In addition, we devise a POI-centric graph structure to explore heterogeneous collaborative signals underlying historical check-ins. Extensive experiments conducted on four real-world datasets demonstrate that our proposed SSDL significantly outperforms the state-of-the-art approaches -- for example, it yields up to 8.57% improvements on ACC@1.
Abstract:Recent increases in computing power have enabled the numerical simulation of many complex flow problems that are of practical and strategic interest for naval applications. A noticeable area of advancement is the computation of turbulent, two-phase flows resulting from wave breaking and other multiphase flow processes such as cavitation that can generate underwater sound and entrain bubbles in ship wakes, among other effects. Although advanced flow solvers are sophisticated and are capable of simulating high Reynolds number flows on large numbers of grid points, challenges in data analysis remain. Specifically, there is a critical need to transform highly resolved flow fields described on fine grids at discrete time steps into physically resolved features for which the flow dynamics can be understood and utilized in naval applications. This paper presents our recent efforts in this field. In previous works, we developed a novel algorithm to track bubbles in breaking wave simulations and to interpret their dynamical behavior over time (Gao et al., 2021a). We also discovered a new physical mechanism driving bubble production within breaking wave crests (Gao et al., 2021b) and developed a model to relate bubble behaviors to underwater sound generation (Gao et al., 2021c). In this work, we applied our bubble tracking algorithm to the breaking waves simulations and investigated the bubble trajectories, bubble creation mechanisms, and bubble acoustics based on our previous works.
Abstract:Stock selection attempts to rank a list of stocks for optimizing investment decision making, aiming at minimizing investment risks while maximizing profit returns. Recently, researchers have developed various (recurrent) neural network-based methods to tackle this problem. Without exceptions, they primarily leverage historical market volatility to enhance the selection performance. However, these approaches greatly rely on discrete sampled market observations, which either fail to consider the uncertainty of stock fluctuations or predict continuous stock dynamics in the future. Besides, some studies have considered the explicit stock interdependence derived from multiple domains (e.g., industry and shareholder). Nevertheless, the implicit cross-dependencies among different domains are under-explored. To address such limitations, we present a novel stock selection solution -- StockODE, a latent variable model with Gaussian prior. Specifically, we devise a Movement Trend Correlation module to expose the time-varying relationships regarding stock movements. We design Neural Recursive Ordinary Differential Equation Networks (NRODEs) to capture the temporal evolution of stock volatility in a continuous dynamic manner. Moreover, we build a hierarchical hypergraph to incorporate the domain-aware dependencies among the stocks. Experiments conducted on two real-world stock market datasets demonstrate that StockODE significantly outperforms several baselines, such as up to 18.57% average improvement regarding Sharpe Ratio.
Abstract:Turbulence-free images cannot be produced by conventional computational ghost imaging because calculated light is not affected by the same atmospheric turbulence as real light. In this article, we first addressed this issue by measuring the photon number fluctuation autocorrelation of the signals generated by a conventional computational ghost imaging device. Our results illustrate how conventional computational ghost imaging without structural changes can be used to produce turbulence-free images.
Abstract:Trip recommendation is a significant and engaging location-based service that can help new tourists make more customized travel plans. It often attempts to suggest a sequence of point of interests (POIs) for a user who requests a personalized travel demand. Conventional methods either leverage the heuristic algorithms (e.g., dynamic programming) or statistical analysis (e.g., Markov models) to search or rank a POI sequence. These procedures may fail to capture the diversity of human needs and transitional regularities. They even provide recommendations that deviate from tourists' real travel intention when the trip data is sparse. Although recent deep recursive models (e.g., RNN) are capable of alleviating these concerns, existing solutions hardly recognize the practical reality, such as the diversity of tourist demands, uncertainties in the trip generation, and the complex visiting preference. Inspired by the advance in deep learning, we introduce a novel self-supervised representation learning framework for trip recommendation -- SelfTrip, aiming at tackling the aforementioned challenges. Specifically, we propose a two-step contrastive learning mechanism concerning the POI representation, as well as trip representation. Furthermore, we present four trip augmentation methods to capture the visiting uncertainties in trip planning. We evaluate our SelfTrip on four real-world datasets, and extensive results demonstrate the promising gain compared with several cutting-edge benchmarks, e.g., up to 4% and 12% on F1 and pair-F1, respectively.