Abstract:Histological artifacts pose challenges for both pathologists and Computer-Aided Diagnosis (CAD) systems, leading to errors in analysis. Current approaches for histological artifact restoration, based on Generative Adversarial Networks (GANs) and pixel-level Diffusion Models, suffer from performance limitations and computational inefficiencies. In this paper, we propose a novel framework, LatentArtiFusion, which leverages the latent diffusion model (LDM) to reconstruct histological artifacts with high performance and computational efficiency. Unlike traditional pixel-level diffusion frameworks, LatentArtiFusion executes the restoration process in a lower-dimensional latent space, significantly improving computational efficiency. Moreover, we introduce a novel regional artifact reconstruction algorithm in latent space to prevent mistransfer in non-artifact regions, distinguishing our approach from GAN-based methods. Through extensive experiments on real-world histology datasets, LatentArtiFusion demonstrates remarkable speed, outperforming state-of-the-art pixel-level diffusion frameworks by more than 30X. It also consistently surpasses GAN-based methods by at least 5% across multiple evaluation metrics. Furthermore, we evaluate the effectiveness of our proposed framework in downstream tissue classification tasks, showcasing its practical utility. Code is available at https://github.com/bugs-creator/LatentArtiFusion.
Abstract:In Weighted Model Counting (WMC), we assign weights to literals and compute the sum of the weights of the models of a given propositional formula where the weight of an assignment is the product of the weights of its literals. The current WMC solvers work on Conjunctive Normal Form (CNF) formulas. However, CNF is not a natural representation for human-being in many applications. Motivated by the stronger expressive power of pseudo-Boolean (PB) formulas than CNF, we propose to perform WMC on PB formulas. Based on a recent dynamic programming algorithm framework called ADDMC for WMC, we implement a weighted PB counting tool PBCounter. We compare PBCounter with the state-of-the-art weighted model counters SharpSAT-TD, ExactMC, D4, and ADDMC, where the latter tools work on CNF with encoding methods that convert PB constraints into a CNF formula. The experiments on three domains of benchmarks show that PBCounter is superior to the model counters on CNF formulas.
Abstract:In the realm of human mobility, the decision-making process for selecting the next-visit location is intricately influenced by a trade-off between spatial and temporal constraints, which are reflective of individual needs and preferences. This trade-off, however, varies across individuals, making the modeling of these spatial-temporal dynamics a formidable challenge. To address the problem, in this work, we introduce the "Spatial-temporal Induced Hierarchical Reinforcement Learning" (STI-HRL) framework, for capturing the interplay between spatial and temporal factors in human mobility decision-making. Specifically, STI-HRL employs a two-tiered decision-making process: the low-level focuses on disentangling spatial and temporal preferences using dedicated agents, while the high-level integrates these considerations to finalize the decision. To complement the hierarchical decision setting, we construct a hypergraph to organize historical data, encapsulating the multi-aspect semantics of human mobility. We propose a cross-channel hypergraph embedding module to learn the representations as the states to facilitate the decision-making cycle. Our extensive experiments on two real-world datasets validate the superiority of STI-HRL over state-of-the-art methods in predicting users' next visits across various performance metrics.
Abstract:Most session-based recommender systems (SBRSs) focus on extracting information from the observed items in the current session of a user to predict a next item, ignoring the causes outside the session (called outer-session causes, OSCs) that influence the user's selection of items. However, these causes widely exist in the real world, and few studies have investigated their role in SBRSs. In this work, we analyze the causalities and correlations of the OSCs in SBRSs from the perspective of causal inference. We find that the OSCs are essentially the confounders in SBRSs, which leads to spurious correlations in the data used to train SBRS models. To address this problem, we propose a novel SBRS framework named COCO-SBRS (COunterfactual COllaborative Session-Based Recommender Systems) to learn the causality between OSCs and user-item interactions in SBRSs. COCO-SBRS first adopts a self-supervised approach to pre-train a recommendation model by designing pseudo-labels of causes for each user's selection of the item in data to guide the training process. Next, COCO-SBRS adopts counterfactual inference to recommend items based on the outputs of the pre-trained recommendation model considering the causalities to alleviate the data sparsity problem. As a result, COCO-SBRS can learn the causalities in data, preventing the model from learning spurious correlations. The experimental results of our extensive experiments conducted on three real-world datasets demonstrate the superiority of our proposed framework over ten representative SBRSs.
Abstract:In this paper, we study the problem of MOOC quality evaluation which is essential for improving the course materials, promoting students' learning efficiency, and benefiting user services. While achieving promising performances, current works still suffer from the complicated interactions and relationships of entities in MOOC platforms. To tackle the challenges, we formulate the problem as a course representation learning task-based and develop an Information-aware Graph Representation Learning(IaGRL) for multi-view MOOC quality evaluation. Specifically, We first build a MOOC Heterogeneous Network (HIN) to represent the interactions and relationships among entities in MOOC platforms. And then we decompose the MOOC HIN into multiple single-relation graphs based on meta-paths to depict the multi-view semantics of courses. The course representation learning can be further converted to a multi-view graph representation task. Different from traditional graph representation learning, the learned course representations are expected to match the following three types of validity: (1) the agreement on expressiveness between the raw course portfolio and the learned course representations; (2) the consistency between the representations in each view and the unified representations; (3) the alignment between the course and MOOC platform representations. Therefore, we propose to exploit mutual information for preserving the validity of course representations. We conduct extensive experiments over real-world MOOC datasets to demonstrate the effectiveness of our proposed method.
Abstract:Traffic flow prediction is an important part of smart transportation. The goal is to predict future traffic conditions based on historical data recorded by sensors and the traffic network. As the city continues to build, parts of the transportation network will be added or modified. How to accurately predict expanding and evolving long-term streaming networks is of great significance. To this end, we propose a new simulation-based criterion that considers teaching autonomous agents to mimic sensor patterns, planning their next visit based on the sensor's profile (e.g., traffic, speed, occupancy). The data recorded by the sensor is most accurate when the agent can perfectly simulate the sensor's activity pattern. We propose to formulate the problem as a continuous reinforcement learning task, where the agent is the next flow value predictor, the action is the next time-series flow value in the sensor, and the environment state is a dynamically fused representation of the sensor and transportation network. Actions taken by the agent change the environment, which in turn forces the agent's mode to update, while the agent further explores changes in the dynamic traffic network, which helps the agent predict its next visit more accurately. Therefore, we develop a strategy in which sensors and traffic networks update each other and incorporate temporal context to quantify state representations evolving over time.
Abstract:The 0-1 multidimensional knapsack problem(MKP) is a classical NP-hard combinatorial optimization problem. In this paper, we propose a novel heuristic algorithm simulating evolutionary computation and large neighbourhood search for the MKP. It maintains a set of solutions and abstracts information from the solution set to generate good partial assignments. To find high-quality solutions, integer programming is employed to explore the promising search space specified by the good partial assignments. Extensive experimentation with commonly used benchmark sets shows that our approach outperforms the state of the art heuristic algorithms, TPTEA and DQPSO, in solution quality. It finds new lower bound for 8 large and hard instances
Abstract:Feature selection aims to select a subset of features to optimize the performances of downstream predictive tasks. Recently, multi-agent reinforced feature selection (MARFS) has been introduced to automate feature selection, by creating agents for each feature to select or deselect corresponding features. Although MARFS enjoys the automation of the selection process, MARFS suffers from not just the data complexity in terms of contents and dimensionality, but also the exponentially-increasing computational costs with regard to the number of agents. The raised concern leads to a new research question: Can we simplify the selection process of agents under reinforcement learning context so as to improve the efficiency and costs of feature selection? To address the question, we develop a single-agent reinforced feature selection approach integrated with restructured choice strategy. Specifically, the restructured choice strategy includes: 1) we exploit only one single agent to handle the selection task of multiple features, instead of using multiple agents. 2) we develop a scanning method to empower the single agent to make multiple selection/deselection decisions in each round of scanning. 3) we exploit the relevance to predictive labels of features to prioritize the scanning orders of the agent for multiple features. 4) we propose a convolutional auto-encoder algorithm, integrated with the encoded index information of features, to improve state representation. 5) we design a reward scheme that take into account both prediction accuracy and feature redundancy to facilitate the exploration process. Finally, we present extensive experimental results to demonstrate the efficiency and effectiveness of the proposed method.
Abstract:The non-local block is a popular module for strengthening the context modeling ability of a regular convolutional neural network. This paper first studies the non-local block in depth, where we find that its attention computation can be split into two terms, a whitened pairwise term accounting for the relationship between two pixels and a unary term representing the saliency of every pixel. We also observe that the two terms trained alone tend to model different visual clues, e.g. the whitened pairwise term learns within-region relationships while the unary term learns salient boundaries. However, the two terms are tightly coupled in the non-local block, which hinders the learning of each. Based on these findings, we present the disentangled non-local block, where the two terms are decoupled to facilitate learning for both terms. We demonstrate the effectiveness of the decoupled design on various tasks, such as semantic segmentation on Cityscapes, ADE20K and PASCAL Context, object detection on COCO, and action recognition on Kinetics.
Abstract:In this paper, we aim to understand Residual Network (ResNet) in a scientifically sound way by providing a bridge between ResNet and Feynman path integral. In particular, we prove that the effect of residual block is equivalent to partial differential equation, and the ResNet transforming process can be equivalently converted to Feynman path integral. These conclusions greatly help us mathematically understand the advantage of ResNet in addressing the gradient vanishing issue. More importantly, our analyses offer a path integral view of ResNet, and demonstrate that the output of certain network can be obtained by adding contributions of all paths. Moreover, the contribution of each path is proportional to e^{-S}, where S is the action given by time integral of Lagrangian L. This lays the solid foundation in the understanding of ResNet, and provides insights in the future design of convolutional neural network architecture. Based on these results, we have designed the network using partial differential operators, which further validates our theoritical analyses.