Abstract:Weakly supervised medical image segmentation (MIS) using generative models is crucial for clinical diagnosis. However, the accuracy of the segmentation results is often limited by insufficient supervision and the complex nature of medical imaging. Existing models also only provide a single outcome, which does not allow for the measurement of uncertainty. In this paper, we introduce DiffSeg, a segmentation model for skin lesions based on diffusion difference which exploits diffusion model principles to ex-tract noise-based features from images with diverse semantic information. By discerning difference between these noise features, the model identifies diseased areas. Moreover, its multi-output capability mimics doctors' annotation behavior, facilitating the visualization of segmentation result consistency and ambiguity. Additionally, it quantifies output uncertainty using Generalized Energy Distance (GED), aiding interpretability and decision-making for physicians. Finally, the model integrates outputs through the Dense Conditional Random Field (DenseCRF) algorithm to refine the segmentation boundaries by considering inter-pixel correlations, which improves the accuracy and optimizes the segmentation results. We demonstrate the effectiveness of DiffSeg on the ISIC 2018 Challenge dataset, outperforming state-of-the-art U-Net-based methods.
Abstract:The essential of navigation, perception, and decision-making which are basic tasks for intelligent robots, is to estimate necessary system states. Among them, navigation is fundamental for other upper applications, providing precise position and orientation, by integrating measurements from multiple sensors. With observations of each sensor appropriately modelled, multi-sensor fusion tasks for navigation are reduced to the state estimation problem which can be solved by two approaches: optimization and filtering. Recent research has shown that optimization-based frameworks outperform filtering-based ones in terms of accuracy. However, both methods are based on maximum likelihood estimation (MLE) and should be theoretically equivalent with the same linearization points, observation model, measurements, and Gaussian noise assumption. In this paper, we deeply dig into the theories and existing strategies utilized in both optimization-based and filtering-based approaches. It is demonstrated that the two methods are equal theoretically, but this equivalence corrupts due to different strategies applied in real-time operation. By adjusting existing strategies of the filtering-based approaches, the Monte-Carlo simulation and vehicular ablation experiments based on visual odometry (VO) indicate that the strategy adjusted filtering strictly equals to optimization. Therefore, future research on sensor-fusion problems should concentrate on their own algorithms and strategies rather than state estimation approaches.
Abstract:Individual objects, whether users or services, within a specific region often exhibit similar network states due to their shared origin from the same city or autonomous system (AS). Despite this regional network similarity, many existing techniques overlook its potential, resulting in subpar performance arising from challenges such as data sparsity and label imbalance. In this paper, we introduce the regional-based dual latent state learning network(R2SL), a novel deep learning framework designed to overcome the pitfalls of traditional individual object-based prediction techniques in Quality of Service (QoS) prediction. Unlike its predecessors, R2SL captures the nuances of regional network behavior by deriving two distinct regional network latent states: the city-network latent state and the AS-network latent state. These states are constructed utilizing aggregated data from common regions rather than individual object data. Furthermore, R2SL adopts an enhanced Huber loss function that adjusts its linear loss component, providing a remedy for prevalent label imbalance issues. To cap off the prediction process, a multi-scale perception network is leveraged to interpret the integrated feature map, a fusion of regional network latent features and other pertinent information, ultimately accomplishing the QoS prediction. Through rigorous testing on real-world QoS datasets, R2SL demonstrates superior performance compared to prevailing state-of-the-art methods. Our R2SL approach ushers in an innovative avenue for precise QoS predictions by fully harnessing the regional network similarities inherent in objects.
Abstract:Quality of Service (QoS) prediction is an essential task in recommendation systems, where accurately predicting unknown QoS values can improve user satisfaction. However, existing QoS prediction techniques may perform poorly in the presence of noise data, such as fake location information or virtual gateways. In this paper, we propose the Probabilistic Deep Supervision Network (PDS-Net), a novel framework for QoS prediction that addresses this issue. PDS-Net utilizes a Gaussian-based probabilistic space to supervise intermediate layers and learns probability spaces for both known features and true labels. Moreover, PDS-Net employs a condition-based multitasking loss function to identify objects with noise data and applies supervision directly to deep features sampled from the probability space by optimizing the Kullback-Leibler distance between the probability space of these objects and the real-label probability space. Thus, PDS-Net effectively reduces errors resulting from the propagation of corrupted data, leading to more accurate QoS predictions. Experimental evaluations on two real-world QoS datasets demonstrate that the proposed PDS-Net outperforms state-of-the-art baselines, validating the effectiveness of our approach.
Abstract:Graph representation learning based on graph neural networks (GNNs) can greatly improve the performance of downstream tasks, such as node and graph classification. However, the general GNN models do not aggregate node information in a hierarchical manner, and can miss key higher-order structural features of many graphs. The hierarchical aggregation also enables the graph representations to be explainable. In addition, supervised graph representation learning requires labeled data, which is expensive and error-prone. To address these issues, we present an unsupervised graph representation learning method, Unsupervised Hierarchical Graph Representation (UHGR), which can generate hierarchical representations of graphs. Our method focuses on maximizing mutual information between "local" and high-level "global" representations, which enables us to learn the node embeddings and graph embeddings without any labeled data. To demonstrate the effectiveness of the proposed method, we perform the node and graph classification using the learned node and graph embeddings. The results show that the proposed method achieves comparable results to state-of-the-art supervised methods on several benchmarks. In addition, our visualization of hierarchical representations indicates that our method can capture meaningful and interpretable clusters.
Abstract:Inspired by the recently remarkable successes of Sparse Representation (SR), Collaborative Representation (CR) and sparse graph, we present a novel hypergraph model named Regression-based Hypergraph (RH) which utilizes the regression models to construct the high quality hypergraphs. Moreover, we plug RH into two conventional hypergraph learning frameworks, namely hypergraph spectral clustering and hypergraph transduction, to present Regression-based Hypergraph Spectral Clustering (RHSC) and Regression-based Hypergraph Transduction (RHT) models for addressing the image clustering and classification issues. Sparse Representation and Collaborative Representation are employed to instantiate two RH instances and their RHSC and RHT algorithms. The experimental results on six popular image databases demonstrate that the proposed RH learning algorithms achieve promising image clustering and classification performances, and also validate that RH can inherit the desirable properties from both hypergraph models and regression models.
Abstract:Motivated by the remarkable successes of Graph-based Transduction (GT) and Sparse Representation (SR), we present a novel Classifier named Sparse Graph-based Classifier (SGC) for image classification. In SGC, SR is leveraged to measure the correlation (similarity) of each two samples and a graph is constructed for encoding these correlations. Then the Laplacian eigenmapping is adopted for deriving the graph Laplacian of the graph. Finally, SGC can be obtained by plugging the graph Laplacian into the conventional GT framework. In the image classification procedure, SGC utilizes the correlations, which are encoded in the learned graph Laplacian, to infer the labels of unlabeled images. SGC inherits the merits of both GT and SR. Compared to SR, SGC improves the robustness and the discriminating power of GT. Compared to GT, SGC sufficiently exploits the whole data. Therefore it alleviates the undercomplete dictionary issue suffered by SR. Four popular image databases are employed for evaluation. The results demonstrate that SGC can achieve a promising performance in comparison with the state-of-the-art classifiers, particularly in the small training sample size case and the noisy sample case.
Abstract:We further exploit the representational power of Haar wavelet and present a novel low-level face representation named Shape Primitives Histogram (SPH) for face recognition. Since human faces exist abundant shape features, we address the face representation issue from the perspective of the shape feature extraction. In our approach, we divide faces into a number of tiny shape fragments and reduce these shape fragments to several uniform atomic shape patterns called Shape Primitives. A convolution with Haar Wavelet templates is applied to each shape fragment to identify its belonging shape primitive. After that, we do a histogram statistic of shape primitives in each spatial local image patch for incorporating the spatial information. Finally, each face is represented as a feature vector via concatenating all the local histograms of shape primitives. Four popular face databases, namely ORL, AR, Yale-B and LFW-a databases, are employed to evaluate SPH and experimentally study the choices of the parameters. Extensive experimental results demonstrate that the proposed approach outperform the state-of-the-arts.
Abstract:We present an improved Locality Preserving Projections (LPP) method, named Gloablity-Locality Preserving Projections (GLPP), to preserve both the global and local geometric structures of data. In our approach, an additional constraint of the geometry of classes is imposed to the objective function of conventional LPP for respecting some more global manifold structures. Moreover, we formulate a two-dimensional extension of GLPP (2D-GLPP) as an example to show how to extend GLPP with some other statistical techniques. We apply our works to face recognition on four popular face databases, namely ORL, Yale, FERET and LFW-A databases, and extensive experimental results demonstrate that the considered global manifold information can significantly improve the performance of LPP and the proposed face recognition methods outperform the state-of-the-arts.