Abstract:With the rapid proliferation of autonomous driving, there has been a heightened focus on the research of lidar-based 3D semantic segmentation and object detection methodologies, aiming to ensure the safety of traffic participants. In recent decades, learning-based approaches have emerged, demonstrating remarkable performance gains in comparison to conventional algorithms. However, the segmentation and detection tasks have traditionally been examined in isolation to achieve the best precision. To this end, we propose an efficient multi-task learning framework named LiSD which can address both segmentation and detection tasks, aiming to optimize the overall performance. Our proposed LiSD is a voxel-based encoder-decoder framework that contains a hierarchical feature collaboration module and a holistic information aggregation module. Different integration methods are adopted to keep sparsity in segmentation while densifying features for query initialization in detection. Besides, cross-task information is utilized in an instance-aware refinement module to obtain more accurate predictions. Experimental results on the nuScenes dataset and Waymo Open Dataset demonstrate the effectiveness of our proposed model. It is worth noting that LiSD achieves the state-of-the-art performance of 83.3% mIoU on the nuScenes segmentation benchmark for lidar-only methods.
Abstract:Distributed Ledger Technologies (DLTs) have rapidly evolved, necessitating comprehensive insights into their diverse components. However, a systematic literature review that emphasizes the Environmental, Sustainability, and Governance (ESG) components of DLT remains lacking. To bridge this gap, we selected 107 seed papers to build a citation network of 63,083 references and refined it to a corpus of 24,539 publications for analysis. Then, we labeled the named entities in 46 papers according to twelve top-level categories derived from an established technology taxonomy and enhanced the taxonomy by pinpointing DLT's ESG elements. Leveraging transformer-based language models, we fine-tuned a pre-trained language model for a Named Entity Recognition (NER) task using our labeled dataset. We used our fine-tuned language model to distill the corpus to 505 key papers, facilitating a literature review via named entities and temporal graph analysis on DLT evolution in the context of ESG. Our contributions are a methodology to conduct a machine learning-driven systematic literature review in the DLT field, placing a special emphasis on ESG aspects. Furthermore, we present a first-of-its-kind NER dataset, composed of 54,808 named entities, designed for DLT and ESG-related explorations.
Abstract:Decentralized finance (DeFi) has seen a tremendous increase in interest in the past years with many types of protocols, such as lending protocols or automated market-makers (AMMs) These protocols are typically controlled using off-chain governance, where token holders can vote to modify different parameters of the protocol. Up till now, however, choosing these parameters has been a manual process, typically done by the core team behind the protocol. In this work, we model a DeFi environment and propose a semi-automatic parameter adjustment approach with deep Q-network (DQN) reinforcement learning. Our system automatically generates intuitive governance proposals to adjust these parameters with data-driven justifications. Our evaluation results demonstrate that a learning-based on-chain governance procedure is more reactive, objective, and efficient than the existing manual approach.
Abstract:Pulmonary cancer is one of the most commonly diagnosed and fatal cancers and is often diagnosed by incidental findings on computed tomography. Automated pulmonary nodule detection is an essential part of computer-aided diagnosis, which is still facing great challenges and difficulties to quickly and accurately locate the exact nodules' positions. This paper proposes a dual skip connection upsampling strategy based on Dual Path network in a U-Net structure generating multiscale feature maps, which aims to minimize the ratio of false positives and maximize the sensitivity for lesion detection of nodules. The results show that our new upsampling strategy improves the performance by having 85.3% sensitivity at 4 FROC per image compared to 84.2% for the regular upsampling strategy or 81.2% for VGG16-based Faster-R-CNN.
Abstract:In no-reference 360-degree image quality assessment (NR 360IQA), graph convolutional networks (GCNs), which model interactions between viewports through graphs, have achieved impressive performance. However, prevailing GCN-based NR 360IQA methods suffer from three main limitations. First, they only use high-level features of the distorted image to regress the quality score, while the human visual system (HVS) scores the image based on hierarchical features. Second, they simplify complex high-order interactions between viewports in a pairwise fashion through graphs. Third, in the graph construction, they only consider spatial locations of viewports, ignoring its content characteristics. Accordingly, to address these issues, we propose an adaptive hypergraph convolutional network for NR 360IQA, denoted as AHGCN. Specifically, we first design a multi-level viewport descriptor for extracting hierarchical representations from viewports. Then, we model interactions between viewports through hypergraphs, where each hyperedge connects two or more viewports. In the hypergraph construction, we build a location-based hyperedge and a content-based hyperedge for each viewport. Experimental results on two public 360IQA databases demonstrate that our proposed approach has a clear advantage over state-of-the-art full-reference and no-reference IQA models.
Abstract:360-degree/omnidirectional images (OIs) have achieved remarkable attentions due to the increasing applications of virtual reality (VR). Compared to conventional 2D images, OIs can provide more immersive experience to consumers, benefitting from the higher resolution and plentiful field of views (FoVs). Moreover, observing OIs is usually in the head mounted display (HMD) without references. Therefore, an efficient blind quality assessment method, which is specifically designed for 360-degree images, is urgently desired. In this paper, motivated by the characteristics of the human visual system (HVS) and the viewing process of VR visual contents, we propose a novel and effective no-reference omnidirectional image quality assessment (NR OIQA) algorithm by Multi-Frequency Information and Local-Global Naturalness (MFILGN). Specifically, inspired by the frequency-dependent property of visual cortex, we first decompose the projected equirectangular projection (ERP) maps into wavelet subbands. Then, the entropy intensities of low and high frequency subbands are exploited to measure the multi-frequency information of OIs. Besides, except for considering the global naturalness of ERP maps, owing to the browsed FoVs, we extract the natural scene statistics features from each viewport image as the measure of local naturalness. With the proposed multi-frequency information measurement and local-global naturalness measurement, we utilize support vector regression as the final image quality regressor to train the quality evaluation model from visual quality-related features to human ratings. To our knowledge, the proposed model is the first no-reference quality assessment method for 360-degreee images that combines multi-frequency information and image naturalness. Experimental results on two publicly available OIQA databases demonstrate that our proposed MFILGN outperforms state-of-the-art approaches.
Abstract:Objective quality assessment of stereoscopic omnidirectional images is a challenging problem since it is influenced by multiple aspects such as projection deformation, field of view (FoV) range, binocular vision, visual comfort, etc. Existing studies show that classic 2D or 3D image quality assessment (IQA) metrics are not able to perform well for stereoscopic omnidirectional images. However, very few research works have focused on evaluating the perceptual visual quality of omnidirectional images, especially for stereoscopic omnidirectional images. In this paper, based on the predictive coding theory of the human vision system (HVS), we propose a stereoscopic omnidirectional image quality evaluator (SOIQE) to cope with the characteristics of 3D 360-degree images. Two modules are involved in SOIQE: predictive coding theory based binocular rivalry module and multi-view fusion module. In the binocular rivalry module, we introduce predictive coding theory to simulate the competition between high-level patterns and calculate the similarity and rivalry dominance to obtain the quality scores of viewport images. Moreover, we develop the multi-view fusion module to aggregate the quality scores of viewport images with the help of both content weight and location weight. The proposed SOIQE is a parametric model without necessary of regression learning, which ensures its interpretability and generalization performance. Experimental results on our published stereoscopic omnidirectional image quality assessment database (SOLID) demonstrate that our proposed SOIQE method outperforms state-of-the-art metrics. Furthermore, we also verify the effectiveness of each proposed module on both public stereoscopic image datasets and panoramic image datasets.
Abstract:While pump-and-dump schemes have attracted the attention of cryptocurrency observers and regulators alike, this paper is the first detailed study of pump-and-dump activities in cryptocurrency markets. We present a case study of a recent pump-and-dump event, investigate 220 pump-and-dump activities organized in Telegram channels from July 21, 2018 to November 18, 2018, and discover patterns in crypto-markets associated with pump-and-dump schemes. We then build a model that predicts the pump likelihood of a given coin prior to a pump. The model exhibits high precision as well as robustness, and can be used to create a simple, yet very effective trading strategy, which we empirically demonstrate can generate a return as high as 80% within a span of only three weeks.