Institute of Information Science, Academia Sinica, Taipei, Taiwan
Abstract:By monitoring temporal contrast, event-based vision sensors can provide high temporal resolution and low latency while maintaining low power consumption and simplicity in circuit structure. These characteristics have garnered significant attention in both academia and industry. In recent years, the application of back-illuminated (BSI) technology, wafer stacking techniques, and industrial interfaces has brought new opportunities for enhancing the performance of event-based vision sensors. This is evident in the substantial advancements made in reducing noise, improving resolution, and increasing readout rates. Additionally, the integration of these technologies has enhanced the compatibility of event-based vision sensors with current and edge vision systems, providing greater possibilities for their practical applications. This paper will review the progression from neuromorphic engineering to state-of-the-art event-based vision sensor technologies, including their development trends, operating principles, and key features. Moreover, we will delve into the sensitivity of event-based vision sensors and the opportunities and challenges they face in the realm of infrared imaging, providing references for future research and applications.
Abstract:Varieties of Democracy (V-Dem) is a new approach to conceptualizing and measuring democracy and politics. It has information for 200 countries and is one of the biggest databases for political science. According to the V-Dem annual democracy report 2019, Taiwan is one of the two countries that got disseminated false information from foreign governments the most. It also shows that the "made-up news" has caused a great deal of confusion in Taiwanese society and has serious impacts on global stability. Although there are several applications helping distinguish the false information, we found out that the pre-processing of categorizing the news is still done by human labor. However, human labor may cause mistakes and cannot work for a long time. The growing demands for automatic machines in the near decades show that while the machine can do as good as humans or even better, using machines can reduce humans' burden and cut down costs. Therefore, in this work, we build a predictive model to classify the category of news. The corpora we used contains 28358 news and 200 news scraped from the online newspaper Liberty Times Net (LTN) website and includes 8 categories: Technology, Entertainment, Fashion, Politics, Sports, International, Finance, and Health. At first, we use Bidirectional Encoder Representations from Transformers (BERT) for word embeddings which transform each Chinese character into a (1,768) vector. Then, we use a Long Short-Term Memory (LSTM) layer to transform word embeddings into sentence embeddings and add another LSTM layer to transform them into document embeddings. Each document embedding is an input for the final predicting model, which contains two Dense layers and one Activation layer. And each document embedding is transformed into 1 vector with 8 real numbers, then the highest one will correspond to the 8 news categories with up to 99% accuracy.
Abstract:As technology grows faster, the news spreads through social media. In order to attract more readers and acquire additional profit, some news agencies reproduce massive news in a more appealing manner. Therefore, it is essential to accurately predict whether a news article is from official news agencies. This work develops a headline classification based on Convoluted Neural Network to determine credibility of a news article. The model primarily focuses on investigating key factors from headlines. These factors include word segmentation, part-of-speech tags, and sentiment features. With integrating these features into the proposed classification model, the demonstrated evaluation achieves 93.99% for accuracy.
Abstract:Classifying the confusing samples in the course of RGBT tracking is a quite challenging problem, which hasn't got satisfied solution. Existing methods only focus on enlarging the boundary between positive and negative samples, however, the structured information of samples might be harmed, e.g., confusing positive samples are closer to the anchor than normal positive samples.To handle this problem, we propose a novel Multi-Modal Multi-Margin Metric Learning framework, named M$^5$L for RGBT tracking in this paper. In particular, we design a multi-margin structured loss to distinguish the confusing samples which play a most critical role in tracking performance boosting. To alleviate this problem, we additionally enlarge the boundaries between confusing positive samples and normal ones, between confusing negative samples and normal ones with predefined margins, by exploiting the structured information of all samples in each modality.Moreover, a cross-modality constraint is employed to reduce the difference between modalities and push positive samples closer to the anchor than negative ones from two modalities.In addition, to achieve quality-aware RGB and thermal feature fusion, we introduce the modality attentions and learn them using a feature fusion module in our network. Extensive experiments on large-scale datasets testify that our framework clearly improves the tracking performance and outperforms the state-of-the-art RGBT trackers.