Abstract:Pre-trained language models are increasingly being used in multi-document summarization tasks. However, these models need large-scale corpora for pre-training and are domain-dependent. Other non-neural unsupervised summarization approaches mostly rely on key sentence extraction, which can lead to information loss. To address these challenges, we propose a lightweight yet effective unsupervised approach called GLIMMER: a Graph and LexIcal features based unsupervised Multi-docuMEnt summaRization approach. It first constructs a sentence graph from the source documents, then automatically identifies semantic clusters by mining low-level features from raw texts, thereby improving intra-cluster correlation and the fluency of generated sentences. Finally, it summarizes clusters into natural sentences. Experiments conducted on Multi-News, Multi-XScience and DUC-2004 demonstrate that our approach outperforms existing unsupervised approaches. Furthermore, it surpasses state-of-the-art pre-trained multi-document summarization models (e.g. PEGASUS and PRIMERA) under zero-shot settings in terms of ROUGE scores. Additionally, human evaluations indicate that summaries generated by GLIMMER achieve high readability and informativeness scores. Our code is available at https://github.com/Oswald1997/GLIMMER.
Abstract:Posts, as important containers of user-generated-content pieces on social media, are of tremendous social influence and commercial value. As an integral components of a post, the headline has a decisive contribution to the post's popularity. However, current mainstream method for headline generation is still manually writing, which is unstable and requires extensive human effort. This drives us to explore a novel research question: Can we automate the generation of popular headlines on social media? We collect more than 1 million posts of 42,447 celebrities from public data of Xiaohongshu, which is a well-known social media platform in China. We then conduct careful observations on the headlines of these posts. Observation results demonstrate that trends and personal styles are widespread in headlines on social medias and have significant contribution to posts's popularity. Motivated by these insights, we present MEBART, which combines Multiple preference-Extractors with Bidirectional and Auto-Regressive Transformers (BART), capturing trends and personal styles to generate popular headlines on social medias. We perform extensive experiments on real-world datasets and achieve state-of-the-art performance compared with several advanced baselines. In addition, ablation and case studies demonstrate that MEBART advances in capturing trends and personal styles.
Abstract:Electrical contact resistance or capacitance as measured between a lubricated contact has been used in tribometers, partially reflecting the lubrication condition. In contrast, the electrical impedance provides rich information of magnitude and phase, which can be interpreted using equivalent circuit models, enabling more comprehensive measurements, including the variation of lubricant film thickness and the asperity (metal to metal) contact area. An accurate circuit model of the lubricated contact is critical as needed for the electrical impedance analysis. However, existing circuit models are hand derived and suited to interfaces with simple geometry, such as parallel plates, concentric and eccentric cylinders. Circuit model identification of lubricated contacts with complex geometry is challenging. This work takes the ball-on-disc lubricated contact in a Mini Traction Machine (MTM) as an example, where screws on the ball, grooves on the disc, and contact close to the disc edge make the overall interface geometry complicated. The electrical impedance spectroscopy (EIS) is used to capture its frequency response, with a group of load, speed, and temperature varied and tested separately. The results enable an identification of equivalent circuit models by fitting parallel resistor-capacitor models, the dependence on the oil film thickness is further calibrated using a high-accuracy optical interferometry, which is operated under the same lubrication condition as in the MTM. Overall, the proposed method is applicable to general lubricated interfaces for the identification of equivalent circuit models, which in turn facilitates in-situ tribo-contacts with electric impedance measurement of oil film thickness. It does not need transparent materials as optical techniques do, or structural modifications for piezoelectric sensor mounting as ultrasound techniques do.