Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuanchi Ma

Long text outline generation: Chinese text outline based on unsupervised framework and large language mode

Dec 01, 2024

Yan Yan, Yuanchi Ma

Figure 1 for Long text outline generation: Chinese text outline based on unsupervised framework and large language mode

Figure 2 for Long text outline generation: Chinese text outline based on unsupervised framework and large language mode

Figure 3 for Long text outline generation: Chinese text outline based on unsupervised framework and large language mode

Figure 4 for Long text outline generation: Chinese text outline based on unsupervised framework and large language mode

Abstract:Outline generation aims to reveal the internal structure of a document by identifying underlying chapter relationships and generating corresponding chapter summaries. Although existing deep learning methods and large models perform well on small- and medium-sized texts, they struggle to produce readable outlines for very long texts (such as fictional works), often failing to segment chapters coherently. In this paper, we propose a novel outline generation method for Chinese, combining an unsupervised framework with large models. Specifically, the method first generates chapter feature graph data based on entity and syntactic dependency relationships. Then, a representation module based on graph attention layers learns deep embeddings of the chapter graph data. Using these chapter embeddings, we design an operator based on Markov chain principles to segment plot boundaries. Finally, we employ a large model to generate summaries of each plot segment and produce the overall outline. We evaluate our model based on segmentation accuracy and outline readability, and our performance outperforms several deep learning models and large models in comparative evaluations.

Via

Access Paper or Ask Questions

Masked AutoEncoder for Graph Clustering without Pre-defined Cluster Number k

Jan 09, 2024

Yuanchi Ma, Hui He, Zhongxiang Lei, Zhendong Niu

Abstract:Graph clustering algorithms with autoencoder structures have recently gained popularity due to their efficient performance and low training cost. However, for existing graph autoencoder clustering algorithms based on GCN or GAT, not only do they lack good generalization ability, but also the number of clusters clustered by such autoencoder models is difficult to determine automatically. To solve this problem, we propose a new framework called Graph Clustering with Masked Autoencoders (GCMA). It employs our designed fusion autoencoder based on the graph masking method for the fusion coding of graph. It introduces our improved density-based clustering algorithm as a second decoder while decoding with multi-target reconstruction. By decoding the mask embedding, our model can capture more generalized and comprehensive knowledge. The number of clusters and clustering results can be output end-to-end while improving the generalization ability. As a nonparametric class method, extensive experiments demonstrate the superiority of \textit{GCMA} over state-of-the-art baselines.

Via

Access Paper or Ask Questions