Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xi Yu

Explanatory Summarization with Discourse-Driven Planning

Apr 27, 2025

Dongqi Liu, Xi Yu, Vera Demberg, Mirella Lapata

Abstract:Lay summaries for scientific documents typically include explanations to help readers grasp sophisticated concepts or arguments. However, current automatic summarization methods do not explicitly model explanations, which makes it difficult to align the proportion of explanatory content with human-written summaries. In this paper, we present a plan-based approach that leverages discourse frameworks to organize summary generation and guide explanatory sentences by prompting responses to the plan. Specifically, we propose two discourse-driven planning strategies, where the plan is conditioned as part of the input or part of the output prefix, respectively. Empirical experiments on three lay summarization datasets show that our approach outperforms existing state-of-the-art methods in terms of summary quality, and it enhances model robustness, controllability, and mitigates hallucination.

* Accepted by the Transactions of the Association for Computational Linguistics (TACL)

Via

Access Paper or Ask Questions

What Is That Talk About? A Video-to-Text Summarization Dataset for Scientific Presentations

Feb 12, 2025

Dongqi Liu, Chenxi Whitehouse, Xi Yu, Louis Mahon, Rohit Saxena, Zheng Zhao, Yifu Qiu, Mirella Lapata, Vera Demberg

Abstract:Transforming recorded videos into concise and accurate textual summaries is a growing challenge in multimodal learning. This paper introduces VISTA, a dataset specifically designed for video-to-text summarization in scientific domains. VISTA contains 18,599 recorded AI conference presentations paired with their corresponding paper abstracts. We benchmark the performance of state-of-the-art large models and apply a plan-based framework to better capture the structured nature of abstracts. Both human and automated evaluations confirm that explicit planning enhances summary quality and factual consistency. However, a considerable gap remains between models and human performance, highlighting the challenges of scientific video summarization.

* arXiv admin note: text overlap with arXiv:2306.02873 by other authors

Via

Access Paper or Ask Questions

Multi-Task Semantic Communication With Graph Attention-Based Feature Correlation Extraction

Jan 02, 2025

Xi Yu, Tiejun Lv, Weicai Li, Wei Ni, Dusit Niyato, Ekram Hossain

Figure 1 for Multi-Task Semantic Communication With Graph Attention-Based Feature Correlation Extraction

Figure 2 for Multi-Task Semantic Communication With Graph Attention-Based Feature Correlation Extraction

Figure 3 for Multi-Task Semantic Communication With Graph Attention-Based Feature Correlation Extraction

Figure 4 for Multi-Task Semantic Communication With Graph Attention-Based Feature Correlation Extraction

Abstract:Multi-task semantic communication can serve multiple learning tasks using a shared encoder model. Existing models have overlooked the intricate relationships between features extracted during an encoding process of tasks. This paper presents a new graph attention inter-block (GAI) module to the encoder/transmitter of a multi-task semantic communication system, which enriches the features for multiple tasks by embedding the intermediate outputs of encoding in the features, compared to the existing techniques. The key idea is that we interpret the outputs of the intermediate feature extraction blocks of the encoder as the nodes of a graph to capture the correlations of the intermediate features. Another important aspect is that we refine the node representation using a graph attention mechanism to extract the correlations and a multi-layer perceptron network to associate the node representations with different tasks. Consequently, the intermediate features are weighted and embedded into the features transmitted for executing multiple tasks at the receiver. Experiments demonstrate that the proposed model surpasses the most competitive and publicly available models by 11.4% on the CityScapes 2Task dataset and outperforms the established state-of-the-art by 3.97% on the NYU V2 3Task dataset, respectively, when the bandwidth ratio of the communication channel (i.e., compression level for transmission over the channel) is as constrained as 1 12 .

* 18 pages,11 figures, accepted by IEEE TMC

Via

Access Paper or Ask Questions

Cauchy-Schwarz Divergence Information Bottleneck for Regression

Apr 27, 2024

Shujian Yu, Xi Yu, Sigurd Løkse, Robert Jenssen, Jose C. Principe

Figure 1 for Cauchy-Schwarz Divergence Information Bottleneck for Regression

Figure 2 for Cauchy-Schwarz Divergence Information Bottleneck for Regression

Figure 3 for Cauchy-Schwarz Divergence Information Bottleneck for Regression

Figure 4 for Cauchy-Schwarz Divergence Information Bottleneck for Regression

Abstract:The information bottleneck (IB) approach is popular to improve the generalization, robustness and explainability of deep neural networks. Essentially, it aims to find a minimum sufficient representation $\mathbf{t}$ by striking a trade-off between a compression term $I(\mathbf{x};\mathbf{t})$ and a prediction term $I(y;\mathbf{t})$, where $I(\cdot;\cdot)$ refers to the mutual information (MI). MI is for the IB for the most part expressed in terms of the Kullback-Leibler (KL) divergence, which in the regression case corresponds to prediction based on mean squared error (MSE) loss with Gaussian assumption and compression approximated by variational inference. In this paper, we study the IB principle for the regression problem and develop a new way to parameterize the IB with deep neural networks by exploiting favorable properties of the Cauchy-Schwarz (CS) divergence. By doing so, we move away from MSE-based regression and ease estimation by avoiding variational approximations or distributional assumptions. We investigate the improved generalization ability of our proposed CS-IB and demonstrate strong adversarial robustness guarantees. We demonstrate its superior performance on six real-world regression tasks over other popular deep IB approaches. We additionally observe that the solutions discovered by CS-IB always achieve the best trade-off between prediction accuracy and compression ratio in the information plane. The code is available at \url{https://github.com/SJYuCNEL/Cauchy-Schwarz-Information-Bottleneck}.

* accepted by ICLR-24, project page: \url{https://github.com/SJYuCNEL/Cauchy-Schwarz-Information-Bottleneck}

Via

Access Paper or Ask Questions

Adversarial Reweighting with $α$-Power Maximization for Domain Adaptation

Apr 26, 2024

Xiang Gu, Xi Yu, Yan Yang, Jian Sun, Zongben Xu

Abstract:The practical Domain Adaptation (DA) tasks, e.g., Partial DA (PDA), open-set DA, universal DA, and test-time adaptation, have gained increasing attention in the machine learning community. In this paper, we propose a novel approach, dubbed Adversarial Reweighting with $\alpha$-Power Maximization (ARPM), for PDA where the source domain contains private classes absent in target domain. In ARPM, we propose a novel adversarial reweighting model that adversarially learns to reweight source domain data to identify source-private class samples by assigning smaller weights to them, for mitigating potential negative transfer. Based on the adversarial reweighting, we train the transferable recognition model on the reweighted source distribution to be able to classify common class data. To reduce the prediction uncertainty of the recognition model on the target domain for PDA, we present an $\alpha$-power maximization mechanism in ARPM, which enriches the family of losses for reducing the prediction uncertainty for PDA. Extensive experimental results on five PDA benchmarks, i.e., Office-31, Office-Home, VisDA-2017, ImageNet-Caltech, and DomainNet, show that our method is superior to recent PDA methods. Ablation studies also confirm the effectiveness of components in our approach. To theoretically analyze our method, we deduce an upper bound of target domain expected error for PDA, which is approximately minimized in our approach. We further extend ARPM to open-set DA, universal DA, and test time adaptation, and verify the usefulness through experiments.

* To appear in IJCV

Via

Access Paper or Ask Questions

INSURE: An Information Theory Inspired Disentanglement and Purification Model for Domain Generalization

Sep 08, 2023

Xi Yu, Huan-Hsin Tseng, Shinjae Yoo, Haibin Ling, Yuewei Lin

Abstract:Domain Generalization (DG) aims to learn a generalizable model on the unseen target domain by only training on the multiple observed source domains. Although a variety of DG methods have focused on extracting domain-invariant features, the domain-specific class-relevant features have attracted attention and been argued to benefit generalization to the unseen target domain. To take into account the class-relevant domain-specific information, in this paper we propose an Information theory iNspired diSentanglement and pURification modEl (INSURE) to explicitly disentangle the latent features to obtain sufficient and compact (necessary) class-relevant feature for generalization to the unseen domain. Specifically, we first propose an information theory inspired loss function to ensure the disentangled class-relevant features contain sufficient class label information and the other disentangled auxiliary feature has sufficient domain information. We further propose a paired purification loss function to let the auxiliary feature discard all the class-relevant information and thus the class-relevant feature will contain sufficient and compact (necessary) class-relevant information. Moreover, instead of using multiple encoders, we propose to use a learnable binary mask as our disentangler to make the disentanglement more efficient and make the disentangled features complementary to each other. We conduct extensive experiments on four widely used DG benchmark datasets including PACS, OfficeHome, TerraIncognita, and DomainNet. The proposed INSURE outperforms the state-of-art methods. We also empirically show that domain-specific class-relevant features are beneficial for domain generalization.

* 10 pages, 4 figures

Via

Access Paper or Ask Questions

Receding Horizon Control on the Broadcast of Information in Stochastic Networks

Dec 19, 2022

Thales C. Silva, Li Shen, Xi Yu, M. Ani Hsieh

Figure 1 for Receding Horizon Control on the Broadcast of Information in Stochastic Networks

Figure 2 for Receding Horizon Control on the Broadcast of Information in Stochastic Networks

Figure 3 for Receding Horizon Control on the Broadcast of Information in Stochastic Networks

Figure 4 for Receding Horizon Control on the Broadcast of Information in Stochastic Networks

Abstract:This paper focuses on the broadcast of information on robot networks with stochastic network interconnection topologies. Problematic communication networks are almost unavoidable in areas where we wish to deploy multi-robotic systems, usually due to a lack of environmental consistency, accessibility, and structure. We tackle this problem by modeling the broadcast of information in a multi-robot communication network as a stochastic process with random arrival times, which can be produced by irregular robot movements, wireless attenuation, and other environmental factors. Using this model, we provide and analyze a receding horizon control strategy to control the statistics of the information broadcast. The resulting strategy compels the robots to re-direct their communication resources to different neighbors according to the current propagation process to fulfill global broadcast requirements. Based on this method, we provide an approach to compute the expected time to broadcast the message to all nodes. Numerical examples are provided to illustrate the results.

Via

Access Paper or Ask Questions

Air-Aided Communication Between Ground Assets in a Poisson Forest

Nov 19, 2022

Juan David Pabon, Shaikha Alkandari, Matthew C. Valenti, Xi Yu

Figure 1 for Air-Aided Communication Between Ground Assets in a Poisson Forest

Figure 2 for Air-Aided Communication Between Ground Assets in a Poisson Forest

Figure 3 for Air-Aided Communication Between Ground Assets in a Poisson Forest

Figure 4 for Air-Aided Communication Between Ground Assets in a Poisson Forest

Abstract:Ground assets deployed in a cluttered environment with randomized obstacles (e.g., a forest) may experience line of sight (LoS) obstruction due to those obstacles. Air assets can be deployed in the vicinity to aid the communication by establishing two-hop paths between the ground assets. Obstacles that are taller than a position-dependent critical height may still obstruct the LoS between a ground asset and an air asset. In this paper, we provide an analytical framework for computing the probability of obtaining a LoS path in a Poisson forest. Given the locations and heights of a ground asset and an air asset, we establish the critical height, which is a function of distance. To account for this dependence on distance, the blocking is modeled as an inhomogenous Poisson point process, and the LoS probability is its void probability. Examples and closed-form expressions are provided for two obstruction height distributions: uniform and truncated Gaussian. The examples are validated through simulation. Additionally, the end-to-end throughput is determined and shown to be a metric that balances communication distance with the impact of LoS blockage. Throughput is used to determine the range at which it is better to relay communications through the air asset, and, when the air asset is deployed, its optimal height.

* Military Communications Conference, MILCOM 2022

Via

Access Paper or Ask Questions

Flow-Based Control of Marine Robots in Gyre-Like Environments

Mar 01, 2022

Gedaliah Knizhnik, Peihan Li, Xi Yu, M. Ani Hsieh

Figure 1 for Flow-Based Control of Marine Robots in Gyre-Like Environments

Figure 2 for Flow-Based Control of Marine Robots in Gyre-Like Environments

Figure 3 for Flow-Based Control of Marine Robots in Gyre-Like Environments

Figure 4 for Flow-Based Control of Marine Robots in Gyre-Like Environments

Abstract:We present a flow-based control strategy that enables resource-constrained marine robots to patrol gyre-like flow environments on an orbital trajectory with a periodicity in a given range. The controller does not require a detailed model of the flow field and relies only on the robot's location relative to the center of the gyre. Instead of precisely tracking a pre-defined trajectory, the robots are tasked to stay in between two bounding trajectories with known periodicity. Furthermore, the proposed strategy leverages the surrounding flow field to minimize control effort. We prove that the proposed strategy enables robots to cycle in the flow satisfying the desired periodicity requirements. Our method is tested and validated both in simulation and in experiments using a low-cost, underactuated, surface swimming robot, i.e. the Modboat.

* 7 pages. Accepted to the 2022 International Conference on Robotics and Automation (ICRA)

Via

Access Paper or Ask Questions

Gated Information Bottleneck for Generalization in Sequential Environments

Oct 12, 2021

Francesco Alesiani, Shujian Yu, Xi Yu

Figure 1 for Gated Information Bottleneck for Generalization in Sequential Environments

Figure 2 for Gated Information Bottleneck for Generalization in Sequential Environments

Figure 3 for Gated Information Bottleneck for Generalization in Sequential Environments

Figure 4 for Gated Information Bottleneck for Generalization in Sequential Environments

Abstract:Deep neural networks suffer from poor generalization to unseen environments when the underlying data distribution is different from that in the training set. By learning minimum sufficient representations from training data, the information bottleneck (IB) approach has demonstrated its effectiveness to improve generalization in different AI applications. In this work, we propose a new neural network-based IB approach, termed gated information bottleneck (GIB), that dynamically drops spurious correlations and progressively selects the most task-relevant features across different environments by a trainable soft mask (on raw features). GIB enjoys a simple and tractable objective, without any variational approximation or distributional assumption. We empirically demonstrate the superiority of GIB over other popular neural network-based IB approaches in adversarial robustness and out-of-distribution (OOD) detection. Meanwhile, we also establish the connection between IB theory and invariant causal representation learning, and observed that GIB demonstrates appealing performance when different environments arrive sequentially, a more practical scenario where invariant risk minimization (IRM) fails. Code of GIB is available at https://github.com/falesiani/GIB

* manuscript accepted by IEEE ICDM-21 (regular papers), code is available at https://github.com/falesiani/GIB

Via

Access Paper or Ask Questions