Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Fernando Pérez-Cruz

KG-FRUS: a Novel Graph-based Dataset of 127 Years of US Diplomatic Relations

Oct 30, 2023

Gökberk Özsoy, Luis Salamanca, Matthew Connelly, Raymond Hicks, Fernando Pérez-Cruz

Figure 1 for KG-FRUS: a Novel Graph-based Dataset of 127 Years of US Diplomatic Relations

Figure 2 for KG-FRUS: a Novel Graph-based Dataset of 127 Years of US Diplomatic Relations

Figure 3 for KG-FRUS: a Novel Graph-based Dataset of 127 Years of US Diplomatic Relations

Figure 4 for KG-FRUS: a Novel Graph-based Dataset of 127 Years of US Diplomatic Relations

Abstract:In the current paper, we present the KG-FRUS dataset, comprised of more than 300,000 US government diplomatic documents encoded in a Knowledge Graph (KG). We leverage the data of the Foreign Relations of the United States (FRUS) (available as XML files) to extract information about the documents and the individuals and countries mentioned within them. We use the extracted entities, and associated metadata, to create a graph-based dataset. Further, we supplement the created KG with additional entities and relations from Wikidata. The relations in the KG capture the synergies and dynamics required to study and understand the complex fields of diplomacy, foreign relations, and politics. This goes well beyond a simple collection of documents which neglects the relations between entities in the text. We showcase a range of possibilities of the current dataset by illustrating different approaches to probe the KG. In the paper, we exemplify how to use a query language to answer simple research questions and how to use graph algorithms such as Node2Vec and PageRank, that benefit from the complete graph structure. More importantly, the chosen structure provides total flexibility for continuously expanding and enriching the graph. Our solution is general, so the proposed pipeline for building the KG can encode other original corpora of time-dependent and complex phenomena. Overall, we present a mechanism to create KG databases providing a more versatile representation of time-dependent related text data and a particular application to the all-important FRUS database.

* 11 pages, 5 figures, 2 tables, submitted to NeurIPS databases. Mixed of social sciences and data analysis content

Via

Access Paper or Ask Questions

Regularizing Transformers With Deep Probabilistic Layers

Aug 23, 2021

Aurora Cobo Aguilera, Pablo Martínez Olmos, Antonio Artés-Rodríguez, Fernando Pérez-Cruz

Figure 1 for Regularizing Transformers With Deep Probabilistic Layers

Figure 2 for Regularizing Transformers With Deep Probabilistic Layers

Figure 3 for Regularizing Transformers With Deep Probabilistic Layers

Figure 4 for Regularizing Transformers With Deep Probabilistic Layers

Abstract:Language models (LM) have grown with non-stop in the last decade, from sequence-to-sequence architectures to the state-of-the-art and utter attention-based Transformers. In this work, we demonstrate how the inclusion of deep generative models within BERT can bring more versatile models, able to impute missing/noisy words with richer text or even improve BLEU score. More precisely, we use a Gaussian Mixture Variational Autoencoder (GMVAE) as a regularizer layer and prove its effectiveness not only in Transformers but also in the most relevant encoder-decoder based LM, seq2seq with and without attention.

Via

Access Paper or Ask Questions

Robust Sampling in Deep Learning

Jun 05, 2020

Aurora Cobo Aguilera, Antonio Artés-Rodríguez, Fernando Pérez-Cruz, Pablo Martínez Olmos

Figure 1 for Robust Sampling in Deep Learning

Figure 2 for Robust Sampling in Deep Learning

Figure 3 for Robust Sampling in Deep Learning

Figure 4 for Robust Sampling in Deep Learning

Abstract:Deep learning requires regularization mechanisms to reduce overfitting and improve generalization. We address this problem by a new regularization method based on distributional robust optimization. The key idea is to modify the contribution from each sample for tightening the empirical risk bound. During the stochastic training, the selection of samples is done according to their accuracy in such a way that the worst performed samples are the ones that contribute the most in the optimization. We study different scenarios and show the ones where it can make the convergence faster or increase the accuracy.

* 8 pages, 3 figures

Via

Access Paper or Ask Questions

Out-of-Sample Testing for GANs

Jan 28, 2019

Pablo Sánchez-Martín, Pablo M. Olmos, Fernando Pérez-Cruz

Figure 1 for Out-of-Sample Testing for GANs

Figure 2 for Out-of-Sample Testing for GANs

Figure 3 for Out-of-Sample Testing for GANs

Figure 4 for Out-of-Sample Testing for GANs

Abstract:We propose a new method to evaluate GANs, namely EvalGAN. EvalGAN relies on a test set to directly measure the reconstruction quality in the original sample space (no auxiliary networks are necessary), and it also computes the (log)likelihood for the reconstructed samples in the test set. Further, EvalGAN is agnostic to the GAN algorithm and the dataset. We decided to test it on three state-of-the-art GANs over the well-known CIFAR-10 and CelebA datasets.

Via

Access Paper or Ask Questions

Complex-Valued Kernel Methods for Regression

Oct 31, 2016

Rafael Boloix-Tortosa, Juan José Murillo-Fuentes, Irene Santos Velázquez, Fernando Pérez-Cruz

Figure 1 for Complex-Valued Kernel Methods for Regression

Figure 2 for Complex-Valued Kernel Methods for Regression

Figure 3 for Complex-Valued Kernel Methods for Regression

Figure 4 for Complex-Valued Kernel Methods for Regression

Abstract:Usually, complex-valued RKHS are presented as an straightforward application of the real-valued case. In this paper we prove that this procedure yields a limited solution for regression. We show that another kernel, here denoted as pseudo kernel, is needed to learn any function in complex-valued fields. Accordingly, we derive a novel RKHS to include it, the widely RKHS (WRKHS). When the pseudo-kernel cancels, WRKHS reduces to complex-valued RKHS of previous approaches. We address the kernel and pseudo-kernel design, paying attention to the kernel and the pseudo-kernel being complex-valued. In the experiments included we report remarkable improvements in simple scenarios where real a imaginary parts have different similitude relations for given inputs or cases where real and imaginary parts are correlated. In the context of these novel results we revisit the problem of non-linear channel equalization, to show that the WRKHS helps to design more efficient solutions.

* IEEE Transactions on Signal Processing (Volume: 65, Issue: 19, Oct.1, 1 2017)
* 8 pages, 9 figures

Via

Access Paper or Ask Questions

Gaussian Processes for Nonlinear Signal Processing

Sep 27, 2013

Fernando Pérez-Cruz, Steven Van Vaerenbergh, Juan José Murillo-Fuentes, Miguel Lázaro-Gredilla, Ignacio Santamaria

Figure 1 for Gaussian Processes for Nonlinear Signal Processing

Figure 2 for Gaussian Processes for Nonlinear Signal Processing

Figure 3 for Gaussian Processes for Nonlinear Signal Processing

Figure 4 for Gaussian Processes for Nonlinear Signal Processing

Abstract:Gaussian processes (GPs) are versatile tools that have been successfully employed to solve nonlinear estimation problems in machine learning, but that are rarely used in signal processing. In this tutorial, we present GPs for regression as a natural nonlinear extension to optimal Wiener filtering. After establishing their basic formulation, we discuss several important aspects and extensions, including recursive and adaptive algorithms for dealing with non-stationarity, low-complexity solutions, non-Gaussian noise models and classification scenarios. Furthermore, we provide a selection of relevant applications to wireless digital communications.

* IEEE Signal Processing Magazine, vol.30, no.4, pp.40-50, July 2013

Via

Access Paper or Ask Questions