Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sinan G. Aksoy

Pacific Northwestern National Labs

Talking to GDELT Through Knowledge Graphs

Mar 10, 2025

Audun Myers, Max Vargas, Sinan G. Aksoy, Cliff Joslyn, Benjamin Wilson, Tom Grimes

Figure 1 for Talking to GDELT Through Knowledge Graphs

Figure 2 for Talking to GDELT Through Knowledge Graphs

Figure 3 for Talking to GDELT Through Knowledge Graphs

Figure 4 for Talking to GDELT Through Knowledge Graphs

Abstract:In this work we study various Retrieval Augmented Regeneration (RAG) approaches to gain an understanding of the strengths and weaknesses of each approach in a question-answering analysis. To gain this understanding we use a case-study subset of the Global Database of Events, Language, and Tone (GDELT) dataset as well as a corpus of raw text scraped from the online news articles. To retrieve information from the text corpus we implement a traditional vector store RAG as well as state-of-the-art large language model (LLM) based approaches for automatically constructing KGs and retrieving the relevant subgraphs. In addition to these corpus approaches, we develop a novel ontology-based framework for constructing knowledge graphs (KGs) from GDELT directly which leverages the underlying schema of GDELT to create structured representations of global events. For retrieving relevant information from the ontology-based KGs we implement both direct graph queries and state-of-the-art graph retrieval approaches. We compare the performance of each method in a question-answering task. We find that while our ontology-based KGs are valuable for question-answering, automated extraction of the relevant subgraphs is challenging. Conversely, LLM-generated KGs, while capturing event summaries, often lack consistency and interpretability. Our findings suggest benefits of a synergistic approach between ontology and LLM-based KG construction, with proposed avenues toward that end.

Via

Access Paper or Ask Questions

HyperMagNet: A Magnetic Laplacian based Hypergraph Neural Network

Feb 15, 2024

Tatyana Benko, Martin Buck, Ilya Amburg, Stephen J. Young, Sinan G. Aksoy

Figure 1 for HyperMagNet: A Magnetic Laplacian based Hypergraph Neural Network

Figure 2 for HyperMagNet: A Magnetic Laplacian based Hypergraph Neural Network

Figure 3 for HyperMagNet: A Magnetic Laplacian based Hypergraph Neural Network

Figure 4 for HyperMagNet: A Magnetic Laplacian based Hypergraph Neural Network

Abstract:In data science, hypergraphs are natural models for data exhibiting multi-way relations, whereas graphs only capture pairwise. Nonetheless, many proposed hypergraph neural networks effectively reduce hypergraphs to undirected graphs via symmetrized matrix representations, potentially losing important information. We propose an alternative approach to hypergraph neural networks in which the hypergraph is represented as a non-reversible Markov chain. We use this Markov chain to construct a complex Hermitian Laplacian matrix - the magnetic Laplacian - which serves as the input to our proposed hypergraph neural network. We study HyperMagNet for the task of node classification, and demonstrate its effectiveness over graph-reduction based hypergraph neural networks.

* 9 pages, 1 figure

Via

Access Paper or Ask Questions

Randomized Algorithms for Symmetric Nonnegative Matrix Factorization

Feb 13, 2024

Koby Hayashi, Sinan G. Aksoy, Grey Ballard, Haesun Park

Abstract:Symmetric Nonnegative Matrix Factorization (SymNMF) is a technique in data analysis and machine learning that approximates a symmetric matrix with a product of a nonnegative, low-rank matrix and its transpose. To design faster and more scalable algorithms for SymNMF we develop two randomized algorithms for its computation. The first algorithm uses randomized matrix sketching to compute an initial low-rank input matrix and proceeds to use this input to rapidly compute a SymNMF. The second algorithm uses randomized leverage score sampling to approximately solve constrained least squares problems. Many successful methods for SymNMF rely on (approximately) solving sequences of constrained least squares problems. We prove theoretically that leverage score sampling can approximately solve nonnegative least squares problems to a chosen accuracy with high probability. Finally we demonstrate that both methods work well in practice by applying them to graph clustering tasks on large real world data sets. These experiments show that our methods approximately maintain solution quality and achieve significant speed ups for both large dense and large sparse problems.

Via

Access Paper or Ask Questions

Scalable tensor methods for nonuniform hypergraphs

Jun 30, 2023

Sinan G. Aksoy, Ilya Amburg, Stephen J. Young

Abstract:While multilinear algebra appears natural for studying the multiway interactions modeled by hypergraphs, tensor methods for general hypergraphs have been stymied by theoretical and practical barriers. A recently proposed adjacency tensor is applicable to nonuniform hypergraphs, but is prohibitively costly to form and analyze in practice. We develop tensor times same vector (TTSV) algorithms for this tensor which improve complexity from $O(n^r)$ to a low-degree polynomial in $r$, where $n$ is the number of vertices and $r$ is the maximum hyperedge size. Our algorithms are implicit, avoiding formation of the order $r$ adjacency tensor. We demonstrate the flexibility and utility of our approach in practice by developing tensor-based hypergraph centrality and clustering algorithms. We also show these tensor measures offer complementary information to analogous graph-reduction approaches on data, and are also able to detect higher-order structure that many existing matrix-based approaches provably cannot.

Via

Access Paper or Ask Questions

Seven open problems in applied combinatorics

Mar 20, 2023

Sinan G. Aksoy, Ryan Bennink, Yuzhou Chen, José Frías, Yulia R. Gel, Bill Kay, Uwe Naumann, Carlos Ortiz Marrero, Anthony V. Petyuk, Sandip Roy(+3 more)

Figure 1 for Seven open problems in applied combinatorics

Figure 2 for Seven open problems in applied combinatorics

Figure 3 for Seven open problems in applied combinatorics

Figure 4 for Seven open problems in applied combinatorics

Abstract:We present and discuss seven different open problems in applied combinatorics. The application areas relevant to this compilation include quantum computing, algorithmic differentiation, topological data analysis, iterative methods, hypergraph cut algorithms, and power systems.

* 43 pages, 5 figures

Via

Access Paper or Ask Questions

Skew-Symmetric Adjacency Matrices for Clustering Directed Graphs

Mar 02, 2022

Koby Hayashi, Sinan G. Aksoy, Haesun Park

Figure 1 for Skew-Symmetric Adjacency Matrices for Clustering Directed Graphs

Figure 2 for Skew-Symmetric Adjacency Matrices for Clustering Directed Graphs

Figure 3 for Skew-Symmetric Adjacency Matrices for Clustering Directed Graphs

Figure 4 for Skew-Symmetric Adjacency Matrices for Clustering Directed Graphs

Abstract:Cut-based directed graph (digraph) clustering often focuses on finding dense within-cluster or sparse between-cluster connections, similar to cut-based undirected graph clustering methods. In contrast, for flow-based clusterings the edges between clusters tend to be oriented in one direction and have been found in migration data, food webs, and trade data. In this paper we introduce a spectral algorithm for finding flow-based clusterings. The proposed algorithm is based on recent work which uses complex-valued Hermitian matrices to represent digraphs. By establishing an algebraic relationship between a complex-valued Hermitian representation and an associated real-valued, skew-symmetric matrix the proposed algorithm produces clusterings while remaining completely in the real field. Our algorithm uses less memory and asymptotically less computation while provably preserving solution quality. We also show the algorithm can be easily implemented using standard computational building blocks, possesses better numerical properties, and loans itself to a natural interpretation via an objective function relaxation argument.

* 21 pages, 7 figures

Via

Access Paper or Ask Questions

Hypergraph Random Walks, Laplacians, and Clustering

Jun 29, 2020

Koby Hayashi, Sinan G. Aksoy, Cheong Hee Park, Haesun Park

Figure 1 for Hypergraph Random Walks, Laplacians, and Clustering

Figure 2 for Hypergraph Random Walks, Laplacians, and Clustering

Figure 3 for Hypergraph Random Walks, Laplacians, and Clustering

Figure 4 for Hypergraph Random Walks, Laplacians, and Clustering

Abstract:We propose a flexible framework for clustering hypergraph-structured data based on recently proposed random walks utilizing edge-dependent vertex weights. When incorporating edge-dependent vertex weights (EDVW), a weight is associated with each vertex-hyperedge pair, yielding a weighted incidence matrix of the hypergraph. Such weightings have been utilized in term-document representations of text data sets. We explain how random walks with EDVW serve to construct different hypergraph Laplacian matrices, and then develop a suite of clustering methods that use these incidence matrices and Laplacians for hypergraph clustering. Using several data sets from real-life applications, we compare the performance of these clustering algorithms experimentally against a variety of existing hypergraph clustering methods. We show that the proposed methods produce higher-quality clusters and conclude by highlighting avenues for future work.

Via

Access Paper or Ask Questions

Relative Hausdorff Distance for Network Analysis

Jun 12, 2019

Sinan G. Aksoy, Kathleen E. Nowak, Emilie Purvine, Stephen J. Young

Figure 1 for Relative Hausdorff Distance for Network Analysis

Figure 2 for Relative Hausdorff Distance for Network Analysis

Figure 3 for Relative Hausdorff Distance for Network Analysis

Figure 4 for Relative Hausdorff Distance for Network Analysis

Abstract:Similarity measures are used extensively in machine learning and data science algorithms. The newly proposed graph Relative Hausdorff (RH) distance is a lightweight yet nuanced similarity measure for quantifying the closeness of two graphs. In this work we study the effectiveness of RH distance as a tool for detecting anomalies in time-evolving graph sequences. We apply RH to cyber data with given red team events, as well to synthetically generated sequences of graphs with planted attacks. In our experiments, the performance of RH distance is at times comparable, and sometimes superior, to graph edit distance in detecting anomalous phenomena. Our results suggest that in appropriate contexts, RH distance has advantages over more computationally intensive similarity measures.

* 20 pages

Via

Access Paper or Ask Questions