Hostname: page-component-77f85d65b8-v2srd Total loading time: 0 Render date: 2026-03-27T20:40:22.802Z Has data issue: false hasContentIssue false

Generative AI and Topological Data Analysis of Longitudinal Panel Data

Published online by Cambridge University Press:  10 September 2025

Badredine Arfi*
Affiliation:
Department of Political Science, University of Florida , Gainesville, FL, USA
*
Rights & Permissions [Opens in a new window]

Abstract

This article constructs an approach to analyzing longitudinal panel data which combines topological data analysis (TDA) and generative AI applied to graph neural networks (GNNs). TDA is deployed to identify and analyze unobserved topological heterogeneities of a dataset. TDA-extracted information is quantified into a set of measures, called functional principal components. These measures are used to analyze the data in four ways. First, the measures are construed as moderators of the data and their statistical effects are estimated through a Bayesian framework. Second, the measures are used as factors to classify the data into topological classes using generative AI applied to GNNs constructed by transforming the data into graphs. The classification uncovers patterns in the data which are otherwise not accessible through statistical approaches. Third, the measures are used as factors that condition the extraction of latent variables of the data through a deployment of a generative AI model. Fourth, the measures are used as labels for classifying the graphs into classes used to offer a GNN-based effective dimensionality reduction of the original data. The article uses a portion of the militarized international disputes (MIDs) dataset (from 1946 to 2010) as a running example to briefly illustrate its ideas and steps.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of The Society for Political Methodology
Figure 0

Table 1 Bayesian estimation with sociomatrix expansion.

Figure 1

Figure 1 Examples of Cech and Rips–Vietoris simplicial complex.

Figure 2

Figure 2 PDs for four years.

Figure 3

Figure 3 Persistence entropy, total MIDs and normalized mean-FPCA.

Figure 4

Table 2 Model with sociomatrix and topological moderator effects.

Figure 5

Figure 4 GCN model (Kipf and Welling 2017).

Figure 6

Figure 5 Proportions of clusters over the years.

Figure 7

Figure 6 Max attention weights, persistence entropy, and MIDs.

Figure 8

Figure 7 Subgraphs with top eight attention and IOs-sharing weights.

Figure 9

Figure 8 Conditioned ARGVA model with the topological conditioning done through embedding: $Z \times \boldsymbol{\zeta } \to Z.$

Figure 10

Figure 9 KDE of latent space PCA components in year 2010.

Figure 11

Figure 10 Cluster cohesion over time.

Figure 12

Figure 11 Probabilities of state membership in four clusters.

Figure 13

Table 3 All encompassing model.

Figure 14

Figure 12 Most important subgraphs for four years.

Figure 15

Table 4 Bayesian estimates with SubGraphX-generated data.

Figure 16

Figure 13 ROC curve using empirical data as testing data.

Figure 17

Figure 14 Flowchart steps.

Supplementary material: File

Arfi supplementary material

Arfi supplementary material
Download Arfi supplementary material(File)
File 1.8 MB