Continuous Latent Position Models for Instantaneous Interactions

We create a framework to analyse the timing and frequency of instantaneous interactions between pairs of entities. This type of interaction data is especially common nowadays, and easily available. Examples of instantaneous interactions include email networks, phone call networks and some common types of technological and transportation networks. Our framework relies on a novel extension of the latent position network model: we assume that the entities are embedded in a latent Euclidean space, and that they move along individual trajectories which are continuous over time. These trajectories are used to characterize the timing and frequency of the pairwise interactions. We discuss an inferential framework where we estimate the individual trajectories from the observed interaction data, and propose applications on artificial and real data.


Introduction
The Latent Position Model (LPM, Hoff et al. 2002) is a widely used statistical model that can be used to characterize a network through a latent space representation.The model embeds the nodes of the network as points in the real plane, and then uses these latent features to explain the observed interactions between the entities.This provides a neat and easy-to-interpret graphical representation of the observed interaction data, which is able to capture some extremely common empirical features such as transitivity and homophily.
In this paper, we propose a new LPM that can be used to model repeated instantaneous interactions between entities, over an arbitrary time interval.The time dimension is continuous, and an interaction between any two nodes may happen at any point in time.
We propose a data generative mechanism which is inspired by the extensive literature on LPMs, and we define an efficient estimation framework to fit our model.
Since the foundational work of Hoff et al. (2002), the literature on LPMs has been developed in many directions, both from the methodological and from the applied point of views.Recent review papers on the topic include Salter-Townshend et al. (2012), Rastelli et al. (2016), Raftery (2017), and Sosa and Buitrago (2020).
As regards statistical methodology, the original paper of Hoff et al. (2002) defined a framework to infer and interpret a LPM for binary interactions.The authors introduced two types of LPMs: the projection model and the distance model.
The projection model postulates that the probability that an edge appears between any two nodes is determined by the dot product of the latent coordinates of the two respective nodes.As a consequence, a crucial contribution for the edge probability is given by the direction the nodes point towards.By contrast, the distance model defines the connection probability as a function of the Euclidean distance between the two nodes.
Nodes that are located close to each other are more likely to connect than nodes that are located far apart.Both models provide a clear representation of the interaction data which can be used to study the network's topology, or to construct model-based summaries and visualizations, or predictions.
More recently, the projection model and its variations have been extensively studied and used in a variety of applications (see Hoff 2005;Hoff 2018 and references therein).This model has also clear connections to a rich machine learning literature on spatial embeddings, which include Rahimi and Recht (2007), Lee andSeung (1999), andHalko et al. (2011).Variations of the projection model have been extended to dynamic settings (Durante and Dunson 2014;Durante and Dunson 2016), and other types of networks frameworks (Durante et al. 2017).
As regards the distance model, this has been extended by Handcock et al. (2007) and Krivitsky et al. (2009) to represent clustering of the nodes and more flexible degree distributions.In the context of networks evolving over time, dynamic extensions of the model have been considered in Sarkar and Moore (2006), and more recently in several works including Sewell and Chen (2015b) and Friel et al. (2016) for binary interactions.
The recent review paper of Kim et al. (2018) provides additional references on dynamic network modeling.Other relevant and interesting works that revolve around the distance models in either static or dynamic settings include Gollini and Murphy (2014) and Salter-Townshend and McCormick (2017) for multi-view networks, Sewell and Chen (2016) for dynamic weighted networks, and Gormley and Murphy (2007) and Sewell and Chen (2015a) for networks of rankings.We also mention Raftery et al. (2012), Fosdick et al. (2019), Rastelli et al. (2018), and Tafakori et al. (2019) which introduce original and closely related modeling or computational ideas.
Crucially, we note that all existing dynamic LPMs consider a discrete time dimension, whereby the interactions are observed at a number of different points in time1 .By contrast, a fundamental original aspect of this paper is that it considers a fully continuous LPM whereby the interactions are instantaneous, and they can happen at any point in time.Continuous networks of this type are especially common and widely available, as they include email networks (Klimt and Yang 2004), functional brain networks (Park and Friston 2013), and other networks of human interactions (see Cattuto et al. 2010;Barrat and Cattuto 2013 and references therein).Some of the approaches that have been proposed in the statistics literature to model instantaneous interactions include Corneli et al. (2017) and Matias et al. (2018), however we note that these approaches rely on extensions of the stochastic blockmodel (Nowicki and Snijders 2001), and not on the LPM.Another relevant strand of literature focuses instead on modeling this type of data using Hawkes processes (see Junuthula et al. 2019 and references therein).
We propose our new Continuous Latent Position Model (CLPM) both for the projection model framework and for the distance model framework.In our approach, each of the nodes is characterized by a latent trajectory on the real plane, which is assumed to be a piece-wise linear curve.The interactions between any two nodes are modeled as events of a inhomogeneous Poisson point process, whose rate is determined by the instantaneous positions of the nodes, at each point in time.The piece-wise linear curve assumption gives sufficient flexibility regarding the possible trajectories, while not affecting the purely continuous nature of the framework, in that the rate of the Poisson process is not piece-wise constant.This is a major difference with respect to other approaches that have been considered (Corneli et al. 2017 and one of the approaches of Matias et al. 2018).
We propose a penalized likelihood approach to perform inference, and we use optimization via gradient descent to obtain optimal estimates of the model parameters.
We have created a software that implements our estimation method, which is publicly available from CLPM GitHub repository (2021).
The paper is structured as follows: in Section 2 we introduce our new model and its two variants (i.e. the projection and distance model), and we derive the main equations that are used in the paper; in Section 3 we describe our approach to estimate the model parameters; in Section 4 we illustrate our procedure on three synthetic datasets, whereas in Section 5 we propose real data applications.We give final comments and conclusions in Section 6.

Modeling the interaction times
The data that we observe is stored as a list of interactions (or edge list) in the format E := {(τ e , i e , j e )} e∈N , where τ e ∈ [0, T ] for all e is the interaction time between the nodes i e and j e , with i e , j e ∈ {1, . . ., N }.We consider undirected interactions without self loops, although extensions to the directed case are straightforward.We emphasize that all interactions are instantaneous, i.e. their length is not relevant or not recorded.An interaction between two nodes may occur at any point in time τ e ∈ [0, T ].Let us now formally introduce the list of the interaction times between two arbitrary nodes i and j: where E ij is the total number of times i interacts with j before T .We assume that the interaction times in the above equation are the realization of a inhomogeneous Poisson point process with instantaneous rate function denoted with λ ij (t) ≥ 0, ∀t ∈ [0, T ] and nodes i and j.Using a more convenient (but equivalent) characterization, we state that the waiting time for a new interaction event between i and j is exponential with a variable rate that changes over time.Then, if we assume that the inhomogeneous point processes are independent for all pairs i and j, the likelihood function for the rates can be written as: where, for simplicity, we have removed the superscript (i, j) from τ (i,j) e . In the sections below we will specify the conditions that make the processes independent.

Latent positions
Our goal is to embed the nodes of the network into a latent space, such that the latent positions are the primary driving factor behind the frequency and timing of the interactions between the nodes.Crucially, since the time dimension is continuous and interactions can happen at any point in time, we aim at creating a modeling framework which also evolves continuously over time.Thus, the fundamental assumption of our model is that, at any point in time, the Poisson rate function λ ij (t) is determined by the latent positions of the corresponding nodes, which we denote z i (t) ∈ R 2 and z j (t) ∈ R 2 .
Remark.We assume that the number of dimensions of the latent space is equal to 2, because the main interest of the proposed approach is in latent space visualization of the network.However, we note that the generative model presented in this section can be easily extended to the case z i (t) ∈ R d , with d > 2.
To facilitate the inference task, the trajectories are assumed to be piece-wise linear curves, characterized by a number of user-defined change points in the time dimension.
These change points are in common across the trajectories of all nodes and they determine the points in time when the linear motions of the nodes change direction and speed.This means that we must define a grid of the time dimension through K change points that are common across all trajectories.We stress that this modeling choice is only meant to restrict the variety of continuous trajectories that we may consider, as it allows us to use a tractable parametric structure while keeping a high flexibility regarding the trajectories that can be obtained.Also, we make the assumption that, within any two consecutive critical points, the speed at which any given node moves remains constant.
As a consequence, we only need to store the coordinates of the nodes at the change points, since all the intermediate positions can then be obtained with: for any change points η k and η k+1 and node i.

Projection Model
Similarly to the foundational paper of Hoff et al. (2002), we introduce two possible characterizations of the rates through the latent positions: one is inspired by the projection model, the other is inspired by the distance model.In our projection model, we assume that: i) the latent positions are constrained within the first quadrant of R2 , i.e. z i (t) ∈ R + × R + , for all t ∈ [0, T ] and node i; ii) the rate is exactly equal to the dot product between the positions, i.e.
, for all t ∈ [0, T ] and nodes i and j.
As a consequence, the further the nodes are positioned from the origin, the more frequent their interactions will be, especially towards other nodes that are aligned in the same direction 2 .Viceversa, we are not expecting frequent interactions for nodes that are located too close to the origin, or between pairs of nodes forming an angle which is close to 90 degrees.The restriction of the latent space to the first quadrant guarantees that the rate remains always non-negative, and we argue that this assumption does not diminish the flexibility nor interpretability of the model.
By taking the logarithm of Eq. ( 2) and replacing λ ij (t), the log-likelihood for the projection model is Remark.When expressing the Poisson rate λ ij (•) as a function of the latent trajectories, we move from the unconditional independence assumption leading to Eq. ( 2) to a conditional one.The timelines of events for all pairs of nodes are independent given the latent trajectories.
As proven in Appendix A, the integral term appearing in Eq. ( 4) can be calculated analytically, thus leading to a closed form expression for the log-likelihood of the projection model.

Distance Model
Here, we introduce a version of the latent position model that uses the latent Euclidean distances between the nodes, rather than the dot products.We argue that, in this context, the distance model provides both more flexibility and easier interpretability.
The simulation studies that we perform in Section 4 will highlight these advantages.
In the distance model, we assume that: where the last term corresponds to the squared Euclidean distance between nodes i and j at time t.We also introduced an intercept term β as per the original LPM by Hoff et al. (2002).The intercept term affects all nodes but extensions of the model where it become specific to each node can also be considered for both the projection and the distance models.By taking the logarithm of Eq. ( 2) and using Eq. ( 5) the log-likelihood of the distance model becomes: Similarly to the projection model, also the above log-likelihood has a closed form, since the integral inside the brackets can be calculated analytically (proof in Appendix B).

Penalized likelihood
Due to the piece-wise linearity assumption in Eq. ( 5), for each node we only need to estimate its positions at times {η k } k∈ [K] .In order to avoid over fitting and obtain more interpretable and meaningful results, we use prior distributions over the latent positions , as a means to penalize large velocities of the nodes in the latent space.
Projection Model: as a penalization, in this case we require that: for all nodes, where µ and σ 2 are hyper-parameters to be fixed by the user.The above assumption states that the cosine of the angle between the position of a node at a change point, and the position of the same node at the following change point, follows a truncated Gaussian distribution (in [0, 1]).In the applications we choose µ ≈ 1 and a small value of σ 2 so that we require that any node rotates around the origin as little as necessary.
Distance Model.We define Gaussian random walk priors on the critical points of the latent trajectories: for every node i where I 2 is the identity matrix of order two.The equation above (with σ 2 = 1) would correspond to a Brownian motion process for the i-th latent trajectory, except that we would only observe it at the change points, where the latent positions are estimated.However, as the number of change points increases, the prior that we specify tends to a scaled Brownian motion on the plane.The parameters σ 2 0 and σ 2 are user-defined, hence, similarly to the projection model, the Gaussian priors can be used as penalizations.In order to obtain sensible penalizations, we choose small values of the variance parameters, as to ensure that the nodes are scattered around the origin of the space, and that the speed of the nodes along the trajectories is not too large.In this way, the nodes are forced to move as little as necessary, making the latent visualization of the network easier to read and interpret.
Remark.The likelihood function of the original latent distance model of Hoff et al. (2002) is not identifiable with respect to translations, rotations, and reflections of the latent positions.This is a challenging issue in a Bayesian setting that relies on sampling from the posterior distribution.In fact, the posterior samples become non-interpretable, since the affine transformations may have occurred during the collection of the sample (Shortreed et al. 2006).These non-identifiabilities are not especially relevant in an optimization setting, since usually the equivalent configurations of model parameters lead to the same qualitative results and interpretations.However, a case for non-identifiability can be made for dynamic networks, since translations, rotations, and reflections can occur across time, thus affecting results and interpretation.The penalizations that we introduce in this paper ensure that the nodes move as little as necessary, thus disfavouring any rotations, translations and reflections of the space.As a consequence, the penalizations directly address the identifiability issues and the latent point process remains comparable across time.

Inference
In this section, we discuss the inference for the distance model described in Section 2.2.2, but an analogous procedure is considered for the projection model.
Recalling that we work with undirected graphs, and thanks to Eq. ( 8), the log- where C is a constant term that does not depend on (Z, β) and the integral can be explicitly computed as shown in Appendix B. Since this log-likelihood has a closed form, we implement it and rely on automatic differentiation (Griewank 1989;Baydin et al. 2018) to maximize it numerically, with respect to (β, Z), via gradient ascent.
We have implemented the estimation algorithm and visualization tools in a software repository, called CLPM, which is publicly available (CLPM GitHub repository 2021).
Moreover, as it can be seen in Eq. ( 9), the log-likelihood is additive in the number of nodes.Potentially, this remark allows one to speed up the inference of the model parameters by means of stochastic gradient ascent (Bottou 2010).Indeed, let us introduce ψ 1 , . . ., ψ n such that and a discrete random variable Ψ(β, Z) such that where we stress that the above probability is conditional to Z and given the model parameter β.Then, let us denote ∇ the gradient operator with respect to (β, Z) and E π the expectation taken with respect tho the probability measure π introduced above (and hence with Z given).Then, we have the following Proof.
where the last equality follows from the additivity of the gradient operator.4 Experiments: synthetic data In this section we illustrate applications of our methodology on artificial data.We propose two types of frameworks: in the first one, we consider dynamic block structures (which involve the presence of communities, hubs, and isolated points).In this case, our aim is to inspect how the network dynamics are captured by CLPM.In the second framework, we generate data using the distance CLPM and we aim at recovering the simulated trajectories for each node.

Dynamic block structures
Simulation study 1.In this first experiment, we use a data generative mechanism that relies on a dynamic blockmodel structure for instantaneous interactions (Corneli et al. 2017).We specifically focus on a special case of a dynamic stochastic blockmodel where we can have community structure, but we cannot have disassortative mixing, i.e. the rate of interactions within a community cannot be smaller than the rate of interactions between communities.In this framework, the dynamic stochastic blockmodel approximately corresponds to a special case of our distance CLPM, whereby the nodes clustered together essentially are located nearby.
In the generative framework that we consider the only node-specific information is the cluster label, hence, this structure is not as flexible as the CLPM as regards modeling node's individual behaviours.So, our goal here is to obtain a latent space visualization for these data, and to ensure that CLPM can accurately capture and highlight the presence of communities.An aspect of particular importance is how CLPM reacts to the creation and dissolution of communities over time: for this purpose, our generated data includes changes in the community structure over time.
For this setup, we consider the time interval [0, 40] (for simplicity we use seconds as a unit measure of time), and divide this into 4 consecutive time segments of 10 seconds each.In each of the 4 time segments, 60 nodes are arranged into different community structures.Thus, any changes in community structure are synchronous for all nodes and they happen at the endpoints of a time segment.The rate of interactions between any two nodes is determined by their group allocations in that specific time segment.
The rate remains constant in each time segment, so that we effectively have a piece-wise homogeneous Poisson process over time, for each dyad.
We denote with X (s) ∈ N N ×N a simulated weighted interaction matrix which counts how many interactions occur in the s-th time segment for each dyad: iv) in the time segment [30,40], we are back to the same structure as in i).
Throughout the simulation, node 1 always behaves as a hub, and node 60 is always isolated.This means that node 1 interacts with rate 10 at all times with any other node, whereas node 60 interacts with rate 0.01 at all times with any other node, regardless of any cluster label.
In Figure 1 we show a collection of snapshots at some critical time points, for the projection model.The full videos of the results are provided in the code repository.The main observation is that the communities are clearly captured at all times, and they are clearly visually separated.While the two clusters with a stronger community structure are almost aligned to the axes, the non-community in the second time segment, which has low interaction rate, is instead positioned more centrally and it is more dispersed, but still separated from the others.The hub is always located very far from the origin and from other points, since this guarantees a large dot product value with respect to all other nodes.By contrast, the isolated node is always located at the origin of the space.
Figure 2 shows instead the snapshots for the distance model.In this case, the clusters are clearly separated at all times.The cluster with a strong community structure is less dispersed than the clusters with a weaker community structure.The hub is constantly positioned in the centre of the space, as to minimize the distance from all of the nodes at the same time.The isolated node is instead wandering in the outskirts of the latent social space.The creation and dissolution of communities only happens right at the proximity of start/end of each time segment.
Remark.One main difference between the projection model and the distance model is that, in two dimensions, the distance model seems more suited for representing a diverse community structure.The main reason is that the projection model requires nodes of different communities (which have few interactions with each other) to be distributed along the axes, respectively.Indeed , this guarantees that they are close to perpendicular, hence facilitating a strong separation between communities.Now, in dimension two, this clearly can only happen with no more than two communities for the projection model.
By contrast, the distance model does not have this limitation, and it can more easily accommodate a large number of completely separated communities.We use this argument to favour the use of the distance CLPM in two dimensions, in our applications.However, we note that the projection model may be of interest in higher dimension (d >> 2) for purposes other than visualization (e.g.clustering or sub-space tracking).
Technical details regarding the simulation's parameters, including penalization terms    and number of change points, can be consulted on the CLPM code repository.
Simulation study 2. In the second simulation study, we use again a blockmodel structure, however, in this case we approximate a continuous time framework by defining very short time segments, and letting the communities change from one time segment to the next.Since creations and dissolutions of communities would be unlikely in such a short period of time, we keep the community memberships unchanged, and we progressively increase the cohesiveness of the communities.This means that we progressively increase the rates of interactions between any pairs of nodes that belong to the same community, while keeping any other rate constant.The rate of interactions within each community starts at value 1 and increases in a step-wise fashion over 40 segments, up to the value 5.The time interval is [0, 40], and we consider two communities.Half way through the simulation, a special node moves from one community to the other.
For the projection CLPM, we show the results in Figure 3, whereas Figure 4 shows the results for the distance model.Both approaches clearly capture the reinforcement of the communities over time by aggregating the nodes of each group.We observe this behaviour both for the projection model and for the distance model.The projection model also exhibits nodes getting farther from the centre of the space, since this would give them higher interaction rates, overall.As concerns the special node moving from one community to the other, this is well captured in that the node transitions smoothly after approximately 20 seconds, in both models.

Distance Model
Simulation study 3.In this simulation study, we generate data from the latent distance model itself (Section 2.2.2).In this case, our goal can be more ambitious and thus we aim at reconstructing the individual trajectory of each of the nodes, at every point in time, as accurately as possible.To make the reading of the results easier, we assume that the nodes move along some pre-determined trajectories that are easy to visualize.
The N = 20 nodes start on a ring which is centered at the origin of the space, and has radius equal to 1.The nodes are located consecutively and in line along the ring, with equal space in between any two consecutive nodes.Then, they start to move at constant speed towards the centre of the space, which they reach after 5 seconds.After reaching the centre, they perform the same motion backwards, and they are back at their initial positions after 5 more seconds.The trajectories of the nodes make it so that, when the    nodes are along the largest ring, their rate of interaction is essentially zero, however the rate increases as they are closer and closer to the center of the space.
Figure 5 shows a collection of snapshots for the projection model.The nodes are approximately equally spaced along a line and they progress outwards from the centre of the space.As they get far apart from the centre and from each other, their dot products increase and so do their interaction rates.The projection model, which is not the same model that has generated the data, tends to spread out the nodes on the space, which is ideal and expected from these data.However, this means that some of the nodes almost point in perpendicular directions, which is at odds with the fact that, half-way through the study, all nodes should interact with all others.
As concerns the results for the distance model, these are shown in Figure 6  the latent space is also correctly estimated since the largest ring has approximately radius 1.
There are some important remarks to make.First, after 5 seconds, i.e. when all nodes are located close to the centre, it is understandable that a rotation or reflection (with respect to the origin of the space) may happen.This is inevitable since the solution can only be recovered up to a rotation/reflection of all the latent trajectories, but also because the first 5 seconds and the last 5 seconds can technically be seen as two independent problems.The collapse to zero can be seen as a reset in terms of orientation of the latent space.That is because the penalization terms only work with two consecutive change points, so, if we view them as identifiability constraints, they would lose their effectiveness when all the nodes collapse to zero for some time.A second fundamental remark is that the estimation procedure can lead to good results only if we observe an appropriate number of interactions.This is a specific trait of latent position models in general, since we can only guess the position of one node accurately when we know to whom it connects (or, in this context, how frequently), as we would tend to locate it close to its neighbors.In our simulated setting, there are few to no interactions when nodes are along the largest ring, so it makes sense that the results seem a bit more noisy in those instants.

Applications
In this section, we illustrate our approach over 3 real datasets, highlighting how we can characterize the trajectories of individual nodes, the formation and dissolution of communities, and other types of connectivity patterns.From the simulation studies, we have pointed out that the distance model generally provides a more convenient and appropriate framework to study these aspects of the data.In addition, it is also easier to interpret, so, we only show the results for the distance model and redirect the reader to the associated code repository where the complete results can be found.

ACM Hypertext conference dataset
The ACM Hypertext 2009 conference was held over three days in Turin, Italy, from June 29 to July 1.At the conference, 113 attendees wore special badges which recorded an interaction whenever two badges were facing each other at a distance of 1.5 meters or less, for at least 20 seconds.For each of these interactions, a timestamp was recorded as well as the identifiers of the two personal badges.This interaction dataset was first analysed by Isella et al. (2011), and is publicly available from Hypertext 2009 network dataset -KONECT (2017).Similarly to Corneli et al. (2016), we focus our analysis on the first day of the conference.On the first day, the main events that took place included a poster session in the morning (starting from 8 a.m.), a lunch break around 1 p.m., and a cheese and wine reception in the evening between 6 p.m. and 7 p.m.We use our distance CLPM to provide a graphical representation of these data, and to note how the model responds to the various gatherings that happened during the day.We set a change point (η k ) every fifteen minutes.Figures 8 and 9 provide a number of snapshots highlighting some of the relevant moments of the day.The complete results, shown as a video, can be found in the code repository that accompanies this paper.
We can see that, in the morning, there is a high level of mixing between the attendees.
The visitors tend to merge and split into different communities that change very frequently and very randomly.These communities reach a high level of clusteredness, which signals that the participants of the study are mixing into different groups.This is perfectly in agreement with the idea that the participants are moving from one location to another, as it usually happens during poster sessions and parallel talk sessions.The nodes exhibit different types of patterns and behaviours, in that some nodes are central and tend to join many communities, whereas others have lower levels of participation and remain at the outskirts of the space.In the late morning, we see a clear close gathering around 12 p.m., whereby almost all nodes move towards the centre of the space.This is emphasized even more at 2 p.m., which corresponds to the lunch break.It is especially interesting that, even though the space becomes more contracted at this time, we can still clearly see a strong clustering structure.
In the afternoon, we go back to the same patterns as in the morning, whereby the participants mix in different groups and move around the space.The wine reception is also clearly captured around 6 p.m. where we see again some level of contraction of the space, to signal a large gathering of the participants.
After this event, the overall rate of interactions diminishes sharply, and as a consequence we see the nodes spreading out in the space.

Reality mining
The reality mining dataset (Eagle and Pentland 2006) is derived from the Reality Commons project, which was run at the Massachusetts Institute of Technology (MIT) from 14 September 2004 to 5 May 2005.The dataset describes proximity interactions in a group of 96 students, collected primarily through bluetooth devices.An overview of this network dataset is also given by Rastelli (2019).
In the context of this paper, the proximity interactions can be reasonably considered as instantaneous interactions, due to the study being 9 months long.With our latent space representation, we aim at highlighting the patterns of connections of the students during the study, and any social communities that arise and how these change over time.
Figure 10 shows a few snapshots of our fitted distance CLPM.The complete results, shown as a video, can be found in the code repository that accompanies this paper.
We observe that, in general, the students are quite separated and few communities arise.This does not necessarily mean that the nodes do not interact, but it is a sign that there are no subgroups of students with an uncommonly high interaction rate.Also, it is important to point out that the latent space is very expanded: coupled with an estimated intercept value of 3.9, this signals that the latent space has a strong effect on modeling the interaction rates and that it can capture well the variability in the data.
Over time, the students tend to mix in different social groups, thus quickly forming and undoing communities.This could be explained by the interactions that the students have due to college activities or other daily activities.Near the end of the study, a large cluster appears, signaling a large gathering to which the students participated.This may correspond to the period before a deadline, as outlined in Eagle and Pentland (2006).

London bikes
Infrastructure networks provide an excellent example of instantaneous interaction data.
In this section we consider a network of bike hires which is collected and publicly distributed by Transport for London: Cycle hire usage data 2012-2015 .We focus on a specific weekend day (Sunday 6 September 2015), and study the patterns of interactions between all bike hire stations in London.The bike hire stations correspond to the nodes of our network, whereas an instantaneous interaction between two nodes at time t simply means that a bike started a journey from one station towards the other, at that time (we consider undirected connections).
In Figure 11 we show a collection of snapshots at some critical time points during the day, for the distance model.The complete results, shown as a video, can be found in the code repository that accompanies this paper.
Although there are a total of 818 stations that are active in this dataset, we provide a visualization for the 60 most active stations only.However, we emphasize that the results were obtained using the whole dataset.In addition, we highlight with a different color the 3 stations with the highest number of interactions, overall.These stations are: The first aspect that we notice is that the latent space expands during inactive times, and it contracts during busy hours.The contractions and expansions are not homogeneous, rather they highlight the presence of dense and less dense clusters of stations.
The estimate for the intercept parameter is −5.2, and the dispersion of the points in the latent space is not particularly large.This signals that the latent space characterization is not having a very strong effect on the rate of interactions, and the model does not capture much variability in the rates of interactions.This highlights that these connectivity data follow patterns that cannot be completely explained by the purely geometrical nature of our model.That is, the connections are determined by a variety of factors that cannot be framed into this latent positions context, and the geographical information on bike hiring accounts for only a part of the problem.

Conclusions
We have introduced a new time-continuous version of the well known and widely used latent position model, as an extension which can model instantaneous interactions between entities.We have proposed a new methodology which provides good flexibility while also allowing for an efficient inferential framework.The methodology is implemented in our software CLPM which accompanies this paper and is publicly available.This provides an essential additional tool for practitioners that are interested in deriving latent space visualizations from observed instantaneous interaction data.
The framework that we propose is highly inspired by the work of Hoff et al. (2002), and by the vast literature that has followed in this direction.Our work combines some crucial theoretical and statistical aspects of latent position modeling, with a pragmatic approach to inference and visualization of the results.Crucially, we provide simulation studies and real data applications to demonstrate how our method leads to sensible and accurate results, with low computational demands.
As regards extensions and future work, our research opens up several new directions, to address and potentially change some crucial parts of our procedure.A fundamental challenge is related to the geometric nature of the latent space.In this paper, and in the literature cited here, affine latent spaces are considered, endowed with the standard dot product, which, in turn, induces the Euclidean distance.However, some important works in the literature of the static latent position model consider the latent space to be spherical (McCormick and Zheng 2015) or hyperbolic (Krioukov et al. 2010;Asta and Shalizi 2015).As expected, since latent position models are generative models, the geometry of the latent space has crucial consequences on the properties of the simulated network (Smith et al. 2019). Indeed, recently, Lubold et al. (2020) introduced a method to consistently estimate the manifold type, dimension, and curvature from a class of latent spaces.Addressing these topics in the context of dynamic latent position models is a promising avenue of research that can extend our work.Another challenging aspect of our methodology regards inference: in this paper, we propose an optimization approach to maximize a penalized likelihood.An interesting alternative would be to consider a different approach that could allow one to also quantify uncertainty around the parameter estimates.Finally, in terms of modeling, we use piece-wise continuous trajectories due to their flexibility and easy tractability, however, alternatives to this parametrization may also be considered.

B Log-likelihood for the distance model
We focus on the integral in Eq. ( 6) and prove that it can be explicitly solved.where the variable change t = s−ηg η g+1 −ηg was performed and the following notations were adopted to simplify the exposition ∆ g i : = z i (η g+1 ) − z i (η l ) ∆ g j : = z j (η g+1 ) − z j (η l ) z g i : = z i (η g ) z g j : = z j (η g ) By denoting g(t) := − t(∆ g i − ∆ g j ) + (z g i − z g j ) 2 2 the exponent inside the integral, we can "complete the square" as follows where ( By plugging all this into Eq.12 it follows that The above proposition allows us to sample (subsets of) nodes uniformly at random, with re-injection, and use each sample (a.k.a.mini-batch) to update the model parameters via stochastic gradient ascent, as shown in Bottou (2010).
j , where P(•) indicates the Poisson probability mass function, and C is a latent vector of length N indicating the cluster labels of each of the nodes.Once we know the number of interactions for each dyad and each segment, the timing of these interactions can be sampled from a uniform distribution in the respective time segment.More in detail, the rate parameters are characterized as follows: i) in the time segment [0, 10[, the expected number of interactions is the same for every pair of nodes: θ (1) c 1 c j = 1, for all i and j; ii) in the time segment [10, 20[, three communities emerge, in particular θ whereas the rate for any two nodes in different communities is 1; iii) in the time segment [20, 30[, the first community splits and each half joins a different existing community.The two remaining communities are characterized by θ Again, any two nodes in different communities interact with rate 1;

Figure 1 :
Figure 1: Simulation study 1: snapshots for the projection model.

Figure 2 :
Figure 2: Simulation study 1: snapshots for the distance model.

Figure 3 :
Figure 3: Simulation study 2: snapshots for the projection model.
, and they highlight that the true trajectories are essentially accurately recovered.The model can capture really well the contraction and expansion of the latent space, and the individual trajectories of the nodes are closely following the theoretical counterparts.The scale of 1

Figure 6 :
Figure 6: Simulation study 3: snapshots for the distance model.

Figure 7 :
Figure 7: ACM application: cumulative number of interactions for each quarter hour (first day).

Figure 8 :
Figure 8: ACM application: snapshots for the distance model (morning hours).

Figure 9 :
Figure 9: ACM application: snapshots for the distance model (afternoon hours).

Figure 10 :Figure 11 :
Figure 10: MIT application: snapshots for the distance model.

•
Belgrove Street, King's Cross, situated next to King's Cross Square (shown in red); • Finsbury Circus, Liverpool Street, situated next to Liverpool Street station (shown in blue); • Newgate Street, St. Paul's, situated next to St. Paul's Cathedral (shown in green).