Assessing the stability of egocentric networks over time using the digital participant-aided sociogram tool Network Canvas

Abstract This paper examines the stability of egocentric networks as reported over time using a novel touchscreen-based participant-aided sociogram. Past work has noted the instability of nominated network alters, with a large proportion leaving and reappearing between interview observations. To explain this instability of networks over time, researchers often look to structural embeddedness, namely the notion that alters are connected to other alters within egocentric networks. Recent research has also asked whether the interview situation itself may play a role in conditioning respondents to what might be the appropriate size and shape of a social network, and thereby which alters ought to be nominated or not. We report on change in these networks across three waves and assess whether this change appears to be the result of natural churn in the network or whether changes might be the result of factors in the interview itself, particularly anchoring and motivated underreporting. Our results indicate little change in average network size across waves, particularly for indirect tie nominations. Slight, significant changes were noted between waves one and two particularly among those with the largest networks. Almost no significant differences were observed between waves two and three, either in terms of network size, composition, or density. Data come from three waves of a Chicago-based panel study of young men who have sex with men.


Introduction
This paper examines the stability of egocentric social networks as reported over time using a novel touchscreen-based participant-aided sociogram (PAS) implemented within Network Canvas . The PAS is an extension of earlier "bullseye" or "target" diagram ways of visually arranging social contacts (Bellotti, 2014). It is seen as an alternative to traditional name generator techniques for eliciting social ties, especially when one wishes to record a network that resembles the set of close personal ties (Hogan et al., 2007). Network Canvas, a recent introduction into this field, is a full screen computer program that allows respondents to interact with circles representing alters and lines between the circles representing indirect ties. It has additional features for collecting alter-specific data akin to data collected from traditional name interpreter questions.
Network Canvas is one of a number of recent entrants in computer-supported network data collection, such as OpenEddi (Eddens & Fagan, 2018), VennMaker (Gamper et al., 2012), EgoWeb (McCarty & Govindaramanujam, 2005) and GENSI (Stark & Krosnick, 2017). In many cases, a key design goal of software is to provide a means to collect data that minimises the burden on both the respondents as well as those who would manage this data subsequently in the lab. Yet, with new features, designs, and conditions it is important to assess the quality of networks produced using such approaches.
To a large degree, the impetus for designing PAS techniques has been to increase the intelligibility and to reduce the burden of the name generator instrument (McCarty & Govindaramanujam, 2005). Researchers have noted that name generators have been considered a burden for respondents, especially when it is important to collect data on ties between alters (McCarty et al., 2007). The conventional approach, the "dyad census", asks about all possible combination of alters. Unfortunately, a linear increase in the number of alters leads to an exponential increase in the number of possible dyads (and thus questions to ask). For perspective, asking about all dyads between 4 alters would require 6 questions, but for 20 alters it would require 190 questions.
The invasive nature of the questions combined with the complexity of the research task means there exist concerns about the quality of the data that comes from name generators. Three concerns stand out: the boundary problem, instrument bias, and panel conditioning. The boundary problem (Laumann et al., 1983) refers to the challenge of communicating the boundary of who should be included or excluded from a network, in this case referring to personal, egocentric networks. Instrument bias refers to the way in which the nature of the data collection task can suggest or nudge individuals into withholding important data or including unnecessarily or extraneous details. Panel conditioning refers to changes in responses at a subsequent stage or wave that occur because of the experience in a previous stage or wave of data collection (Silber et al., 2019;Warren & Halpern-Manners, 2012).
These concerns can undermine both the validity and the reliability of network research. Validity is a concern when we believe there exists real observable phenomena that are being excluded or extraneous data that are being included. Validity is based on the premise that there exists a valid answer to a question. The challenge with assessing validity in egocentric networks is that there is often no way to unambiguously and consistently establish membership in a network. Some researchers have sidestepped this by using an exogenous network (rather than the respondent's own network) as a response. For example, Eddens & Fagan (2018) asked respondents to watch clips from a television show to test the validity of visual approaches versus a more conventional dyad census. Admittedly, this was not an egocentric network. Nevertheless, the study reinforces the potential for visual network approaches to have high accuracy scores relative to other approaches.
In contrast to validity, reliability in social network research concerns whether results are consistent between measurements. Whether the respondent consistently nominates too many or too few people relative to the "true" number would be a matter of validity, but the fact that the respondent continues to name just as many people or just as many ties would be a matter of reliability.
In this paper, we focus on the reliability of network measurements across waves of a longitudinal study. We cannot establish whether the alters mentioned are truly members of an egocentric network. Similarly, we cannot directly evaluate the impact of Network Canvas on the validity of alter nomination given that all data was collected using this tool. In this research context, we can nevertheless measure whether respondents give consistent numbers of alters, whether the choice of alters nominated (or not) in subsequent waves conform to expected structural patterns, and whether there are consistent numbers of ties between the nominated alters wave on wave.
Of the three concerns stated above (boundary problems, instrument bias, and panel conditioning) we are the most interested in this analysis in panel conditioning. In an observational study, it is difficult to assess whether individuals have clearly established boundaries and whether those nominated truly belong in the network given the specific prompts. Doing this would require the sort of experimental design of Silber et al. (2019) who withheld the name generator from one wave for half of respondents. In that case, they found little evidence of panel conditioning, but this was using a considerably smaller network akin to the name generator found in the General Social Survey. Furthermore, instrument bias is an important issue and one that we will mention throughout the paper, but unfortunately, given an observational study where the instrument is the same in all three waves, we can do little to assess the specific bias of the instrument. To note, past work with this sample has compared one wave of this sample to a comparable interview task for sexual contacts and found little difference in rates of reporting ). Yet, having only one wave of data, there was little opportunity to study panel conditioning, which typically requires multiple waves of data (or multiple repeated sequences within the same instrument; Warren & Halpern-Manners, 2012).
Panel conditioning can occur for a number of reasons. In fact, it can be argued that panel conditioning can be seen as a form of instrument bias when the bias means that prior exposure to the instrument makes a difference to subsequent questions. For example, Eagle & Proeschold-Bell (2015) reported that when respondents were asked to nominate advice networks, the follow-up approach affected reporting rates. In another example, individuals may find the data collection task to be burdensome or overly complex at first observation, and then they may strategically withhold alter nominations at the next observation to get through the task more quickly, a phenomenon previously observed as "motivated underreporting" (Tourangeau et al., 2012). Researchers have speculated that panel conditioning affected responses in the 2004 General Social Survey (Paik & Sanchagrin, 2013). Respondents were first asked to list voluntary associations and then to talk about each one in turn. When the next section came and they were asked to list core discussion networks, many people appeared reluctant to do such a task again. Where the modal category was two friends in 1984, it was zero in 2004(McPherson et al., 2006. One challenge of assessing panel conditioning between waves is that we expect networks to vary over time. To that end, test-retest measures are not expected to perfectly correlate. In fact, a perfect correlation between waves might be a concern. A perfect correlation means individuals exclusively selected all the alters from the previous wave, a very implausible outcome given the voluminous research indicating churn in networks over time (discussed below). Thus, if we do not observe some change between waves, then we might not be observing a real egocentric network, but an artefact of the data collection process. On the other hand, if the entire sample shows change over time (for example, if the average number of alter nominations continues to decrease wave on wave) then we believe this is evidence of panel conditioning. The plausible explanation would be that the instrument is so burdensome that respondents engage in increasing amounts of motivated underreporting.
To explore changes in alter nominations over time, we use data from RADAR, a longitudinal study of sexual behaviors and drug use within a sample of young men who have sex with men (YMSM) based in Chicago. The sample's consistency in age, gender and sexuality permits us to explore test-retest rates under relatively stable and homogeneous conditions. This is not unlike Eagle and Proeschold-Bell's use of repeated name generators among clergy in North Carolina (2015). Unlike that case, however, we are testing a visual interface and assessing tie counts as well as alter counts via Network Canvas.
Below we first review change in egocentric networks relevant for this sample. We then reflect in greater detail on some advances in participant-aided sociograms. We then report empirical data. We first focus on counts of alters and change in alter nominations between three waves of a study. We pay particular attention to the notion of structural embeddedness. That is, if an alter is connected to others in the network in any identifiable way they are embedded in the social structure. We then report on counts of indirect ties and examine change in tie nominations between waves before concluding with observations and limitations.
Before continuing, some notes on terminology: A "respondent" is a person who consents to disclose information about their relations to others. These relations are referred to as "ties" and the people that are tied to the respondent are called "alters". In this paper all ties are undirected. Although we use the term respondent to refer to the person present in the research study, we also use the term "ego" when discussing the respondent with respect to network ties, thus we refer to ego-alter ties and not respondent-alter ties. A respondent could have ties to many people. If the tie is between ego and alter, it is a "direct tie" (since ego has direct experience of this tie) and if it is a reported tie between two alters nominated by ego, this is termed an "indirect tie" (Phillips II et al., 2017).

Considering network change over time
Past work in social networks has examined change over time and found that egocentric networks have considerable stability within a core of personal contacts as well as considerable churn among weaker ties (Perry & Pescosolido, 2012;Bidart & Lavenu, 2005;Suitor & Keeton, 1997;Morgan et al., 1997). Suitor & Keeton (1997), in their study of women's networks in midlife, found that two-thirds of the personal ties persisted when measured a year later. A full quarter to a third of ties persisted ten years later. Similarly, Morgan et al. (1997) examined the networks of recent widows at seven time points in a single year. They noted that some individuals appeared to move in and out of the network while others tended to persist across waves. In a comparable manner to this research, Morgan et al. (1997) used a target diagram modelled after Antonucci's target diagrams (1986). The authors report that roughly a third of network members would be nominated through all three waves, with the remainder churning between waves.
Past research has indicated that networks of young adults are particularly prone to change, as youth are likely to change settings and residences as they transition between jobs and sites of post-secondary education (Wrzus et al., 2013). Others have noted that networks can change considerably over the span of a few years due to issues with mental health and support. For example, Perry & Pescosolido (2012), used the Indianapolis Mental Health Study to demonstrate how some mental health issues seemed to be correlated with overall increases or decreases in network size, despite the persistence of a stable core of ties.
Although a lot of work has looked at change in the number of alter nominations over time, there is less work on the number of ties over time. However, we believe that ties are also of interest, particularly considering the visual nature of Network Canvas as a means to simplify the collection of indirect ties between alters. This leads to the first research question.

Network change as based on social connectedness
Friends change with contexts and across the life course (Bidart & Degenne, 2005;Pahl & Spencer, 2004;. Further, due to the challenge in specifying the boundary for what constitutes an egocentric network, even the same person can report wildly different networks depending on the methodology (Bernard et al., 1990). For example, summation methods such as those introduced by Bernard et al. (1990) seek to capture all known alters and provide estimates in the hundreds. On the other hand, the core discussion network name generators used by Burt (1984) in the General Social Survey reported a maximum of six alters. As we ask for nominations based on social support, sexual contact, and drug use, the networks observed here tend to be closer to core discussion networks than full contact networks.
The notion that egocentric networks tend to have a stable core and a turbulent periphery has led authors to discuss the structural embeddedness of ties (Perry & Pescosolido, 2012;Morgan et al., 1997;. Ties that are structurally embedded are presumed to be more enduring than ties that are not. Structural embeddedness is not about the type of tie, but its capacity to link people. Thus, if a respondent knows "Alice" professionally, but their friends know Alice outside of work, then Alice is structurally embedded even if the tie between the respondent and Alice is a work tie rather than a social tie. Taking into account multiple alters and alter types is akin to the pooled "total network" of Perry & Pescosolido (2012), who assessed network dynamics of individuals seeking mental health treatment.
Our research context offers us the opportunity to look not only at alters nominated because they represent a social tie to ego, but also because they potentially represent a sexual or drug use tie. Granted, all ties are social in some sense. Yet we believe that ties nominated because they use drugs with ego or have had sex with ego may persist or disappear at a different rate than ties nominated as friends or family members. Having the opportunity to compare ties nominated as social, sexual or drug use ties allow us to further understand what leads to the persistence of a tie (either an ego-alter tie or an observed indirect tie between alters). Broadly speaking, we would expect family and "serious partners" to persist in the network whereas we would expect casual sexual contacts and drug contacts to be less likely to persist in the network. . . with one proviso: If the respondent reports that two alters in the network have had sexual contact or used drugs together, we believe this represents a form of embeddedness (and personal disclosure) that indicates the alters are structurally embedded. By implication, if alter had sex with alter but that alter does not know anyone else in the network, the alter is less likely to persist. If alter is nominated in a subsequent wave, we believe it is likely that ego is in the process of integrating that alter into the larger egocentric network. To this end, when we talk about structural embeddedness in a network, we will explore any kind of link between alters and not merely social ties. Thus, we arrive at the second research question.
Research Question 2. What elements of social role and social connection (considered together as structural embeddedness) are related to the persistence of alters across waves?

Network change and instrument bias
The use of name generators for small egocentric networks is well established (Wellman, 1993). Yet, the issues with such name generators are also well established. As past researchers have noted, name generators are often tedious and tiring (McCarty et al., 2007). The outcome of a high respondent burden is that it is suspected that respondents underreport alter nominations or half-heartedly complete alter information to get through the task (Paik & Sanchagrin, 2013;Tourangeau et al., 2012). In cases of socially sensitive data, especially sexual contact and drug use, it is important to get full estimates of incidence. This refers both to the reporting of alters as well as the reporting of indirect ties.
Recently, researchers have investigated whether the instruments themselves influence reporting rates (Tubaro et al., 2014;Vehovar et al., 2008). For example, Bidart & Lavenu (2005) noted that differences in question wording can affect reporting rates despite this persistent stable core.
Past work has indicated that individuals can understand a visual sociogram with little instruction, and even identify social roles (Lee & Archambault, 2016;McConnell et al., 2018). Individuals can navigate interactive sociograms and use them to identify important alters (Jeon et al., 2016). Yet, the visual sociogram might also serve as a form of visual "anchoring", giving people a sense of when a network is "too dense" via its look rather than its accuracy. It might also suggest a certain limit by indicating how many nodes fit on a screen without scrolling. Furthermore, repeated administration of a PAS might lead people to have expectations about what constitutes the "right size" based on whether the previous wave led to too few or too many contacts.
When Eagle & Proeschold-Bell (2015) investigated a core discussion name generator for interviewer effects they also discovered plausible evidence for panel conditioning derived from their research design. They deployed a name generator that allowed for more than five names but only asked name interpreter questions for the first five. In subsequent studies, respondents were far more likely to nominate precisely five people, instead of fewer or more names. This is panel conditioning as a form of anchoring.
While anchoring is a form of unintentional panel conditioning, motivated underreporting is a more pernicious form of panel conditioning. With motivated underreporting, it is less that the instrument suggests a certain size or shape of a network and more that the respondent simply does not like the experience of the research instrument (Tourangeau et al., 2012). Thus, the respondent will try to complete a task as quickly as possible and thus will intentionally underreport names to accomplish this. Motivated underreporting in this case is a form of panel conditioning because the respondent has to be already aware that if they nominate more names it will lead to more questions after the name generator.

Research Question 3. To what extent to do we observe systematic biases between waves that might be evidence of panel conditioning, either as anchoring or motivated underreporting?
To address this research question we will examine changes in the number of alters or edges over time. If, at the sample level, there is a randomly distributed but consistent decline in the number of alters nominated between waves, this would suggest that the respondents are engaging in motivated underreporting. If, on the other hand, only those with very large networks underreport alter nominations in subsequent waves, this would suggest anchoring is a more plausible explanation, akin to the regression to the mean explanation for such results suggested by Silber et al. (2019).

Methods and analytical framework
Data for this study come from RADAR, a longitudinal cohort study of over 1,000 YMSM from the Chicago metropolitan area. Respondents completed interviews at 6 month intervals and provided individual, network, and biological data at each visit. At the point of writing, 814 individuals had completed a third visit (enrolled in the study for approximately 18 months) and were included in the analysis. In order to enrol in the cohort, respondents had to be aged 16 to 29 years and assigned a male sex at birth.
Data collection began in February 2015 and is ongoing. We align by wave rather than date of interview though the difference between the first respondent of wave one and the most recent respondent of wave one was a little over 18 months. For each respondent there was an expectation that they would return at approximately six-month intervals. Those who did not return within roughly seven months were then rescheduled to the next wave. To that end, there were 39 cases that we had to eliminate because they had wave one and wave three data but skipped wave two. Two additional cases were removed because of an absence of wave one data, leaving a total of 773 respondents in the final analytic sample.

Network canvas and the collection of data
Network Canvas was developed as a tool to simplify and improve the collection of network data, particularly egocentric data. This study employs data collected in RADAR using an early Network Canvas version (named NetCanvas-R). Later versions are broadly similar (cf., http://networkcanvas.com/ for the latest releases). Importantly, data used in this study could be collected using the current design of Network Canvas with little difference to respondents except subtle aesthetic tweaks. NetCanvas-R (and Network Canvas generally) emphasizes a visual, tactile interface reminiscent of paper-based participant-aided sociograms (See Figure 1). In addition to a name generator, NetCanvas-R has a means for respondents to provide indirect ties in a visual frame. Alters are represented as circles and tapping one after the other creates a tie between two alters.

Longitudinal data
When collecting network data at a subsequent time point, researchers have the option of introducing data collected in previous waves. Introducing names might bias people to include more persons wave on wave as they remember new names in addition to previously mentioned names. On the other hand, withholding names between waves might lead to unnecessary forgetting (Wright & Pescosolido, 2002;Brewer, 2000). The RADAR study opted to include a panel on the left hand side of the name generator that showed the names mentioned in the immediately preceding wave. The respondent could move a node from a panel of previously nominated alters into panel of currently nominated alters. The practical justification is that we wanted to link nodes across waves. By having the respondent literally indicate which alters were still relevant, this enabled us to link the alters from one wave to the next using a system identifier. This is only one possible strategy for combining recognition and recall. By contrast, in the Indianapolis Mental Health Survey, respondents were first asked to do free recall and subsequently were allowed to cross check afterwards with previously nominated alters (Wright & Pescosolido, 2002). Although this was not done with Network Canvas in this instance, such a two-stage strategy is possible.
With NetCanvas-R, respondents could only see names from the immediately preceding wave. This is because the list of names could grow over many waves (at time of submission some respondents are on wave six) and become unmanageable. Since the respondent could only see names from the previous wave, anyone mentioned in wave one but not in wave two would have to be created as a new -unlinked-node in wave three. This means that any estimates of network size pooled across three waves will possibly be inflated by the number of alters who are the same person in wave one and three but not counted as such. We do not consider these pooled network counts for anything other than descriptive purposes and focus instead on testing changes between waves one and two and waves two and three. They will still be presented for illustration.

Variables used in the analysis
The alter nomination step consists of multiple prompts designed to identify alters known as social contacts, sex contacts, or drug use contacts.
Alters in this study are considered social contacts if they are nominated from any of the name generator prompts. For socially important alters RADAR starts with a single item: "Who are the people you are closest to" with a subheading of "That is, people you saw or talked to regularly and shared your personal thoughts or feelings with during the last 6 months". Then RADAR includes individuals who were nominated through the other alter nomination screens as additional social alters. In that sense, the count of social alters can also be treated as total alter nominations.
We nominate sex contacts in two stages. The first is a sex contact name generator prompt that appears after the closeness name generator prompt: "Is there anyone else that you didn't list that you have had sex with in the past 6 months". For alters who were mentioned in other prompts that might also be sex contacts, we include an additional step on a later screen once all alter prompts have been presented. On this later screen, respondents are asked to "[t]ap on all the people with whom you have had sex in the last six months." Contacts who were previously nominated as sex alters are already highlighted when the respondent reaches this screen. The procedure for identifying drug use alters is the same except in all cases, the words "had sex" are replaced with "used marijuana or other drugs".
For social roles we ask individuals to identify the role of the alter from a list of many potential roles. This identification happens during the initial name generation. Roles include "extended family", "neighbour", "hook-up", etc. . . These roles have been coarsened to "family", "friends", "casual partner" and "serious partner" where relevant for this analysis.
For indirect ties, we use a series of prompts that mirror the prompts for alters. The indirect ties come from three questions that follow the theme of the name generators. Although all alters were coded as social contacts if they were nominated by ego, we do not consider every indirect tie to be a social tie when reporting. Social indirect ties are only those ties that appeared during the stage with the prompt: "Connect any two people you know who are very close or spent time together in the past six months". To note, we acknowledge that this question has some ambiguity in that could mean ([very close] or [spent time in the past six months]) or ([very close or spent time together], [in the last six months]). This ambiguity is less the case in waves two and beyond since the interviewer guide stipulates that six months should be taken to mean since the last interview. We encourage future researchers to consider splitting this question into "very close" and "spent time together" as separate questions.
For indirect sex ties, we ask "please connect any two people who have had sex in the past 6 months". For indirect drug ties, we ask "please connect any two people who used marijuana or other drugs together in the past six months".
For the questions on structural embeddedness, we link alters if there has been any tie between them at any wave. This is because structure refers to the broad class of connections between people known to the respondent. Doing drugs is a social activity as is having sex. Thus, if alters are connected via these routes, it could be seen as evidence of being embedded in a social structure. Thus, when we investigate whether a casual sex alter appears from one wave to another, it makes sense to examine not just who the alter had sex with, but who that alter is friends with. More advanced modelling taking into account whether the specific type of tie leads to being mentioned in a subsequent wave is welcome but out of scope for this paper.
As we are discussing a network that is only partially observed at any time point, we are hesitant to include structural metrics that include shortest paths, random walks or any measure that includes triangles across waves. Instead, we keep our metrics relatively simple: We examine unweighted degree (i.e., if any tie exists between alters in multiple waves it is still only counted as a single tie), whether alter is in the giant connected component and whether alter is an isolate. Despite their simplicity, these metrics still indicate alter's structural embeddedness by demonstrating if alter is connected to most of ego's contacts.
When we investigate the differences in the average number of ties nominated between waves, we consider each of the three edge types as separate distributions. This is because each kind of indirect tie was captured on a single separate stage. When investigating the average number of ties, we are interested in whether the screen had any effect. Adding together the ties from different stages will obscure the variations that might be related to the presentation of ties on a single screen. Also, as expected, there are many more social indirect ties reported than indirect sex or drug ties. Thus, variations in social tie distributions could obscure changes in the smaller sex or drug use distributions.

Analytical strategy
To understand the reliability of networks captured via NetCanvas-R, we use a number of statistical tests as well as descriptive reporting where relevant. As is evident from the discussion above, reliability here is tricky. Simply having the respondent move all alters from the previous interview panel into the current interview panel would produce highly reliable results wave on wave. Yet, these results would almost certainly be artificial. We actually want to see less-than-perfect correspondence between waves, as this would reflect the changes in alter nominations we expect to find based on existing research. At the same time, while some networks are likely to get larger between waves and others are likely to get smaller, at the sample level this variation should cancel out unless there is a reason for most of the sample to either add or withhold ties. Detectable changes at the sample level are plausible evidence for motivated underreporting as a form of panel conditioning, whereas changes at the individual level may simply be related to the contingencies of life (alongside cognitive biases such as forgetting; Brewer, 2000). To this end, we examine differences at both the respondent level and the sample level. These are detailed below.
To examine changes in the sample, we first look at the distribution of the average size of an ego network or the average number of indirect ties reported. We use paired sample t-tests to compare the means between waves. We then look to the correlation between the two waves. This is because the average could stay the same, but very different people could have the largest and smallest networks if the sample is not correlated. Since the data are right skewed we opt for the non-parametric Spearman's rho rather than Pearson's r. 1 Following the recommendation of Schroder et al. (2003) we also report an intraclass correlation measure. We use the concordance correlation coefficient (CCC) as a measure of the intraclass correlation (Lin, 1989). CCC is akin to a non-parametric version of the ICC. The advantage of CCC is that it is less sensitive to outliers or skewness. While we report the estimate, we do not, strictly speaking, want to see perfect correlation as that may be evidence of respondents lazily repeating their responses every visit. According to Cicchetti (1994), a value of 0.4 or less is poor and 0.75 or higher is excellent when assessing reliability via correlation measures. Past work using aggregate estimates of sexual contact frequency have found ICCs for reporting as high as .8 (c.f., Schroder et al. (2003) for examples) but this typically refers to measurements taken within a week or questions about overall counts rather than counts based on specific named alters. As all ICC scores are expected to have a high p-value, they are not reported. Instead we report confidence intervals estimated using a jackknife bootstrapping sampler.
Even if the average network size stays the same, there could still be differences in the shape of the distribution. We use a Kolmogorov-Smirnov test to look at whether the shape of the distributions have changed. P-values in these tests mean that the shape differs significantly between waves. One plausible reason for a significant difference in our case is if the overall distribution remains similar but outliers are reduced between waves. This could be an example of instrument bias as panel conditioning whereby individuals with especially large or dense networks withhold alter or tie nominations.
The above tests examine differences in the shape and order of the sample. However, we also want to look at differences at the respondent level. For this we rely here primarily on descriptive differences. We understand that it would be preferable to model these differences. Due to the nature of the research design (i.e., many egos, multiplex interdependent ties, alters appearing in each other's network, and the same alters appearing as different nodes in wave 1 and 3 if absent in wave 2), we believe the complexity of fitting models such as those found in Crossley et al. (2015) are out of scope. All statistics were performed in Python with the scipy.stats package except canonical correlation coefficient which was calculated using the agRee package in R (Feng, 2018).

Are there as many alters reported wave on wave?
We first look to Research Question 1 by investigating the number of alters nominated between waves. Table 1 shows the average number of alters recalled per ego per wave as well as the average total pooled across all three waves. We show the mean number of alters reported, the median, and the standard deviation to help understand the skew of the data. This is further helped by the distributions shown in Figure 2.
Respondents nominated approximately 15 social alters in each of the waves. Pooling the alters across all three waves, respondents listed a total of 24 people on average. The precise number is slightly lower in wave one and slightly higher in waves two and three. Although respondents nominate slightly more alters overall in waves two and three, they nominate slightly fewer sex contacts and drug use contacts. Paired samples t-tests indicate that the average number of alters differs significantly between wave one and wave two but not between waves two and three. This holds for the total nominated alters as well as sex contacts and drug use contacts. T-tests are shown in Table 2 alongside three other statistical tests.   All the distributions are heavy tailed as can be seen by the high standard deviations, especially for the average number of sex contacts and drug contacts. This suggests that some individuals are especially prone to nominating sex contacts and drug use contacts relative to the average. This is also shown by the numerous outliers for average number of sex contacts and drug contacts  in Figure 2. The Kolmogorov-Smirnov tests indicate that there were instances where distributions differ significantly between waves. In particular, the change was significant between wave one and wave two for sex contacts (K 12 =0.088, p 12 <0.05). but not for drug contacts. The distribution of average number of social contacts varied slightly between waves one and wave two, but it is not significant at the critical value (α=0.05). These tests suggest that between waves 1 and 2 there was some variation in who had the most or fewest network members. This is evidence that individuals do not merely carry over their network members between waves. Spearman's rho (r s ) and the canonical correlation coefficient show that in general there is a moderately strong correlation between waves for all alter counts. The correlation was strongest for drug ties, particularly between waves 2 and 3 (r s = 0.751; CCC = 0.70). It was weakest for sex contacts between waves 1 and 2 (r s = 0.53, CCC = 0.44).

What contributes to an alter being nominated in multiple waves?
Research Question 2 focuses on factors that would help explain persistence between waves, particularly with respect to role and structural embeddedness. To explore what is associated with an alter being reported in a single wave or multiple waves, we first group alters by the number of waves in which they appear. We do this for social alters, drug use alters and sex alters in Table 3. In this table we separate sex alters into casual sex partners and (self-reported) serious relationship partners as there are obvious reasons why these would churn differently. For all alters, 31% were reported in all three waves. A further 21% were only reported in two waves and finally, 48% of the alters were reported in a single wave only. Whether an alter is a sex or drug use alter appears to influence the likelihood of that alter appearing in multiple waves. In general, those who use drugs with the respondent tend to persist within the network, whereas those who have sex with the respondent, particularly if they are not a serious partner, tend to be more fleeting. Among drug use alters, the modal group are those mentioned in all three waves (34% of all drug use alters). Among casual sex alters, the modal group are those mentioned only in the first wave, then followed by only the third wave (29% and 26% respectively). Only 9% of casual sex contacts appeared in all three waves. As 42% of all serious relationship contacts appeared in all three waves, this reinforced our justification to separate casual and serious sexual contacts. To note, however, it is still the case that serious partners are not always long-term partners: 33% of nominated serious partners appeared in only one wave.
To better understand whether an alter is likely to persist, we can look to further attributes of alters either in terms of ego-alter role relationships or structural position based on indirect ties. For brevity we collapse the counts for waves one and two and waves two and three into "any two waves" and the counts for only wave one, two, or three into "a single wave". This is shown in Table 4. Recall that this network is not solely a network of social ties, but any link that would suggest alter is aware of or has associated with others in the network.
Results indicate that structurally embedded alters are much more likely to persist in these networks. Isolates represented 22% of total alters, yet they represent 37% of alters reported in a single wave and only 6% of alters reported in three waves. For alters in all three waves, 81% are in the giant connected component. By contrast, for alters in a single wave, only 46% of single-wave alters are in the giant connected component. Following this pattern, alters from a single wave have a considerably smaller degree (1.72) compared to alters who appear in three waves (4.7).
On average, respondents nominated more friends than family. Among the pooled nominations (i.e. across all three waves), on average 11.6 alters were friends and 4.2 were family. Yet, family members were more likely to be mentioned across all three waves than friends.

Are there as many ties reported across waves?
Above we focused on alter nominations. Below we return to the second part of Research Question 1 by investigating indirect tie nominations across waves. Ties in this case refer to indirect ties between alters. Unlike the alter nomination screen, there were no affordances (i.e. visual cues) in place to remind respondents of ties from a previous wave. Even the layout of the alters on the sociogram had to be done anew each visit. This means the average number of ties might vary considerably between visits, as might the presence or absence of any specific tie.
In Table 5, we report on the average number of ties per wave as well as a total indirect ties pooled across all three waves. The counts in the sex and drug columns refer to the ties between nominated alters. That is to say, simply because there is a drug use edge between two alters does not imply that ego used drugs with these alters, only that ego is reporting as being aware of these alters using drugs with each other (in the last six months).
The wave on wave numbers show that respondents report a consistent number of ties between waves. In fact, compared to the variation in mean number of nodes, there is considerably less variation in the mean number of ties. From Table 6 we can observe that no paired sample t-test  found any significant difference in means. We further note in Table 6 that is to say the average number of ties was very consistent between waves for all three tie types: social, sex and drug. This is reinforced by the boxplot distributions in Figure 3, which indicates much consistency between waves. We further note that the distributions themselves were not significantly different between waves. That is, there were no instances where one wave had a few outliers and the next did not. This is evinced by the lack of significance in the Kolmogorov-Smirnov tests. The overall distribution in the mean number of indirect ties per ego appears very stable using this instrument. Notably, Spearman's rho and the concordance correlation coefficients are somewhat smaller here than they are for the distribution of alter counts. Thus, people appear to change the number of reported indirect ties between waves more so than they change the number of alters. Yet, despite this, the overall distribution of ties remains very consistent in shape and average size.

Are these the same ties mentioned across waves?
Although most of Research Question 3 will be addressed interpretively in the discussion, we believe a discussion of which ties persist between waves can help us understand whether instrument bias has occurred. To note, with this instrument, there were no visual cues or features that would remind respondents which edges appeared in the previous wave. With such stable numbers of ties across the sample, one might nevertheless assume that this is because people would nominate the same ties between the same alters. This was not the case most of the time, as very few ties  appeared in all three waves. Only 10% of social ties, 8% of sex ties and 6% of drug ties appeared in all three waves. Instead, the bulk of ties only show up in a single wave, as shown in Table 7. The fact that almost three quarters of social ties and over three quarters of both indirect sex ties and indirect drug use ties were in a single wave is notable. It is especially interesting given that many alters persisted between waves, meaning that in theory, there were opportunities to link numerous people in the network at more than one time period. For the sex and drug edges it is possible that the respondent simply did not hear about any recent contact between these alters, yet for social edges, it is unclear whether respondents would carry over a social tie between waves if they did not observe the two alters interacting. This is especially of interest for sex ties as their counts are extremely low to begin with. We believe this is worth investigating further in future work.

Discussion
The findings indicate that there was little change in the sample distribution between waves. The significant changes that were observed appear to suggest structural embeddedness as a driving factor for who would be more or less likely to be nominated in future waves. We now reflect on these findings in relation to our research questions.
The first research question directly focused on the issue of network size by asking to what extent does it change between waves. The results indicate that overall there was a great deal of consistency among the sample between waves. That consistency appears through aggregate numbers, such as the consistency of the average number of alters per respondent or the average number of indirect ties per network. Despite the consistency at the sample level, there was still considerable and often significant variation at the respondent level between waves. We believe this within-person variation reinforces the extensive literature on how social networks churn even over short time periods. The consistency across the sample, however, indicates how even though there is variation at the individual level, this variation appears to be related to individuals coming in and out of particular networks, rather than an overall bias to exclude or include people in multiple waves.
The number of social contacts (excluding sex contacts) increased significantly from the first wave to the second, but not from the second to the third wave. This may be a form of panel conditioning here insofar as respondents in the second wave were more familiar with what is expected of them in a network interview and may have therefore primed themselves to consider more names. There is also the possibility that having a panel of "previously nominated" names helped individuals recall who was available. We should not make too much of this finding substantively, however, since the difference is less than one alter from wave one to wave two in a group of 14-15 alters.
With respect to sex contacts, we observe a slight but significant decrease in reporting between waves. The boxplot in Figure 2 illustrates that in waves two and three, there were fewer outliers than in the first wave. We offer three possible explanations: conditioning based on the lab setting, conditioning based on the instrument itself, and interviewee issues. With respect to the lab, it is possible that the interview experience itself led respondents to be more cautious with subsequent sex contact. The respondents were interviewed at the Center on Halsted, a community centre for advancing LBGTQ issues, especially health; this, in conjunction with highly trained interviewers, decreased the likelihood of the first explanation. With respect to the instrument, it is noteworthy that respondents at the higher end of the distribution reported fewer contacts in subsequent waves. This may be because they perceive the screen as relatively cramped with too many ties. To consider this in relation to motivated underreporting, sex alters were subject to additional name interpreter questions where most other social ties were not. The drug use alters were also subject to an additional name interpreter but it was nowhere near as involved as the sex contact questions. However, as there was a slight decrease in reporting for both sex and drug alters wave on wave, we believe panel conditioning cannot be ruled out, only considered unlikely. Finally, with respect to interviewee issues, the change may have simply been an artefact of the six-month window. After the first interview, it was likely that it was easier to remember the six-month period by anchoring to the prior interview.
In the case of ties, there were some interesting and potentially vexing findings. Respondents tended to report a very similar number of ties between waves. Neither the mean number of ties nor the shape of the distribution changed significantly between waves. Yet, as noted above, there were no visual features to indicate which ties appeared in the previous wave. This is where things are vexing. Despite the extremely close averages, there appears to be substantial within-person variation across the waves. It seems that, depending on the tie type and wave, only 28-42% of ties that could have appeared in a subsequent wave were actually reported. It is plausible that such a tie could have occurred but the respondent was not aware of it. This might have been the case for indirect sex or drug use ties. However, we are doubtful that over half of all social relationships would dissipate between a single wave, particularly when the next wave was a mere six months later.
The second research question focused on the qualities of the alters that were associated with being recalled at multiple waves, particularly structural embeddedness. We explored ego-alter attributes (namely the alter's relationship to ego) as well as alter-alter attributes (namely alter's structural position in an aggregated network of all ties across all waves). We found that nodes persisted across waves if they were nominated as family members, serious partners or friends. The same can be said for serious partners, as 42% appeared in all three waves and a further 25% appeared in two waves. While casual partners were very likely to be present only in a single wave, some appeared in multiple waves. This highlights how casual partners refer not to the frequency of the interaction, but its relational meaning. Some people will have long-term casual partners.
Turning to structural embeddedness, we chose three measures of embeddedness: percent of alters in the giant connected component, degree and isolate. Being connected to the giant connected component suggests that alter is in some respects a part of ego's social world as alter knows of at least one other person in ego's life but likely knows several directly or indirectly. By contrast, being an isolate means ego and alter have never closed a triad; they may share casual acquaintances, but not mutual close network members. Such an alter is both less likely to be mentioned with others and perhaps less likely to be top of mind since individuals tend to recall people in clusters (Marin, 2004).
Reflecting on panel conditioning and the participant-aided sociogram for the third research question, it appears that individuals needed a little adjustment in order to fully adapt to what is expected of them in this interview. In the case of alters, there were significant changes between the first two waves that were either non-significant or non-existent between the second and third wave. This might be because of the difficulty in assessing whether contact was inside or outside of the six-month window but it might also be because of respondent burden for those nominating over thirty or forty alters. What is promising for us is that there was not much evidence that individuals were motivated to underreport alter nominations after learning that creating an alter will lead to more questions about that alter. There might be some evidence of this for the most extreme cases, but even then results are not clear as some individuals would report even more alters at a subsequent time period. On the average, however, the consistency between waves appears to be very robust.

Conclusion
An investigation into the stability of networks across three waves using a novel data capture technique might be best captured by the adage "the more things change, the more they stay the same". We saw considerable churn in the network with respect to alters and especially with indirect ties. Yet, the distribution of network size and density remained remarkably similar.
Much to our initial surprise, later waves appeared to elicit more rather than fewer ties. This was not because people reported the same ties repeatedly, but instead because respondents brought in some alters while excluding others. This churn was most prevalent for casual sex contacts. Drug use contacts, by contrast, remained far more resilient. This suggests notably different social processes for drug contacts and sexual contacts in this population, as well as different potential intervention strategies when considered within the public health context of at-risk young men who have sex with men. One of the advantages of working with a name generator rather than merely count estimates of sexual contact frequency is that we can look beyond mere concordance between waves to whether it refers to the same individuals or different ones, a detail that can be especially important in public health.
Participant-aided sociograms have been lauded for rendering visible an object of inquiry that often only exists in the lab. Respondents tend to enjoy the PAS and to take ownership of it. In doing so it highlights how the lab is not a neutral space of data extraction. Some people do indeed drift out of our lives and others return. This churn is not the same as motivated underreporting. By showing that individuals tend to have consistent network sizes especially at the sample level, we believe that our technique did not, in this context, suffer from notable instrument bias. By contrast, we think that it represents both a reliable and engaging approach to the future collection of longitudinal data.

Limitations and future work
The biggest limitation of this work is external validity. The sample was comprised of adolescents and young adults who were assigned male sex at birth. First, this is a group who could plausibly be more enthusiastic about the use of new technology. Second, young people are in an unsettled part of the life course, and their networks tend to reflect this (Wrzus et al., 2013). This is especially the case for YMSM who often experience some network changes as (and if) they come out (McConnell et al., 2018).
This study was also limited by some serious constraints that follow from RADAR's strict confidentiality procedures. RADAR asks individuals to report on their sex and drug use, including condomless sex and HIV status. To this end, we were limited in our ability to link data. Future work under other circumstances may want to link names across waves thereby enabling an even longer-term comparison with more waves. A lack of linking is also related to our choice of analysis techniques. We take each alter in each network to be different. Yet, a study of almost one thousand YMSM in Chicago will undoubtedly have overlap between the networks. In networks with extensive overlap, a stochastic actor-oriented model may recover a lot of structural reasons for tie disappearance that were not easily estimated using traditional approaches (Snijders, 2001).
Finally, we did not compare the networks captured to a dyad census. To note, the list of alters captured at wave one were broadly comparable to the sexual contacts collected in the same interview using a different instrument . Nonetheless, comparing to a dyad census, over time and within person, would provide some important details about which ties are most likely to be forgotten. It would also provide a fairer comparison for panel conditioning with motivated underreporting of names due to respondent burden (Fischer, 2009;Paik & Sanchagrin, 2013). To that extent, our results suggest that the number of ties reported using a participant-aided sociogram could plausibly be higher than a dyad census, if underreporting is related to respondent fatigue. However, especially for large, dense networks, the sociogram might reveal fewer dyads if the respondent found entering all of the ties in a dense cluster to be frustrating. These are empirical questions and left for future researchers.
As we shift from simply asking individuals questions in a guided interview towards ever more structured computer assisted techniques, we will continue to confront these representational issues. In this sense, future work will have to consider not simply question wording but a variety of issues with screen size, mobility, and responsiveness if we wish to ameliorate respondent burden and capture meaningful, reliable networks.