Hostname: page-component-77f85d65b8-2tv5m Total loading time: 0 Render date: 2026-04-22T22:24:52.696Z Has data issue: false hasContentIssue false

Role detection in bicycle-sharing networks using multilayer stochastic block models

Published online by Cambridge University Press:  22 April 2022

Jane Carlen
Affiliation:
Department of Statistics, University of California, Los Angeles, USA
Jaume de Dios Pont
Affiliation:
Department of Mathematics, University of California, Los Angeles, USA
Cassidy Mentus
Affiliation:
Department of Mathematics, University of California, Los Angeles, USA
Shyr-Shea Chang
Affiliation:
Department of Mathematics, University of California, Los Angeles, USA
Stephanie Wang
Affiliation:
Department of Computer Science and Engineering, University of California, San Diego, USA
Mason A. Porter*
Affiliation:
Department of Mathematics, University of California, Los Angeles, USA Santa Fe Institute, USA
*
*Corresponding author. E-mail: mason@math.ucla.edu
Rights & Permissions [Opens in a new window]

Abstract

In urban systems, there is an interdependency between neighborhood roles and transportation patterns between neighborhoods. In this paper, we classify docking stations in bicycle-sharing networks to gain insight into the human mobility patterns of three major cities in the United States. We propose novel time-dependent stochastic block models, with degree-heterogeneous blocks and either mixed or discrete block membership, which classify nodes based on their time-dependent activity patterns. We apply these models to (1) detect the roles of bicycle-sharing stations and (2) describe the traffic within and between blocks of stations over the course of a day. Our models successfully uncover work blocks, home blocks, and other blocks; they also reveal activity patterns that are specific to each city. Our work gives insights for the design and maintenance of bicycle-sharing systems, and it contributes new methodology for community detection in temporal and multilayer networks with heterogeneous degrees.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2022. Published by Cambridge University Press
Figure 0

Figure 1. (Top) Total bicycle trips by departing hour for weekdays, weekends, and overall in New York City, San Francisco, and Los Angeles. Hour 0 designates midnight. (Bottom) Trips that arrive (solid curves) and depart (dashed curves) by hour in the six San Francisco stations with the most trips.

Figure 1

Figure 2. In the TDD-SBM networks that we construct, the red dots indicate the mean weights of the edges from block g to block h in time layer t, where $g,h \in \{1,2,3\}$ and $t \in \{0, 1, \ldots, 15\}$. The title of each panel indicates the blocks g and h (e.g., “1 to 2”).

Figure 2

Figure 3. Classification of bicycle-sharing stations in downtown Los Angeles using (left) a two-block TDMM-SBM and (right) a two-block TDD-SBM. The sizes of the nodes take continuous values. In the left panel, we scale the size of each node based on its value of $\sum_gC_{ig}$. In the right panel, we scale the size of each node based on its degree.

Figure 3

Table 1. Evaluation of parameter-estimation methods for networks that we generate using our TDD-SBM and our TDD-SBM without degree-correction. We estimate parameters using both of these models and the PPSBM of Matias et al. (2018). For each experiment, we indicate the number of blocks (K) and the number of nodes (N) in the networks. The “Gen DC” column indicates whether or not we use degree-correction when generating the networks, and the “Fit DC” column indicates whether or not we use degree-correction when we fit the networks. The “Fit Method” column indicates which model we use for fitting. To the right of the vertical line, we show the means of several statistics over 100 instantiations of the networks, with the associated standard deviations in parentheses. “ARI” is the adjusted Rand index between the inferred block assignments and the ground-truth assignments. The quantity MAPE$(\hat{\omega})$ is the mean absolute-percentage error, expressed as a fraction, between the block-connectivity parameters $\hat{\omega}$ and $\omega$. “Gen LLIK” is the unnormalized log-likelihood of a generated network under the data-generating model. “Diff LLIK” is the result of subtracting the unnormalized log-likelihood of a generated network under the data-generating model from its unnormalized log-likelihood under the estimated model

Figure 4

Table 2. Evaluation of parameter-estimation methods for networks that we generate using our TDMM-SBM. For each experiment, we indicate the number of blocks (K) and the number of nodes (N) in the networks. To the right of the vertical line, we show the means of several statistics over 100 instantiations of the networks, with associated standard deviations in parentheses. “BAE” is the blockwise absolute error of the block-membership parameters. The quantity MAPE$(\hat{\omega})$ is the mean absolute-percentage error, expressed as a fraction, between the block-connectivity parameters $\hat{\omega}$ and $\omega$. We show “–” in this column when the result is extremely large due to values of $\omega_{ght}$ near 0. The quantity MAPE$_p(\hat{\omega})$ is the pairwise MAPE of $\hat{\mathbf{\omega}}$. “Gen LLIK” is the unnormalized log-likelihood of a generated network under the data-generating TDMM-SBM. “Diff LLIK” is the result of subtracting the unnormalized log-likelihood of a generated network under the data-generating model from its unnormalized log-likelihood under the estimated TDMM-SBM. “Diff LLIK Discrete” is the result of subtracting the unnormalized log-likelihood of a generated network under the data-generating model from its unnormalized log-likelihood under the degree-corrected TDD-SBM

Figure 5

Figure 4. Estimated time-dependent block-connectivity parameters $\hat{\omega}_{ght}$ for (left) the two-block TDMM-SBM and (right) the two-block TDD-SBM for the downtown Los Angeles bicycle-sharing network.

Figure 6

Figure 5. Two-block TDMM-SBM assignments of downtown Los Angeles bicycle-sharing stations overlaid on a simplified LA zoning map. The industrial zones include manufacturing and commercial areas. As in Figure 3, we scale the size of each node based on its value of $\sum_gC_{ig}$.

Figure 7

Figure 6. Classification of bicycle-sharing stations in San Francisco using (left) a two-block TDMM-SBM and (right) a two-block TDD-SBM. The sizes of the nodes take continuous values. In the left panel, we scale the size of each node based on its value of $\sum_gC_{ig}$. In the right panel, we scale the size of each node based on its degree.

Figure 8

Figure 7. Estimated time-dependent block-connectivity parameters $\hat{\omega}_{ght}$ for (left) the two-block TDMM-SBM and (right) the two-block TDD-SBM for the San Francisco bicycle-sharing network.

Figure 9

Figure 8. Estimated blocks from two-block TDD-SBMs for time-aggregated bicycle-sharing networks in (left) downtown Los Angeles and (right) San Francisco.

Figure 10

Figure 9. Estimated blocks from two-blocks TDD-SBMs without degree-correction for bicycle-sharing networks in (left) downtown Los Angeles and (right) San Francisco. We scale the size of each node based on its degree.

Figure 11

Figure 10. Comparison of the blocks that we infer in the Manhattan subnetwork (i.e., the “Manhattan (home)” block in the right panel of Figure A.4) of the New York City bicycle-sharing network using (left) a five-block TDMM-SBM and (right) a five-block TDD-SBM.

Figure 12

Figure 11. Estimated time-dependent block-connectivity parameters $\hat{\omega}_{ght}$ for (left) the five-block TDMM-SBM and (right) the five-block TDD-SBM for the Manhattan subnetwork of the New York City bicycle-sharing network.

Figure 13

Figure A.1. The first two singular vectors of the in-degree matrix and the out-degree matrix of the downtown Los Angeles bicycle-sharing network.

Figure 14

Figure A.2. The first two singular vectors of the in-degree matrix and the out-degree matrix of the San Francisco bicycle-sharing network.

Figure 15

Figure A.3. The first two singular vectors of the in-degree matrix and the out-degree matrix of the New York City bicycle-sharing network.

Figure 16

Figure A.4. Classification of New York City bicycle-sharing stations using (left) a three-block TDMM-SBM and (right) a three-block TDD-SBM. The sizes of the nodes take continuous values. In the left panel, we scale the area of each node based on the value of $\sum_gC_{ig}$. In the right panel, we scale the area of each node based on its degree.

Figure 17

Figure A.5. Estimated time-dependent block-connectivity parameters $\hat{\omega}_{ght}$ for (left) the three-block TDMM-SBM and (right) the three-block TDD-SBM for the New York City bicycle-sharing network. We use “M” to signify Manhattan and “BK” to signify Brooklyn.

Figure 18

Figure A.6. Three-block TDD-SBM station roles versus the coverage-area zoning map of New York City.

Figure 19

Figure A.7. Values of the Akaike information criterion for the MLE of the TDMM-SBM with 2–10 blocks for the Los Angeles bicycle-sharing network.

Figure 20

Table A.1. Comparison of log-likelihoods and numbers of parameters ($N_p$) in SBMs for the Manhattan network, which has $N = 166$ nodes