To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In this chapter, we begin our dive into the fundamentals of network data. We delve deep into the strange world of networks by considering the friendship paradox, the apparently contradictory finding that most people (nodes) have friends (neighbors) who are more popular than themselves. How can this be? Where are all these friends coming from? We introduce network thinking to resolve this paradox. As we will see, It is due to constraints induced by the network structure: pick a node randomly and you are much more likely to land next to a high-degree node than on a high-degree node because high-degree nodes have many neighbors. This is unexpected, almost profoundly so; a local (node-level) view of a network will not accurately reflect the global network structure. This paradox highlights the care we need to take when thinking about networks and network data mathematically and practically.
Network studies follow an explicit form, from framing questions and gathering data, to processing those data and drawing conclusions. And data processing leads to new questions, leading to new data and so forth. Network studies follow a repeating lifecycle. Yet along the way, many different choices will confront the researcher, who must be mindful of the choices they are making with their data and the choices of tools and techniques they are using to study their data. In this chapter, we describe how studies of networks begin and proceed, the life-cycle of a network study
A number of properties relating to the inverse z-transform are discussed. The partial fraction expansion (PFE) of a rational z-transform plays a role in finding the inverse transform. It is shown that the inverse z-transform solution is not unique and depends on the region of convergence (ROC). Depending on the ROC, the solution may be causal, anticausal, two-sided, stable, or unstable. The condition for existence of a stable inverse transform is also developed. The interplay between causality, stability, and the ROC is established and illustrated with examples. The case of multiple poles is also considered. The theory and implementation of IIR linear-phase filters is discussed in detail. The connection between z-transform theory and analytic functions in complex variable theory is placed in evidence. Based on this connection, many intriguing examples of z-transform pairs are pointed out. In particular, closed-form expressions for radii of convergence of the z-transform can be obtained from complex variable theory. The case of unrealizable digital filters and their connection to complex variable theory is also discussed.
In this chapter, we introduce visualization techniques for networks, what problems we face, and solutions we use, to make those visualizations as effective as possible. Visualization is an essential tool for exploring network data, revealing patterns that may not be easily inferred from statistics alone. Although network visualization can be done in many ways, the most common approach is through two-dimensional node-link diagrams. Properly laying out nodes and choosing the mapping between network and visual properties is essential to create an effective visualization, which requires iteration and fine-tuning. For dense networks, filtering or aggregating the data may be necessary. Following an iterative, back-and-forth workflow is essential, trying different layout methods and filtering steps to show the networks structure best while keeping the original questions and goals in mind. Visualization is not always the endpoint of a network analysis but can also be a useful step in the middle of an exploratory data analysis pipeline, similar to traditional statistical visualization of non-network data.
This chapter gives a brief overview of sampling based on sparsity. The idea is that a signal which is not bandlimited can sometimes be reconstructed from a sampled version if we have a priori knowledge that the signal is sparse in a certain basis. These results are very different from the results of Shannon and Nyquist, and are sometimes referred to as sub-Nyquist sampling theories. They can be regarded as generalizations of traditional sampling theory, which was based on the bandlimited property. Examples include sampling of finite-duration signals whose DFTs are sparse. Sparse reconstruction methods are closely related to the theory of compressive sensing, which is also briefly introduced. These are major topics that have emerged in the last two decades, so the chapter provides important references for further reading.
This chapter covers data provenance or data lineage, the detailed history of how data was created and manipulated, as well as the process of ensuring the validity of such data by documenting the details of its origins and transformations. Data provenance is a central challenge when working with data. Computing helps but also hinders our ability to maintain records of our work with the data. The best science will result when we adopt strategies to carefully and consistently record and track the origin of data and any changes made along the way. For instance, we want to know where (by whom) a dataset was created and what was the process used to create it. Then, if there were any changes, such as fixing erroneous entries, we need to have a good record of such changes. With these goals in mind, we discuss best practices for tracking data provenance. While such practices generally take time and effort to implement, making them seem tedious in the short term, over time, your research will become more reliable, and you and your collaborators will be grateful.
All fields of science benefit from gathering and analyzing network data. This chapter summarizes a small portion of the ways networks are found in research fields thanks to increasing volumes of data and the computing resources needed to work with that data. Epidemiology, dynamical systems, materials science, and many more fields than we can discuss here, use networks and network data. Well encounter many more examples during the rest of this book.
While there are cases where it is straightforward and unambiguous to define a network given data, often a researcher must make choices in how they define the network and that those choices, preceding most of the work on analyzing the network, have outsized consequences for that subsequent analysis. Sitting between gathering the data and studying the network is the upstream task: how to define the network from the underlying or original data. Defining the network precedes all subsequent or downstream tasks, tasks we will focus on in later chapters. Often those tasks are the focus of network scientists who take the network as a given and focus their efforts on methods using those data. Envision the upstream task by asking, what are the nodes? and what are the links?, with the network following from those definitions. You will find these questions a useful guiding star as you work, and you can learn new insights by reevaluating their answers from time to time.
This chapter presents mathematical details relating to the Fourier transform (FT), Fourier series, and their inverses. These details were omitted in the preceding chapters in order to enable the reader to focus on the engineering side. The material reviewed in this chapter is fundamental and of lasting value, even though from the engineer’s viewpoint the importance may not manifest in day-to-day applications of Fourier representations. First the chapter discusses the discrete-time case, wherein two types of Fourier transform are distinguished, namely, l1-FT and l2-FT. A similar distinction between L1-FT and L2-FT for the continuous-time case is made next. When such FTs do not exist, it is still possible for a Fourier transform (or inverse) to exist in the sense of the so-called Cauchy principal value or improper Riemann integral, as explained. A detailed discussion on the pointwise convergence of the Fourier series representation is then given, wherein a number of sufficient conditions for such convergence are presented. This involves concepts such as bounded variation, one-sided derivatives, and so on. Detailed discussions of these concepts, along with several illuminating examples, are presented. The discussion is also extended to the case of the Fourier integral.
This chapter introduces recursive difference equations. These equations represent discrete-time LTI systems when the so-called initial conditions are zero. The transfer functions of such LTI systems have a rational form (ratios of polynomials in z). Recursive difference equations offer a computationally efficient way to implement systems whose outputs may depend on an infinite number of past inputs. The recursive property allows the infinite past to be remembered by remembering only a finite number of past outputs. Poles and zeros of rational transfer functions are introduced, and conditions for stability expressed in terms of pole locations. Computational graphs for digital filters, such as the direct-form structure, cascade-form structure, and parallel-form structure, are introduced. The partial fraction expansion (PFE) method for analysis of rational transfer functions is introduced. It is also shown how the coefficients of a rational transfer function can be identified by measuring a finite number of samples of the impulse response. The chapter also shows how the operation of polynomial division can be efficiently implemented in the form of a recursive difference equation.
Networks exhibit many common patterns. What causes them? Why are they present? Are they universal across all networks or only certain kinds of networks? One way to address these questions is with models. In this chapter, we explore in-depth the classic mechanistic models of network science. Random graph models underpin much of our understanding of network phenomena, from the small world path lengths to heterogeneous degree distributions and clustering. Mathematical tools help us understand what mechanisms or minimal ingredients may explain such phenomena, from basic heuristic treatments to combinatorial tools such as generating functions.
Network science is a broadly interdisciplinary field, pulling from computer science, mathematics, statistics, and more. The data scientist working with networks thus needs a broad base of knowledge, as network data calls for—and is analyzed with—many computational and mathematical tools. One needs good working knowledge in programming, including data structures and algorithms to effectively analyze networks. In addition to graph theory, probability theory is the foundation for any statistical modeling and data analysis. Linear algebra provides another foundation for network analysis and modeling because matrices are often the most natural way to represent graphs. Although this book assumes that readers are familiar with the basics of these topics, here we review the computational and mathematical concepts and notation that will be used throughout the book. You can use this chapter as a starting point for catching up on the basics, or as reference while delving into the book.
As we have seen, network data are necessarily imperfect. Missing and spurious nodes and edges can create uncertainty in what the observed data tell us about the original network. In this chapter, we dive deeper into tools that allow us to quantify such effects and probe more deeply into the nature of an unseen network from our observations. The fundamental challenge of measurement error in network data is capturing the error-producing mechanism accurately and then inferring the unseen network from the (imperfectly) observed data. Computational approaches can give us clues and insights, as can mathematical models. Mathematical models can also build up methods of statistical inference, whether in estimating parameters describing a model of the network or estimating the networks structure itself. But such methods quickly become intractable without taking on some possibly severe assumptions, such as edge independence. Yet, even without addressing the full problem of network inference, in this chapter, we show valuable ways to explore features of the unseen network, such as its size, using the available data.
Gives a brief overview of the book. Notations for signal representation in continuous time and discrete time are introduced. Both one-dimensional and two-dimensional signals are introduced, and simple examples of images are presented. Examples of noise removal and image smoothing (filtering) are demonstrated. The concept of frequency is introduced and its importance as well as its role in signal representation are explained, giving musical notes as examples. The history of signal processing, the role of theory, and the connections to real-life applications are mentioned in an introductory way. The chapter also draws attention to the impact of signal processing in digital communications (e.g., cell-phone communications), gravity wave detection, deep space communications, and so on.