To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The function of cells is based on complex networks of interacting chemical reactions carefully organized in space and time. These biochemical reaction networks produce observable cellular functions. Network reconstruction is the process of identifying all the reactions that comprise a network. The reconstruction process for metabolic networks has been developed and implemented for a number of organisms. The main features of metabolic network reconstruction are described in this chapter. We briefly review the key properties of metabolic networks and introduce the hierarchical thinking that goes into the interpretation of complex network functions. Further details can be found in authoritative sources.
As discussed at the end of this chapter, a true genome-scale reconstruction of cellular functions necessitates accounting for all cellular networks simultaneously. Such a comprehensive network reconstruction has yet to be established; therefore, in this chapter, we focus on metabolism and address the reconstruction of transcriptional regulatory and signaling networks in the following two chapters.
Basic Features
Intermediary metabolism can be viewed as a chemical “engine” that converts available raw materials into energy as well as the building blocks needed to produce biological structures, maintain cells, and carry out various cellular functions. This chemical engine is highly dynamic, obeys the laws of physics and chemistry, and is thus limited by various physicochemical constraints. It also has an elaborate regulatory structure that allows it to respond to a variety of external perturbations.
In the early 1960s, there was a bifurcation of emphasis in biology. Molecular biology had arrived, providing a growing understanding of DNA, protein, and other chemical components of cells. A science was emerging that had rigor in terms of analytical chemistry and controlled experimentation, and relevance to biochemical and genetic functions of cells and occasionally to their phenotypes. Holistic emphasis in biology, which had primarily been practiced through physiology, faded into the background as it is much more difficult to state hypotheses, do controlled experiments, or execute the scientific process for the behavior of systems and networks in biology. However, as outlined in the introductory chapter, this situation has now changed. We now have technology that allows for the detailed enumeration of biological components, enabling us to study cells and complex biological processes as systems. As a consequence, systems biology has arisen as a new field. This new field does not yet have a well-defined and articulated conceptual basis. In this chapter, we will attempt to collect some of the key issues that represent to the conceptual foundations of systems biology. Its content is not intended to be, and cannot be, complete but rather represents an attempt to initiate this process.
Components vs. Systems
Biological components all have a finite turnover time. Most metabolites turn over within a minute in a cell, mRNA molecules typically have 2-hour halflives in human cells, 3% of the extracellular matrix in cardiac muscle is turned over daily, and so forth.
In the last chapter we discussed the elementary topological properties of the network that the stoichiometric matrix represents. In this chapter we look deeper into the properties of the stoichiometric matrix and how these fundamental topological properties can be used to obtain a more thorough understanding of the reaction network that it represents. This material is perhaps the most mathematical part of this book. It should be readily accessible to readers with formal education in the physical and engineering sciences, while readers with a life science background may find it challenging. The concepts introduced are important to the rest of the chapters in Part II. The stoichiometric matrix is a mathematical mapping operation (recall Figure 6.1). Matrices have certain fundamental properties that describe this mapping operation. These properties are contained in the four fundamental subspaces associated with a matrix. This chapter discusses these subspaces and how we can mathematically define them and interpret their contents in biochemical and biological terms.
Dimensions of the Fundamental Subspaces
The mapping that the stoichiometric matrix represents was illustrated in Figure 6.1 and a preliminary discussion of the associated four subspaces is found in Chapter 6. The stoichiometric matrix is typically rank deficient. The rank r of a matrix denotes the number of linearly independent rows and columns that the matrix contains. Rows are linearly dependent if any one row can be computed as a linear combination of the other rows.
Cellular functions rely on the interactions of their chemical constituents. Various high-throughput experimental methods allow us now to determine the chemical composition of cells on a genome scale. These methods include whole genome sequencing and annotation (genomics), the measurement of the messenger RNA molecules that are synthesized under a given condition (transcriptomics), the protein abundance, interactions and functional states (proteomics) measurements of the presence and concentration of metabolites (metabolomic), and metabolic fluxes (fluxomics). In addition, methods now exist to determine the binding sites of protein to the DNA (location analysis) and to measure of a limited number of fluxes through reactions inside a cell. The physical location of protein products and segments of the DNA can be determined using various fluorescent reporting molecules. All these methods can be used to help to reconstruct the biochemical reaction networks that operate in cells. This part of the text discusses the reconstruction of metabolic, regulatory, and signaling networks. Given the rate at which new methods are being developed, it is likely that this part of the text will become dated the fastest. However, with new or existing methods the result of the reconstruction process is a set of chemical reactions or interactions that comprise these networks. The reader should be mindful of the fact that these are not separate and independent networks. In fact they interact with one another.
The expression of the gene complement of a genome is carefully regulated. Only a fraction of the genes in a genome are expressed under a given condition or in a particular cell type. There is a complex transcriptional regulatory network that controls which genes are expressed in response to various environmental and developmental signals. Extensive effort is being devoted to the elucidation of the components of transcriptional regulatory networks and the links between them. The reconstruction methods that are being developed are based on both legacy and high-throughput data types, and notable progress is being made with a few specific cases. Although the comprehensive details are not yet available for any one transcriptional regulatory network, some of their fundamental principles have been elucidated and a conceptual framework for their hierarchical decomposition has been developed.
Basic Properties
The chemical conversions taking place in metabolic networks relate to the dismemberment and assembly of small molecules through a series of chemical transformations. In contrast, transcriptional regulatory networks involve the association and interaction of large molecules. They rely primarily on protein–protein interactions and DNA–protein interactions, although metabolites do participate directly in some of these transformations. The chemistry underlying these interactions is currently partially understood, but much progress is being made. Still, however, these networks are not as well assembled and characterized as metabolic networks.
The functional states of reconstructed networks are directly related to cellular phenotypes. With reconstructed networks represented in the form of S, we can use mathematics to compute their candidate functional states of reconstructed networks. If one adopts the informatics point of view of S and its annotated information as biochemically, genetically, and genomically (BIGG) structured database, then these in silico methods are viewed as query tools. Whether viewed from an informatic or mathematical standpoint, the result of applying in silico analysis methods is the study of network properties, sometimes called emergent properties. These properties represent functionalities of the whole network and are hard to decipher from a list of its individual components. In some sense, these properties are a reflection of the hierarchical nature of living systems. A variety of methods have been developed to examine the properties of genome-scale networks. The third part of this text summarizes the in silico methods that have been developed and deployed to date. The development and application of such methods is the focus of a growing number of researchers worldwide, and we can thus anticipate that there will be much progress in this field over the coming years.
In this chapter, we discuss the importance of the concept of dependence in social network data. Dependence is usually treated as a technical statistical issue, but in the case of social networks, the type of dependencies that might be expected in the data reflect underlying social processes that generate network structures. Consequently, we argue that possible dependencies need to be thought about explicitly when modeling social networks. We present a hierarchy of increasingly more complex dependence assumptions and show how to represent these in terms of dependence graphs. We show how dependence graphs are used in exponential random graph (p*) models for social networks. The most commonly used dependence assumption for p* models is that of Markov random graphs, but we summarize new developments that introduce higher-order dependence structures, Markov assumptions constrained within social settings, and dependencies involving individual-level attributes. We conclude by conceptualizing our general approach in terms of social space, with different types of dependence structures construed as forms of abstract proximity between elements of that social space.
Social Phenomena, Networks, and Dependence
An event or process is a social phenomenon precisely because behaviors by the individuals involved are interrelated. The form of interrelation may be particularly complex. As Solomon Asch argued half a century ago, social phenomena have the reflexive quality of being psychologically represented in each of the participating individuals.
This chapter considers study design and data collection methods for social network studies, emphasizing methodological research and applications that have appeared since an earlier review (Marsden 1990). It concentrates on methods and instruments for measuring social relationships linking actors or objects. Many analytical techniques discussed in other chapters identify patterns and regularities that measure structural properties of networks (such as centralization or global density), and/or relational properties of particular objects/actors within them (such as centrality or local density). The focus here is on acquiring the elementary data elements themselves.
Beginning with common designs for studying social networks, the chapter then covers methods for setting network boundaries. A discussion of data collection techniques follows. Survey and questionnaire methods receive primary attention: they are widely used, and much methodological research has focused on them. More recent work emphasizes methods for measuring egocentric networks and variations in network perceptions; questions of informant accuracy or competence in reporting on networks remain highly salient. The chapter closes with a brief discussion of network data from informants, archives, and observations, and issues in obtaining them.
Network Study Designs
The broad majority of social network studies use either “whole-network” or “egocentric” designs. Whole-network studies examine sets of interrelated objects or actors that are regarded for analytical purposes as bounded social collectives, although in practice network boundaries are often permeable and/or ambiguous.
By
Mark Huisman, Heymans Institute/DPMG, University of Groningen,
Marijtje A. J. van Duijn, ICS/Statistics & Measurement Theory, University of Groningen
This chapter reviews software for the analysis of social networks. Both commercial and freely available packages are considered. Based on the software page on the INSNA website (http://www.insna.org/INSNA/soft inf.html), and using the main topics in the book on network analysis by Wasserman and Faust (1994), which we regard as the standard text, we selected twenty-seven software packages: twenty-three stand-alone programs, listed in Table 13.1.1, and five utility toolkits given in Table 13.1.2.
Software merely aimed at visualization of networks was not admitted to the list because this is the topic of Chapter 12 of this book (Freeman 2004). We do review a few programs with strong visualization properties. Some were originally developed for network visualization, and now contain analysis procedures (e.g., NetDraw; Borgatti 2002). Other programs were specifically developed to integrate network analysis and visualization (e.g., NetMiner, Cyram 2004, and visone; Brandes and Wagner 2003). Two other programs for network visualization are worth mentioning here because some of the reviewed software packages have export functions to these graph drawing programs, or they are freely distributed together with the social analysis software: KrackPlot (Krackhardt, Blythe, and McGrath 1994) and Mage (Richardson 2001).
The age of the software was not a criterion for selection, although the release dates of the last versions of the majority of the reviewed software were within the last one or two years.
Tables 13.1.1 and 13.1.2 describe the main objective or characteristic of each program.
Centrality is one of the most important and widely used conceptual tools for analyzing social networks. Nearly all empirical studies try to identify the most important actors within the network. In this chapter, we discuss three extensions of the basic concept of centrality. The first extension generalizes the concept from that of a property of a single actor to that of a group of actors within the network. This extension makes it possible to evaluate the relative centrality of different teams or departments within an organization, or to assess whether a particular ethnic minority in a society is more integrated than another. The second extension applies the concept of centrality to two-mode data in which the data consist of a correspondence between two kinds of nodes, such as individuals and the events in which they participate. In the past, researchers have dealt with such data by converting them to standard network data (with considerable loss of information); the objective of the extension discussed here is to apply the concept of centrality directly to the two-mode data. The third extension uses the centrality concept to examine the core-periphery structure of a network.
It is well-known that a wide variety of specific measures have been proposed in the literature dating back at least to the 1950s with the work of Katz (1953). Freeman (1979) imposed order on some of this work in a seminal paper that categorized centrality measures into three basic categories – degree, closeness, and betweenness – and presented canonical measures for each category.
We begin with a graph (or a directed graph), a single set of nodes N, and a set of lines or arcs ℒ. It is common to use this mathematical concept to represent a social network. We use the notation of Wasserman and Faust (1994), especially Chapters 13 and 15. There are extensions of these ideas to a wide range of social networks, including multiple relations, affiliation relations, valued relations, and social influence and selection situations (in which information on attributes of the nodes is available). Later chapters in this volume discuss such generalizations.
The model p* was first discussed by Frank and Strauss (1986), who termed it a distribution for a Markov random graph. Further developments, especially commentary on estimation of distribution parameters, were given by Strauss and Ikeda (1990). Wasserman and Pattison (1996) further elaborated this family of models, showing how a Markov parametric assumption provides just one of many possible sets of parameters. This family, with its variety and extensions, was named p*, a label by which it has come to be known. The parameters reflect structural concerns, which are assumed to govern the probabilistic nature of the underlying social and/or behavioral process.
The development of p* presented here is different from that found in Wasserman and Pattison (1996) and Anderson, Wasserman, and Crouch (1999), but similar to the presentation in Pattison and Wasserman (1999).