To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Since its emergence as a field of importance in the 1970s, digital signal processing (DSP) has grown in exponential lockstep with advances in digital hardware. Today's digital age requires that under-graduate students master material that was, until recently, taught primarily at the graduate level. Many DSP textbooks remain rooted in this graduate-level foundation and cover an exhaustive (and exhausting!) number of topics. This book provides an alternative. Rather than cover the broadest range of topics possible, we instead emphasize a narrower set of core digital signal processing concepts. Rather than rely solely on mathematics, derivations, and proofs, we instead balance necessary mathematics with a physical appreciation of subjects through heuristic reasoning, careful examples, metaphors, analogies, and creative explanations. Throughout, our underlying goal is to make digital signal processing as accessible as possible and to foster an intuitive understanding of the material.
Practical DSP requires hybrid systems that include both discrete-time and continuous-time components. Thus, it is somewhat curious that most DSP textbooks focus almost exclusively on discrete-time signals and systems. This book takes a more holistic approach and begins with a review of continuous-time signals and systems, frequency response, and filtering. This material, while likely familiar to most readers, sets the stage for sampling and reconstruction, digital filtering, and other aspects of complete digital signal processing systems. The synergistic combination of continuous-time and discrete-time perspectives leads to a deeper and more complete understanding of digital signal processing than is possible with a purely discrete-time viewpoint. A strong foundation of continuous-time concepts naturally leads to a stronger understanding of discrete-time concepts.
In February 2012, Kobe Bryant, the American basketball star, joined Chinese microblogging site Sina Weibo. Within a few hours, more than 100,000 followers joined his page, anxiously waiting for his first microblogging post on the site. The media considered the tremendous number of followers Kobe Bryant received as an indication of his popularity in China. In this case, the number of followers measured Bryant's popularity among Chinese social media users. In social media, we often face similar tasks in which measuring different structural properties of a social media network can help us better understand individuals embedded in it. Corresponding measures need to be designed for these tasks. This chapter discusses measures for social media networks.
When mining social media, a graph representation is often used. This graph shows friendships or user interactions in a social media network. Given this graph, some of the questions we aim to answer are as follows:
• Who are the central figures (influential individuals) in the network?
• What interaction patterns are common in friends?
• Who are the like-minded users and how can we find these similar individuals?
To answer these and similar questions, one first needs to define measures for quantifying centrality, level of interactions, and similarity, among other qualities. These measures take as input a graph representation of a social interaction, such as friendships (adjacency matrix), from which the measure value is computed.
We live in an age of big data. With hundreds of millions of people spending countless hours on social media to share, communicate, connect, interact, and create user-generated data at an unprecedented rate, social media has become one unique source of big data. This novel source of rich data provides unparalleled opportunities and great potential for research and development. Unfortunately, more data does not necessarily beget more good, only more of the right (or relevant) data that enables us to glean gems. Social media data differs from traditional data we are familiar with in data mining. Thus, new computational methods are needed to mine the data. Social media data is noisy, free-format, of varying length, and multimedia. Furthermore, social relations among the entities, or social networks, form an inseparable part of social media data; hence, it is important that social theories and research methods be employed with statistical and data mining methods. It is therefore a propitious time for social media mining.
Social media mining is a rapidly growing new field. It is an interdisciplinary field at the crossroad of disparate disciplines deeply rooted in computer science and social sciences. There are an active community and a large body of literature about social media. The fast-growing interests and intensifying need to harness social media data require research and the development of tools for finding insights from big social media data. This book is one of the intellectual efforts to answer the novel challenges of social media. It is designed to enable students, researchers, and practitioners to acquire fundamental concepts and algorithms for social media mining.
With the rise of social media, the web has become a vibrant and lively realm in which billions of individuals all around the globe interact, share, post, and conduct numerous daily activities. Information is collected, curated, and published by citizen journalists and simultaneously shared or consumed by thousands of individuals, who give spontaneous feedback. Social media enables us to be connected and interact with each other anywhere and anytime – allowing us to observe human behavior in an unprecedented scale with a new lens. This social media lens provides us with golden opportunities to understand individuals at scale and to mine human behavioral patterns otherwise impossible. As a byproduct, by understanding individuals better, we can design better computing systems tailored to individuals' needs that will serve them and society better. This new social media world has no geographical boundaries and incessantly churns out oceans of data. As a result, we are facing an exacerbated problem of big data – “drowning in data, but thirsty for knowledge.” Can data mining come to the rescue?
Unfortunately, social media data is significantly different from the traditional data that we are familiar with in data mining. Apart from enormous size, the mainly user-generated data is noisy and unstructured, with abundant social relations such as friendships and followers-followees. This new type of data mandates new computational data analysis approaches that can combine social theories with statistical and data mining methods. The pressing demand for new techniques ushers in and entails a new interdisciplinary field – social media mining.
Mountains of raw data are generated daily by individuals on social media. Around 6 billion photos are uploaded monthly to Facebook, the blogosphere doubles every five months, 72 hours of video are uploaded every minute to YouTube, and there are more than 400 million daily tweets on Twitter. With this unprecedented rate of content generation, individuals are easily overwhelmed with data and find it difficult to discover content that is relevant to their interests. To overcome these challenges, we need tools that can analyze these massive unprocessed sources of data (i.e., raw data) and extract useful patterns from them. Examples of useful patterns in social media are those that describe online purchasing habits or individuals' website visit duration. Data mining provides the necessary tools for discovering patterns in data. This chapter outlines the general process for analyzing social media data and ways to use data mining algorithms in this process to extract actionable patterns from raw data.
The process of extracting useful patterns from raw data is known as Knowledge discovery in databases (KDD). It is illustrated in Figure 5.1. The KDD process takes raw data as input and provides statistically significant patterns found in the data (i.e., knowledge) as output. From the raw data, a subset is selected for processing and is denoted as target data. Target data is preprocessed to make it ready for analysis using data mining algorithm. Data mining is then performed on the preprocessed (and transformed) data to extract interesting patterns. The patterns are evaluated to ensure their validity and soundness and interpreted to provide insights into the data.
In this chapter, we introduce the discrete Fourier transform (DFT), which may be viewed as an economy class DTFT and is applicable when x[n] is of finite length (or made finite length by windowing). The DFT is one of the most important tools for digital signal processing, especially when we implement it using the efficient fast Fourier transform (FFT) algorithm, discussed in Sec. 9.7. The development of the FFT algorithm in the mid-sixties gave a huge impetus to the area of DSP. The DFT, using the FFT algorithm, is truly the workhorse of modern digital signal processing, and it is nearly impossible to exaggerate its importance. A solid understanding of the DFT is a must for anyone aspiring to work in the digital signal processing field. Not only does the DFT provide a frequency-domain representation of DT signals, it is also useful to numerous other tasks such as FIR filtering, spectral analysis, and solving partial differential equations.
Computation of the Direct and Inverse DTFT
As we saw in Ch. 6, frequency analysis of discrete-time signals involves determination of the discretetime Fourier transform (DTFT) and its inverse (IDTFT). The DTFT analysis equation of Eq. (6.1) yields the frequency spectrum X(Ω) from the time-domain signal x[n], and the synthesis equation of Eq. (6.2) reverses the process and constructs x[n] from X(Ω). There are, however, two difficulties in the implementation of these equations on a digital processor or computer.
Social forces connect individuals in different ways. When individuals get connected, one can observe distinguishable patterns in their connectivity networks. One such pattern is assortativity, also known as social similarity. In networks with assortativity, similar nodes are connected to one another more often than dissimilar nodes. For instance, in social networks, a high similarity between friends is observed. This similarity is exhibited by similar behavior, similar interests, similar activities, and shared attributes such as language, among others. In other words, friendship networks are assortative. Investigating assortativity patterns that individuals exhibit on social media helps one better understand user interactions. Assortativity is the most commonly observed pattern among linked individuals. This chapter discusses assortativity along with principal factors that result in assortative networks.
Many social forces induce assortative networks. Three common forces are influence, homophily, and confounding. Influence is the process by which an individual (the influential) affects another individual such that the influenced individual becomes more similar to the influential figure. Homophily is observed in already similar individuals. It is realized when similar individuals become friends due to their high similarity. Confounding is the environment's effect on making individuals similar. For instance, individuals who live in Russia speak Russian fluently because of the environment and are therefore similar in language. The confounding force is an external factor that is independent of inter-individual interactions and is therefore not discussed further.
Individuals in social media make a variety of decisions on a daily basis. These decisions are about buying a product, purchasing a service, adding a friend, and renting a movie, among others. The individual often faces many options to choose from. These diverse options, the pursuit of optimality, and the limited knowledge that each individual has create a desire for external help. At times, we resort to search engines for recommendations; however, the results in search engines are rarely tailored to our particular tastes and are query-dependent, independent of the individuals who search for them.
Applications and algorithms are developed to help individuals decide easily, rapidly, and more accurately. These algorithms are tailored to individuals' tastes such that customized recommendations are available for them. These algorithms are called recommendation algorithms or recommender systems.
Recommender systems are commonly used for product recommendation. Their goal is to recommend products that would be interesting to individuals. Formally, a recommendation algorithm takes a set of users U and a set of items I and learns a function f such that
f : U × I → R (9.1)
In other words, the algorithm learns a function that assigns a real value to each user-item pair (u, i), where this value indicates how interested user u is in item i. This value denotes the rating given by user u to item i. The recommendation algorithm is not limited to item recommendation and can be generalized to recommending people and material, such as, ads or content.
In May 2011, Facebook had 721 million users, represented by a graph of 721 million nodes. A Facebook user at the time had an average of 190 friends; that is, all Facebook users, taken into account, had a total of 68.5 billion friendships (i.e., edges). What are the principal underlying processes that help initiate these friendships? More importantly, how can these seemingly independent friendships form this complex friendship network?
In social media, many social networks contain millions of nodes and billions of edges. These complex networks have billions of friendships, the reasons for existence of most of which are obscure. Humbled by the complexity of these networks and the difficulty of independently analyzing each one of these friendships, we can design models that generate, on a smaller scale, graphs similar to real-world networks. On the assumption that these models simulate properties observed in real-world networks well, the analysis of real-world networks boils down to a cost-efficient measuring of different properties of simulated networks. In addition, these models
• allow for a better understanding of phenomena observed in real-world networks by providing concrete mathematical explanations and
• allow for controlled experiments on synthetic networks when real-world networks are not available.
We discuss three principal network models in this chapter: the random graph model, the small-world model, and the preferential attachment model. These models are designed to accurately model properties observed in real-world networks. Before we delve into the details of these models, we discuss their properties.