To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In the last two decades, the scientific community has witnessed a surge in activity, interesting results, and notable progress in our conceptual understanding of computing and information based on the laws of quantum theory. One of the significant aspects of these developments has been an integration of several fields of inquiry that not long ago appeared to be evolving, more or less, along narrow disciplinary paths without any major overlap with each other. In the resulting body of work, investigators have revealed a deeper connection among the ideas and techniques of (apparently) disparate fields. As is evident from the title of this volume, logic, mathematics, physics, computer science and information theory are intricately involved in this fascinating story. The inquisitive reader might focus, perhaps, on the marriage of the most unlikely and intriguing fields of quantum theory and logic and ask: Why quantum logic?
By many, “logic” is deemed to be panacea for faulty intuition. It is often associated with the rules of correct thinking and decision-making, but not necessarily in its most sublime role as a deep intellectual subject underlying the validity of mathematical structures and worthy of investigation and discovery in its own right. Indeed, within the realm of the classical theories of nature, one may encounter situations that defy comprehension, should one hold to the intuition developed through experiencing familiar macroscopic scenarios in our routine impressions of natural phenomena.
One such example is a statement within the special theory of relativity that the speed of light is the same in all inertial frames. It certainly defies the common intuition regarding the observation of velocities of familiar objects in relative motion. One might be tempted to dismiss it as contrary to observation. However, while analyzing natural phenomena for objects moving close to the speed of light and, therefore, unfamiliar in the range of velocities we are normally accustomed to, logical deductions based on the postulates of the special relativity theory lead to the correct predictions of experimental observations.
There exists an undeniable interconnection between the deepest theories of nature and mathematical reasoning, famously stated by Eugene Wigner as the unreasonable efficacy of mathematics in physical theories.
Cellular automata are discrete dynamical systems based on local, synchronous and parallel updates of symbols written on an infinite array of cells. Such systems were conceived in the early 1950s by John von Neumann and Stanislaw Ulam in the context of machine self-reproduction, while one-dimensional variants were studied independently in symbolic dynamics as block maps between sequences of symbols. In the early 1970s, John Conway introduced Game-Of-Life, a particularly attractive two-dimensional cellular automaton that became widely known, in particular once popularised by Martin Gardner in Scientific American. In physics and other natural sciences, cellular automata are used as models for various phenomena. Cellular automata are the simplest imaginable devices that operate under the nature-inspired constraints of massive parallelism, locality of interactions and uniformity in time and space. They can also exhibit the physically relevant properties of time reversibility and conservation laws if the local update rule is chosen appropriately.
Today cellular automata are studied from a number of perspectives in physics, mathematics and computer science. The simple and elegant concept makes them also objects of study of their own right. An advanced mathematical theory has been developed that uses tools of computability theory, discrete dynamical systems and ergodic theory.
Closely related notions in symbolic dynamics are subshifts (of finite type). These are sets of infinite arrays of symbols defined by forbidding the appearance anywhere in the array of some (finitely many) local patterns. In comparison to cellular automata, the dynamic local update function has been replaced by a static local matching relation. Two-dimensional subshifts of finite type are conveniently represented as tiling spaces by Wang tiles. There are significant differences in the theories of oneand two-dimensional subshifts. One can implant computations in tilings, which leads to the appearance of undecidability in decision problems that are trivially decidable in the one-dimensional setting. Most notably, the undecidability of the tiling problem — asking whether a given two-dimensional subshift of finite type is non-empty — implies deep differences in the theories of one- and two-dimensional cellular automata.
In this chapter we present classical results about cellular automata, discuss algorithmic questions concerning tiling spaces and relate these questions to decision problems about cellular automata, observing some fundamental differences between the one- and two-dimensional cases.
Abstract. We put forward a new take on the logic of quantum mechanics, following Schrödinger's point of view that it is composition which makes quantum theory what it is, rather than its particular propositional structure due to the existence of superpositions, as proposed by Birkhoff and von Neumann. This gives rise to an intrinsically quantitative kind of logic, which truly deserves the name ‘logic’ in that it also models meaning in natural language, the latter being the origin of logic, that it supports automation, the most prominent practical use of logic, and that it supports probabilistic inference.
The physics and the logic of quantum-ish logic. In 1932 John von Neumann formalized Quantum Mechanics in his book “Mathematische Grundlagen der Quantenmechanik”. This was effectively the official birth of the quantum mechanical formalism which until now, some 75 years later, has remained the same. Quantum theory underpins so many things in our daily lives including chemical industry, energy production and information technology, which arguably makes it the most technologically successful theory of physics ever.
However, in 1935, merely three years after the birth of his brainchild, von Neumann wrote in a letter to American mathematician Garrett Birkhoff: “I would like to make a confession which may seem immoral: I do not believe absolutely in Hilbert space no more.” (sic)—for more details see [73].
Soon thereafter they published a paper entitled “The Logic of Quantum Mechanics” [13]. Their ‘quantum logic’ was cast in order-theoretic terms, very much in the spirit of the then reigning algebraic view of logic, with the distributive law being replaced with a weaker (ortho)modular law.
This resulted in a research community of quantum logicians [68, 71, 47, 30]. However, despite von Neumann's reputation, and the large body of research that has been produced in the area, one does not find a trace of this activity neither in the mainstream physics, mathematics, nor logic literature. Hence, 75 years later one may want to conclude that this activity was a failure.
What went wrong?
The mathematics of it. Let us consider the raison d'être for the Hilbert space formalism.
Sequences of natural numbers have always been the subject of all kind of investigations. Among them is the famous Fibonacci sequence, related to the well-known Golden Ratio. This sequence is the paradigmatic example of a sequence satisfying a linear recursion. Latter sequences are of fundamental importance in algebra, combinatorics, and number theory, as well as in automata theory, since they count languages associated with finite automata.
The first scope of the present chapter is to prove a classification theorem: it characterises a class of sequences associated with certain quivers (directed graphs); in general these sequences satisfy non-linear recursions; it turns out that these sequences satisfy also a linear recursion exactly when the undirected underlying graph is of Dynkin, or extended Dynkin, type. This is perhaps the first example of a classification theorem, that uses Dynkin diagrams, in the realm of integer sequences.
Note that, for the sake of simplicity, we have dealt here with diagrams and quivers, whose underlying graphs are simple graphs; the corresponding Dynkin diagrams are sometimes called simply laced. The whole theory may be done with more general diagrams, that is, using Cartan matrices, see (Assem et al., 2010).
The sequences we consider are called friezes: they were introduced by Philippe Caldero and they satisfy non-linear recursions associated with quivers. Such a recursion is a particular case of a mutation, an operation introduced by Fomin and Zelevinsky in their theory of cluster algebras. The recursion does not show at all that these sequences are integer-valued. This must be proved separately and for this we follow Fomin and Zelevinsky's Laurent phenomenon (Fomin and Zelevinsky, 2002a,b), for which a proof is given.
We shall explain all details about Dynkin and extended Dynkin diagrams. These diagrams have been used to classify several mathematical objects, starting with the Cartan–Killing classification of simple Lie algebras. There is a simple combinatorial characterisation of these diagrams, due to (Vinberg, 1971) for Dynkin diagrams, and to (Berman et al., 1971/72) for extended Dynkin diagrams: they are characterised by the existence of a certain function on the graph, which must be subadditive or additive. See Section 10.8 for more detail.
In this chapter we discuss (multidimensional) shifts of finite type, or SFTs for short. These are the sets X(ℰ) ⊆ ΣZd of d-dimensional configurations that arise by forbidding the occurrence of a finite list ℰ of finite patterns. This local, combinatorial and perhaps naive definition leads to surprisingly complex behaviour, and therein lies some of the interest in these objects. SFTs also arise independently in a variety of contexts, including language theory, statistical physics, dynamical systems theory and the theory of cellular automata. We shall touch upon some of these connections later on.
The answer to the question ‘what do SFTs look like’ turns out to be profoundly different depending on whether one works in dimension one or higher. In dimension one, the configurations in an SFT X(ℰ) can be described constructively: there is a certain finite directed graph G, derived explicitly from ℰ, such that X(ℰ) is (isomorphic to) the set of bi-infinite vertex paths in G. Most of the elementary questions, and many of the hard ones, can then be answered by combinatorial and algebraic analysis of G and its adjacency matrix. For a brief account of these matters, see Section 9.2.5.
In dimension d ≥ 2, which is our main focus here, things are entirely different. First and foremost, given ℰ, one cannot generally give a constructive description of the elements of X(ℰ). In fact, Berger showed in 1966 that it is impossible even to decide, given ℰ, whether X(ℰ) = ∅. Consequently, nearly any other property or quantity associated with X(ℰ) is undecidable or uncomputable.
What this tells us is that, in general, we cannot hope to know what a particular SFT behaves like. But one can still hope to classify the types of behaviours that can occur in the class of SFTs, and, quite surprisingly, it has emerged in recent years that for many properties of SFTs such a classification exists. The development of this circle of ideas began with the characterisation of the possible rates of ‘word-growth’ (topological entropy) of SFTs, and was soon followed by a characterisation of the degrees of computability (in the sense of Medvedev and Muchnik) of SFTs, and of the languages that arise by restricting configurations of SFTs to lower-dimensional lattices.
Open Innovation (OI) supports companies in systematically collaborating with external partners, offering various advantages. However, companies still face several challenges when applying OI, e.g., identifying relevant OI partners, collaboration methods, and project risks. Often, insufficient planning is the reason for subsequent deficits in OI projects. The analysis of relevant context factors (‘situation’) is important, which affect and constrain OI. To date, a general approach for analyzing (open) innovation situations or guidelines for developing one do not exist. Usually researchers develop their own situation analysis, including extensive literature reviews and experiencing similar challenges. This publication sets the basis for successfully planning OI projects. It focuses on developing an analysis approach for OI situations and supports other researchers in developing their own analysis approaches. The resultant objectives of the publication are to: (1) provide a list of potential situation analysis criteria; (2) provide a guideline for developing a situation analysis; (3) provide initial indications of relevant OI-specific situation criteria. The criteria were derived from the literature and qualitatively evaluated by three industry partners to assess their usability. Although this work is exploratory, and the results are not automatically generalizable, it is an important contribution for ensuring the success of OI, and for analyzing enablers and barriers to knowledge transfer from academia to industry.
Abstract. With a view to quantum foundations, we define the concepts of an empirical model (a probabilistic model describing measurements and outcomes), a hidden-variable model (an empirical model augmented by unobserved variables), and various properties of hidden-variable models, for the case of infinite measurement spaces and finite outcome spaces. Thus, our framework is general enough to include, for example, quantum experiments that involve spin measurements at arbitrary relative angles. Within this framework, we use the concept of the fiber product of measures to prove general versions of two determinization results about hidden-variable models. Specifically, we prove that: (i) every empirical model can be realized by a deterministic hidden-variable model; (ii) for every hidden-variable model satisfying locality and λ-independence, there is a realizationequivalent hidden-variable model satisfying determinism and λ-independence.
Introduction. Hidden variables are extra variables added to the model of an experiment to explain correlations in the outcomes. Here is a simple example. Alice's and Bob's computers have been prepared with the same password. We know that the password is either p2s4w6r8 or 1a3s5o7d, but we do not know which it is. If Alice now types in p2s4w6r8 and this unlocks her computer, we immediately know what will happen when Bob types in one or other of the two passwords. The two outcomes—when Alice types a password and Bob types a password—are perfectly correlated. Clearly, it would be wrong to conclude that, when Alice types a password on her machine, this somehow causes Bob's machine to acquire the same password. The correlation is purely informational: It is our state of knowledge that changes, not Bob's computer. Formally, we can consider an r.v. (random variable) X for Alice's password, an r.v. Y for Bob's password, and an extra r.v. Z. The r.v. Z takes the value z1 or z2 according as the two machines were prepared with the first or the second password. Then, even though X and Y will be perfectly correlated, they will also be independent (trivially so), conditional on the value of Z. In this sense, the extra r.v. Z explains the correlation.
In Chapter 3, we provided a theoretical overview of the explore-exploit problem and its importance in scoring items for recommender systems, in particular, the connection to the classical multiarmed bandit (MAB) problem. We discussed the Bayesian and the minimax approaches to the MAB problem, including some popular heuristics that are used in practice. However, several additional nuances that arise in recommender systems violate assumptions made in the MAB problem.These include dynamic item pool, nonstationary CTR, and delay in feedback. This chapter develops new solutions that work well in practice.
For many recommender systems, it is appropriate to score items based on some positive action rate, such as click probabilities (CTR). Such an approach maximizes the total number of actions on recommended items. A simple approach that is often used in practice is to recommend the top-k items with the highest CTR. We shall refer to this as the most-popular recommendation approach, where popularity is measured through item-specific CTR. Although conceptually simple, the most-popular approach is technically nontrivial because the CTR of items have to be estimated. It also serves as a good baseline for applications where nonpersonalized recommendation is an acceptable solution. Hence we begin in this chapter by developing explore-exploit solutions for the most-popular recommendation problem.
In Section 6.1, we introduce an example application and show the characteristics of most-popular recommendation in this real-life application. We then mathematically define the explore-exploit problem for most-popular recommendation in Section 6.2 and develop a Bayesian solution from first principles in Section 6.3. A number of popular non-Bayesian solutions are reviewed in Section 6.4. Through a series of extensive experiments in Section 6.5, we show that, when the system can be properly modeled using a Bayesian framework, the Bayesian solution performs significantly better than other solutions. Finally, in Section 6.6, we discuss how to address the data sparsity challenge when the set of candidate items is large.
Example Application: Yahoo! Today Module
In Section 5.1.1, we introduced feature modules that are commonly shown on the home pages of web portals. The Today Module on the Yahoo! home page (see Figure 5.1 for a snapshot) is a typical example. The goal for this module is to recommend items (mostly news stories of different types) to maximize user engagement on the home page, usually measured by the total number of clicks.
After developing statistical methods for recommender systems, it is important to evaluate their performance to assess performance metrics in different application settings. Broadly speaking, there are two kinds of evaluation, depending on whether a recommendation algorithm, or more precisely, the model used in the algorithm, has been deployed to serve users:
1. Predeployment offline evaluation: A new model must show strong signs of performance improvement over existing baselines before being deployed to serve real users. To ascertain the potential of a newmodel before testing it on real user visits, we compute various performance measures on retrospective (historical) data. We refer to this as offline evaluation. To perform such offline evaluation, we need to log data that record past user-item interactions in the system. Model comparison is performed by computing various offline metrics based on such data.
2. Postdeployment online evaluation: Once the model performance looks promising based on offline metrics, we test it on a small fraction of real user visits. We refer to this as online evaluation. To perform online evaluation, it is typical to run randomized experiments online. A randomized experiment, also referred to as an A/B test or a bucket test in web applications, compares a new method to an existing baseline. It is conducted by assigning two random user or visit populations to the treatment bucket and the control bucket, respectively. The treatment bucket is typically smaller than the control because it serves users according to the new recommendationmodel that is being tested, whereas the control bucket serves users using the status quo. After running such a bucket test for a certain time period, we gauge model performance by comparing metrics that are computed using data collected from the corresponding buckets.
In this chapter, we describe several ways to measure the performance of recommendation models and discuss their strengths and weaknesses. We start in Section 4.1 with traditional offline evaluation metrics that measure out-of-sample predictive accuracy on retrospective ratings data. Our use of the term rating is generic and refers to both explicit ratings like star ratings on movies and implicit ratings (also called responses) like clicks on recommended items (we use ratings and responses interchangeably). In Section 4.2, we discuss online evaluation methods, describing both performance metrics and how to set up online bucket tests in a proper way.
The approaches that we discussed in previous chapters typically recommend items to optimize for a single objective, usually clicks on recommended items. However, a click is only the starting point of a user's journey, and subsequent downstream utilities, such as time spent on the website after the click and revenue generated through displaying advertisements on web pages, are also important. Here clicks, time spent, and revenue are three different objectives for which a website may want to optimize. When different objectives have strong positive correlation, maximizing one would automatically maximize others. In reality, this is not a typical scenario. For example, the advertisements on real estate articles tend to generate higher revenue than those on entertainment articles, but users usually click more and spend more time on entertainment articles. In situations like this, it is not feasible to reach optimality for all objectives. The goal, instead, is to find a good trade-off among competing objectives. For example, we maximize revenue such that clicks and time spent are still within 95 percent of the achievable number of clicks and 90 percent of the achievable time spent relative to some normative serving scheme (e.g., CTR maximization). How to set the constraints (the 95 percent and 90 percent thresholds) depends on the business goals and strategies of the website and is application specific. This chapter provides a methodology based on multiobjective programming to optimize a recommender system for a given set of predetermined constraints. Such a formulation can easily incorporate various application-driven requirements.
We present two approaches to multiobjective optimization. Section 11.2 describes a segmented approach (originally published in Agarwal et al., 2011a), where users are classified into user segments in a way similar to segmented postpopular recommendation (discussed in Sections 3.3.2 and 6.5.3) and the optimization is at the segment level. Although useful, this segmented approach assumes that users are partitioned into a few coarse, nonoverlapping segments. In many applications, a user is characterized by a high-dimensional feature vector (thousands of dimensions with large number of all possible combinations) – the segmented approach fails to provide recommendations at such granular resolutions because decisions are made at coarse user-segment resolutions. Section 11.3 describes a personalized approach (originally published in Agarwal et al., 2012) that improves the segmented approach by optimizing the objectives at the individual-user level.
Recommender systems have to select items for users to optimize for one or more objectives. We introduced several possible objectives in Chapter 1, reviewed classical methods in Chapter 2, described the explore-exploit trade-off and the key ideas to reduce dimensionality of the problem in Chapter 3, and discussed how to evaluate recommendation models in Chapter 4. In this and the five subsequent chapters of the book, we discuss various statistical methods used in some commonly encountered scenarios. In particular, we focus on problem settings where the main objective is to maximize some positive user response to the recommended items. In many application scenarios, clicks on items are the primary response. To maximize clicks, we have to recommend items with high click-through rates (CTRs). Thus, CTR estimation is our main focus. Although we use click and CTR as our primary objectives, other types of positive response (e.g., share, like) can be handled in a similar way. We defer the discussion of Multi objective optimization to Chapter 11.
The choice of statistical methods for a recommendation problem depends on the application. In this chapter, we provide a high-level overview of techniques to be introduced in the next four chapters. We start with an introduction to a variety of different problem settings in Section 5.1 and then describe an example system architecture in Section 5.2 to illustrate how web recommender systems work in practice, along with the role of statistical methods in such systems.
Problem Settings
A typical recommender system is usually implemented as a module on a web page. In this section, we introduce some common recommendation modules, provide details of the application settings, and conclude with a discussion of commonly used statistical methods for these settings.
Common Recommendation Modules
We classify websites into the following four categories: general portals, personal portals, domain-specific sites, and social network sites. Table 5.1 provides a summary.
General portals are websites that provide a wide range of different content. The home pages of content networks like Yahoo!, MSN, and AOL are examples of general portals.
Personal portals are websites that allow users to customize their home pages with desired content. For example, users of My Yahoo! customize their home pages by selecting content feeds from different sources or publishers and arranging the feeds on the pages according to their preferences.