To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
QUEUES FEATURE IN our daily lives like never before. From the checkout counter in the community grocery store to customer support over the phone, queues are theatres of great social and engineering drama. Entire business operations of many leading companies are geared towards providing hassle-free customer support and experience – timely and effective resolution of client queries about services on a regular basis. Alternatively, it could be effective traffic management and resource optimization for a multiplex cinema operator involved in ticket sales. Sometimes it may not involve humans at all, like in the case of a database query to a computer server for specific information that may be routed through a job queue. How a queue moves in time and how services are offered over epochs determine how businesses will be able to make profit or how efficiently computer servers will execute tasks. All these have a huge technological and economical impact. No wonder we have seen huge investments by concerned stakeholders to upgrade and upscale hardware and software infrastructure to re-engineer queues towards greater system efficiency and profitability. The mathematical technology of queues is crafted out of models that investigate and replicate stochastic behavior of engineering systems. This is the subject of our study in this chapter.
STATISICAL EXPERIMENTS ENABLE us to make inferences from data about parameters that characterize a population. Generally speaking, inferences may be of two types, namely, deductive inference and inductive inference. Deductive inference pertains to conclusions based on a set of premises (propositions) and their synthesis. Deductive reasoning has a definitive character. For example, all men are mortal (first proposition); Socrates is a man (second proposition); hence, Socrates is mortal (deductive conclusion). On the other hand, inductive inference has a probabilistic character. One conducts an experiment and collects data. Based on this data, certain conclusions are drawn that may have a broader applicability beyond the contours of the particular experiment performed by the researcher. This generalization of the conclusions drawn from the particular experiment constitutes the framework of inductive reasoning. For example, measurement of heights of a small group of people belonging to a certain population is conducted. Based on the calculations of this small sample set, and upon finding that for this small group the average height of men is greater than the average height of women, it is inferred that the men of this population are generally taller than the women.
The formal practice of inductive reasoning dates back to the thesis of Gottfried Wilhelm Leibniz (see Figure 5.1). He was the first to propose that probability is a relation between hypothesis and evidence (data). His thesis was founded on three conceptual pillars: chance (probability), possibilities (realizable random events), and ideas (generalization of inferences by induction). We have encountered the first two concepts in earlier chapters of this textbook. In this chapter, we will delve into the third theme whereby we will discuss methods to draw conclusions from data derived from statistical experiments based on the principles of inductive reasoning.
Our lived experiences are punctuated by events that are sometimes a result of our purposeful intentions and at other times outcomes that happen by pure chance. Even at an abstract level, it is a very human endeavor to deduce meaning from seemingly random observations an exercise whose primary objective is to derive a causal structure in observed phenomena. In fact, our whole intellectual pursuit that differentiates us from other beings can be understood through our inner urge to discover the very purpose of our existence and the conditions that make this possible. This eternal play between chance episodes and purposeful volition manifests in diverse situations that I have labored to recreate through computer simulations of realistic events. This play has a dual role - first, it binds together the flow of our varied experiences and, second, it offers us a perspective to assimilate our understanding of events happening around us that affect us. In order to appreciate this play of chance and purpose, it is essential that students and readers have a conceptual grounding in the areas of probability, statistics, and stochastic processes. Therefore, several playful computer simulations and projects are interlaced with theoretical foundations and numerical examples - both solved and exercise problems. In this way, the presentation in this book remains true to its spirit of inviting thoughtful readers to the various aspects of this area of study.
Historical remark
The advent of a rigorous framework for studying probability and statistics dates back to the eighth century AD and is documented in the works of Al-Khalil, who was an Arab philologist. This branch of mathematics continues to be under development with major contributions from Soviet mathematician Andrey N. Kolmogorov, who developed the modern foundations of probability and statistical theory from a measure-theoretic standpoint in the twentieth century.
DISTRIBUTIONS ARE GENERALIZATIONS of mathematical functions from a purely technical standpoint. But perhaps it is most pertinent to begin by asking a more utilitarian question. Why should we study distributions? Specifically, why should we study probability distributions? One of the motivations stems from a practical limitation of experimental measurements that is underlined by the uncertainty principle postulated by Werner Heisenberg (see Figure 2.1). The very fabric of reality and the structure of scientific laws that govern our ability to understand physical phenomena demand a probabilistic (statistical) approach. Our inability to make infinite-precision measurements of data necessitates the consideration of averages over many measurements, and under similar conditions, as a more reliable strategy to affix experimental values to unknowns with reasonable accuracy.
The advent of the internet and sensor technology has enabled humankind to collect, store, and share data in bulk. In turn, access to a variety of data has amplified a different kind of problem, which is to devise an appropriate strategy to derive meaning from data. Indeed, extracting information from data has acquired the highest priority among tasks performed by engineers and scientists alike. State-ofthe-art machine learning algorithms are used to process and analyze data in order to leverage maximum gains in developing new technology and creating a new body of knowledge.
Further, the data-rich tech-universe has inherent complexity in addition to the vastness in terms of numbers. This complexity arises from the fact that often this data is embedded in a higher-dimensional space. For example, the data acquired by a camera hosted on a robot is in the form of multiple grayscale images (frames); each data-frame is constituted of a sequence of numbers that represents the intensity of grayness of each pixel. If each image has a resolution 100 × 100 (pixel count), then this image data is embedded in a 10000 dimensional space. Additionally, if the camera records 100 frames per second for one minute, then we have 6000 data points in a 10000 dimensional space. This is just an illustrative example of how a high-dimensional large data set may be generated. Quite evidently, not all the 10000 dimensions host most of the information. One of the most important techniques that we will learn in this chapter will allow us to extract a lower dimensional representation of the data set that will retain sufficient information for the robot to navigate and perform its tasks.
MARKOV CHAINS WERE first formulated as a stochastic model1 by Russian mathematician Andrei Andreevich Markov. Markov spent most of his professional career at St. Petersburg University and the Imperial Academy of Science. During this time, he specialized in the theory of numbers, mathematical analysis, and probability theory. His work on Markov chains utilized finite square matrices (stochastic matrices) to show that the two classical results of probability theory, namely, the weak law of large numbers and the central limit theorem, can be extended to the case of sums of dependent random variables. Markov chains have wide scientific and engineering applications in statistical mechanics, financial engineering, weather modeling, artificial intelligence, and so on. In this chapter, we will look at a few applications as we build the concepts of Markov chains. Additionally, we will also implement a technique (using Markov chains) to solve a simple and practical engineering problem related to aircraft control and automation.
3.1 Chapter objectives
The chapter objectives are listed as follows.
1. Students will learn the definition and applications of Markov processes.
2. Students will learn the definition of the stochastic matrix (also known as the probability transition matrix) and perform simple matrix calculations to compute conditional probabilities.
3. Students will learn to solve engineering and scientific problems based on discrete time Markov chains (DTMCs) using multi-step transition probabilities.
4. Students will learn to compute return times and hitting times to Markov states.
5. Students will learn to classify different Markov states.
6. Students will learn to use the techniques of DTMCs introduced in this chapter to solve a complex engineering problem related to flight control operations.
Combining simultaneous equations with latent variables and measurement models results in general latent variable SEMs, the subject of Chapter 6. It covers model specifications, implied moments, identification, estimation, outliers and influential cases, model fit, and respecification in such models. Furthermore, Chapter 6 also explores higher order factor analysis, longitudinal models, and Bayesian estimation.
This chapter presents the matrix deviation inequality, a uniform deviation bound for random matrices over general sets. Applications include two-sided bounds for random matrices, refined estimates for random projections, covariance estimation in low dimensions, and an extension of the Johnson–Lindenstrauss lemma to infinite sets. We prove two geometric results: the M* bound, which shows how random slicing shrinks high-dimensional sets, and the escape theorem, which shows how slicing can completely miss them. These tools are applied to a fundamental data science task – learning structured high-dimensional linear models. We extend the matrix deviation inequality to arbitrary norms and use it to strengthen the Chevet inequality and derive the Dvoretzky– Milman theorem, which states that random low-dimensional projections of high-dimensional sets appear nearly round. Exercises cover matrix and process-level deviation bounds, high-dimensional estimation techniques such as the Lasso for sparse regression, the Garnaev–Gluskin theorem on random slicing of the cross-polytope, and general-norm extensions of the Johnson–Lindenstrauss lemma.
Chapter 7 covers models with categorical endogenous variables. It examines the consequences of treating such variables as continuous and how to modify SEMs to take account of categorical variables. It begins with single equation regression-like models for binary, ordinal, and count variables and builds to multiequation models. It includes a polychoric correlation approach, models with exogenous observed variables, the treatment of missing values, and alternative modeling approaches for categorical variables.
This chapter introduces structural equation models (SEMs). It defines SEMs and outlines their history. It also presents several widespread misunderstandings about SEMs and presents their strengths and weaknesses. Finally, the chapter provides an outline of the remaining book chapters.