Performance Modeling and Design of Computer Systems

9 - Ergodicity Theory
from IV - From Markov Chains to Simple Queues
Mor Harchol-Balter, Carnegie Mellon University, Pennsylvania
Book:

Performance Modeling and Design of Computer Systems

Published online:

05 February 2013

Print publication:

18 February 2013, pp 148-189
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Acknowledgments
Mor Harchol-Balter, Carnegie Mellon University, Pennsylvania
Book:

Performance Modeling and Design of Computer Systems

Published online:

05 February 2013

Print publication:

18 February 2013, pp xxiii-xxiv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

VI - Real-World Workloads: High Variability and Heavy Tails
Mor Harchol-Balter, Carnegie Mellon University, Pennsylvania
Book:

Performance Modeling and Design of Computer Systems

Published online:

05 February 2013

Print publication:

18 February 2013, pp 347-348
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Part VI discusses queueing analysis where the arrival process and/or service process are generally distributed.
We start with Chapter 20, where we study empirical job size distributions from computing workloads. These are often characterized by heavy tails, very high variance, and decreasing failure rate. Importantly, these are very different from the Markovian (Exponential) distributions that have enabled the Markov-chain-based analysis that we have done so far.
New distributions require new analysis techniques. The first of these, the method of phase-type distributions, is introduced in Chapter 21. Phase-type distributions allow us to represent general distributions as mixtures of Exponential distributions. This in turn enables the modeling of systems involving general distributions using Markov chains. However, the resulting Markov chains are very different from what we have seen before and often have no simple solution. We introduce matrix-analytic techniques for solving these chains numerically. Matrix-analytic techniques are very powerful. They are efficient and highly accurate. Unfortunately, they are still numerical techniques, meaning that they can only solve “instances” of the problem, rather than solving the problem symbolically in terms of the input variables.
In Chapter 22 we consider a new setting: networks of Processor-Sharing (PS) servers with generally distributed job sizes. These represent networks of computers, where each computer time-shares among several jobs. We again exploit the idea of phasetype distributions to analyze these networks, proving the BCMP product form theorem for networks with PS servers. The BCMP theorem provides a simple closed-form solution for a very broad class of networks of PS servers.

19 - Closed Networks of Queues
from V - Server Farms and Networks: Multi-server, Multi-queue Systems
Mor Harchol-Balter, Carnegie Mellon University, Pennsylvania
Book:

Performance Modeling and Design of Computer Systems

Published online:

05 February 2013

Print publication:

18 February 2013, pp 331-346
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

22 - Networks with Time-Sharing (PS) Servers (BCMP)
from VI - Real-World Workloads: High Variability and Heavy Tails
Mor Harchol-Balter, Carnegie Mellon University, Pennsylvania
Book:

Performance Modeling and Design of Computer Systems

Published online:

05 February 2013

Print publication:

18 February 2013, pp 380-394
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In Chapter 21, we saw one application for phase-type (PH) distributions: If we need to analyze a system whose workload involves distributions that are non-Exponential (e.g., high-variability workloads), then we can use a PH distribution to at least match 2 or 3 moments of that workload distribution. This allows us to represent the system via a Markov chain, which we can often solve via matrix-analytic methods.
In this chapter we see another application of PH distributions. Here, we are interested in analyzing networks of Processor-Sharing (time-sharing) servers (a.k.a. PS servers). It will turn out that networks of PS servers exhibit product form solutions, even under general service times. This is in contrast to networks of FCFS servers, which require Exponential service times. Our proof of this PS result will rely on phase-type distributions. This result is part of the famous BCMP theorem [16].
Review of Product Form Networks
So far we have seen that all of the following networks have product form:
Open Jackson networks: These assume probabilistic routing, FCFS servers with Exponential service rates, Poisson arrivals, and unbounded queues.
Open classed Jackson networks: These are Jackson networks, where the outside arrival rates and routing probabilities can depend on the “class” of the job.
Closed Jackson networks
Closed classed Jackson networks
We have also seen (see Exercise 19.3) that Jackson networks with load-dependent service rates have product form. Here the service rate can depend on the number of jobs at the server. This is useful for modeling the effects of parallel processing.

28 - Performance Metrics
from VII - Smart Scheduling in the M/G/1
Mor Harchol-Balter, Carnegie Mellon University, Pennsylvania
Book:

Performance Modeling and Design of Computer Systems

Published online:

05 February 2013

Print publication:

18 February 2013, pp 473-477
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

26 - M/G/1 Transform Analysis
from VI - Real-World Workloads: High Variability and Heavy Tails
Mor Harchol-Balter, Carnegie Mellon University, Pennsylvania
Book:

Performance Modeling and Design of Computer Systems

Published online:

05 February 2013

Print publication:

18 February 2013, pp 450-456
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

V - Server Farms and Networks: Multi-server, Multi-queue Systems
Mor Harchol-Balter, Carnegie Mellon University, Pennsylvania
Book:

Performance Modeling and Design of Computer Systems

Published online:

05 February 2013

Print publication:

18 February 2013, pp 251-252
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Part V involves the analysis of multi-server and multi-queue systems.
We start in Chapter 14 with the M/M/k server farm model, where k servers all work “cooperatively” to handle incoming requests from a single queue. We derive simple closed-form formulas for the distribution of the number of jobs in the M/M/k. We then exploit these formulas in Chapter 15 to do capacity provisioning for the M/M/k. Specifically, we answer questions such as, “What is the minimum number of servers needed to guarantee that only a small fraction of jobs are delayed?” We derive simple answers to these questions in the form of square-root staffing rules. In these two chapters and the exercises therein, we also consider questions pertaining to resource allocation, such as whether a single fast server is superior to many slow servers, and whether a single central queue is superior to having a queue at each server.
We then move on to analyzing networks of queues, consisting of multiple servers, each with its own queue, with probabilistic routing of packets (or jobs) between the queues. In Chapter 16 we build up the fundamental theory needed to analyze networks of queues. This includes time-reversibility and Burke's theorem. In Chapter 17, we apply our theory to Jackson networks of queues. We prove that these have product form, and we derive the limiting distribution of the number of packets at each queue. Our proofs introduce the concept of Local Balance, which we use repeatedly in derivations throughout the book.

23 - The M/G/1 Queue and the Inspection Paradox
from VI - Real-World Workloads: High Variability and Heavy Tails
Mor Harchol-Balter, Carnegie Mellon University, Pennsylvania
Book:

Performance Modeling and Design of Computer Systems

Published online:

05 February 2013

Print publication:

18 February 2013, pp 395-407
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In Chapter 22 we studied the M/G/1/PS queue and derived simple closed-form solutions for πn, E[N], and E[T] (assuming G is any Coxian distribution).
In this chapter we move on to the M/G/1/FCFS queue. We have already had some exposure to thinking about the M/G/1/FCFS. Using the matrix-analytic techniques of Chapter 21, we saw that we could solve the M/PH/1/FCFS queue numerically, where PH represents an arbitrary phase-type distribution. However, we still do not have a simple closed-form solution for the M/G/1/FCFS that lets us understand the effect of load and the job size variability on mean response time.
This chapter introduces a simple technique, known as the “tagged job” technique, which allows us to obtain a simple expression for mean response time in the M/G/1/FCFS queue. The technique will not allow us to derive the variance of response time, nor will it help us understand the higher moments of the number of jobs in the M/G/1/FCFS – for those, we will need to wait until we get to transform analysis in Chapter 25. Nonetheless, the resulting simple formula for mean response time will lead to many insights about the M/G/1 queue and optimal system design for an M/G/1 system.
The Inspection Paradox
We motivate this chapter by asking several questions. We will come back to these questions repeatedly throughout the chapter. By the end of the chapter everything will be clear.

31 - Scheduling: Non-Preemptive, Size-Based Policies
from VII - Smart Scheduling in the M/G/1
Mor Harchol-Balter, Carnegie Mellon University, Pennsylvania
Book:

Performance Modeling and Design of Computer Systems

Published online:

05 February 2013

Print publication:

18 February 2013, pp 499-507
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Until now, we have only considered scheduling policies that do not have any knowledge of the job sizes. In this chapter and the next two chapters, we will look at size-based scheduling policies, starting with non-preemptive size-based policies (this chapter) and followed by preemptive size-based policies (next two chapters). The size-based policies that we will be studying include the following:
SJF – (non-preemptive) Shortest-Job-First (Chapter 31)
PSJF – Preemptive-Shortest-Job-First (Chapter 32)
SRPT – (preemptive) Shortest-Remaining-Processing-Time (Chapter 33)
It will be convenient to evaluate these size-based policies as special cases of priority queueing, so we start by analyzing priority queues, which are important in their own right.
Size-based scheduling is a very important topic, which is why we devote three chapters to it. The proper size-based scheduling policy can greatly improve the performance of a system. It costs nothing to alter your scheduling policy (no money, no new hardware), so the performance gain comes for free. The above size-based policies are implemented in real systems. For web servers serving static content, SRPT scheduling has been implemented in the Linux kernel to schedule HTTP requests [92]. It has also been used to combat transient overload in web servers [162]. Priority queues are likewise prevalent in computer systems. Prioritization of jobs is used in databases to provide differentiated levels of service, whereby high-priority transactions (those that bring in lots of money) are given priority over low-priority transactions (those that are less lucrative).

11 - Exponential Distribution and the Poisson Process
from IV - From Markov Chains to Simple Queues
Mor Harchol-Balter, Carnegie Mellon University, Pennsylvania
Book:

Performance Modeling and Design of Computer Systems

Published online:

05 February 2013

Print publication:

18 February 2013, pp 206-224
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

21 - Phase-Type Distributions and Matrix-Analytic Methods
from VI - Real-World Workloads: High Variability and Heavy Tails
Mor Harchol-Balter, Carnegie Mellon University, Pennsylvania
Book:

Performance Modeling and Design of Computer Systems

Published online:

05 February 2013

Print publication:

18 February 2013, pp 359-379
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

We have seen many examples of systems questions that can be answered by modeling the system as a Markov chain. For a system to be well modeled by a Markov chain, it is important that its workloads have the Markovian property. For example, if job sizes and interarrival times are independent and Exponentially distributed, and routing is probabilistic between the queues, then the system can typically be modeled easily using a CTMC. However, if job sizes or interarrival times are distributed according to a distribution that is not memoryless, for example Uniform(0, 100), then it is not at all clear how a Markov chain can be used to model the system.
In this chapter, we introduce a technique called “the method of stages” or “the method of phases.” The idea is that almost all distributions can be represented quite accurately by a mixture of Exponential distributions, known as a phase-type distribution (PH).We will see how to represent distributions by PH distributions in Section 21.1. Because PH distributions are made up of Exponential distributions, once all arrival and service processes have been represented by PH distributions, we will be able to model our systems problem as a CTMC, as shown in Section 21.2.
The Markov chains that result via the method of phases are often much more complex than Markov chains we have seen until now. They typically cannot be solved in closed form. Thus, in Section 21.3, we introduce the matrix-analytic method, a very powerful numerical method that allows us to solve many such chains that come up in practice.

1 - Motivating Examples of the Power of Analytical Modeling
from I - Introduction to Queueing
Mor Harchol-Balter, Carnegie Mellon University, Pennsylvania
Book:

Performance Modeling and Design of Computer Systems

Published online:

05 February 2013

Print publication:

18 February 2013, pp 3-12
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

7 - Modification Analysis: “What-If” for Closed Systems
from III - The Predictive Power of Simple Operational Laws: “What-If” Questions and Answers
Mor Harchol-Balter, Carnegie Mellon University, Pennsylvania
Book:

Performance Modeling and Design of Computer Systems

Published online:

05 February 2013

Print publication:

18 February 2013, pp 114-126
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

20 - Tales of Tails: A Case Study of Real-World Workloads
from VI - Real-World Workloads: High Variability and Heavy Tails
Mor Harchol-Balter, Carnegie Mellon University, Pennsylvania
Book:

Performance Modeling and Design of Computer Systems

Published online:

05 February 2013

Print publication:

18 February 2013, pp 349-358
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

We have alluded several times during this book to the fact that computing workloads have highly variable job sizes (service requirements), that are not well described by an Exponential distribution. This chapter is a story of my own experience in studying UNIX jobs in the mid-1990s, as a PhD student at U.C. Berkeley. Results of this research are detailed in [84, 85]. The story serves as both an introduction to empirical measurements of computer workloads and as a case study of how a deeper understanding of computer workloads can lead to improved computer system designs. The remaining chapters in the book address modeling and performance evaluation of systems with high-variability workloads.
Grad School Tales … Process Migration
In the mid-1990s, an important research area was CPU load balancing in a Network of Workstations (at U.C. Berkeley it was coined the “N.O.W. project”). The idea in CPU load balancing is that CPU-bound jobs might benefit from being migrated from a heavily loaded workstation to a more lightly loaded workstation in the network. CPU load balancing is still important in today's networks of servers. It is not free, however: Migration can be expensive if the job has a lot of “state” that has to be migrated with the job (e.g., lots of open files associated with the job), as is common for jobs that have been running for a while. When the state associated with the job is great, then the time to migrate the job to another machine is high, and hence it might not be worth migrating that job.

12 - Transition to Continuous-Time Markov Chains
from IV - From Markov Chains to Simple Queues
Mor Harchol-Balter, Carnegie Mellon University, Pennsylvania
Book:

Performance Modeling and Design of Computer Systems

Published online:

05 February 2013

Print publication:

18 February 2013, pp 225-235
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

14 - Server Farms: M/M/k and M/M/k/k
from V - Server Farms and Networks: Multi-server, Multi-queue Systems
Mor Harchol-Balter, Carnegie Mellon University, Pennsylvania
Book:

Performance Modeling and Design of Computer Systems

Published online:

05 February 2013

Print publication:

18 February 2013, pp 253-268
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In today's high-volume world, almost no websites, compute centers, or call centers consist of just a single server. Instead a “server farm” is used. The server farm is a collection of servers that work together to handle incoming requests. Each request might be routed to a different server, so that servers “share” the incoming load. From a practical perspective, server farms are often preferable to a single “super-fast” server because of their low cost (many slow servers are cheaper than a single fast one) and their flexibility (it is easy to increase/decrease capacity as needed by adding/removing servers). These practical features have made server farms ubiquitous.
In this chapter, we study server farms where there is a single queue of requests and where each server, when free, takes the next request off the queue to work on. Specifically, there are no queues at the individual servers. We defer discussion of models with queues at the individual servers to the exercises and later chapters.
The two systems we consider in this chapter are the M/M/k system and the M/M/k/k system. In both, the first “M” indicates that we have memoryless interarrival times, and the second “M” indicates memoryless service times. The third field denotes that k servers share a common pool of arriving jobs. For the M/M/k system, there is no capacity constraint, and this common pool takes the form of an unbounded FCFS queue, as shown later in Figure 14.3, where each server, when free, grabs the job at the head of the queue to work on.

3 - Probability Review
from II - Necessary Probability Background
Mor Harchol-Balter, Carnegie Mellon University, Pennsylvania
Book:

Performance Modeling and Design of Computer Systems

Published online:

05 February 2013

Print publication:

18 February 2013, pp 31-69
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

32 - Scheduling: Preemptive, Size-Based Policies
from VII - Smart Scheduling in the M/G/1
Mor Harchol-Balter, Carnegie Mellon University, Pennsylvania
Book:

Performance Modeling and Design of Computer Systems

Published online:

05 February 2013

Print publication:

18 February 2013, pp 508-517
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In this chapter, we discuss preemptive scheduling policies that make use of knowing the size of the job. As in the last chapter, we start by defining and evaluating preemptive priority queueing, and then we extend that analysis to the Preemptive-Shortest-Job-First (PSJF) scheduling policy.
Motivation
Recall that we can divide scheduling policies into non-preemptive policies and preemptive policies.
Question: What is discouraging about the mean response time of all the non-preemptive scheduling policies that we have looked at?
Answer: They all have an E[S2] factor that comes from waiting for the excess of the job in service. This is a problem under highly variable job size distributions.
We have also looked at preemptive policies. These tend to do better with respect to mean response time under highly variable job size distributions. Not all of these have equal performance, however. Preemptive policies like PS and PLCFS that do not make use of size have mean response time equal to that of M/M/1/FCFS; namely, they are insensitive to the job size distribution beyond its mean. This is already far better than non-preemptive scheduling policies, when the job size distribution has high variability. However, preemptive policies that make use of size or age can do even better by biasing toward jobs with small size. So far, we have seen this only for the FB scheduling policy that favors jobs with small age. In this chapter and the next, we will examine policies that make use of a job's (original) size and remaining size.

IV - From Markov Chains to Simple Queues
Mor Harchol-Balter, Carnegie Mellon University, Pennsylvania
Book:

Performance Modeling and Design of Computer Systems

Published online:

05 February 2013

Print publication:

18 February 2013, pp 127-128
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Part IV introduces both discrete-time Markov chains (referred to as DTMCs) and continuous-time Markov chains (referred to as CTMCs). These allow us to model systems in much greater detail and to answer distributional questions, such as “What is the probability that there are k jobs queued at server i?” Markov chains are extremely powerful. However, only certain problems can be modeled via Markov chains. These are problems that exhibit the Markovian property, which allows the future behavior to be independent of all past behavior.
Chapter 8 introduces DTMCs and the Markovian property. We purposely defer the more theoretical issues surrounding ergodicity, including the existence of a limiting distribution and the equivalence between time averages and ensemble averages, to Chapter 9. Less theoretically inclined readers may wish to skim Chapter 9 during a first reading. Chapter 10 considers some real-world examples of DTMCs in computing today, including Google's PageRank algorithm and the Aloha (Ethernet) protocol. This chapter also considers more complex DTMCs that occur naturally and how generating functions can be used to solve them.
Next we transition to CTMCs. Chapter 11 discusses the Markovian property of the Exponential distribution and the Poisson process, which make these very applicable to CTMCs. Chapter 12 shows an easy way to translate all that we learned for DTMCs to CTMCs. Chapter 13 applies CTMC theory to analyzing the M/M/1 single-server queue and also covers the PASTA property.

Performance Modeling and Design of Computer Systems

Queueing Theory in Action

Refine listing

Refine listing

Actions for selected content:

46 results in Performance Modeling and Design of Computer Systems

9 - Ergodicity Theory

Acknowledgments

VI - Real-World Workloads: High Variability and Heavy Tails

Summary

19 - Closed Networks of Queues

22 - Networks with Time-Sharing (PS) Servers (BCMP)

Summary

28 - Performance Metrics

26 - M/G/1 Transform Analysis

V - Server Farms and Networks: Multi-server, Multi-queue Systems

Summary

23 - The M/G/1 Queue and the Inspection Paradox

Summary

31 - Scheduling: Non-Preemptive, Size-Based Policies

Summary

11 - Exponential Distribution and the Poisson Process

21 - Phase-Type Distributions and Matrix-Analytic Methods

Summary

1 - Motivating Examples of the Power of Analytical Modeling

7 - Modification Analysis: “What-If” for Closed Systems

20 - Tales of Tails: A Case Study of Real-World Workloads

Summary

12 - Transition to Continuous-Time Markov Chains

14 - Server Farms: M/M/k and M/M/k/k

Summary

3 - Probability Review

32 - Scheduling: Preemptive, Size-Based Policies

Summary

IV - From Markov Chains to Simple Queues

Summary

Queueing Theory in Action

Refine listing

Refine listing

Actions for selected content:

Save Search

46 results in Performance Modeling and Design of Computer Systems

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary