In today's high-volume world, almost no websites, compute centers, or call centers consist of just a single server. Instead a “server farm” is used. The server farm is a collection of servers that work together to handle incoming requests. Each request might be routed to a different server, so that servers “share” the incoming load. From a practical perspective, server farms are often preferable to a single “super-fast” server because of their low cost (many slow servers are cheaper than a single fast one) and their flexibility (it is easy to increase/decrease capacity as needed by adding/removing servers). These practical features have made server farms ubiquitous.
In this chapter, we study server farms where there is a single queue of requests and where each server, when free, takes the next request off the queue to work on. Specifically, there are no queues at the individual servers. We defer discussion of models with queues at the individual servers to the exercises and later chapters.
The two systems we consider in this chapter are the M/M/k system and the M/M/k/k system. In both, the first “M” indicates that we have memoryless interarrival times, and the second “M” indicates memoryless service times. The third field denotes that k servers share a common pool of arriving jobs. For the M/M/k system, there is no capacity constraint, and this common pool takes the form of an unbounded FCFS queue, as shown later in Figure 14.3, where each server, when free, grabs the job at the head of the queue to work on.