PERFORMANCE OF NON-COOPERATIVE ROUTING OVER PARALLEL NON-OBSERVABLE QUEUES

Autonomic computing is emerging as a significant new approach to the design of computer services. Its goal is the development of services that are able to manage themselves with minimal direct human intervention, and, in particular, are able to sense their environment and to tune themselves to meet end-user needs. However, the impact on performance of the interaction between multiple uncoordinated self-optimizing services is not yet well understood. We present some recent results on a non-cooperative load-balancing game which help to better understand the result of this interaction. In this game, users generate jobs of different services, and the jobs have to be processed on one of the servers of a computing platform. Each service has its own dispatcher which probabilistically routes jobs to servers so as to minimize the mean processing cost of its own jobs. We first investigate the impact of heterogeneity in the amount of incoming traffic routed by dispatchers and present a result stating that, for a fixed amount of total incoming traffic, the worst-case overall performance occurs when each dispatcher routes the same amount of traffic. Using this result we then study the so-called Price of Anarchy (PoA), an oft-used worst-case measure of the inefficiency of non-cooperative decentralized architectures. We give explicit bounds on the PoA for cost functions representing the mean delay of jobs when the service discipline is PS or SRPT. These bounds indicate that significant performance degradations can result from the selfish behavior of self-optimizing services. In practice, though, the worst-case scenario may occur rarely, if at all. Some recent results suggest that for the game under consideration the PoA is an overly pessimistic measure that does not reflect the performance obtained in most instances of the problem.


INTRODUCTION
Even small degradations in the performance and availability of modern computer services can have a considerable business impact. These services usually require continuous operation over time, always maintaining the response time below an acceptable threshold, in order to avoid damage to brand reputation, lost revenue and reduced productivity. Yet, modern computer services have reached a level of complexity where the human effort required to get the systems up and running is becoming prohibitively expensive. Autonomic computing has been proposed by IBM [20,30] as an approach for reducing the cost of operating complex computer systems, while at the same time improving their performance and availability. Inspired by the autonomic nervous system of the human body, this approach aims at enabling computer systems to manage themselves with minimal direct human intervention.
In particular, autonomic systems are self-optimizing systems that are able to sense their environment and to tune themselves to meet end-user needs, for example, by dispatching incoming jobs to the best available resources in order to maintain and improve the requested quality of service (QoS) in response to dynamically changing workloads. An interesting example is the approach for smart workload allocation to cloud servers presented in [52,53]. Inspired by the Cognitive Packet Network [13,23] adaptive routing protocol for packet networks, this paper investigates adaptive allocation algorithms that make measurement-based fast online decisions to address QoS. Measurement data are collected by a controller which uses a Random Neural Network (RNN) based [24] decision scheme, or a greedy scheme called "sensible routing" [22] that probabilistically allocates successive tasks to hosts based on a real-time estimate of the one that can give the best QoS. It is shown that when the hosts have significantly different performance characteristics, the autonomic approach comes out clearly better with respect to static allocation schemes (e.g., Round Robin scheme) because of its use of on-line measurement data and of on-line adaptation.
Even if there is no doubt on the potential for Autonomic Computing to improve the performance and availability of individual computer services, we lack the hindsight necessary to understand the outcome of the interaction of many self-optimizing services. If multiple self-optimizing services share the resources of a computing platform, each one seeking to minimize the mean processing time of its own jobs without any coordination with the others, does not it lead to an anarchic situation in which everyone will lose? In other words, is there a significant price to pay in terms of performance for the lack of coordination of autonomic services? The present paper addresses these issues by reviewing some recent results from Game Theory which help to better understand the overall performance resulting from the interaction of uncoordinated self-optimizing services.
Game Theory is a systematic framework to model, analyze, and solve decentralized design problems involving multiple autonomous agents that interact strategically in a rational and intelligent way [4,27,42]. In particular, non-cooperative Game Theory is used to study and understand decentralized algorithms in which the autonomous agents behave "selfishly", that is each agent makes decisions so as to optimize its own performance, without coordination with the other agents. In the past decade, Game Theory has found applications in as diverse areas as load-balancing in server farms [2,5,9,10,15,18,28], power control and spectrum allocation in wireless networks [14,25,26,35,38,39,43,44,51], congestion control in the Internet [1,21,37,49,55] or decentralized routing in communication networks [4,16,29,31,36,45].
In the present article, we consider a non-cooperative routing game which was originally introduced in the seminal paper of Orda, Rom and Shimkin [45]. In this game, users generate jobs of different services, and the jobs have to be processed on one of the servers of a computing platform. Each service has its own dispatcher which probabilistically routes jobs to servers so as to minimize the mean processing cost of its own jobs. In the following, since there is no central authority for dispatching jobs to servers, this routing scheme will be referred to as a decentralized or non-cooperative load-balancing scheme (We shall use the terms load-balancing and routing interchangeably). In this load-balancing scheme, the optimal routing strategy of a dispatcher depends on the strategy of the others and the dispatchers are therefore the players of a non-cooperative routing game. We can distinguish two different settings depending on the number of dispatchers. If the number of dispatchers is finite, then it is said that the game is "atomic" and a well-known equilibrium strategy is given by the so-called Nash Equilibrium, that is, a routing strategy from which unilateral deviation does not help any dispatcher in improving the performance perceived by the jobs it routes. When the number of dispatcher grows to infinity (every arriving job is handled by a dispatcher and it takes its own routing decision) the game is referred to as a "non-atomic" game and the corresponding equilibria is given by the notion of Wardrop Equilibrium. In this case, the equilibrium point is characterized by the fact that the performance in every (used) server is the same. In the present article we are mostly interested in the "atomic" setting, and we refer to [2,15,28,54] for some related works in the "non-atomic" setting.
The main issue we address in this paper is that of the performance of non-cooperative load-balancing schemes. We first show that there always exists a unique Nash Equilibrium, that is, a routing strategy from which no dispatcher has any incentive to deviate. We then present a result on the worst-case traffic conditions for non-cooperative routing, which states that the worst Nash Equilibrium occurs when the amount of traffic that every dispatcher routes is exactly the same. One immediate consequence is that the routing game under consideration belongs to a particular class of games known as Potential Games [41], which implies in particular that equilibrium performance can be computed as the solution of a standard convex optimization problem. We then compare the performance of the globally optimal routing strategy with that given by the Nash Equilibrium, or in other words, the performance when there is only one dispatcher which routes all the traffic so as to optimize the performance of all jobs, and the performance when there are several dispatchers, each one seeking to optimize the performance of its own jobs. In order to do so, we first look at the Price of Anarchy (PoA) which was introduced by Koutsoupias and Papadimitriou [34]. The PoA is a worst-case measure of the inefficiency of a non-cooperative scheme. It is defined as the ratio between the performance obtained by the worst Nash Equilibrium and the global optimal solution. We present explicit bounds on the PoA for cost functions representing the mean delay of jobs when the service discipline is PS or SRPT. These bounds indicate that as the number of dispatchers increases, the loss of efficiency may grow unboundedly, implying that the "selfish" behavior of uncoordinated self-optimizing services can lead to significant performance degradations. In practice, though, the worst-case scenario may occur rarely, if at all. We review some recent results suggesting that for the game under consideration the PoA is an overly pessimistic measure that does not reflect the performance obtained in most instances of the problem.
The rest of the paper is organized as follows. In Section 2, we present the non-cooperative load-balancing game under consideration. We then study the worst-case traffic conditions for non-cooperative routing in Section 3. In Section 4, we explain how bounds on the PoA are derived for cost functions representing the mean delay of jobs when the service discipline is PS or SRPT. Section 5 is devoted to the analysis of a new measure, called the inefficiency, for the comparison of the non-cooperative and centralized load-balancing schemes. Finally, some conclusions are drawn in Section 6.

NON-COOPERATIVE LOAD-BALANCING GAME
We consider K computer services sharing the computing resources of S servers. Users generate jobs of each service, which have to be processed on one of the servers. We assume that each service has its own job dispatcher which selects the best available server on which to process each job so as to optimize the performance of its own individual jobs. In the following, we let C = {1, . . . , K} be the set of dispatchers and S = {1, . . . , S} be the set of servers (see Figure 1).

Figure 1.
Non-cooperative load balancing: each dispatcher controls a portion of the total traffic intensity and probabilistically routes its jobs so as to minimize their own mean processing cost.
Jobs arrive to the system according to independent Poisson processes and those received by dispatcher i are said to be jobs of class i. They have class-independent generally distributed service times. In the following, we let λ i be the traffic intensity of class i, and λ = i∈C λ i be the total traffic intensity.
Server j ∈ S has capacity r j and a holding cost c j per job is incurred for each job sent to this server. It will be assumed throughout the paper that the total capacity of the system r = n∈S r n is such thatλ < r, which is the necessary and sufficient condition to guarantee the stability of the system.
Dispatcher i uses a Bernoulli routing policy, that is, it probabilistically routes arrivals to servers so as to optimize the performance of its own jobs (Routing policies where a dispatcher has the memory of its previous routing decisions have also been considered in [6][7][8]). This model follows from the assumption that routing decisions are made without observing the queue lengths and that the dispatcher reacts to periodic performance measurements attained from each server with the goal of minimizing the processing cost of its own jobs. Let x i = (x i,j ) j∈S denote the routing strategy of dispatcher i, with x i,j being the amount of traffic it sends towards server j. Let denote the set of feasible routing strategies for dispatcher i. Given a vector x ∈ X = i∈C X i , we let y j = i∈C x i,j be the total flow on server j and ρ j = y j /r j be the utilization rate of that server. We assume that the mean response time of server j depends only on its utilization rate and has the form φ(ρ j )/r j , where φ is a strictly increasing function of the load ρ j (see below for some examples). Note that from Little's law xi,j rj φ(ρ j ) is the mean number of class-i jobs on server j, so that c j xi,j rj φ(ρ j ) represents the mean cost to be paid by dispatcher i for sending jobs to server j at rate x i,j .
Dispatcher i seeks to minimize its total cost T i (x) for processing jobs, which is assumed to be the sum of the costs incurred on all the servers. This optimization problem can be formulated as follows: In particular, when the holding cost is the same in every server, every dispatcher independently seeks to minimize the mean response time of its own jobs. Since there is no coordination between the dispatchers, which are in competition for the capacities of the servers, we shall refer to this routing scheme as a non-cooperative routing scheme. Note that the cost incurred by class i on server j depends not only on its amount of flow x i,j on that server, but also on the total amount of flow y j through that server, which determines the server performance. Hence, the optimal routing strategy of dispatcher i depends on the routing strategies of other dispatchers, which means that the dispatchers are involved in a non-cooperative routing game. A Nash equilibrium of this routing game is a point x ∈ X from which no class finds it beneficial to deviate unilaterally, that is x ∈ X is a Nash equilibrium point (NEP) if and only if The existence of a unique NEP can be established under some assumptions on the function φ (see Theorem 2.1 in [45]). More precisely, we shall assume that φ : [0, 1) → [1, ∞) is a continuously differentiable, strictly increasing and convex function whose second derivative φ exists, and such that φ(0) = 1 and lim ρ→1 − φ(ρ) = +∞. Typical examples are the 2 (1−ρ) when all job classes have the same squared coefficient of variation c 2 b of job sizes. Another interesting example is the delay function of the M/P areto/1/SRP T queue in heavy-traffic, which is given by where m depends on the shape parameter of the Pareto distribution [15]. The main issue addressed in the present paper is that of the performance guarantees that can be obtained for non-cooperative routing schemes. To this end, we compare the performance at the Nash equilibrium with that of a globally optimal routing strategy. An optimal strategy is that of a centralized routing scheme, with a single dispatcher controlling all the traffic (i.e., K = 1 and λ 1 =λ) and routing jobs so as to optimize the performance of all jobs. An optimal routing strategy is given by x * i,j = λī λ y * j , where the y * j are the optimal solution of the following optimization problem: subject to j y j =λ, The quantity D(x) = j∈S c j ρ j φ(ρ j ) represents the mean processing cost of all jobs in the routing strategy x ∈ X , and is known as the social cost of this routing strategy. In the following, the optimal value D(x * ) of the social cost will be denoted by D 1 (λ, r, c) in order to make explicit its dependence on the total traffic intensityλ and on the vectors r = (r j ) j∈S and c = (c j ) j∈S of server capacities and costs. When there are K > 1 dispatchers, the social cost D K (λ, r, c) = D(x) at the NEP x corresponds to the sum of individual player costs, that is, D K (λ, r, c) = i∈C T i (x), and it depends in addition on the traffic vector λ, that is on the precise amount of traffic controlled by each dispatcher. In order to better understand the performance degradation resulting from the selfish behavior of the dispatchers, we shall compare the social costs of the decentralized and centralized routing schemes. We first start by studying the worst-case traffic conditions for the non-cooperative routing.
Remark 1: Apart from the performance of non-cooperative routing schemes, another important issue is related to the convergence to the Nash equilibrium. Do uncoordinated routing agents converge to a Nash equilibrium, and, if so, how long do they need? This issue is not addressed in this paper, but interested readers may refer to, for example, [3,12,17,40,45].

WORST-CASE TRAFFIC CONDITIONS
As noted in Section 2, the performance of the non-cooperative routing scheme depends on the precise amount of traffic controlled by each of the K dispatchers. In general, the evaluation of the social cost at the Nash equilibrium for an arbitrary traffic vector λ is difficult. It turns out that this evaluation becomes simpler under the worst-case traffic conditions. In this section, we present a result describing the traffic conditions under which the worst-case performance is obtained and discuss some implications of this result.
We start by comparing the optimality conditions for the decentralized and centralized settings. Let λ be a traffic vector and consider the associated NEP x. Denote by y and ρ the vectors of offered traffics and utilization rates at that NEP, respectively. In the noncooperative routing scheme, each and every dispatcher i minimizes its private cost T i (x). According to the Karush-Kuhn-Tucker (KKT) optimality conditions, this implies that at the NEP x each player i sends a positive amount of traffic only to those servers having a minimal marginal private cost for that player. Formally, it means that there exist multipliers μ 1 , μ 2 , . . . , μ K , such that with equality if x i,j > 0. This is in contrast to the optimality conditions for the centralized routing scheme, which states that at an optimal routing solution x * only those servers having a minimal marginal social cost receive a positive amount of traffic. This implies that there exists a multiplier μ * such that at point x * with equality if y * j > 0. The latter condition is not necessarily satisfied at the NEP. In particular, it is proven in [11] that the lower the cost per unit capacity of a server, the greater is its marginal social cost at the NEP, that is with strict inequality if y n > y m . Hence, it is possible that at the NEP two servers with different marginal social costs receive a positive amount of flow, which contradicts the optimality conditions of the social cost. Property (3) of the NEP has been used in [11] to prove that the worst-case performance is obtained when all dispatchers control the same amount of traffic. This result is illustrated in Figure 2 in the case of K = 2 classes where we plot the ratio D K (λ, r, c)/D 1 (λ, r, c) as a function of the amount of traffic λ 2 controlled by dispatcher 2. When λ 2 is 0 orλ, the ratio of social costs is 1, implying that the non-cooperative routing scheme is optimal. The worst performance of this routing scheme is achieved for the symmetric game, that is when λ 2 = λ 1 =λ 2 .

Theorem 1 ( [11]): Let
The key idea of the proof of Theorem 1 is to show that, starting from an arbitrary traffic vector λ, the symmetric traffic vector λ = can be reached in a finite number of steps R by a sequence {λ n } n≥0 such that λ 0 = λ and D K (λ n+1 ) ≥ D K (λ n ). Such a sequence is obtained by considering a certain transformation λ →λ of the traffic vector, which amounts to transferring traffic from the most loaded dispatchers to the least loaded ones, thus reducing the heterogeneity of the traffic vector. It is shown in [11] that the load on the most attractive servers (those with the smaller cost per unit capacity) cannot decrease under this transformation. However, since, according to (3), those servers are precisely those with the greatest marginal social cost, the convexity of the social cost D K () implies that it cannot be reduced under the transformation, so that D K (λ, r, c) = D K (λ 0 , r, c) ≤ D K (λ 1 , r, c) ≤ · · · ≤ D K (λ R , r, c) = D K (λ = , r, c).
An important consequence of Theorem 1 is that, for the worst-case analysis of noncooperative routing, we can restrict ourselves to the symmetric game. It is well known that in this case the non-cooperative routing game is a potential game [41] (see e.g. Theorem 4.1 in [16]). In other words, although each and every dispatcher independently optimizes its own cost function, they collectively solve a standard convex optimization problem. This is formally stated in Proposition 1.

Proposition 1 ( [11]): If the vector ρ is a global optimum of the following convex optimization problem
then the routing strategy x such that x i,j = r j ρj K , ∀i ∈ C, ∀j ∈ S, is the NEP of the symmetric game.
Note that when K = 1, the above problem reduces to the global optimization problem solved by the centralized scheme. When K → ∞, the equivalent problem states the common function that is jointly optimized by an infinite number of players and is characteristic of the Wardrop equilibrium. As we shall see in the following, the fact that the worst-case analysis of the non-cooperative routing scheme reduces to the analysis of a convex optimization problem considerably simplifies the comparison with the centralized routing scheme.

WORST-CASE PERFORMANCE ANALYSIS
In this section, we compare the performance of the global optimum with that given by the Nash equilibrium, or in other words, the performance when there is only one dispatcher which routes all the traffic, and the performance when there are several dispatchers each one seeking to optimize its own performance. In order to do so we look at the PoA which was introduced by Koutsoupias and Papadimitriou [34]. The PoA is an oft-used measure of the inefficiency of a decentralized scheme, which for our model is defined as λ, r, c) .
Note that the PoA lies in the interval [1, ∞). We have seen in Theorem 1 that the worst-case performance is obtained when all dispatchers control the same amount of traffic. Therefore, λ, r, c) .
Several recent works have shown that non-cooperative load-balancing can be very inefficient in the presence of non-linear delay functions, see, for example, [9,10,15,28]. The PoA has been analyzed both in the so-called non-atomic scenario where every arriving job can select the server in which it will be served, and in the atomic scenario considered in this paper where each player controls a non-negligible amount of flow. For the non-atomic scenario, Haviv and Roughgarden have shown in [28] that the PoA corresponds to the number of servers, implying that, in a server farm with S servers, the mean response time of jobs can be as high as S times the optimal one! For the atomic scenario, we present below bounds on the PoA obtained in the case of M/G/1/P S queues [9], and in the case of M/P areto/1/SRP T queues [11], which prove that the PoA can grows unboundedly with the number of dispatchers K. The fact that the Nash equilibrium can be very inefficient has paved the way to a lot of research on mechanism design that aims at architecting Nash equilibria so that they are efficient with respect to the centralized setting [32,33,46].

Bounds on the PoA for M/G/1/P S queues
Let us first assume that the servers on which jobs are processed are modeled as M/G/1/P S queues, in which case we have φ(ρ) = 1/ (1 − ρ). Note that since we have assumed Poisson arrivals of jobs at the dispatchers and Bernoulli routing, the assumption of Poisson arrivals at the servers is consistent. The following lower and upper bounds on the PoA have been established in [9].
Theorem 2 ( [9]): For a system with two or more servers, This result states that the PoA is of the order of √ K independently of the number of servers. In other words, in the worst-case scenario, the performance degradation with K self-optimizing services can be of order √ K. The proof of the upper bound is based on the observation that for φ(ρ) = 1/(1 − ρ) the potential function of Proposition 1 takes the simple form, from which an explicit solution of the symmetric game can be obtained. This explicit solution is used in [9] to establish that: • The distributed scheme with K dispatchers uses only a subset of the servers used by the centralized scheme. In other words, if S * (K) denotes the set of servers used at the NEP of the symmetric game with K dispatchers, we have S * (K) ⊂ S * (1). • If y j (K) represents the offered traffic in equilibrium in the K player symmetric game, then , ∀j ∈ S * (1).
The upper bound on the PoA is then obtained as follows: λ, r, c).
The lower bound is established in [9] by exhibiting an example of a symmetric game for which D K (λ = , r, c) ≥ K/(2 √ K − 1)D 1 (λ, r, c). The procedure for obtaining this example is as follows: using the KKT conditions, define the parametersλ, r and c such that for K > 1 only the least costly server is used, whereas for K = 1 more than one server is used. For such a symmetric game, by splitting the traffic over several servers, the centralized setting does much better than the decentralized one.

Bounds on the PoA for M/P areto/1/SRP T queues
Let us now assume that job sizes follow a class-independent power law probability distribution, and that the servers use the shortest-remaining-processing-time (SRPT) scheduling discipline [48]. This scheduling discipline, which preemptively runs the job with shortest remaining processing requirement, is known to be optimal with respect to mean response time [47,50]. Although no simple closed form formula is known for the mean response time of M/P areto/1/SRP T queues, it is shown in [15] that in heavy-traffic conditions the mean delay has the form φ(ρ) = 1/(1 − ρ) m , where m depends on the shape parameter of the Pareto distribution.
One of the main difficulty in the analysis of the PoA in this case is that, in contrast to the case of M/G/1/P S queues, a closed-form solution of the optimization problem stated in Proposition 1 cannot be obtained. This turns out to be a major obstacle for the derivation of an upper bound on the PoA. Nevertheless, a lower bound can be obtained by following the same procedure as that used for M/G/1/P S queues.
Note that this lower bound on the PoA is independent of r and c. It proves that, for M/P areto/1/SRP T queues, the PoA grows at least as fast as O(K m m+1 ) as K grows. Hence, as the number of dispatchers K → ∞, the performance degradation with respect to the centralized setting tends to infinity. Although we do not have an upper bound for PoA as in the M/G/1/P S case, it is conjectured in [11] that the lower bounds constructed using the above procedure give the right order of the PoA, just as was proved for the case of M/G/1/P S delay functions.

IS THE POA THE RIGHT MEASURE OF INEFFICIENCY?
In Section 4, the lower bounds on the PoA have been established by computing the system parameters in such a way that in the decentralized setting only the least costly server is used, whereas more than one server is used in the centralized setting. Similarly, in the nonatomic scenario considered in [28], the worst-case architecture has one server whose capacity is much larger (tending to infinity) compared to that of the other servers. It is doubtful that such asymmetries will occur in data-centers where processors are more than likely to have similar characteristics. This suggests that the worst-case analysis of the inefficiency of selfish routing is overly pessimistic and that high PoAs are obtained in pathological instances that hardly occur in practice.
In [19], the authors adopt this point of view. Assuming that the holding cost is the same in every server (that is, c = e), which is equivalent to assuming that each dispatcher seeks to minimize the mean processing time of its own jobs, they propose a new measure, called the inefficiency, to compare the non-cooperative routing scheme with K dispatchers and S servers and the centralized one. This new measure is defined as follows The rationale for this definition is that in practice the system administrator controls neither the total incoming traffic nor how it is split between the dispatchers, whereas the number of servers and their capacities are fixed. Therefore, it makes sense to consider a fixed data-center architecture under the worst traffic conditions for the inefficiency of selfish routing (provided the system is stable). As is true of the PoA, inefficiency can take values between 1 and ∞. A higher value of inefficiency indicates a worse performance of selfish routing compared to centralized routing. As opposed to the PoA, the inefficiency depends on the parameters (the server speeds and the number of servers in our case) of the architecture. By calculating the worst possible inefficiency, one retrieves the PoA, that is, PoA(K, S) = sup r I S K (r). It follows from Theorem 1 that so that, as for the PoA, the inefficiency depends only on the total traffic intensity and not on individual traffic flows to each of the dispatcher. The setting considered in [19] is similar to the one introduced in Section 2, but the authors restrict themselves to the case of two classes of servers, which are modeled as M/G/1/P S queues: there are S 1 "fast" servers of capacity r 1 , and S 2 = S − S 1 "slow" servers, each one of capacity r 2 < r 1 . Let β = r1 r2 ≥ 1 be the ratio of server capacities, α = S1 S2 > 0 be the ratio of the numbers of servers of each type, and definē andλ The following lemma gives the conditions onλ under which the centralized setting and the decentralized one use only the fast class of servers, or both classes, and describes the evolution of the ratio D K (λ K e, r)/D 1 (λ, r).

Lemma 1 ( [19]):
1. Ifλ ≤λ OP T , both settings use only the "fast" servers, and the ratio of social costs is equal to 1, 2. ifλ OP T ≤λ ≤λ NE , the decentralized setting uses only the "fast" servers, while the centralized one uses all servers, and the ratio of social costs is strictly increasing, 3. ifλ NE <λ < r, both settings use all servers, and the ratio of social costs is strictly decreasing.
This behavior of the ratio of social costs is illustrated for K = 2 and K = 5 in Figure 3 in the case of a server farm with S 1 = 100 fast servers of capacity r 1 = 100, and S 2 = 300 slow servers of capacity r 2 = 10.
Since the ratio D K (λ, r)/D 1 (λ, r) is a continuous function ofλ over the interval [0, r), a direct consequence of Lemma 1 is the following Theorem.

Theorem 3 ( [19]):
The inefficiency is worst when the total arriving traffic intensity equals λ NE , namely,  Theorem 3 fully characterizes the worst case traffic conditions for a server farm with two classes of servers. It states that the worst inefficiency of the decentralized setting is achieved when (a) each dispatcher controls the same amount of traffic and (b) the total traffic intensity is such that the decentralized setting only starts using the slow servers.
The analysis of the symmetric game obtained forλ =λ NE can be done with Proposition 1. This allows to obtain an explicit expression for the inefficiency of selfish routing for data-centers with two classes of PS servers [19]: where we recall that β = r1 r2 ≥ 1 and α = S1 S2 > 0. Note that the inefficiency I S K (r) does not depend on the total number of servers S, but only on the ratio of server capacities and on the ratio of the numbers of servers of each type. In Figure 4, we plot the inefficiency I S K (r) of the non-cooperative routing scheme with K = 5 dispatchers and S = 1000 servers as the parameters α and β change from 1 S−1 to 2 and from 1 to 1, 000, respectively. It can be observed that even for unbalanced scenarios (α small and β large), the inefficiency is always fairly close to 1, indicating that, even in the worst case traffic conditions, the gap between the NEP and the optimal routing solution is not significant.

CONCLUSION
Self-optimizing services are able to use on-line measurements to dispatch incoming jobs to the best available computing resources in order to maintain and improve the QoS in response to dynamically changing workloads. The interaction between uncoordinated dispatchers can however result in significant performance degradations. We have seen that the worst-case traffic conditions occur when all dispatchers control the same amount of traffic. We have also presented explicit bounds on the PoA for cost functions representing the mean delay of jobs when the service discipline is PS or SRPT. These bounds indicate that as the number of dispatchers increases, the loss of efficiency may grow unboundedly, implying that the "selfish" behavior of uncoordinated self-optimizing services can lead to significant performance degradations. It is nevertheless important to keep in mind that these bounds are obtained for worst-case scenarios, which do not necessarily occur in practice. In datacenters where processors are more than likely to have similar characteristics, no significant performance degradation is observed with respect to a globally optimal routing strategy. However, more work is needed to confirm or contradict this observation.