M/M/c queue

Source: Wikipedia, the free encyclopedia.

In

Poisson process, there are c servers, and job service times are exponentially distributed.[3] It is a generalisation of the M/M/1 queue which considers only a single server. The model with infinitely many servers is the M/M/∞ queue
.

Model definition

An M/M/c queue is a stochastic process whose

state space
is the set {0, 1, 2, 3, ...} where the value corresponds to the number of customers in the system, including any currently in service.

The model can be described as a

transition rate matrix

on the state space {0, 1, 2, 3, ...}. The model is a type of birth–death process. We write ρ = λ/(c μ) for the server utilization and require ρ < 1 for the queue to be stable. ρ represents the average proportion of time which each of the servers is occupied (assuming jobs finding more than one vacant server choose their servers randomly).

The

state space
diagram for this chain is as below.

Stationary analysis

Number of customers in the system

If the traffic intensity is greater than one then the queue will grow without bound but if server utilization then the system has a stationary distribution with probability mass function[4][5]

where πk is the probability that the system contains k customers.

The probability that an arriving customer is forced to join the queue (all servers are occupied) is given by

which is referred to as

Erlang's C formula and is often denoted C(c, λ/μ) or E2,c(λ/μ).[4] The average number of customers in the system (in service and in the queue) is given by[6]

Busy period of server

The busy period of the M/M/c queue can either refer to:

Write[8][9] Tk = min( t: k jobs in the system at time 0+ and k − 1 jobs in the system at time t) and ηk(s) for the Laplace–Stieltjes transform of the distribution of Tk. Then[8]

  1. For k > c, Tk has the same distribution as Tc.
  2. For k = c,
  3. For k < c,

Response time

The response time is the total amount of time a customer spends in both the queue and in service. The average response time is the same for all work conserving service disciplines and is[6]

Customers in first-come, first-served discipline

The customer either experiences an immediate exponential service, or must wait for k customers to be served before their own service, thus experiencing an Erlang distribution with shape parameter k + 1.[10]

Customers in processor sharing discipline

In a processor sharing queue the service capacity of the queue is split equally between the jobs in the queue. In the M/M/c queue this means that when there are c or fewer jobs in the system, each job is serviced at rate μ. However, when there are more than c jobs in the system the service rate of each job decreases and is where n is the number of jobs in the system. This means that arrivals after a job of interest can impact the service time of the job of interest. The Laplace–Stieltjes transform of the response time distribution has been shown to be a solution to a Volterra integral equation from which moments can be computed.[11] An approximation has been offered for the response time distribution.[12][13]

Finite capacity

In an M/M/c/K queue only K customers can queue at any one time (including those in service[4]). Any further arrivals to the queue are considered "lost". We assume that K ≥ c. The model has transition rate matrix

on the state space {0, 1, 2, ..., c, ..., K}. In the case where c = K, the M/M/c/c queue is also known as the Erlang–B model.[1]: 495 

Transient analysis

See Takács for a transient solution[14] and Stadje for busy period results.[15]

Stationary analysis

Stationary probabilities are given by[16]

The average number of customers in the system is [16]

and the average time in the system for a customer is [16]

The average time in the queue for a customer is [16]

The average number of customers in the queue can be obtained by using the effective arrival rate. The effective arrival rate is calculated by [16]

Thus we can obtain the average number of customers in the queue by [16]

An implementation of the above calculations in Python can be found.[17]

Heavy-traffic limits

Writing X(t) for the number of customers in the system at time t, it can be shown that under three different conditions the process

converges to a diffusion process.[1]: 490 

  1. Fix μ and c, increase λ and scale by n = 1/(1 − ρ)2.
  2. Fix μ and ρ, increase λ and c, and scale by n = c.
  3. Fix as a constant β where

and increase λ and c using the scale n = c or n = 1/(1 − ρ)2. This case is called the

Halfin–Whitt regime.[18]

See also

References

  1. ^ .
  2. ^ Harrison, Peter; Patel, Naresh M. (1992). Performance Modelling of Communication Networks and Computer Architectures. Addison–Wesley. p. 173.
  3. JSTOR 2236285
    .
  4. ^ .
  5. .
  6. ^ .
  7. .
  8. ^ .
  9. .
  10. ^ Iversen, Villy B. (June 20, 2001). "ITU/ITC Teletraffic Engineering Handbook" (PDF). Retrieved August 7, 2012.
  11. .
  12. .
  13. .
  14. ^ Takács, L. (1962). Introduction to the Theory of Queues. London: Oxford University Press. pp. 12–21.
  15. .
  16. ^ .
  17. ^ "Basic Calculator for Queueing Theory". GitHub.
  18. JSTOR 170115
    .