Allerton 2002: Script for Maximum Size Matching and Input Queued Switches


(Slide: Title: Maximum Size Matching and Input Queued Switches)


(Note: This first paragraph is similar to the Sigcomm talk on single buffered routers, as the motivation is in a sense similar. Everything else is very different though!)


We usually characterize a router by its capacity. The capacity is the sum of the rates on its line cards. But this full capacity is usually not available to us. In reality there are a number of constraints imposed by a given router architecture such as head of line blocking, output blocking and other structural limitations, which prevents us from having access to all of this capacity. So the available throughput is generally lower. It is desirable to achieve 100% throughput since it allows a network operator to efficiently utilize the effective link bandwidth.


(Slide: Definition 100% throughput)


For the purpose of this talk, we shall say that a switch gives 100% throughput if the expected size of the queues is finite for any admissible load. By an admissible load we mean that no input or output is oversubscribed.


 (Slide: A characteristic Switch)


Most high speed packet switches in the core of the Internet today use a crossbar switching fabric with queues at the inputs, one for each output called virtual output queues. We will assume that time is slotted and that a fixed size cell arrives in every time slot. Such switches are constrained to perform a matching in every time slot, i.e. they can transfer at most one cell from every input and send one cell to every output in a time slot.


The scheduling algorithm in an IQ switch can be modeled as a problem of bi-partite graph matching on a graph with N input and N output vertices. And the throughput attainable from a switch depends on the efficiency of the scheduling algorithm.


(Slide: Maximum Size Matching)


An obvious algorithm which comes to mind is the maximum size matching (MSM). MSM chooses the matching which can connect the largest number of inputs to outputs. It maximizes the size of the matching and thus maximizes the instantaneous throughput. However contrary to intuition MSM does not give 100% throughput.


(Slide: Contents)


While there are a number of algorithms which have been proposed and studied to achieve 100% throughput, namely in the class of maximum weight matching, we are interested in the more simpler to implement class of maximum size matching algorithms. Unfortunately not much is known about this class of algorithms. In this talk I shall attempt to study the class of MSM algorithms. In order to prove some properties of algorithms in this class, I shall use a mechanism called non pre-emptive or batch scheduling which attempts to aggregate cells in batches and schedule these cells. Further we shall look at a couple of theorems of algorithms in this class.


(Slide: Example of 2x2)

The following example shows that MSM does not give 100% throughput even in a 2x2 switch. The input arrival rates are as shown. (Read from slide).


The MSM would prefer the cross matching as compared to the straight one.  It has been shown both via analysis as well as simulations that the size of the VOQ (1,1) grows without bound.


Note that there are many different MSM algorithms possible. Later in this talk we will see two examples of MSM algorithms. When we say that MSM does not give 100% throughput what it means is that there is at least one MSM algorithm, which does not have this property.


(Slide Motivation)


Read from slide.


(Slide: Questions)


It is interesting to further understand the class of MSM? For example


   - Is there a subclass of MSMs, which are stable?

        - Note that it is known that at least one MSM called LPF [ref] is stable.


  - Is MSM stable under uniform load?


-         Simulation seems to suggest that MSM is stable under uniform load. Though it appears that the proof of the above statement might be simple, it seems that the solution is hard to find.


I would like to state that I do not have a general answer to the above problem as yet. However, in this part of the talk, I shall present proofs for both the above questions, under a slightly different framework, called 'batch scheduling'


(Slide: non pre-emptive scheduling … 1)


The main idea behind non –pre-emptive scheduling is to aggregate cells in batches and to schedule them together. Aggregating cells helps us to improve the size of the matching and hence increase throughput. Non pre-emptive scheduling is called frame scheduling in literature. Here the size of the frame is fixed. The only difference in batch scheduling is that we allow the size of the frame to increase or decrease based on the arrival traffic.



Read from slide.


(Slide: Batch Scheduling.. 2)


(Set of two slides for animation)

[Diagram: Batch Scheduling]


Let me first describe the framework of batch scheduling. The input queues in the switch are split into two different priority levels. All arriving cells are classified into one of two strict priority levels, level one and level two as shown here.  An arriving cell is queued in priority level two if there are any cells (in any VOQ of the switch) in priority level one. Since, level one cells have strict priority over level two cells, at any given time, only the cells in level one will be considered for scheduling. If at some time, all the cells in priority level one are scheduled, and all VOQs in priority level one become empty, then all the cells in priority level two are immediately “poured” into the priority level one queues.


The set of cells which are transferred en mass from level two to level one, is referred to as a batch. Any newly arriving cells will now be stored in the now empty priority level two queues and they will not be available to the switch scheduling algorithm till the present batch is completely scheduled.


Takeaway - "In Batch scheduling, cells of a later batch cannot affect cells from the present batch preventing pre-emption".


Note that every batch will consist of a number of cells. As before we can represent a batch using a weighted bipartite request graphs, where the number of cells in any VOQ could represent the weights the edges.


(Slide: Definitions)




     The degree of any vertex "v" in a batch "k" is defined as the number of cells destined from an input "v" (or destined to an output v), in batch k.


Maximum Degree:

    This is the maximum degree amongst all input and output ports.


(Slide: Maximum Size Matching)


Explain why MSM does not give 100% throughput.


(Slide: Critical Maximum Size Matching)


  [Diagram: CMSM schedule, Hajek’s Example]


I shall now consider a subclass of MSM called critical maximum size matching, which was first introduced by Hajek and Weller.


Critical Maximum Size matching: A critical maximum size-matching (CMSM) algorithm is a MSM, which can schedule a batch of maximum degree M in M timeslots.  Note that all MSMs do not have this property as seen by the previous example.


(Slide: Previous Results)


Read from slide.


(Slide: Arrival Traffic)


(Slide: Contents)


(Slide – CMSM with uniform traffic)


  Theorem: CMSM is stable under batch scheduling, if the input traffic is admissible and Bernoulli i.i.d. Uniform.


-[Informal structure of the proof]

 - In what follows, I shall appeal to the evolution of the batches. Consider the process T_k, which denotes the time it takes to schedule batch k.

- This means for batch (k+1), the cells are buffered for exactly T_k time.

- Since the rates to each input and to each output are uniform, we would expect no more than lamnda. T_k cells which arrive at each input or output in batch T_k+1,

-         That means that batch k+1, takes less time to schedule that batch "k".

-         That means that the size of the batch is bounded in mean.

-         This means that the delay faced by a packet would be bounded


Note that this set of informal arguments would be complete if we could say something definite about this inequality. If the traffic had deterministic constraints imposed on it, we are done. However since we have relaxed the constraints on the input traffic, we need to sweat a bit more.


(Slide: Formal Arguments: Basics)


  (Read from slide)


-         We will now formalize these arguments. Note that we shall try and show the following inequality.


(Slide: Formal Arguments.. 1 - Chernoff Bound)


Let X_1 be a bernoulli trial (coin toss), which has a probability "p" of success.

Let X be generated as the sum of T bernoulli trials. Then the expected value of the random variable X is simply \mu= pT. The Chernoff Bound says that: the probability of X exceeding \delta times the mean is less than <..


Bounding the degree of each vertex:

 - Consider d_{v,k}.Note that the expected value of d_v,k for all inputs and outputs is al.

-         We can write using the Chernoff bound ....

-         Thus we can bound the maximum degree of each vertex]


(Slide: Formal Arguments 2: Write down the choice of \delta, \epsilon. Etc)


(Slide: Formal Arguments 3: Convex Combination)


[Diagram: Show convex combination of the two numbers]


(Slide: Formal Arguments – 4: Choosing Gamma)


End of slide: There is always a value of T_k  which satisfies the inequality for Q




(Slide: Formal Arguments 5: )


 State Lyapunov Function......

 State B_k is finite.

 State queue length is finite.

 State 100% throughput..


(Slide: Formal Arguments 6:)


Read from slide.  Mention Michael Neely and Bruce Hayek for pointing out that there was an error in an original argument.


(Slide: Contents)


(Slide: CMSM for non uniform traffic)


State- no proof and read from slide.


Now motivate the next part of the talk.

We ask the question as to why does this not hold for a standard  MSM?

Note that similar to the previous case we we get D_k+1 < (1-e)B_k

However,  note that we are not assured to serving batch "k+1" in D_{k+1} time time.

Hence B_k+1 > D_k+1.

So we cant relate B-k+1 and B-k.


(Slide: Contents)


(Slide: Uniform Graph)


- If for example we had a perfect graph, then it would in fact take exactly the max degree.

- Then it gets cleared in exactly the same time.

- This is nice. This tells us something.

- Recall the question that we asked??

- Perhaps if traffic was uniform, maybe MSM with batch scheduling will be stable.


(Slide: MSM for uniform load)


Theorem: (Read from slide)



(Slide: Conclusion)


We have extended some previous work on the CMSM by analyzing it with the more traditional stochastic traffic models. Also we looked at the class of MSM’s itself and showed that they can give 100% throughput if the arrival traffic is Bernoulli i.i.d. Uniform. An obvious open problem is to understand how the MSM and CMSM perform under continuous scheduling.


Our results are essentially theoretical in nature and we hope that they will lead us to a better understanding of the MSM algorithm