GBN Presentation: ---------------- 1) Good Evening everybody. My topic for today is "Techniques for fast Packet Buffers", which is joint work done by my advisor Nick McKeown and myself Sundar Iyer at Stanford University. 2) Goal is to analyze techniques for building very high-speed packet buffers. Here's one example of an OC768 speed packet buffer. Packets arrive at the rate of 40Gb/sec. Now we know that packet switches and routers store about one RTT worth of data, which works out to 10Gb of data. Lets assume that the minimum packet size on the Internet is about 64 bytes. Lets see what hap­ pens when these 64 byte packets arrive: [Animate] So we see that we need to build memories, which are both large in, size and which are accessed at very high rates. [Talk about SRAM and DRAM] [Animate] [Pause] So I am going to talk about one specific, perhaps obvious memory architecture, which consists a both a DRAM and an SRAM. We be­ lieve that these techniques may already exist in some propriety form, in many of the Internet switches and routers. 3) Here's a line card. Which consists of packets arriving at rate "R", and which depart at the rate "R". Now the packet buffer it­ self, consists of packets, which are classified into Q queues, and each of these queues, are arranged in a first in first out manner. We see here an arbiter, which makes requests for differ­ ent packets or cells. We will also assume that all packets have been split into fixed size cells. We hope to keep some amount of fixed size SRAM in the line card, which is accessed at the line rate and the remaining data in a DRAM in the background. The SRAM has to be refreshed periodical­ ly, by reading from the DRAM. [Animate] Since it takes a longer time to read packets from the DRAM, that at the rate at which they arrive on the ingress and egress, we will assume that there are "b" DRAMs, and one can read from the "b" DRAMs in parallel. One can see that b is the ratio of the random access time on the DRAM and the rate at which the memory is accedded. [Continue Animation] We will assume a simplistic model on the DRAM i.e. there is a single address bus, and we read from "b" DRAMs in parallel one from each physical DRAM. Note that for each queue, the head of the queue is in the SRAM, so that when the arbiter makes a re­ quest the cell is always in the SRAM. Also the body of the queue is stored in the DRAM. Now the architecture on the ingress is exactly similar, i.e. Cells arrive and they are stored in the SRAM. Then when the SRAM gets filled, specific cells are written to the DRAM. The ingress and egress architecture are similar. Hence we will describe on the egress architecture and analyze it. 4) There are a couple of different questions that we can ask. However this talk is about the latter. Since networking switches and routers can tolerate latencies, we would be more interested in the latter question. 5) We note that a critical question is: What memory management should be used?. In fact we can ask the question as to what memory management algorithm should be used such that we can minimize the size of the SRAM, and maintain bounded latency. I will not prove the result in this talk, but directly state an algorithm called ECQF, which will minimize the size of the SRAM being used. We will assume that the SRAM being used is a dynamic shared memory. Explain the slide: For point 2, say that of course as this is not known apriori, we will simply, "buffer" i.e. arbiter requests thus creating an initial latency. 6) Explain. [Animate] State that Q=4 and b=4. State that the different colors are for different queues. 7) State results. The first is what we described. Impatient Arbiter. We also state the result for the other ques­ tion, i.e. what happens if we are impatient i.e. cannot tolerate latency. 8) Read out implemenation numbers. Say that we need 10Gb of DRAM once again. Comments: 1) One does not always face the latency is the departure traffic pattern is known. 2) Multiple addresses, bank conflicts.