Abstract

Problem: High-speed networks including the Internet backbone suffer from a well-known problem: packets arrive on high-speed routers much faster than commodity memory can support. On a 10 Gb/s link, packets can arrive every 32 ns, while memory can only be accessed once every ~50ns. By 1997, this was identified as a fundamental problem on the horizon. As link rates increase (usually at the rate of Moore’s Law), the performance gap widens and the problem only becomes worse. The problem is hard because packets can arrive in any order and require unpredictable operations to many data structures in memory. And so, like many other computing systems, router performance is affected by the available memory technology. If we are unable to bridge this performance gap, then —

  1. We cannot create Internet routers that reliably support links >10 Gb/s.
  2. Routers cannot support the needs of real-time applications such as voice, video conferencing, multimedia, gaming, etc., that require guaranteed performance.
  3. Hackers or viruses can easily exploit the memory performance loopholes in a router and bring down the Internet.

Contributions: This thesis lays down a theoretical foundation for solving the memory performance problem in high-speed routers. It brings under a common umbrella several high-speed router architectures, and introduces a general principle called “constraint sets” to analyze them. We derive fourteen fundamental, not ephemeral solutions to the memory performance problem. These can be classified under two types — (1) load balancing algorithms that distribute load over slower memories, and guarantee that the memory is available when data needs to be accessed, with no exceptions whatsoever, and (2) caching algorithms that guarantee that data is available in cache 100% of the time. The robust guarantees are surprising, but their validity is proven analytically.

Results and Current Usage: Our results are practical — at the time of writing, more than 6M instances of our techniques (on over 25 unique product instances) will be made available annually. It is estimated that up to ~80% of all high-speed Ethernet switches and Enterprise routers in the Internet will use these techniques. Our techniques are currently being designed into the next generation of 100 Gb/s router line cards, and are also planned for deployment in Internet core routers.

Primary Consequences: The primary consequences of our results are that —

  1. Routers are no longer dependent on memory speeds to achieve high performance.
  2. Routers can better provide strict performance guarantees for critical future applications (e.g., remote surgery, supercomputing, distributed orchestras).
  3. The router data-path applications for which we provide solutions are safe from malicious memory performance attacks, either now and provably, ever in future.

Secondary Consequences: We have modified the techniques in this thesis to solve the memory performance problems for other router applications, including VOQ buffering, storage, page allocation, and virtual memory management. The techniques have also helped increase router memory reliability, simplify memory redundancy, and enable hot-swappable recovery from memory failures. It has helped to reduce worst-case memory power (by ~25-50%) and automatically reduce average case memory and I/O power (which can result in dramatic power reduction in networks that usually have low utilization). They have enabled the use of complementary memory serialization technologies, reduced pin counts on packet processing ASICs, approximately halved the physical area to build a router line card, and made routers more affordable (e.g., by reducing memory cost by ~50%, and significantly reducing ASIC and board costs). In summary, they have led to considerable engineering, economic, and environmental benefits.

Applicability and Caveats: Our techniques exploit the fundamental nature of memory access, and so their applicability is not limited to networking. However, our techniques are not a panacea. As routers become faster and more complex, we will need to cater to the memory performance needs of an ever-increasing number of router applications. This concern has resulted in a new area of research pertaining to memory-aware algorithmic design.