Interop Presentation.

Slide 1:

Hi, a very good morning everybody. My name is Sundar Iyer  and  I
am  representing  Switchon  Networks.  My  talk for today is "Co-
Processors and the Role of Specialized Hardware."

We shall begin by looking at a few network applications and their
system   requirements.  I  will  then  present  a  study  of  the
performance  of  these  applications  on  legacy   architectures.
Finally  I shall make a case for the use of Network Co-processors
based on these performance metrics.

Slide 2:

Here is a typical network system or box. Shown in this figure  is
a  backplane, which contains a switch fabric, and many line cards
sticking off the bus.  Lets take a closer look at one  of  these.
These  line  cards  usually  consist  of a line interface, fabric
interface chips, memory  and  at  least  one  processing  element
called a network processor. This network processor has the job of
receiving the packet, processing it and sending it on its way  to
the switch fabric. The network processor has to perform this task
at very high speeds in order to keep up with the data  rate.  For
example  on  a  2.5  Gbit  interface,  the network processor must
process 64 byte packets in less than 200 ns. The question we  ask
is "Can the network processors keep pace with this?"

In fact I would like to ask a more precise question. "Can one  do
a suite of networking applications such as:

1.      Routing or  forwarding,  2.      NAT,  3.      Enterprise
Functions  such as QoS, RMON, 4.      Load Balancing, 5.      URL
Switching

and the like at these rates?

Slide 3:

In order to answer that question, we did a study on some  typical
applications. The analysis was done on a processor, which had 800
MIPS of processing power.  We took the worst-case time  taken  by
the processor for the following applications. Lets have a look at
the throughput figures.

1.      We start with forwarding. This looks fine as it keeps  up
with  OC48 rates.  2.      For address translation the throughput
falls considerably.  3.      If  we  look  a  typical  enterprise
router  which  say does Firewalling, QoS, Routing and a couple of
RMON functions we  see  that  the  throughput  is  lower  than  a
gigabit.   4.      And  here  comes  the  surprising  part.  When
applications, which are content aware, such as load balancing and
URL switching come into play, the throughput is minimal.

Notice that we started off by trying to achieve wire-speed rates,
which for this example is a little more than 2.5 Gbit/sec.

Why is this the reason behind this phenomenon? It turns out  that
each  of  these applications share a common feature. That feature
is called content processing. It is this, which turns out  to  be
the bottleneck.

The goal  of  a  co-processor  is  to  raise  the  bar  on  these
applications and eliminate this bottleneck.

Slide 4:

At this stage I would like to re-iterate that we shall be looking
at enabling each of these application at 'wire- speed'.

Our  goal  is  to  offload  this  bottlenecked  task  of  content
processing,  to  dedicated co-processors as shown here. These are
two examples of co-processors, which sit  alongside  the  network
processor.

Slide 5:

For today's talk I shall concentrate on one  co-processing  task,
which  is  content  processing.  So what is content processing? I
shall illustrate this  with  an  example.  Lets  assume  that  we
receive  a packet on an interface. In order to understand what is
to be done with the packet it is necessary to identify  what  the
packet  contains. This involves searching and extracting data and
then classifying this data to make a decision.

1.      We can begin by authenticating the source MAC address  by
doing  a layer 2 lookup.  2.      A layer three lookup is done to
identify the subnet from which the packet  arrives.  We  identify
that  the  packet  arrives from the marketing network.  3.      A
layer 4 classification informs us that our  VP  of  Marketing  is
accessing a Web Server outside.

Now comes the tougher part, which can involve  looking  into  the
entire remainder the packet.

4.      A content lookup tells us that  the  external  server  is
yahoo.  5.      A further peek into the packet identifies that an
audio file is being requested.  6.      Finally a thorough lookup
lets  us  know  that  our  VP  craves  the classic, American Pie.
7.      We note that a certain external factor may influence  our
decision.  In this case the packet is being sent at 7pm. We allow
the packet based on the policies, which  are  configured  in  the
box?

The moral of the story, "The Boss is always right!"

Slide 6.

So why do the applications shown in the  previous  slide  perform
progressively worse without assistance?

Here why. Every application, which requires  content  processing,
is   configured   with  specific  policies.  The  mean  and  lean
applications usually have simple policies configured.  As  we  go
towards  data  intensive applications, the policies involved also
become progressively complex. Specifically applications  such  as
load  balancing and URL switching require a large number of rules
as well as high complexity rules.

Slide 7:

I would  like  to  touch  upon  the  requirements  of  a  content
processor. Primarily

1.      The content processor should be programmable to support a
wide  format  of rules. This involves.  ?       Dimensions, which
says,  "How  much  can  you  look  at".   ?       The  number  of
policies,   -  "How  many  can  you  look  at"  ?       Different
Operations  –  "What  can  you  while  looking  at  the   packet"
?       Priority – "What will you return to me"?  2.      Speed –
"How fast can you do that"?  3.      Dynamic Update –  "How  fast
can  I reconfigure that". In many solutions policies new policies
need to be added and deleted and hence update  speeds  should  be
fast.   4.      Minimize  CPU  Bandwidth  –  "Do  I  need to keep
nagging you?" 5.      Rule Scalability  –  "Will  you  be  around
tomorrow".   A  content  processor  should  be  scalable  to next
generation  speeds  i.e.  OC192  and   OC768.    6.      Glueless
Interface  &  Easy  Software  Integration  –  Finally the ease of
integration into a specific hardware architecture as well as easy
software integration help in wide acceptability.

Slide 8:

Finally I would like  to  conclude  with  a  simple  architecture
diagram  of  a  NP/Co-Processor solution. It would suffice to say
that Co-processors form one niche in the network processor  space
and form an integral part of any vendor solution.