A key component of Chapters 4 and 5 is the monitoring of flows and the study of their characteristics. This monitoring of flows has been done both off-line and on-line. The off-line profiling of user flows has been used, first, to see whether the proposed solutions make sense in the face of current Internet workloads and, second, to feed the model in Section 5.4. On the other hand, the on-line monitoring of active flows has been used to control the circuit-switched backbone in real time. Next, I will describe two approaches that can be used to study user flows.
RFC 2722  provides a general framework for describing network traffic flows and presents an architecture for traffic flow measurement and reporting. The purpose of such flow-measurement system is to understand network usage and performance, which is in general done off-line, rather than to control the network in real time. Namely, such a flow measurement system can be used for network planning, performance and QoS estimation, and per-user billing.
There are two related tools that use sampling of packets to study flows. Cisco offers a feature in its routers called NetFlow  that logs in memory one packet out of every N packet arrivals6.8 and later dumps the log to a permanent storage. There are numerous commercial and open-source programs that analyze off-line the traces sampled by NetFlow . Duffield et al.  have proposed sampling flows with a frequency that is the inverse of the flow size to decrease the number of samples without introducing measurement errors.
Estan and Varghese  propose two methods that sample large flows (those that take a non-negligible amount of the link capacity) more precisely. One method samples packets at fixed arrival intervals, and it creates a filter for the flow of each sampled packet. All subsequent packets will try that filter. Large flows are more likely to have a filter in place when their packets arrive, and so they are more likely to be matched and sampled. The other method hashes each arriving packet using multiple hash functions. The value of each of the hash entries is increased with the packet size. A packet belonging to a large flow finds that the values of all its hash entries are large, whereas short flows most likely have some entry with a small value. These two methods use less memory than Cisco's NetFlow, and they accurately sample large flows, but they ignore many small flows.
As with the method listed in Section 4.3.3, the two methods described above require the observation of every single packet in the link. The difference between the two approaches is that Estan's methods use fewer filters by focusing on big flows, whereas the method of Section 4.3.3 uses many more filters because it measures how many flows are currently active, whether they are large or small. This latter information is then used to calculate the total flow capacity to properly size the circuit in the core in real-time. However, small flows typically take less than 20% of the aggregate rate, and so Estan's two methods can provide a rough estimate of the envelope of the total flow bandwidth with less state, but, as mentioned in Section 4.3.3, the amount of state related to all active flows (big or small) is not a big problem.