Monday, May 30, 2011

More about 3560 QoS

When it comes time to talk about queue-sets on the 3560, I find a LOT more research is needed than I expected. Check out the following blog from INE.com to better understand how QoS configuration changed between the 3550 and 3560. It has an excellent section which explains how queue-sets are just a buffer space partitioning scheme for the switch.

http://blog.ine.com/2008/03/03/bridging-the-gap-between-3550-and-3560-qos-part-i/

That URL plus some research on the Cisco website for 3560 QoS configuration helped me understand how queue-sets work.

http://www.cisco.com/en/US/docs/switches/lan/catalyst3560/software/release/12.1_19_ea1/configuration/guide/swqos.html#wp1163863

The upshot is that by having two queue-sets available, you can provision two different QoS egress architectures on a single 3560. This will let you customize thresholds and buffer allocation for two groups of ports. However, an interesting side note is that DSCP and CoS mapping to queues remains global for all ports. This means the queue-sets are being utilized only to address different port speeds on the switch.


Thursday, May 26, 2011

More on LAN Switching and Congestion Avoidance

The 3560 is the model I've got in my lab to test things on. It's helped me understand things like the method in use for congestion avoidance, weighted tail drop (WTD). Enabled with QoS, WTD will create 3 thresholds per queue for tail drop. The CoS value is used to help set the thresholds- you associate a CoS value(s) with the threshold for drops. In this case, usually CoS 6 and/or 7 is set to the last (highest) threshold, which performs tail drops once the queue reaches 100% full.

WTD is highly granular because it can be configured separately for each of the 6 queues on a 3560 (there are two ingress and four egress queues). The complexity is increased because you have to decide whether to trust CoS (or DSCP) values on received traffic, or to remark the traffic. Let's look at what to do if you trust markings received.

To assign specific CoS values to a threshold, use the command:
mls qos srr-queue input cos-map threshold THRESHOLD_ID cos1 . . . cos8
If you are trusting DSCP values, the command used to associate them with thresholds is:
mls qos srr-queue input dscp-map threshold THRESHOLD_ID dscp1 . . . dscp8
The command to associate tail drop percentages with thresholds is:
mls qos srr-queue input threshold QUEUE_ID THRESHOLD-PERCENTAGE1 THRESHOLD-PERCENTAGE2

Cisco 3560 Egress Queuing
The 3560 has 4 egress queues per interface. Just like with ingress queues, you can configure DSCP or COS mappings to each, set up weights, and configure WTD drop thresholds. If you configure a priority queue, it MUST be queue 1. A major difference is that while ingress commands were executed at the global level egress commands are run at the interface level.

On an odd note, the 3560 apparently is architected to slow down egress traffic. The book claims this allows implementation of subrate speed for Metro Ethernet as well as to prevent "some types of DoS attacks." I can find no information on the latter.

Slightly complicating queuing is the fact that 3560s assign an internal DSCP value to a frame. This is determined when the forwarding decision is made. Once the internal DSCP is assigned and an exit interface determined, two things occur:

  1. Internal DSCP value is compared to a global DSCP-to-COS map to determine the COS value of the frame
  2. The per-interface COS-to-queue map indicates which queue the frame will be placed in
 Let's next discuss the scheduler, which handles packets after they are queued. Confusion can arise because the 3560 has two different methods for scheduling that use the same acronym of SRR: they are shared round robin and shaped round robin. Both address the issue of queue starvation when a priority queue exists, but the shaped version rate-limits queues so that they will not exceed the configured bandwidth allowance.

The text offers two examples to understand how these schedulers work. In the first one, all queues hold some amount of frames. Both shaped and shared scheduling will service queues based on the weighting configured. The commands for weighting egress queues are:

srr-queue bandwidth share weight1 weight2 weight3 weight4
srr-queue bandwidth shape weight1 weight2 weight3 weight4

Per the IOS command reference for 12.2.25SEE, SRR shaping default weight is 25 for queue1 and zero for the other three. The other three also operate in shared mode by default. However, the queue setup is different for SRR sharing- each queue is assigned one quarter of the bandwidth. It's worth noting here that as per the IOS command reference, shaped queues with a zero weight configured will IGNORE the weighting assigned with the command srr-queue bandwidth shape and instead will use the values configured with the command srr-queue bandwidth share. So by default, when shaping queues you also are sharing them and will want to configure both commands to complete your QoS config on the 3560. *sigh* I hate extra typing!

Moving on, let's consider how the schedulers operate when not all queues contain traffic. If only one queue contained traffic and that queue had a weight of 25, then in shared scheduling this queue would utilize all the bandwidth for its traffic. However, if shaped scheduling were in use, the scheduler would delay sending packets even if no other queues had traffic to send, to limit the queue to 25% of bandwidth.

Let's now talk about having a priority queue (queue 1 is the only choice possible, remember). If all queues had traffic EXCEPT queue1, and then queue 1 has frames arrive, the scheduler will finish sending its current packet and service queue1 to the configured bandwidth limit (25% by default). Excess frames will be queued rather than discarded in this scenario.

Now let's imagine that queue1 has packets queued up like crazy and the other queues are empty. Here the scheduler behaviors will be different. In shared mode queue1 will be allowed to transmit at full line rate. In shaped mode, queue1 will be serviced to guarantee only its configured percentage of bandwidth (25 by default). The main takeaway from these examples is that shaped SRR will never allow the priority queue to exceed its configured bandwidth percentage, even if no other queues have traffic to send.

Next time we'll discuss egress queue-sets and more of the architecture used in sending out traffic from 3560s.


    Tuesday, May 17, 2011

    LAN switching: Congestion avoidance and management

    The text discusses 3560 switch queues, specifically how this model has ingress and egress queues. There are two ingress queues, but only one of them can be configured as a priority queue, and these queues use a method called Weighted Tail Drop during periods of congestion.


    Shared Round Robin (SRR) is used in the 3560 to schedule packets from the ingress queues to the backplane fabric. You can specify the guaranteed amount of traffic each queue will have (weighting), but neither queue is limited to only that amount. This way, if one queue is empty the second ingress queue can utilize all the bandwidth available for sending traffic.Finally, you should know that SRR weighting functions more like a percentage specification than as a bandwidth amount (e.g.- it is relative and not absolute).

    Behaviors you'll want to keep in mind when you configure ingress queuing include:

    1) COS mappings are used by default; COS 5 traffic gets its own queue. DSCP mapping is available.
    2) You'll need to specify if the second ingress queue is a priority queue
    3) Ensure the default WTD thresholds are appropriate for your traffic; at 100% full traffic is dropped by default
    4) WTD can have up to three different thresholds where traffic is dropped in increasing increments

    Priority Ingress Queuing
    The relevant command is mls qos srr-queue input priority-queue QUEUE-ID bandwidth WEIGHT. In this syntax, the weight value is a percentage of link bandwidth. The next command used is mls qos srr-queue input buffers PERCENTAGE1 PERCENTAGE2. The default for the latter command has 90% of buffers allocated to queue 1 and 10% to queue 2, so you'll want to watch out for this if your priority queued traffic needs more buffers.

    Unfortunately, when I tried to Google search the 3560 or 3750 buffer sizes I had no luck determining either how large the buffers are, or how best to check for buffer size problems! The best info available was to run a show interface command and check the input section for buffer overflow packets being reported. That's far from an exact science.

    You'll also need to configure the SRR scheduler for both queues-  mls qos srr-queue input bandwidth WEIGHT1 WEIGHT2. Although the word bandwidth appears in this command, it's not an actual BW value but a weight number being specified. Confusing eh? Default values are 4 and 4 for weight1 and weight2, which divides scheduling evenly between queues. Although I've been Googling, I cannot locate solid info on what happens to the second queue once you configure a priority queue and its weight value. I'll test this in my lab tomorrow and update the blog.

    EDIT: So the results are in! Configuring the bandwidth weight with mls qos srr-queue input priority-queue changes a weighting value, which is multiplied by the bandwidth value (set using the command mls qos srr-queue input bandwidth). To see these command options being changed, you'd check the output of the command sh mls qos input-queue. Sample output is below-

    Rack1SW1(config)#mls qos srr-queue input priority-queue 2 bandwidth 30

    Rack1SW1#sh mls qos input-queue
    Queue     :       1       2
    ----------------------------------------------
    buffers   :      90      10
    bandwidth :       4       4
    priority  :       0      30
    threshold1:     100     100
    threshold2:     100     100

    Rack1SW1(config)#mls qos srr inp ban 2 6

    Rack1SW1#sh mls qos input-queue
    Queue     :       1       2
    ----------------------------------------------
    buffers   :      90      10
    bandwidth :       2       6
    priority  :       0      30
    threshold1:     100     100
    threshold2:     100     100

    Wednesday, May 4, 2011

    Queuing Strategies and Tools continued

    The Magic of WRED Weighting
    WRED looks at IP precedence or DSCP values to prioritize packets when the time comes to discard. You configure this via a traffic profile, which has a minimum threshold, a maximum threshold, and a MPD value. These profiles provide granular control over discarding, for example IPP 0 traffic can be discarded at a lower threshold than other traffic.

    As with many IOS features, WRED has default profile settings assigned to DSCP values. Here's a chart of them:

    DSCP    Min. Threshold   Max Threshold  MPD
    AFx1             33                    40                      10
    AFx2             28                    40                      10
    AFx3             24                    40                      10
    EF                 37                    40                      10

    Since WRED uses the queue depth as a trigger for dropping traffic, you have to configure it for each queue. There are limits on which queuing mechanisms support WRED, though. They are physical interfaces with FIFO queuing, ATM VCs, and non-LLQ classes in a CBWFQ policy map. N.B.- when you configure WRED on a physical interface, all other queuing mechanisms are disabled and you'll have only one FIFO queue!

    random-detect is the command to enable WRED. The default is to check IPP, so if you want to use DSCP you'll have to use the syntax random-detect dscp-based. If you want to change the default WRED values, the syntax for IPP or DSCP commands is as follows-


    random-detect precedence IPP_value min-threshold max-threshold [mark-probability-denominator]
    random-detect dscp DSCP_value min-threshold max-threshold [mark-probability-denominator]

    One last odd note from the CCIE study book pertains to changing the rolling average of queue depth. If you feel the need to adjust the calculation, you can use the command random-detect exponential-weighting-constant EXPONENT. Changing this exponent value from default will affect drops because WRED doesn't base its decisions on the current queue length; WRED uses the average queue length to decide whether to drop packets. It appears that in most cases you'd be looking to shorten the time between calculating the average queue lengths.

    Modified Deficit Round Robin (MDRR)
    This was implemented only in the Cisco 12000 series because they didn't support CBWFQ or LLQ. It's expected that you'd only need to know the concepts of how MDRR works, so my discussion will be brief. MDRR configures 7 queues (0-6) which are serviced in round robin method. One additional queue is a priority queue.

    If there are no packets in the PQ, each queue gets serviced in turn (once/cycle). If there are packets in the PQ, it will service them in one of two ways depending on configuration. Strict priority mode always services the PQ but will starve other traffic. Usually alternate mode is used for this reason. Alternate mode services the PQ in between each regular queue (PQ, queue1, PQ, queue2, etc.). You may still get jitter with this method but no starvation will occur.

    MDRR also has two concepts you'll want to know: quantum value and deficit. These relate to MDRR scheduling. The quantum value is a number (in bytes) of data that will be transmitted from each queue. The deficit concept comes into play when a router pulls a packet from a queue that exceeds the quantum value. Since more bytes were sent than the quantum value "allows," the router tracks the extra data sent and subtracts that number from the quantum value when it next services that queue. Over time, this guarantees that each queue will get its configured percentage of the bandwidth.



    Monday, May 2, 2011

    Queuing Strategies and Tools

    As I'd mentioned at the beginning of my notes, queuing impacts for major characterisics of QoS. Those are bandwidth, delay, jitter, and packet loss. The default FIFO queue that IOS creates in memory to handle packets affects drop, delay, and jitter. This is because of its size: once full, tail drop of packets will happen. You can prevent this by lengthening the queue, but that has impact on delay and usually jitter. And of course, if traffic rates exceed the interface bandwidth, your chances of a drop are higher regardless of queue length.

    The Cisco QoS Exam guide offers better insight into the queuing process than does the CCIE book. It displays a sample with two queues to better explain the process. Here's a diagram I created which is similar to the one in the book.

    This diagram shows the major questions a chosen queuing method will answer. Note that while question 1 appears to be classification/marking, this is actually part of the queuing method's decision process. In other words, it is a queuing choice based on the markings which have already been applied by marking/classification. Don't get confused!

    Each queue will normally be FIFO, unless you set class-default to use WFQ. This was mentioned before as an option. The interface subcommand is fair-queue.

    After queuing occurs, queue scheduling will be done. This is the pattern, or algorithm, used to service queues. This can take several forms, such as always servicing a particular queue first, or a bandwidth cap may be observed, etc.


    Weighted Random Early Detection (WRED)
    This queuing mechanism monitors the usage of a queue and progressively drops traffic as it is more heavily utilized. This is done in hopes that TCP is the main protocol in use; if TCP is present, it will retransmit lost traffic and throughput will be better managed. If not, you will have a problem.

    There are settings WRED uses you'll want to be familiar with. Among them are:

    Average queue depth- this is compared to thresholds you set and then actions are taken (drop/remark traffic)
    Mark probability denominator- used to specify how much traffic is being dropped; 1/X, where X is is the mark probability denominator. This calculation sets the max threshold drop percentage.