You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

You can combine Traffic Server with Linux iproute2 to shape traffic between the proxy and origin or between the proxy and client. Or you can use BSD ALTQ or a separate device like a capable router or network switch. Traffic Server marks the traffic and iproute2 etc. do the actual traffic shaping. The two can be on the same machine or they can be separate devices. This is sometimes called bandwidth management, traffic shaping or QoS.

Contents

Here are some ways Traffic Server can communicate with iproute2 etc.

  • Packet or connection mark (on the same machine only)
  • Type of Service (ToS) or Differentiated Services (DiffServ) Field (IP packet)
  • Priority Code Point (PCP) Field (Ethernet frame)

There's more than one way to set some of these marks and some ways don't work in all scenarios:

  • Configuration variables
  • API functions
  • header_rewrite operators

Configuration Variables

mark_in and tos_in mark traffic destined for the client (the packets that make up a client response) and mark_out and tos_out mark traffic destined for the origin (the packets that make up an origin request). Sometimes you can mark traffic sent *from* the origin with the Netfilter CONNMARK target (the packets that make up an origin response). mark_in and mark_out set the packet mark and tos_in and tos_out set the ToS/DiffServ Field. Configuration variables for the connection mark and PCP Field haven't been implemented.

You can set each variable globally in records.config and mark_out and tos_out are also overridable but mark_in and tos_in are not. There was a bug (TS-2995) where mark_in and tos_in had no effect if proxy.config.accept_threads wasn't zero, this was fixed in version 5.1.0

mark_out and tos_out are overridable but mark_in and tos_in are not. In addition to configuring them globally in records.config you can override them per transaction with the conf_remap plugin, the set-config header_rewrite operator, or TSHttpTxnConfigIntSet() *however* they are only consulted when the socket is created which happens before SEND_REQUEST_HDR_HOOK so you must set them in READ_REQUEST_HDR_HOOK or at some other earlier time. See Connection::open(). It might be possible to update an existing socket when you change the configuration variables by implementing a callback.

API Functions

ClientPacketMarkSet() and ClientPacketTosSet() mark traffic destined for the client and ServerPacketMarkSet() and ServerPacketTosSet() mark traffic destined for the origin. Sometimes you can mark traffic sent *from* the origin with the Netfilter CONNMARK target. ClientPacketMarkSet() and ServerPacketMarkSet() set the packet mark and ClientPacketTosSet() and ServerPacketTosSet() set the ToS/DiffServ Field. API functions for the connection mark and PCP Field haven't been implemented but you can call TSHttpSsnClientFdGet() or TSHttpTxnClientFdGet() to get the client socket and call setsockopt() to set the option yourself.

Unlike the configuration variables the API functions work whether the socket has already been created or not. They immediately update an existing socket, the options are also applied when a new socket is eventually created.

header_rewrite Operators

The set-conn-dscp operator sets the ToS/DiffServ Field on traffic sent to the client (the packets that make up a client response). Operators to mark traffic destined for the origin haven't been implemented but you can override the mark_out and tos_out configuration variables with the set-config operator. Operators for the packet mark, connection mark, and PCP Field haven't been implemented.

The set-conn-dscp operator immediately updates an existing socket. (The client socket will already have been created because a transaction doesn't exist without one.)

Packet or Connection Mark

These marks work only when Traffic Server and iproute2 etc. are on the same machine, they aren't communicated to a separate device, however a separate device can copy marks from the ToS/DiffServ or PCP fields to the connection mark to mark traffic sent *from* the origin. The socket option is setsockopt(sockfd, SOL_SOCKET, SO_MARK, &optval, optlen)

ToS/DiffServ Field

This is an 8-bit field in the IPv4 and IPv6 packets but the 2 least significant bits are now reserved for Explicit Congestion Notification (ECN). You can't set them! The values you can set are 0x00, 0x04, 0x08, ... 0xfc. If you set the field to 0xff the effective value will be 0xfc. Furthermore

  • the values XXX000XX have special meaning for backwards compatibility with the IP Precedence Field (see RFC 2474 section 4.2)
  • and the values XXXXX0XX are reserved for standards action (see RFC 2474 section 6).
  • The values XXXX10XX are initially available for experimental or local use but future standards should preferentially claim them if other values are exhausted.

Only the values XXXX11XX or 0x0c, 0x1c, 0x2c, ... 0xfc are reserved for experimental or local use.

These marks work even if a separate device does the actual traffic shaping. The ToS Field was originally specified in RFC 791. Both it and the IPv6 Traffic Class Field were superseded by the DiffServ Field specified in RFC 2474. The IPv4 socket option is setsockopt(sockfd, IPPROTO_IP, IP_TOS, &optval, optlen). The IPv6 socket option is setsockopt(sockfd, IPPROTO_IPV6, IPV6_TCLASS, &optval, optlen)

PCP Field

This is a 3-bit field in the Ethernet frame. The field is specified in IEEE 802.1Q and the values are the 8 priority levels specified in IEEE P802.1p.

These marks work even if a separate device in the same broadcast domain does the actual traffic shaping. The socket option is setsockopt(sockfd, SOL_SOCKET, SO_PRIORITY, &optval, optlen)

Nothing related to the PCP Field has been added to Traffic Server but you can call TSHttpSsnClientFdGet() or TSHttpTxnClientFdGet() to get the client socket and call setsockopt() to set the option yourself. There's a feature request (TS-3037) to implement the PCP Field in Traffic Server.

Here's an example of a plugin that can set the PCP Field on traffic sent to the client (the packets that make up a client response) based on the request target. You could adapt it to instead mark traffic sent to the origin (the packets that make up an origin request) or to support additional criteria like the user agent.

Use the tsxs utility to compile it:


  $ tsxs -o priority.so priority.cc

Add lines like the following to remap.config to configure it:

remap.config

  map <a class="external-link" href="http://example.com" rel="nofollow">http://example.com</a> <a class="external-link" href="http://example.com" rel="nofollow">http://example.com</a> @plugin=priority.so @pparam=4
  regex_map http://.*\.example.com http://$0 @plugin=priority.so @pparam=4

Traffic Sent *From* the Origin

You can copy marks from the ToS/DiffServ or PCP fields to the connection mark with the Netfilter CONNMARK target, to mark traffic sent *from* the origin, even if Traffic Server is on a separate device. In the following example, if you set the ToS/DiffServ Field (0x0c, 0x1c, etc.) on traffic sent to the origin, it will add both the origin request and response to the same iproute2 class (2:1, 2:3, etc.):


  iptables -t mangle -A POSTROUTING -m tos --tos 0x0c -j CONNMARK --set-mark 1
  iptables -t mangle -A POSTROUTING -m connmark --mark 1 -j CLASSIFY --set-class 2:1
  iptables -t mangle -A POSTROUTING -m tos --tos 0x1c -j CONNMARK --set-mark 2
  iptables -t mangle -A POSTROUTING -m connmark --mark 2 -j CLASSIFY --set-class 2:3

Zero Penalty Hit

Zero Penalty Hit is a feature of Squid to mark traffic sent to the client based on the cache lookup status (hit, miss, etc.) The use case for this is the management of upstream bandwidth without limiting access to content that's already cached. Here's an example of a plugin that does the same thing but I'm skeptical that it isn't better to directly manage the bandwidth between the proxy and origin? You can mark traffic sent *from* the origin with the Netfilter CONNMARK target (the packets that make up an origin response).

Use the tsxs utility to compile the plugin:


  $ tsxs -o tos.so tos.cc

If the content was already cached (cache hit) then the plugin sets the ToS/DiffServ Field on the client response to 0x0c. Edit the source code to change the value or implement a configuration variable to set it. You could adapt it to mark other cache lookup statuses (see TSHttpTxnCacheLookupStatusGet()) or instead set the packet mark, connection mark, or PCP Field.

There's a feature request (TS-2597) to implement this in Traffic Server.

Transparency

When shaping traffic between the origin and proxy, iproute2 etc. normally can't tell which client a request originated from, they know only that it was sent by the proxy. You can however configure Traffic Server to make the client or origin connections transparent. If the client connection is transparent then the source of client responses and destination of client requests is the address of the origin. If the origin connection is transparent then the source of origin requests and destination of origin responses is the address of the client. You can enable the two features independently. See Transparent Proxying.

This is handy for communicating more detail to iproute2 etc. For example here's how to divide upstream bandwidth equally among all clients with iproute2 and SFQ:

records.config

  # The source of origin requests and destination of origin responses is
  # the address of the client
  CONFIG proxy.config.http.server_ports STRING 8080:tr-out


  # Remember if traffic originated from our internet connection
  iptables -t mangle -A PREROUTING -i eth0.2 -j MARK --set-mark 1/1

  ifconfig ifb0 up

  # A qdisc is required before we can add a filter
  insmod sch_prio
  tc qdisc add dev br-lan root handle 1 prio

  # Shape only traffic originating from our internet connection
  # (packet mark 1/1)
  insmod cls_u32
  insmod act_mirred
  tc filter add dev br-lan parent 1: protocol ip pref 1 u32 match mark 1 1 flowid 1:1 action mirred egress redirect dev ifb0

  # Don&apos;t shape traffic (reorder/delay/drop) while there&apos;s available
  # capacity.  Unfortunately available capacity must be manually
  # configured and fine-tuned.  The following assumes isolated
  # up/downstream capacity (full-duplex).
  insmod sch_tbf
  tc qdisc add dev eth0.2 root handle 1 tbf rate .5mbit burst 5k latency 70ms
  tc qdisc add dev ifb0 root handle 1 tbf rate 2.5mbit burst 5k latency 70ms

  # Schedule an equal amount of traffic for each client
  insmod sch_sfq
  tc qdisc add dev eth0.2 parent 1: handle 2 sfq
  tc qdisc add dev ifb0 parent 1: handle 2 sfq

  # Divide downstream traffic into clients by destination IP address.
  # Divide upstream traffic into clients by *Netfilter connection
  # tracking* source IP address (after NAT all upstream traffic shares the
  # same source IP address).
  insmod cls_flow
  tc filter add dev eth0.2 parent 2: pref 1 handle 1 flow hash keys nfct-src divisor 1024
  tc filter add dev ifb0 parent 2: protocol ip pref 1 handle 1 flow hash keys dst divisor 1024

Background

Doing the actual traffic shaping with standard tools means you can combine all of their existing features with Traffic Server, you don't need to know and maintain another system specifically for Traffic Server, and you can shape the aggregate of proxy and non-proxy traffic. For example you can limit the sum of all traffic except access to Wikipedia and Khan Academy.

Something you can't yet do with iproute2 is specify a bandwidth for each connection or each client (without enumerating all of the clients in advance). This feature is available in MikroTik RouterOS, they call it PCQ.

If you need to distinguish one transaction from another you're required to use configuration variables for origin traffic but header_rewrite operators or API functions for client traffic. I think this asymmetry unnecessarily burdens the administrator with implementation details. It would be more consistent and more user friendly to just implement configuration variables for each socket option and make them work in all scenarios (origin and client traffic, records.config, the conf_remap plugin, the set-config header_rewrite operator, and TSHttpTxnConfigIntSet()). It might be possible to update an existing socket when you change the configuration variables by implementing a callback.

I wonder how the current implementation works with persistent connections? For example I suspect if you set a socket option per transaction and then reuse the connection to make another origin request or if the client reuses it to make another request, the option doesn't get reset?

TS-1090 and commit b77838991531d6cb402618c3d690b83e95b92d63 originally added the packet mark and ToS/DiffServ Field configuration variables and API functions. TS-3002 added the set-conn-dscp rewrite_header operator.

Example

Here's a full example of how to shape traffic between the proxy and origin based on the request target. It assigns websites one of three priorities. After the upstream bandwidth is scheduled by priority, each client gets an equal portion of each priority.

Traffic Server and iproute2 can be on different devices. It communicates the priority to iproute2 in the ToS/DiffServ Field which it overrides with the tos_out configuration variable and the conf_remap plugin. It shapes traffic sent *from* the origin by copying the priority from the ToS/DiffServ Field to the connection mark. The origin connection is transparent so iproute2 can tell which client the traffic belongs to. The client connection is transparent so Traffic Server can get the address of the origin in the rare case that a client neglects to send a Host header (HTTP/1.0 doesn't require it). You could alternatively configure the client to use the proxy explicitly.

Be careful of ICMP redirects, they can sometimes cause clients to route non-web traffic to the proxy.

records.config

  # The source of client responses and destination of client requests is
  # the address of the origin.  The source of origin requests and
  # destination of origin responses is the address of the client.
  CONFIG proxy.config.http.server_ports STRING 8080:tr-full

remap.config

  # Give high priority to Wikipedia, low priority to YouTube
  map <a class="external-link" href="http://wikipedia.org" rel="nofollow">http://wikipedia.org</a> <a class="external-link" href="http://wikipedia.org" rel="nofollow">http://wikipedia.org</a> @plugin=conf_remap.so @pparam=proxy.config.net.sock_packet_tos_out=0x0c
  regex_map http://.*\.wikipedia\.org http://$0 @plugin=conf_remap.so @pparam=proxy.config.net.sock_packet_tos_out=0x0c
  map <a class="external-link" href="http://youtube.org" rel="nofollow">http://youtube.org</a> <a class="external-link" href="http://youtube.org" rel="nofollow">http://youtube.org</a> @plugin=conf_remap.so @pparam=proxy.config.net.sock_packet_tos_out=0x1c
  regex_map http://.*\.youtube\.org http://$0 @plugin=conf_remap.so @pparam=proxy.config.net.sock_packet_tos_out=0x1c


  # Remember if traffic originated from our internet connection
  iptables -t mangle -A PREROUTING -i eth0.2 -j MARK --set-mark 1/1

  # Route web traffic to the proxy server except traffic already
  # originating from it.  Matching web traffic by port number isn&apos;t
  # perfect but it&apos;s good enough.  This is the MAC address of the proxy
  # server.  Because it&apos;s configured to make origin connections
  # transparent this is the only way to match traffic already originating
  # from it:
  # <a class="external-link" href="http://thread.gmane.org/gmane.comp.security.firewalls.netfilter.general/45405" rel="nofollow">http://thread.gmane.org/gmane.comp.security.firewalls.netfilter.general/45405</a>
  iptables -t mangle -A PREROUTING -m mac --mac-source 00:22:15:d2:1e:61 -j RETURN
  iptables -t mangle -A PREROUTING -p tcp --dport 80 -j MARK --set-mark 2/2
  iptables -t mangle -A PREROUTING -i eth0.2 -p tcp --sport 80 -j MARK --set-mark 2/2

  # Web traffic is medium priority by default but the proxy server further
  # breaks down some high/low priority traffic.  It communicates this by
  # setting the ToS/DiffServ Field (it uses the pool of codepoints reserved
  # for experimental or local use, 0x0c/0x0c).  Mark the connection to
  # remember the priority and apply the same classification to response
  # traffic (on which the ToS/DiffServ Field is not set).
  iptables -t mangle -A POSTROUTING -m tos --tos 0x0c -j CONNMARK --set-mark 1
  iptables -t mangle -A POSTROUTING -m connmark --mark 1 -j CLASSIFY --set-class 2:1
  iptables -t mangle -A POSTROUTING -m tos --tos 0x1c -j CONNMARK --set-mark 2
  iptables -t mangle -A POSTROUTING -m connmark --mark 2 -j CLASSIFY --set-class 2:3

  # Route web traffic to the proxy server
  ip route add table 1 via 192.168.1.2
  ip rule add fwmark 2/2 table 1

  ifconfig ifb0 up

  # A qdisc is required before we can add a filter
  insmod sch_prio
  tc qdisc add dev br-lan root handle 1 prio

  # Shape only traffic originating from our internet connection
  # (packet mark 1/1)
  insmod cls_u32
  insmod act_mirred
  tc filter add dev br-lan parent 1: protocol ip pref 1 u32 match mark 1 1 flowid 1:1 action mirred egress redirect dev ifb0

  # Don&apos;t shape traffic (reorder/delay/drop) while there&apos;s available
  # capacity.  Unfortunately available capacity must be manually
  # configured and fine-tuned.  The following assumes isolated
  # up/downstream capacity (full-duplex).
  insmod sch_tbf
  tc qdisc add dev eth0.2 root handle 1 tbf rate .5mbit burst 5k latency 70ms
  tc qdisc add dev ifb0 root handle 1 tbf rate 2.5mbit burst 5k latency 70ms

  # Schedule traffic according to three priorities
  tc qdisc add dev eth0.2 parent 1: handle 2 prio
  tc qdisc add dev ifb0 parent 1: handle 2 prio

  # For each priority schedule an equal amount of traffic for each client
  insmod sch_sfq
  tc qdisc add dev eth0.2 parent 2:1 handle 3 sfq
  tc qdisc add dev ifb0 parent 2:1 handle 3 sfq
  tc qdisc add dev eth0.2 parent 2:2 handle 4 sfq
  tc qdisc add dev ifb0 parent 2:2 handle 4 sfq
  tc qdisc add dev eth0.2 parent 2:3 handle 5 sfq
  tc qdisc add dev ifb0 parent 2:3 handle 5 sfq

  # Divide downstream traffic into clients by destination IP address.
  # Divide upstream traffic into clients by *Netfilter connection
  # tracking* source IP address (after NAT all upstream traffic shares the
  # same source IP address).
  insmod cls_flow
  tc filter add dev eth0.2 parent 3: pref 1 handle 1 flow hash keys nfct-src divisor 1024
  tc filter add dev ifb0 parent 3: protocol ip pref 1 handle 1 flow hash keys dst divisor 1024
  tc filter add dev eth0.2 parent 4: pref 1 handle 1 flow hash keys nfct-src divisor 1024
  tc filter add dev ifb0 parent 4: protocol ip pref 1 handle 1 flow hash keys dst divisor 1024
  tc filter add dev eth0.2 parent 5: pref 1 handle 1 flow hash keys nfct-src divisor 1024
  tc filter add dev ifb0 parent 5: protocol ip pref 1 handle 1 flow hash keys dst divisor 1024

Resources

The best source of iproute2 documentation is the Linux Advanced Routing and Traffic Control project.

The OpenISP Bandwidth Management chapter is more up-to-date and succinct.

How To Accelerate Your Internet covers bandwidth management and optimization in general, from developing policy to case studies. Chapter 6: Performance Tuning briefly touches on traffic shaping and iproute2.

  • No labels