Marcelo Araujo
Copyright © 2008 Marcelo Araujo
This research are based in some technologies existing in the market, all papers and documents used in this study are available on the Internet and are referenced in this document. If you want to use this document how reference in some project, please don't forget the credits.
Introduction
Space to understand and recognize the correct implementation of ToS and PRECEDENCE field. Actually we have today some mechanisms and specifications to work with IP HEADER, directly ToS field. My research around this concern is for equalizing all knowing available in most of RFC's.
References
- RFC 0791 - Internet Protocol.
- RFC 1122 - Requirements for Internet Hosts - Communication Layers.
- RFC 1349 - Type of Service in the Internet Protocol Suite.
- DRAFT XIAO - TCP Processing of the IP Precedence Field.
Comparing the implementations
My first question is: What implementation is correct?
Although seems disorganized, we have rules to follow and practices to adopt but many equipments not have the correct implementation.
As example:
In the RFC 0791 we have the first three bits 0-2 for setting the PRECEDENCE field and more three bits 3-5 for setting the quality of service desired. The rest 6 and 7 is reserved for future use. ``` Bits 0-2: Precedence. Bit 3: 0 = Normal Delay, 1 = Low Delay. Bits 4: 0 = Normal Throughput, 1 = High Throughput. Bits 5: 0 = Normal Relibility, 1 = High Relibility. Bit 6-7: Reserved for Future Use.
0 1 2 3 4 5 6 7
+-----+-----+-----+-----+-----+-----+-----+-----+
| | | | | | |
| PRECEDENCE | D | T | R | 0 | 0 |
| | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
```
However, we have an update with the RFC 1349 and now we use the 'Reserved bit' to minimize the monetary cost of transmitting a datagram within the network.
``` 1000 -- minimize delay 0100 -- maximize throughput 0010 -- maximize reliability 0001 -- minimize monetary cost 0000 -- normal service
0 1 2 3 4 5 6 7
+-----+-----+-----+-----+-----+-----+-----+-----+
| | | |
| PRECEDENCE | TOS | MBZ |
| | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
```
I investigated how the IPFW(IP Firewall) works with this field and with the reserved bit and which RFC is implemented.
179 static struct _s_x f_iptos[] = {
180 { "lowdelay", IPTOS_LOWDELAY}, /* 1000 -- minimize delay */
181 { "throughput", IPTOS_THROUGHPUT}, /* 0100 -- maximize throughput */
182 { "reliability", IPTOS_RELIABILITY},/* 0010 -- maximize reliability */
183 { "mincost", IPTOS_MINCOST}, /* 0001 -- minimize monetary cost */
184 { "congestion", IPTOS_CE},
185 { "ecntransport", IPTOS_ECT},
186 { "ip tos option", 0},
187 { NULL, 0 }
We have too the PRECEDENCE within ToS field, but IPFW not implement this function described in RFC 0791.
Precedence field:
111 - Network Control
110 - Internetwork Control
101 - CRITIC/ECP
100 - Flash Override
011 - Flash
010 - Immediate
001 - Priority
000 - Routine
We have a short implementation within ip_fw2.c, but not support all combinations proposed by RFC 0791.
2996 case O_IPPRECEDENCE:
2997 match = (is_ipv4 &&
2998 (cmd->arg1 == (ip->ip_tos & 0xe0)) );
2999 break;
I investigated within netinet/ip.h the PRECEDENCE offset 0xe0.
86 /*
87 * Definitions for IP precedence (also in ip_tos) (hopefully unused).
88 */
89 #define IPTOS_PREC_NETCONTROL 0xe0
90 #define IPTOS_PREC_INTERNETCONTROL 0xc0
91 #define IPTOS_PREC_CRITIC_ECP 0xa0
92 #define IPTOS_PREC_FLASHOVERRIDE 0x80
93 #define IPTOS_PREC_FLASH 0x60
94 #define IPTOS_PREC_IMMEDIATE 0x40
95 #define IPTOS_PREC_PRIORITY 0x20
96 #define IPTOS_PREC_ROUTINE 0x00
The PRECEDENCE field are rarely used and the default is called how NETCONTROL or with the offset 0xe0 or all bits 111. The RFC 0791 on the section 3.2.1.6 refers to only two combinations commonly used. The PRECEDENCE 0xe0(Network Control) and 0xc0(Internetwork Control) the last is designed for GATEWAYS only. All the other offset PRECEDENCE is responsibility of network to control the access and use of those precedence.
The patch
Code patch to implement IP PRECEDENCE fields.
```
Index: sbin/ipfw/ipfw2.c
RCS file: /home/ncvs/src/sbin/ipfw/ipfw2.c,v retrieving revision 1.117 diff -u -r1.117 ipfw2.c --- sbin/ipfw/ipfw2.c 24 Feb 2008 15:37:45 -0000 1.117 +++ sbin/ipfw/ipfw2.c 26 Feb 2008 14:37:20 -0000 @@ -177,6 +177,18 @@ { NULL, 0 } }; +static struct _s_x f_iptospre[] = { + { "netcontrol", IPTOSPRE_NETCONTROL}, + { "intercontrol", IPTOSPRE_INTERCONTROL}, + { "criticecp", IPTOSPRE_CRITICECP}, + { "flashover", IPTOSPRE_FLASHOVER}, + { "flash", IPTOSPRE_FLASH}, + { "immediate", IPTOSPRE_IMMEDIATE}, + { "priority", IPTOSPRE_PRIORITY}, + { "routine", IPTOSPRE_ROUTINE}, + { NULL, 0} +}; + static struct _s_x f_iptos[] = { { "lowdelay", IPTOS_LOWDELAY}, { "throughput", IPTOS_THROUGHPUT}, @@ -282,6 +294,7 @@ TOK_IPLEN, TOK_IPID, TOK_IPPRECEDENCE, + TOK_IPTOSPRE, TOK_IPTOS, TOK_IPTTL, TOK_IPVER, @@ -317,6 +330,7 @@ TOK_GRED, TOK_DROPTAIL, TOK_PROTO, + TOK_SETIPTOSPRE, TOK_WEIGHT, TOK_IP, TOK_IF, @@ -411,6 +425,7 @@ { "unreach6", TOK_UNREACH6 }, { "unreach", TOK_UNREACH }, { "check-state", TOK_CHECKSTATE }, + { "iptospre", TOK_SETIPTOSPRE }, { "//", TOK_COMMENT }, { "nat", TOK_NAT }, { NULL, 0 } /* terminator / @@ -449,6 +464,7 @@ { "ipid", TOK_IPID }, { "ipprecedence", TOK_IPPRECEDENCE }, { "iptos", TOK_IPTOS }, + { "iptospre", TOK_IPTOSPRE }, { "ipttl", TOK_IPTTL }, { "ipversion", TOK_IPVER }, { "ipver", TOK_IPVER }, @@ -1599,6 +1615,10 @@ } break; + case O_SETIPTOSPRE: + printf("iptospre %s", match_value(f_iptospre, cmd->arg1)); + break; + case O_LOG: / O_LOG is printed last */ logptr = (ipfw_insn_log *)cmd; break; @@ -1910,6 +1930,10 @@ printf(" established"); break; + case O_IPTOSPRE: + printf(" iptospre %s", match_value(f_iptospre, cmd->arg1)); + break; + case O_TCPDATALEN: if (F_LEN(cmd) == 1) printf(" tcpdatalen %u", cmd->arg1 ); @@ -2712,7 +2736,7 @@ "RULE-BODY: check-state [PARAMS] | ACTION [PARAMS] ADDR [OPTION_LIST]\n" "ACTION: check-state | allow | count | deny | unreach{,6} CODE |\n" " skipto N | {divert|tee} PORT | forward ADDR |\n" -" pipe N | queue N | nat N\n" +" pipe N | queue N | iptospre CODE | nat N\n" "PARAMS: [log [logamount LOGLIMIT]] [altq QUEUE_NAME]\n" "ADDR: [ MAC dst src ether_type ] \n" " [ ip from IPADDR [ PORT ] to IPADDR [ PORTLIST ] ]\n" @@ -2725,6 +2749,7 @@ "OPTION: bridged | diverted | diverted-loopback | diverted-output |\n" " {dst-ip|src-ip} IPADDR | {dst-ip6|src-ip6|dst-ipv6|src-ipv6} IP6ADDR |\n" " {dst-port|src-port} LIST |\n" +" iptospre CODE | {dst-ip|src-ip} IPADDR |\n" " estab | frag | {gid|uid} N | icmptypes LIST | in | out | ipid LIST |\n" " iplen LIST | ipoptions SPEC | ipprecedence | ipsec | iptos SPEC |\n" " ipttl LIST | ipversion VER | keep-state | layer2 | limit ... |\n" @@ -4848,6 +4873,12 @@ action->opcode = O_COUNT; break; + case TOK_SETIPTOSPRE: + NEED1("need iptospre arg\n"); + fill_flags(action, O_SETIPTOSPRE, f_iptospre, *av); + ac--; av++; + break; + case TOK_NAT: action->opcode = O_NAT; action->len = F_INSN_SIZE(ipfw_insn_nat); @@ -5334,6 +5365,12 @@ ac--; av++; break; + case TOK_IPTOSPRE: + NEED1("missing argument for iptospre"); + fill_flags(cmd, O_IPTOSPRE, f_iptospre, *av); + ac--; av++; + break; + case TOK_IPTOS: NEED1("missing argument for iptos"); fill_flags(cmd, O_IPTOS, f_iptos, *av);
Index: sys/netinet/ip_fw.h
RCS file: /home/ncvs/src/sys/netinet/ip_fw.h,v retrieving revision 1.111 diff -u -r1.111 ip_fw.h --- sys/netinet/ip_fw.h 25 Jan 2008 14:38:27 -0000 1.111 +++ sys/netinet/ip_fw.h 26 Feb 2008 14:37:21 -0000 @@ -161,6 +161,9 @@ O_TAG, /* arg1=tag number / O_TAGGED, / arg1=tag number / + O_SETIPTOSPRE, / Add ToS PRECEDENCE support. / + O_IPTOSPRE, / Add ToS PRECEDENCE support. / + O_LAST_OPCODE / not an opcode! / }; @@ -510,6 +513,16 @@ #define IP_FW_IPOPT_RR 0x04 #define IP_FW_IPOPT_TS 0x08 +/ Definitions for IP ToS PRECEDENCE. / +#define IPTOSPRE_NETCONTROL 224 / bin = 111 dec = 224 hex = 0xe0 / +#define IPTOSPRE_INTERCONTROL 192 / bin = 110 dec = 192 hex = 0xc0 / +#define IPTOSPRE_CRITICECP 160 / bin = 101 dec = 160 hex = 0xa0 / +#define IPTOSPRE_FLASHOVER 128 / bin = 100 dec = 128 hex = 0x80 / +#define IPTOSPRE_FLASH 96 / bin = 011 dec = 96 hex = 0x60 / +#define IPTOSPRE_IMMEDIATE 64 / bin = 010 dec = 64 hex = 0x40 / +#define IPTOSPRE_PRIORITY 32 / bin = 001 dec = 32 hex = 0x20 / +#define IPTOSPRE_ROUTINE 0 / bin = 000 dec = 0 hex = 0x00 / + / * Definitions for TCP option names. */ @@ -626,5 +639,22 @@ extern ip_fw_chk_t ip_fw_chk_ptr; #define IPFW_LOADED (ip_fw_chk_ptr != NULL) +/ Some novel@ code. / +#define ADJUST_CHECKSUM(acc, cksum) \ + do { \ + acc += cksum; \ + if (acc < 0) { \ + acc = -acc; \ + acc = (acc >> 16) + (acc & 0xffff); \ + acc += acc >> 16; \ + cksum = (u_short) ~acc; \ + } else { \ + acc = (acc >> 16) + (acc & 0xffff); \ + acc += acc >> 16; \ + cksum = (u_short) acc; \ + } \ + } while (0) +/ Some novel@ code. / + #endif / _KERNEL / #endif / _IPFW2_H */
Index: sys/netinet/ip_fw2.c
RCS file: /home/ncvs/src/sys/netinet/ip_fw2.c,v retrieving revision 1.181 diff -u -r1.181 ip_fw2.c --- sys/netinet/ip_fw2.c 24 Feb 2008 15:37:45 -0000 1.181 +++ sys/netinet/ip_fw2.c 26 Feb 2008 14:37:25 -0000 @@ -177,6 +177,21 @@ extern int ipfw_chg_hook(SYSCTL_HANDLER_ARGS); +/* some @novel code. */ +static __inline int +twowords(void *p) { + uint8_t c = p; +#if BYTE_ORDER == LITTLE_ENDIAN + uint16_t s1 = ((uint16_t)c[1] << 8) + (uint16_t)c[0]; + uint16_t s2 = ((uint16_t)c[3] << 8) + (uint16_t)c[2]; +#else + uint16_t s1 = ((uint16_t)c[0] << 8) + (uint16_t)c[1]; + uint16_t s2 = ((uint16_t)c[2] << 8) + (uint16_t)c[3]; +#endif + return (s1 + s2); +} +/ some @novel code. */ + #ifdef SYSCTL_NODE SYSCTL_NODE(_net_inet_ip, OID_AUTO, fw, CTLFLAG_RW, 0, "Firewall"); SYSCTL_PROC(_net_inet_ip_fw, OID_AUTO, enable, @@ -2700,6 +2715,7 @@ for (; f; f = f->next) { ipfw_insn cmd; uint32_t tablearg = 0; + int accumulate; / Novel@ code. / int l, cmdlen, skip_or; / skip rest of OR block */ again: @@ -3006,6 +3022,11 @@ flags_match(cmd, ip->ip_tos)); break; + case O_IPTOSPRE: + match = (is_ipv4 && + flags_match(cmd, ip->ip_tos)); + break; + case O_TCPDATALEN: if (proto == IPPROTO_TCP && offset == 0) { struct tcphdr tcp; @@ -3322,6 +3343,18 @@ match = 1; break; + / Insert within IP ToS PRECEDENCE field. / + case O_SETIPTOSPRE: + accumulate = twowords(&ip->ip_tos); + ip->ip_tos= cmd->arg1; + accumulate -= twowords(&ip->ip_tos); + ADJUST_CHECKSUM(accumulate, ip->ip_sum); + f->pcnt++; / update stats / + f->bcnt += pktlen; + f->timestamp = time_second; + goto next_rule; + / Insert within IP ToS PRECEDENCE field. / + case O_PROBE_STATE: case O_CHECK_STATE: / @@ -4119,6 +4152,7 @@ case O_FRAG: case O_DIVERTED: case O_IPOPT: + case O_IPTOSPRE: case O_IPTOS: case O_IPPRECEDENCE: case O_IPVER: @@ -4142,6 +4176,10 @@ goto bad_size; break; + case O_SETIPTOSPRE: + have_action = 1; + break; + case O_UID: case O_GID: case O_JAIL: ```
PR Number: kern/121122
How is applicable
You can use how follows to set the classification:
ipfw add 10 iptospre immediate ip from 192.168.0.0/24 to any
You can use to match the package that has the marked flag:
ipfw add 11 count ip from 192.168.0.0/24 to any iptospre immediate