[190553] in North American Network Operators' Group
Re: packet loss question
daemon@ATHENA.MIT.EDU (Mel Beckman)
Fri Jul 8 10:02:26 2016
X-Original-To: nanog@nanog.org
From: Mel Beckman <mel@beckman.org>
To: Phillip Lynn <phillip.lynn@netwolves.com>
Date: Fri, 8 Jul 2016 14:02:19 +0000
In-Reply-To: <577FA41A.8050303@netwolves.com>
Cc: "nanog@nanog.org" <nanog@nanog.org>
Errors-To: nanog-bounces@nanog.org
Philip,
Quite often slow Web page loading and email transport -- termed an applicat=
ion-layer problem because basic transport seems unaffected -- is due to DNS=
problems, particularly reverse DNS for the IP addresses originating your W=
eb queries. If you have non-existent or intermittent IN-ADDR entries for th=
ose IPs, the remote Web servers can be timing out if they have older config=
urations that, for example, do DNS lookups in order to log HTTP requests an=
d block on completion, resulting in timeouts. Use "nslookup x.x.x.x" comman=
d line queries (nslookup is on Windows, Mac and UNIX/Linux) to see if you c=
an resolve the public IP addresses your users original queries from. You ca=
n find those addresses by visiting http://whatismyip.com from a problem des=
ktop.
A second common cause of app-specific throughput problems, particularly whe=
re email is involved, is failed MTU discovery. The standard Internet MTU is=
1500 bytes, but sometimes a router misconfiguration or change in encapsula=
tion type along the path through your ISP lowers that to, say, 1492 or 1486=
bytes (MTU is in increments of 8). The result is that whenever your web or=
email client sends a maximum MTU packet, the packet is dropped, resulting =
in connection impairment. Most HTTP and Email packets are not max-MTU in si=
ze, so you get very uneven performance simulating network congestion.
You can force the MTU to a lower number at your border to test this. You ty=
pically do this at your firewall; it's a setting on the WAN interface confi=
g. Temporarily lower that value dramatically to something like 1440 and see=
if your problem goes away. If it does, you may need to permanently reduce =
MTU, so you should try other divisible-by-8 values -- 1492, 1486, 1478, etc=
-- until you find the largest one that works. I commonly see this when a c=
ustomer switches ISPs from DSL to Cable. Cable providers are fond of steali=
ng 8 or 16 bytes for their CMT headers in a way that breaks MTU discovery.=
=20
A third frequent application-layer throughout debillitator is IPv6 misconfi=
guration. If you support IPv6 for your end users, they may be getting direc=
ted to IPv6 web or mail servers (which are generally preferred via DNS) but=
thwarted by IPv6 transport issues, which could be as simple as routing or =
MTU, or as complex as an invisible 6-over-4 NAT somewhere (such as a your u=
pstream ISP). These problems generally require an IPv6-competent network en=
gineer to resolve, but you can test by disabling IPv6 on your network (whic=
h also requires an IPv6-competent network engineer :)
I'm always amazed at how often these three causes are at the root of perfor=
mance problems. So it's worth investigating each.
-mel beckman
> On Jul 8, 2016, at 6:02 AM, Phillip Lynn <phillip.lynn@netwolves.com> wro=
te:
>=20
>> On 07/07/2016 03:52 PM, Ken Chase wrote:
>> No offence, but i swear that mtr should come with a license to use it. I=
get more
>> questions from people accusing us of network issues with mtr in hand...
>>=20
>> You shoudlnt care that there's 80% packet loss in the middle of your rou=
te, unless
>> you have actual traffic to lag-101.ear3.miami2.level3.net. I suspect you=
dont.
>> (If you did, you'd have mtr'd to it directly of course.)
>>=20
>> As for your second trace, the sudden jump from 0% on 2nd last hop to 100=
% last
>> hop packetloss seems like firewalling to me. (long discussion about the
>> probabilities of getting 5 0%pl hops in a row and 100% on an unfirewalle=
d
>> endpoint elided. TL;DR: use more packets in your test -i 0.1 -c 100 than=
ks).
>>=20
>> If you have 0% packetloss to your target endpoint, is there an issue her=
e?
>> What caused you to mtr? 0% pl is pretty good. You could play quake 1.0
>> through that pl and ping time. The +20ms ATL<>CHI jump in the route you'=
d have
>> to take up with einstein/bill nye/$deity.
>>=20
>> For the 2nd trace, the 1st hop is your latency issue (plus the big jump =
from
>> miami<>ashburn, again the limit is c.)
>>=20
>> ICMP is allowed to be dropped by intervening routers. Someone will quote=
an RFC
>> at us shortly.
>>=20
>> Mtr without a return route is not that useful in figuring out packetloss
>> because pl requires the packet make it there and back. Pl could be anywh=
ere on
>> the return route, which is probably not symmetrical. The internet stoppe=
d
>> being symetrical about 20+ years ago (if it ever even loosely was), so g=
et a friend
>> to send you an mtr to your ip from the farside.
>>=20
>> (I remember a project long ago, some cgi-bin (yeah that long ago) that w=
as
>> basically a full-path forward+reverse traceroute you could hit on a sele=
cted
>> server at the provider. Rather handy. not sure if its still a thing, or=
what it was
>> called.)
>>=20
>> /kc
>>=20
>>=20
>> On Thu, Jul 07, 2016 at 03:17:40PM -0400, Phillip Lynn said:
>> >Hi all,
>> >
>> > I am writing because I do not understand what is happening. I ran =
mtr
>> >against our email server and www.teco.comand below are the results. =
I am
>> >not a network engineer so I am at a loss. I think what I am seeing i=
s
>> >maybe a hand off issue, between Frontier and Level3Miami2. If I am co=
rrect
>> >then what can I do?
>> >
>> > My system is running Centos 6.5 Linux.
>> >
>> >Thanks,
>> >
>> >Phillip
>> >
>> >
>> >
>> >(! 1011)-> sudo mtr -r netwolves.securence.com
>> >HOST: xxxxx@netwolves.comLoss% Snt Last Avg Best Wrst StDev
>> > 1. 172.24.109.1 0.0% 10 0.6 0.6 0.6 0.7 =
0.0
>> > 2. lo0-100.TAMPFL-VFTTP-322.gni 0.0% 10 3.2 2.0 1.0 4.3 =
1.2
>> > 3. 172.99.44.214 0.0% 10 4.0 4.9 2.3 6.9 =
1.5
>> > 4. ae8---0.scr02.mias.fl.fronti 0.0% 10 9.3 9.1 7.5 9.8 =
1.0
>> > 5. ae1---0.cbr01.mias.fl.fronti 0.0% 10 8.9 9.1 7.6 9.7 =
0.7
>> > 6. lag-101.ear3.Miami2.Level3.n 80.0% 10 9.0 8.9 8.8 9.0 =
0.1
>> > 7. 10ge9-14.core1.mia1.he.net 0.0% 10 14.3 13.0 7.6 18.1 =
4.3
>> > 8. 10ge1-1.core1.atl1.he.net 0.0% 10 25.6 33.2 22.4 99.7=
23.6
>> > 9. 10ge10-4.core1.chi1.he.net 0.0% 10 45.6 51.8 45.5 82.7=
12.5
>> > 10. 100ge14-2.core1.msp1.he.net 0.0% 10 53.6 63.9 53.6 125.2=
21.8
>> > 11. t4-2-usi-cr02-mpls-usinterne 0.0% 10 53.2 73.1 53.2 225.6=
54.0
>> > 12. v102.usi-cr04-mtka.usinterne 0.0% 10 53.2 53.9 53.2 55.3=
0.6
>> > 13. netwolves.securence.com 0.0% 10 53.4 53.9 53.4 55.4=
0.7
>> >
>> >(! 1014)-> sudo mtr -r www.teco.com
>> >HOST: xxxxx@netwolves.comLoss% Snt Last Avg Best Wrst StDev
>> > 1. 172.24.109.1 0.0% 10 0.6 0.6 0.6 0.7 =
0.0
>> > 2. lo0-100.TAMPFL-VFTTP-322.gni 0.0% 10 104.8 81.4 1.1 113.2 =
43.2
>> > 3. 172.99.47.198 0.0% 10 115.0 77.8 2.9 115.0 =
40.2
>> > 4. ae7---0.scr01.mias.fl.fronti 0.0% 10 111.1 80.2 8.5 113.5 =
41.3
>> > 5. ae0---0.cbr01.mias.fl.fronti 0.0% 10 105.9 82.2 7.6 115.=
4 33.8
>> > 6. lag-101.ear3.Miami2.Level3.n 70.0% 10 116.1 80.2 8.5 116.=
1 62.0
>> > 7. NTT-level3-80G.Miami.Level3. 0.0% 10 110.0 81.5 9.0 120.3 =
41.9
>> > 8. ae-3.r20.miamfl02.us.bb.gin. 0.0% 10 119.8 84.0 10.0 119.8=
38.5
>> > 9. ae-4.r23.asbnva02.us.bb.gin. 10.0% 10 137.4 107.6 30.1 142.7=
45.7
>> > 10. ae-2.r05.asbnva02.us.bb.gin. 0.0% 10 135.0 109.9 36.6 140.0=
39.1
>> > 11. xe-0-9-0-8.r05.asbnva02.us.c 0.0% 10 147.5 125.6 49.4 165.5=
41.1
>> > 12. 24.52.112.21 0.0% 10 158.6 124.0 49.6 161.3=
41.5
>> > 13. 24.52.112.42 0.0% 10 151.0 127.7 52.2 159.0=
41.2
>> > 14. ??? 100.0 10 0.0 0.0 0.0 0.0 =
0.0
>> >
>> >--
>> >Phillip Lynn
>> >Software Engineer III
>> >NetWolves
>> >Phone:813-579-3214
>> >Fax:813-882-0209
>> >Email: phillip.lynn@netwolves.com
>> >www.netwolves.com
>> >
>>=20
>=20
> None taken,
>=20
> We are having issues with our email and loading some web pages. I used m=
tr to try and find if there is a possible connection issue. I just need to=
understand what is happening , and be able to explain the output showing t=
he 80% packet loss . We are not pointing fingers, just looking to understa=
nd the issue better.
>=20
> Thanks
>=20
> --=20
> Phillip Lynn
> Software Engineer III
> NetWolves
> Phone:813-579-3214
> Fax:813-882-0209
> Email: phillip.lynn@netwolves.com
> www.netwolves.com
>=20