Traceroute in Azure
Is it possible to traceroute in Azure?
Despite a lot of negative statements, it is totally possible. Take a look:
IPv4:
IPv6:
All you need is to meet 2 conditions here:
- Add Network Security Group rule allowing Inbound ICMP to this VM from Any source
- Configure VM with explicit instance-level public IP.
Here's explanation for these conditions:
1. ICMP
All the NSG rules in Azure, explicit or implicit, are stateful. This includes ICMP.
As you may know, traceroute works by sending IP packets with very short TTL (starting with 1). Each router on the path is decreasing TTL by one, and the router that decreases packet TTL to 0, must drop it and should send ICMP message packet TTL Expired in transit.
ICMP message will return to VM from the router, not from VM we are sening probes to. But NSG rules are only ready to receive responses from the end host.
Opening ICMP inbound for all the hosts allows ICMP "Expired in transit" packets to reach original VM, and work tracert properly.
2. Instance-level IP
VM in Azure can be behind load balancer (Implicit or Explicit), have instance-level public IP, or not to have public IP at all.
Public Load Balancer, is stateful, and can forward only TCP and UDP traffic initiated from VM or towards VM.
VM that does not have Public IP or LB configured (as well as does not use UDR to redirect traffic to NVA), but does have internet access, has implicit load balancer configured, and has the same limitations that VM with Load Balancer.
The only way to receive ICMP to VM is to use Instance-level Public IP (IP attached directly to VM), and not to use UDRs (traffic should go from VM directly to internet)
Additional notes
1. Linux VMs
I could not make traceroute work in Linux. Same Azure VM configuration (IL-PIP + ICMP inbound) shows "expired in transit" packets in tcpdump, however traceroute tool does not show them.
MTR shows everything correctly, so if I have to check traceroute from Linux, I use MTR.
2. Tracert -w 200
When Windows traceroute goes through unresponsive hops, it takes 4 seconds to draw * (timeout), 12 seconds per hop (see here).
-w 200 option lets you shorten this value to 200 ms, or 600 ms per non-responding hop. This value is almost always enough, but you may increase it to a higher value if necessary.
This option significantly increases speed of the traceroutes.
How do you use it?
Mostly, troubleshooting connectivity/latency from Azure towards non-Azure destinations.
As traffic between large Autonomous systems is mostly asymmetric (there's no guarantee that traffic enters and leaves AS over the same router or set of routers), in some scenarious it is necessary to understand
Question:
Based on traceroutes provided, can you guess which region my Azure VM is deployed in?
Comments
Post a Comment