DNS and why it should be monitored

DNS is a set it and forget it application. And many administrators do not realize the impact DNS has in their environment. Most environments I have been in have never seemed to even care about how it was architected or what is really happening with their DNS. And yet DNS is often the cause of many outages, and many problems seen and unseen. So here is a list of all the things I think are important to watch for:

  1. Quantity of DNS queries
  2. Types of Queries
  3. DNS Errors
  4. DNS time outs
  5. DNS Names being queried
  6. TTL in the Response
  7. Multiple answers in the DNS response
  8. DNS Geo load balancing,

So why should I even care about any of the above. Well let me go through each of them and list some possible problems with them.

  1. Quantity of DNS Queries
    1. Overloading a DNS server
      1. Can cause time outs
    2. Hiding Traffic through DNS Tunneling as it leaves your environment
    3. Slowing down applications because TTL is to low and 1000’s of query’s are being made for an ip that remains static. Or Times out because of the overwhelming amount of queries caused by low TTL.
  2. Types of Queries
    1. DNS Tunneling
    2. Misconfigured DNS
    3. Easy way to see what and how some applications work (i.e. auto configuration stuff)
  3. DNS Errors
    1. Errors are normal. NXDOMAIN is normal. But why would I have so many and what is by baseline
    2. Look for important names
    3. May be because someone is using Host name, And not FQDN. Could speed the process up by milliseconds.
    4. Do I have multiple DNS suffix search orders?
      1. Does my dns query have to go through 5 DNS suffix’s to get a final answer?
      2. Is this good or bad. (in my opinion it is bad since it slows things down just a ms or 2)
  4. DNS Time outs
    1. When a client has to wait for DNS Time outs it means the application is waiting.
    2. Usually this means that DNS will try again by hitting the secondary DNS server
    3. Long enough times outs can me the application times out with an arbitrary network error.
  5. DNS Names being queried.
    1. Odd names can be an indicator of a virus.
    2. Are your applications using FQDN or host name.
    3. Indicator of Bitcoin mining
    4. There is a lot more. But always a good idea to see what is being queried and Ask your self why.
  6. TTL in the response
    1. This is important from the aspect that many people thing DNS Geo load balancing is awesome.
      1. However when you have 10000 clients that query that service often and their time to live is 5 min or even 0. It can quickly overload a DNS server.
      2. When companies setup DNS Geo Load balancing and set the record to 5 min or 0 it is possible to have an application be talking to 2 totally different ip’s under some circumstances.
      3. So those weird times your app talks to the vendor or customer just fine, but then drops off for a bit. Then comes back. Hmmm check DNS see if they were hitting a different IP address during that time frame.
        1. I see this often. When you do an NSLOOKUP for x.domain.com and it returns ip xxx.xxx.xxx.xxx and you test and all is good. But 10 min ago the app team says it was not. Then they say oh it working now what did you do? So you go about your business they call back 5 days later and say do that thing again we are broke. And you start the whole process over. But what you do not realize is that for 10 min x.domain.com was going to yyyy.yyyy.yyyy.yyyy
    2. Seeing this also helps you on misconfigs.
    3. On static IP’s you can change it to be longer if you see the same request often.
    4. Helps you see when changes are made.
  7. Multiple IP’s in the DNS Response
    1. I have seen responses with as many as 7-10 IP’s in the same response with different TTL’s

This is just a list of some of the reasons I would want to monitor my DNS. I have found multiple times where DNS was the cause of performance issues. As well as the cause of intermittent errors. But it is very hard to tell when this is happening. DNS logging does not always help since it can be overwhelming trying to search through all the logs. However with Extrahop. I have been able visualize this type of data quickly. And easily.

Below is a sampling of the graphs and metrics in a basic dashboard from extrahop.

 

 

 

Unknown's avatar

About Mitch Roberson

Having worked as a consultant at multiple VAR’s as well as Microsoft. Mitch has had the experience of Seeing a multitude of environments. As well as working with both Network, Systems and Security teams. This has allowed him to broaden his knowledge in many areas of IT. Because of this broad experience it has driven him to an almost fanatical desire to have visibility in his environments so he can understand what is happening with in an environment. He still is responsible for day to day operations of Active Directory, Exchange, and much more. But his passion is to learn how applications communicate so he can decrease mean time to resolution.
This entry was posted in Uncategorized. Bookmark the permalink.

1 Response to DNS and why it should be monitored

  1. Hillary B's avatar Hillary B says:

    Aweesome blog you have here

Leave a comment