17:51 - Sunday, 20 April 2014

How To Measure IOwait Per Device?

I have a server, which exports home directories over NFS. They are on software RAID1 (/dev/sdb and /dev/sdc) and the OS is on /dev/sda. I noticed that my %iowait as reported by top and sar are relatively high (compare to the rest of the servers). The values range between 5-10%, as for the other servers (which are more loaded than this one) the same as 0-1%. The so-called user experience drops when the %iowait reaches values above 12%. Then we experience latency.

I don’t have any drive errors in the logs.
I would like to avoid playing with the drives using the trial-and-error method.

How I can find out which device (/dev/sda, /dev/sdb or /dev/sdc) is the bottleneck?

Thanks!

Edit: I use Ubuntu 9.10 and already have iostat installed. I am not interested of NFS related issues, but more of how to find which device slows down the system. The NFS is not loaded, I have 32 threads available, the result of

grep th /proc/net/rpc/nfsdth 32 0 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

Edit2: Here is part of iostat -x 1 output (I hope I’m not violating some rules here):

avg-cpu:  %user   %nice %system %iowait  %steal   %idle          45.21    0.00    0.12    4.09    0.00   50.58Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %utilsda               0.00     0.00   21.00    0.00   368.00     0.00    17.52     0.17    8.10   6.67  14.00sdb               0.00     6.00    0.00    6.00     0.00    96.00    16.00     0.00    0.00   0.00   0.00sdc               0.00     6.00    0.00    6.00     0.00    96.00    16.00     0.00    0.00   0.00   0.00dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00dm-1              0.00     0.00   21.00    0.00   368.00     0.00    17.52     0.17    8.10   6.67  14.00dm-2              0.00     0.00    0.00   12.00     0.00    96.00     8.00     0.00    0.00   0.00   0.00drbd2             0.00     0.00    0.00   12.00     0.00    96.00     8.00     5.23   99.17  65.83  79.00avg-cpu:  %user   %nice %system %iowait  %steal   %idle          45.53    0.00    0.24    6.56    0.00   47.68Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %utilsda               0.00     1.00   23.00    2.00   424.00    24.00    17.92     0.23    9.20   8.80  22.00sdb               0.00    32.00    0.00   10.00     0.00   336.00    33.60     0.01    1.00   1.00   1.00sdc               0.00    32.00    0.00   10.00     0.00   336.00    33.60     0.01    1.00   1.00   1.00dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00dm-1              0.00     0.00   23.00    0.00   424.00     0.00    18.43     0.20    8.70   8.70  20.00dm-2              0.00     0.00    0.00   44.00     0.00   352.00     8.00     0.30    6.82   0.45   2.00drbd2             0.00     0.00    0.00   44.00     0.00   352.00     8.00    12.72   80.68  22.73 100.00avg-cpu:  %user   %nice %system %iowait  %steal   %idle          44.11    0.00    1.19   10.46    0.00   44.23Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %utilsda               0.00   637.00   19.00   16.00   432.00  5208.00   161.14     0.34    9.71   6.29  22.00sdb               0.00    31.00    0.00   13.00     0.00   352.00    27.08     0.00    0.00   0.00   0.00sdc               0.00    31.00    0.00   13.00     0.00   352.00    27.08     0.00    0.00   0.00   0.00dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00dm-1              0.00     0.00   20.00  651.00   456.00  5208.00     8.44    13.14   19.58   0.33  22.00dm-2              0.00     0.00    0.00   42.00     0.00   336.00     8.00     0.01    0.24   0.24   1.00drbd2             0.00     0.00    0.00   42.00     0.00   336.00     8.00     4.73   73.57  18.57  78.00avg-cpu:  %user   %nice %system %iowait  %steal   %idle          46.80    0.00    0.12    1.81    0.00   51.27Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %utilsda               0.00     0.00   16.00    0.00   240.00     0.00    15.00     0.14    8.75   8.12  13.00sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

What are the most relevant columns to look into? What values are considered unhealthy? I suppose await and %util are the ones I am looking for. In my opinion dm-1 is the bottleneck (this is the DRBD resource metadata).

Double thanks!

Edit3: Here is what my setup is:

sda = OS, no RAID. Devices dm-0 and dm-1 are on it, as the latter is a metadata device for the DRBD resource (see below). Both dm-0 and dm-1 are LVM volumes;
drbd2 = dm-2 = sdb + sdc -> this is the RAID1 device, which serves the user home directories over NFS. I don’t think this one is the bottleneck. No LVM volume here.

iostat -x 1?

I am told I must expand that answer further, but as yet I don’t know what to add. You don’t say which distro you’re using, so I can’t point to you to a method to install iostat if you don’t already have it. But I think it’s what you’re asking for.

Edit: glad to see some iostat output! At the moment, the sd[ab] devices have near-identical figures, which they should in RAID-1, and neither is saturated; nor is sdc. drbd2, however, is; what is this used for, and how might it affect server performance as a whole?

Edit 2: I don’t really know what to suggest. You admit that drbd2 “serves the user home directories over NFS” and you say that you have an NFS server latency problem. You produce iostat output that pretty convincingly says that drbd2 is the bottlenecked device. You then say that “In my opinion dm-1 is the bottleneck” and “I don’t think [drbd2] is the bottleneck”. It’s not clear to me what evidence you have that contradicts the hypothesis that drbd2 is the bottleneck, but it would be nice to see it.

Is this a heavily used NFS server? A good way to find out if NFS is the bottleneck is to check how the NFS process is running and if any are in wait status.

grep th /proc/net/rpc/nfsd

th 128 239329954 363444.325 111999.649 51847.080 12906.574 38391.554 25029.724 24115.236 24502.647 0.000 520794.933

The first number is the number of threads available for servicing requests, and the the second number is the number of times that all threads have been needed. The remaining 10 numbers are a histogram showing how many seconds a certain fraction of the threads have been busy, starting with less than 10% of the threads and ending with more than 90% of the threads. If the last few numbers have accumulated a significant amount of time, then your server probably needs more threads.

Increase the number of threads used by the server to 16 by changing RPCNFSDCOUNT=16 in /etc/rc.d/init.d/nfs

You can read more at http://billharlan.com/pub/papers/NFS_for_clusters.html under “Server Threads” Heading.

Both your /dev/sdb and /dev/sdc have very close “await” factor numbers. /dev/sda has some bigger numbers but how can it affect your RAID performance being not included into it? BTW, do you use LVM for mirroring, don’t you?

So reading iostat will help you narrow down what drive(s) are having IO issues, but I have found that tracking down the application causing the IO issues is far more helpful in actually improving the situation. For that iotop is awesome:

http://guichaz.free.fr/iotop/

Share

Advertisement

Comment