Skip to content

How to read Resource Allocation Performance History (Nz Admin tool)

1

Group -> Name of GRA group
Allocated -> Min RSG set for a GRA group
Maximum -> Max RSG set for a GRA group
Allowed -> Based on the actual appliance usage the resources that a specific GRA group can get
Actual -> Actual resources that a GRA group is using
Short Jobs – Running -> Short running jobs (generally that take less than 2 sec)
Long Jobs – Running -> Long running jobs (generally that take more than 2 sec)
Short Jobs – Waiting -> Number of short jobs that went to waiting state because of GRA max for a group.
Long Jobs – Waiting -> Number of long jobs that went to waiting state because of GRA max for a group.

Mustang – How to troubleshoot bad SPU

Here is the procedure to diagnose and find the bad SPU in Mustang box which may be generating errors for some queries.

1) Note the hwid reported in the error

2) cd to /nz/kit/log/postgres and grep for the string “ERROR\:” in pg.log. If you see errors that look like:

2014-09-10 09:31:33.443100 EDT [27619] ERROR: 23 : spu 10.0.32.3 disk error DISK_SATA_RX_ERROR at 12343123

This indicates that these disk errors are causing queries to fail.

3) Confirm that the error is related to the SPU is using the hwid from the alert:

nz@host1:/nz/kit/log/postgres->nzinventory | grep 1962
SPU 1962 Active Online 41 4 10.0.41.4 372.61 GB

Now we know that this SPU will continue to cause queries to fail.

4) Issue the command:

nz@host1:/export/home/nz-> nzsystem pause

Default timeout is 5 minutes. So, it will wait for five minutes to wait for queries in-flight to complete. If in-flight queries don’t complete in this time, NPS will kill active queries and then pause NPS.

To override the default timeout of 300 seconds, you can issue the -timeout flag eg.

nz@host1:/export/home/nz-> nzsystem pause -timeout 600

5) Now that the system has paused, fail over the bad SPU:

nz@host1:/export/home/nz->nzspu failover -id 1962

6) Resume NPS:

nz@host1:/export/home/nz->nzsystem resume

7) Monitor the regen with the command:

nz@host1:/export/home/nz->watch nzinventory -type regenTasks

NOTE: Regen  is much faster on mustang systems than on TwinFins. In Mustang, it will go through a synchronization process to apply transactions for that SPU that were submitted while the SPU was being reigned.  The busier the system is, the longer the synchronization process will take.