ESXI Nexenta 4, round robin, iops=1, no Hardware Accelerated Locking


Nexenta 4 (CE) on ESXI (5/6) sort of fails when you have Hardware Accelerated Locking enabled. You will see a ton of errors in your vmkernel log about this once you activate your ISCSI.

To get it all going again here is a quick snippet.

esxcli system settings advanced set -i 0 -o /VMFS3/HardwareAcceleratedLocking

esxcfg-rescan vmhba32

for i in `esxcfg-scsidevs -c |awk '{print $1}' | grep naa.600`; do esxcli storage nmp device set -d $i --psp VMW_PSP_RR;done

for i in `esxcfg-scsidevs -c |awk '{print $1}' | grep naa.600`; do esxcli storage nmp psp roundrobin deviceconfig set --type=iops --iops=1 --device=$i; done

The first line disables the HW accelerated locking, e.g. back to basics. Then we do a rescan of vmhba32 (SW/ISCSI), then push all disks to VMW_PSP_RR and set the IOPS to 1 for optimal distribution,

C’est ca..

Set Round Robin and IOPS for Nexenta ISCSI Luns


Have a ton of ISCSI volumes across a ton of hosts and just not feeling like clicking all those checkboxes to change paths.. Your solution would be…

esxcli nmp device list | grep naa | grep NEXENTA | awk -F '[()]' '{print $(NF-1)}' | while read disk; do esxcli nmp device setpolicy -d $disk --psp=VMW_PSP_RR; done

Oh, and while you’re at it set the iops chunk size to 1, speeds things up a bit 🙂
esxcli nmp device list | grep naa | grep NEXENTA | awk -F '[()]' '{print $(NF-1)}' | while read disk; do esxcli nmp roundrobin setconfig --device $disk --iops 1 --type iops ; done

Graph ZFS details with Cacti (Nexenta)


Today, as promised a long time ago, and therefore seriously overdue, a short writeup on how to graph ZFS details from Nexenta on Cacti.

On the Nexenta host

Firstly, add some extends to you SNMPD.CONF (here is some background on how to do this ). To do this log in (ssh) as admin to your Nexenta box, then obtain root privileges (su).

Now start  the nmc and run the command setup network service snmp-agent edit-settings . this allows you to edit the snmpd.conf file on a Nexenta host. Do not go about editing these files directly in /etc/snmp because it will not work that way

Now add your extends to the snmpd.conf file, i googled a bit to get some commands to return ZFS details, for our servers we added the following. The snmp-agent edit-settings  will open the snmpd.conf file in VI, so remember: a is to append, 😡 is to save, dd is to delete a line.

We added the following extends to our Nexenta server

extend .1.3.6.1.4.1.2012.88 zpool_name /bin/bash -c “zpool list -H -o name”
extend .1.3.6.1.4.1.2021.88 zpool_snap /bin/bash -c “zpool list -Ho name|for zpool in `xargs`;do zfs get -rHp -o value usedbysnapshots $zpool|awk -F: ‘{sum+=$1} END{print sum}’;done”
extend .1.3.6.1.4.1.2021.88 zpool_used /bin/bash -c “zpool list -Ho name|xargs zfs get -Hp -o value used”
extend .1.3.6.1.4.1.2021.88 zpool_data_used /bin/bash -c “zpool list -Ho name|for zpool in `xargs`;do snap=`zfs get -rHp -o value usedbysnapshots $zpool|awk -F: ‘{sum+=$1} END{print sum}’`;pool=`zfs get -Hp -o value used $zpool`; echo $pool $snap|awk ‘{print (\$1-\$2);}’;done”
extend .1.3.6.1.4.1.2021.88 zpool_available /bin/bash -c “zpool list -Ho name|xargs zfs get -Hp -o value available”
extend .1.3.6.1.4.1.2021.88 zpool_capacity /bin/bash -c “zpool list -H -o capacity”
extend .1.3.6.1.4.1.2021.85 arc_meta_max /bin/bash -c “echo ::arc | mdb -k| grep arc_meta_max|tr -cd ‘[:digit:]'”
extend .1.3.6.1.4.1.2021.85 arc_meta_used /bin/bash -c “echo ::arc | mdb -k| grep arc_meta_used|tr -cd ‘[:digit:]'”
extend .1.3.6.1.4.1.2021.85 arc_size /bin/bash -c “echo ::arc | mdb -k| grep -w size|tr -cd ‘[:digit:]'”
extend .1.3.6.1.4.1.2021.85 arc_meta_limit /bin/bash -c “echo ::arc | mdb -k| grep arc_meta_limit|tr -cd ‘[:digit:]'”
extend .1.3.6.1.4.1.2021.85 arc_meta_c_max /bin/bash -c “echo ::arc | mdb -k| grep c_max|tr -cd ‘[:digit:]'”
extend .1.3.6.1.4.1.2021.89 arc_hits /bin/bash -c “kstat -p ::arcstats:hits| cut -s -f 2”
extend .1.3.6.1.4.1.2021.89 arc_misses /bin/bash -c “kstat -p ::arcstats:misses| cut -s -f 2”
extend .1.3.6.1.4.1.2021.89 arc_l2_hits /bin/bash -c “kstat -p ::arcstats:l2_hits| cut -s -f 2”
extend .1.3.6.1.4.1.2021.89 arc_l2_misses /bin/bash -c “kstat -p ::arcstats:l2_misses| cut -s -f 2”
extend .1.3.6.1.4.1.2021.90 vopstats_zfs_nread /bin/bash -c “kstat -p ::vopstats_zfs:nread | cut -s -f 2”
extend .1.3.6.1.4.1.2021.90 vopstats_zfs_nwrite /bin/bash -c “kstat -p ::vopstats_zfs:nwrite | cut -s -f 2”
extend .1.3.6.1.4.1.2021.90 vopstats_zfs_read_bytes /bin/bash -c “kstat -p ::vopstats_zfs:read_bytes | cut -s -f 2”
extend .1.3.6.1.4.1.2021.90 vopstats_zfs_write_bytes /bin/bash -c “kstat -p ::vopstats_zfs:write_bytes | cut -s -f 2”

I know on Extends you normally would not have to provide an OID, but I like to provide them anyway so I know where to look for the SNMP OID.

After adding these, save the file and say yes to the question to reload the file after the save. Now check the configuration with:

setup network service snmp-agent confcheck

and then restart the snmpd with:

setup network service snmp-agent restart

Your work is now done on the Nexenta host, you can check the settings with an snmpwalk command to see if it actually works

On CACTI

I assume you have an SNMP enabled device set up to point to your Nexenta server, if not this would be a good time to do so. SNMP V2c works for me.

Now import the following XML graph templates on the end of this post to your Cacti server (I’ve got these from this forum but had to modify them quite a bit for them to see get the data form the correct SNMP OID’s.

Now add these templates to your device, and create the graphs. If you are lucky you will get some pretty pictures with some usefull information, especially the one on L2ARC cache turned out to be quite useful to us.

Good luck, and if you have any questions, post  them.

 

Cheers

Continue reading “Graph ZFS details with Cacti (Nexenta)”

No more Nexenta CPU overload


We have all seen it, NexentaStor eating away the CPU in an ESXi environment. It’s actually not consuming the CPU cycles, but since Illuminos reserves cycles (basically telling the CPU to go through a zillion NOOPS) the CPU get trashed, eating up to 15% for each core. Not a pretty sight, and certainly something you would want to get rid off since this seriously screws up your ESXi resource scheduler.

How to go about this is actually quite easy, just disable the nmdtrace service. One small down side to that though, removing / disabling this service will kill all performance stats in the NMS, not that they are of any use anyways, they are nothing short of pathetic (sorry Nexenta), to get around that I will describe how to extend SNMP to get proper statistics into something like Cacti in a later post.

First, lets free up those NOOP cycles and kill nmdtrace. Before doing so you would like to remove the dependency of it with the NVM, so here goes (all as SU on the console of your Nexenta box of course)

svccfg -s nmv delpg nmdtrace

check the NVM service for state

svcs nmv

And if necessary, remove failure state by

svcadm clear nmv

svcadm refresh  nmv

Now we are good to go to kill the nmdtrace process by issuing

svcadm disable -s nmdtrace

If you would like to enable it (god knows why) just issue

svcadm enable s nmdtrace

See the pretty graph 🙂

ESXi CPU