With all of this tutorial publishing I’ve been doing for Grafana, I have neglected to post what mine currently looks like.  So, here it is.  This is the 1080P version.  I also built an iPad version that has a lot of the same info but compressed for the smaller screen.

This is kind of goofy with how Ubiquiti doesn’t do well at supporting SNMP.  For one thing, they don’t support it through the controller, only directly to each AP.  But, you have to enable it at the controller to have it flip the switch on the APs so they’ll respond.  They really want you to use the API, which is great if you’re a programmer.  I am not.  I’m a router jockey, so I like SNMP.  Anyway, after finding and downloading the MIBs I had a look through them and sorted out a couple of OIDs I was interested in.  Specifically, client count per radio and Eth0 bits in and bits out.  Here’s what I loaded into Telegraf.  You need a separate inputs section for each AP you want to monitor.  Nope, not really an “Enterprise” approach.

[[inputs.snmp]]
agents = [ “192.168.x.x:161” ]  ## The IP of a single AP.
timeout = “5s”
retries = 3
version = 1
community = “RO_Community”
max_repetitions = 10
name = “UnifiWiFiOffice”
[[inputs.snmp.field]]
name = “Bits.Out”
oid = “1.3.6.1.4.1.41112.1.6.2.1.1.12.0”
[[inputs.snmp.field]]
name = “Bits.In”
oid = “1.3.6.1.4.1.41112.1.6.2.1.1.6.0”
[[inputs.snmp.field]]
name = “2.4.Clients”
oid = “1.3.6.1.4.1.41112.1.6.1.2.1.8.0”
[[inputs.snmp.field]]
name = “5.0.Clients”
oid = “1.3.6.1.4.1.41112.1.6.1.2.1.8.3”

Continuing the documentation effort.  This is a shell script you run from Unraid in a cron job to feed stats to InfluxDB.  You can then present them in Grafana.  Note about that, I was having a lot of trouble getting the Grafana graphs to present correctly for anything coming from this script.  I had to change the Fill from “null” to “none” in the graph.  Not sure why that’s happening, but “none” gets it to behave just like everything else.

## Assembled from this post: https://lime-technology.com/forum/index.php?topic=52220.msg512346#msg512346

## add to cron like:

## * * * * * sleep 10; /boot/custom/influxdb.sh > /dev/null 2>&1

## //0,10 * * * * /boot/custom/influxdb.sh > /dev/null 2>&1
#

# Set Vars

#

DBURL=http://192.168.x.x:8086 ## IP address of your InfluxDB server

DBNAME=dashboard ## Easier if you pick an existing DB

DEVICE=”UNRAID”

CURDATE=`date +%s`

# Current array assignment.

# I could pull the automatically from /var/local/emhttp/disks.ini

# Parsing it wouldnt be that easy though.

DISK_ARRAY=( sdn sdl sdf sdc sdj sde sdo sdh sdi sdd sdk sdm sdg sdp sdb )

DESCRIPTION=( parity disk1 disk2 disk3 disk4 disk5 disk6 disk7 disk8 disk9 disk10 disk11 disk12 disk13 cache )

#

# Added -n standby to the check so smartctl is not spinning up my drives

#

i=0

for DISK in “${DISK_ARRAY[@]}”

do

smartctl -n standby -A /dev/$DISK | grep “Temperature_Celsius” | awk ‘{print $10}’ | while read TEMP

do

curl -is -XPOST “$DBURL/write?db=$DBNAME” –data-binary “DiskTempStats,DEVICE=${DEVICE},DISK=${DESCRIPTION[$i]} Temperature=${TEMP} ${CURDATE}000000000” >/dev/null 2>&1

done

((i++))

done
# Had to increase to 10 samples because I was getting a spike each time I read it. This seems to smooth it out more

top -b -n 10 -d.2 | grep “Cpu” | tail -n 1 | awk ‘{print $2,$4,$6,$8,$10,$12,$14,$16}’ | while read CPUusr CPUsys CPUnic CPUidle CPUio CPUirq CPUsirq CPUst

do

top -bn1 | head -3 | awk ‘/load average/ {print $12,$13,$14}’ | sed ‘s/,//g’ | while read LAVG1 LAVG5 LAVG15

do

curl -is -XPOST “$DBURL/write?db=$DBNAME” –data-binary “cpuStats,Device=${DEVICE} CPUusr=${CPUusr},CPUsys=${CPUsys},CPUnic=${CPUnic},CPUidle=${CPUidle},CPUio=${CPUio},CPUirq=${CPUirq},

CPUsirq=${CPUsirq},CPUst=${CPUst},CPULoadAvg1m=${LAVG1},CPULoadAvg5m=${LAVG5},CPULoadAvg15m=${LAVG15} ${CURDATE}000000000” >/dev/null 2>&1

done

done
if [[ -f byteCount.tmp ]] ; then
# Read the last values from the tmpfile – Line “eth0”

grep “eth0” byteCount.tmp | while read dev lastBytesIn lastBytesOut

do

cat /proc/net/dev | grep “eth0” | grep -v “veth” | awk ‘{print $2, $10}’ | while read currentBytesIn currentBytesOut

do

# Write out the current stats to the temp file for the next read

echo “eth0” ${currentBytesIn} ${currentBytesOut} > byteCount.tmp
totalBytesIn=`expr ${currentBytesIn} – ${lastBytesIn}`

totalBytesOut=`expr ${currentBytesOut} – ${lastBytesOut}`
curl -is -XPOST “$DBURL/write?db=$DBNAME” –data-binary “interfaceStats,Interface=eth0,Device=${DEVICE} bytesIn=${totalBytesIn},bytesOut=${totalBytesOut} ${CURDATE}000000000” >/

dev/null 2>&1
done

done
else

# Write out blank file

echo “eth0 0 0” > byteCount.tmp

fi
# Gets the stats for boot, disk#, cache, user

#

df | grep “mnt/\|/boot\|docker” | grep -v “user0\|containers” | sed ‘s/\/mnt\///g’ | sed ‘s/%//g’ | sed ‘s/\/var\/lib\///g’| sed ‘s/\///g’ | while read MOUNT TOTAL USED FREE UTILIZATION DISK

do

if [ “${DISK}” = “user” ]; then

DISK=”array_total”

fi

curl -is -XPOST “$DBURL/write?db=$DBNAME” –data-binary “drive_spaceStats,Device=${DEVICE},Drive=${DISK} Free=${FREE},Used=${USED},Utilization=${UTILIZATION} ${CURDATE}000000000” >/dev/null 2>&

1

done

Following my previous post about Grafana, once everything is installed you’ll want to capture some data.  Otherwise, what’s the point.  Telegraf is a data gathering tool made by Influxdata.  It’s stupid simple to get working with InfluxDB.  After following the previous script, go to /etc/telegraf/ and edit telegraf.conf.  Near the top is the Output Plugins section.  Make sure that’s modified for your InfluxDB install.  From there, scroll down to Input Plugins.  There’s a ridiculous number of input plugins available.  We’re focused on SNMP today, but it’s worth looking through the list to see if a “need” can be solved with Telegraf before using some other custom script.

For me, I needed to add SNMP for my Ubiquiti ER-X firewall and my Nutanix CE cluster.  Here’s my SNMP config section with the obvious security bits redacted:

# # Retrieves SNMP values from remote agents
# [[inputs.snmp]]
[[inputs.snmp]]
agents = [ “192.168.x.x:161” ] ##Nutanix CE CVM IP
timeout = “5s”
version = 3

max_repetitions = 50

sec_name = “username”
auth_protocol = “SHA” # Values: “MD5”, “SHA”, “”
auth_password = “password”
sec_level = “authPriv” # Values: “noAuthNoPriv”, “authNoPriv”, “authPriv”

priv_protocol = “AES” # Values: “DES”, “AES”, “”
priv_password = “password”

name = “nutanix”
[[inputs.snmp.field]]
name = “host1CPU”
oid = “1.3.6.1.4.1.41263.9.1.6.1”
[[inputs.snmp.field]]
name = “host2CPU”
oid = “1.3.6.1.4.1.41263.9.1.6.2”
[[inputs.snmp.field]]
name = “host3CPU”
oid = “1.3.6.1.4.1.41263.9.1.6.3”
[[inputs.snmp.field]]
name = “host4CPU”
oid = “1.3.6.1.4.1.41263.9.1.6.4”
[[inputs.snmp.field]]
name = “ClusterIOPS”
oid = “1.3.6.1.4.1.41263.506.0”
[[inputs.snmp.field]]
name = “Host1MEM”
oid = “1.3.6.1.4.1.41263.9.1.8.1”
[[inputs.snmp.field]]
name = “Host2MEM”
oid = “1.3.6.1.4.1.41263.9.1.8.2”
[[inputs.snmp.field]]
name = “Host3MEM”
oid = “1.3.6.1.4.1.41263.9.1.8.3”
[[inputs.snmp.field]]
name = “Host4MEM”
oid = “1.3.6.1.4.1.41263.9.1.8.4”

[[inputs.snmp]]
agents = [ “192.168.0.1:161” ] ##Firewall IP
timeout = “5s”
retries = 3
version = 2
community = “RO_community_string”
max_repetitions = 10

name = “ERX”
[[inputs.snmp.field]]

name = “Bytes.Out”
oid = “1.3.6.1.2.1.2.2.1.10.2”
[[inputs.snmp.field]]
name = “Bytes.In”
oid = “1.3.6.1.2.1.2.2.1.16.2”

You’ll have to get Telegraf to read in the config again.  The sledgehammer method would be a reboot.  I think a Telegraf service restart would also do the trick.  Reboots for me take about 5 seconds (yep, really), so it’s useful to make sure it’s coming up clean on a reboot anyway.

Just went through setting up Grafana on Ubuntu 16.04 and thought I would grab the steps I went through.  I’m using a combination of Telegraf and some custom remote scripts to get data into InfluxDB.

curl -sL https://repos.influxdata.com/influxdb.key | sudo apt-key add –
source /etc/lsb-release
echo “deb https://repos.influxdata.com/${DISTRIB_ID,,} ${DISTRIB_CODENAME} stable” | sudo tee /etc/apt/sources.list.d/influxdb.list
sudo apt-get update && sudo apt-get install influxdb
sudo service influxdb start
echo “deb https://packagecloud.io/grafana/testing/debian/ wheezy main” | sudo tee /etc/apt/sources.list.d/grafana.list
curl https://packagecloud.io/gpg.key | sudo apt-key add –
sudo apt-get update && sudo apt-get install grafana
sudo service grafana-server start
wget https://dl.influxdata.com/telegraf/releases/telegraf_1.2.1_amd64.deb
sudo dpkg -i telegraf_1.2.1_amd64.deb
telegraf -sample-config > telegraf.conf
nano telegraf.conf
telegraf -config telegraf.conf
sudo cp telegraf.conf /etc/telegraf/telegraf.conf
sudo systemctl enable grafana-server.service
sudo systemctl enable telegraf.service
sudo reboot

This gets things installed.  I’ll have another post to describe other configuration that’s required.

This is going to be more of a stream of thought than a specific guide.  There are a lot of moving parts in this and no one seems to have the whole answer.  So, here’s what I’ve been working around so far.

Of course, you must have a tenant set up in Office365.  It must have Azure AD Connect, or whatever they’re deciding to call it these days, functioning correctly.  There’s plenty of resources for getting that far, so I won’t rehash that.

Many of the commands are run in the Lync PowerShell on the FE.

“Get-CsHostingProvider” should look like this:

Identity : LyncOnline
Name : LyncOnline
ProxyFqdn : sipfed.online.lync.com
VerificationLevel : UseSourceVerification
Enabled : True
EnabledSharedAddressSpace : True
HostsOCSUsers : True
IsLocal : False
AutodiscoverUrl : https://webdir.online.lync.com/Autodiscover/Autodis
coverService.svc/root

The syntax is:  Set-CsHostingProvider -Identity “LyncOnline” -VerificationLevel UseSourceVerification -HostsOCSUsers $True -EnabledSharedAddressSpace $True -AutodiscoverUrl https://webdir.online.lync.com/Autodiscover/AutodiscoverService.svc/root

 

Download the SkypeOnlinePowershell.exe.  I’m not going to link to it because Microsoft likes to change locations. Install that on the same Lync FE.
Then, in the Windows Powershell:

Import-Module SkypeOnlineConnector
$cred = Get-Credential
$CSSession = New-CsOnlineSession -Credential $cred -OverrideAdminDomain “yourcompany.onmicrosoft.com”
Import-PSSession $CSSession -AllowClobber
Get-Service “msoidsvc”
Set-CsTenantFederationConfiguration -SharedSipAddressSpace $true

msoidcli is another download from Microsoft.  That and the SkypeOnlinePowershell are plugins to enable functionality.  You’ll probably need them.

You then need to move a pilot user to the online system.  The command is:  Move-CsUser -Identity user@sipdomain.com -Target sipfed.online.lync.com -Credential $creds -HostedMigrationOverrideUrl https://adminXX.online.lync.com/HostedMigration/hostedmigrationservice.svc -DomainController dc-internal-name.local

The adminXX url needs to be grabbed from your S4B online admin portal.  It’s just that part of the url that you see when you’re in the S4B dashboard.  The identity is the test user you want to migrate.  I ran into a lot of trouble getting this to work.  I had to figure out the above commands and then I had to wait for the SharedSipAddressSpace to take affect.  It was not immediate.

At the moment I’m still not able to route calls properly, but the user is showing up in the online S4B admin interface as being migrated.  Lync also shows the user being in LyncOnline.  I’ll edit this post as I make progress.

Unfortunately, I haven’t posted about how my Cobra is legal now.  In the meantime, how about this.  I recently got a cheap action camera off of Amazon.  Works pretty well, and the video quality is decent.  However, I needed a way to mount it to the Cobra.  I stumbled across the bracket you see below at a Tractor Supply store.  This is meant to mount one of those obscenely bright LED light bars for trucks.  However, it fits perfectly on the FFR roll bar.  Rock solid and good mounting for the camera mount.  I think it was <$15 for the pair.

IMG_2411 IMG_2416

Just a quick reminder note about something I’ve run into with Aerohive a couple of times.  If you get too anxious and start changing the config and rebooting quickly, the APs will get confused and seem to go into a waiting period.  Things will behave oddly, and you’ll get error messages like “There’s an admin modifying the config”, or something to that effect.  Just be patient, and either wait for or perform a full reboot.  And then be patient.  It seems like these things just need some time to get caught up occasionally.

Also, I ran into a situation where non-Apple devices would connect fine, but all Apple devices would either say “Unable to join” or “Incorrect password”.  No rhyme or reason to it.  Eventually, after several reboots, the Apple devices magically started working.  Again, just be patient.  It’s not like applying changes to a standalone AP, or even a local controller.  There’s that Internet thing getting in the way!

I’ve been having some trouble with my two Apple Airport Extreme’s in the house.  They are both a couple of generations old and I got them both used off of Ebay some years ago.  They’ve served me well and provided good throughput and signal coverage.  For some reason I can’t explain, in the last month they’ve become slow and buggy.  Maybe it was an update.  Regardless, I’ve had my eye on the new AC APs from Ubiquiti and this was a good excuse to pull the trigger.

So, I decided to get a couple of the LR models, partly because I want more coverage out in the yard, partly because they are less expensive and partly because they are readily available.  I set up the Unifi controller in a VM in Nutanix first, and installation could not have been easier.  So far, I’m very happy with the coverage and performance.  I’ve been getting good coverage in the house, and I’m able to still use them at almost 200′ away from the house.

 

I’ve been running on a Nutanix CE install for about a month now.  With the November release they added some much needed GUI controls for the image service.  You can now import ISOs for install images, without having to fiddle with CLI stuff.

I’ve had virtually no problems, and the VMs are performing well.  If there’s one complaint I have with this solution it’s that the baseline memory utilization is high.  I couldn’t reduce the CVM’s to less than 8GB each without running into serious problems with the cluster.  Plus, there seems to be a missing 3GB per host.  I’m assuming this is what the actual CE and KVM host requires, but that seems high.  I know I can run VMWare ESXi in less than 1GB per host.  So, 11GB per host is used up right from the start.  Since I’m running this on a shoestring budget with 16GB per host, I really only have 5GB available for VMs.  That kinda sucks.

On the upside, the CVM’s at 8GB work fine and the IO performance is pretty amazing.  I’ve seen upwards of 1600 IOPS at times.  This is basically a single consumer grade 240GB SSD in each host for the primary tier and 640GB HDD for the secondary tier.  I don’t think I’m even using the secondary yet.  3 hosts at varying levels of i5 CPU’s, but none of them current gen.

I’m pretty happy with this and I’m looking forward to seeing what Nutanix does next.