ESP8266 Temperature logger for Nagios

Previously I had a post about ESP8266 microcontroller and I had some brief review of this lovely hardware here.
Now I’m going to post a new schematics and a Nagios module to use it as a real time temperature logger for Nagios server in Data-centres.

Briefly what it does and how it works:

– ESP8266 module reads the temperature sensor in every 10 seconds and sends data via UDP
– Nagios server processes the received UDP data  from the ESP module and compares with the settings in Nagios

If Nagios server picks value that triggers the alarm, than it will send warning or alarm to the Nagios admin.
So here we have the electronics schematics and the related program codes for the ESP and the Nagios server.

Connect the ESP and the Dallas sensor as on this picture below. This is the easiest way to wire up them. (1-Wire )
I’m not going to go into details of the ESP module programing, there are dozens articles on the net regarding to this.
Myself I use the LuaLoader, which I think is the easiest one to use. If you just need the related code files, then jump to the end of this article, there you can download all lua files.
You must correct the Nagios server’s address, which is this: ” cu:connect(7,”10.0.4.252″) ” and also your SSID and Password to connect to your access point.

esp8266-ds18b20-2_bb

For power supply I used an old USB cable to power up the ESP module from a server in the Data-centre. After all this is to check the racks and server’s temperature in the DC. :)
The USB has 5V as we know, so you would need to lower this up to 3.3V. You can use an AMS1117 5V to 3V stabilizer, please check the link below about this.
Temperature Sensor = Dallas
ESP module = ESP
AMS1117-3.3 = AMS

And from here the codes for the ESP8266 module and for the Nagios server as well.
Two files need to be uploaded to ESP8266:

first file init.lua:

#####

function startup()
if abort == true then
print(‘startup aborted’)
return
end
print(‘Starting xmitTemp’)
dofile(‘xmitTemp.lua’)
end

abort = false
print(‘Startup in 5 seconds’)
tmr.alarm(0,15000,0,startup)
#####

Second file xmitTemp.lua:

#####

function getTemp()

local addr = nil
local count = 0
local data = nil
local pin = 3 — pin connected to DS18B20
local s = ”

— setup gpio pin for oneWire access
ow.setup(pin)

— do search until addr is returned
repeat
count = count + 1
addr = ow.reset_search(pin)
addr = ow.search(pin)
tmr.wdclr()
until((addr ~= nil) or (count > 100))

— if addr was never returned, abort
if (addr == nil) then
print(‘DS18B20 not found’)
return -999999
end

s=string.format(“Addr:%02X-%02X-%02X-%02X-%02X-%02X-%02X-%02X”,
addr:byte(1),addr:byte(2),addr:byte(3),addr:byte(4),
addr:byte(5),addr:byte(6),addr:byte(7),addr:byte(8))
–print(s)

— validate addr checksum
crc = ow.crc8(string.sub(addr,1,7))
if (crc ~= addr:byte(8)) then
print(‘DS18B20 Addr CRC failed’);
return -999999
end

if not((addr:byte(1) == 0x10) or (addr:byte(1) == 0x28)) then
print(‘DS18B20 not found’)
return -999999
end

ow.reset(pin) — reset onewire interface
ow.select(pin, addr) — select DS18B20
ow.write(pin, 0x44, 1) — store temp in scratchpad
tmr.delay(1000000) — wait 1 sec

present = ow.reset(pin) — returns 1 if dev present
if present ~= 1 then
print(‘DS18B20 not present’)
return -999999
end

ow.select(pin, addr) — select DS18B20 again
ow.write(pin,0xBE,1) — read scratchpad

— rx data from DS18B20
data = nil
data = string.char(ow.read(pin))
for i = 1, 8 do
data = data .. string.char(ow.read(pin))
end

s=string.format(“Data:%02X-%02X-%02X-%02X-%02X-%02X-%02X-%02X”,
data:byte(1),data:byte(2),data:byte(3), data:byte(4),
data:byte(5),data:byte(6), data:byte(7),data:byte(8))
–print(s)

— validate data checksum
crc = ow.crc8(string.sub(data,1,8))
if (crc ~= data:byte(9)) then
print(‘DS18B20 data CRC failed’)
return -9999
end

— compute and return temp as 99V9999 (V is implied decimal-a little COBOL there)
return (data:byte(1) + data:byte(2) * 256) * 625

end — getTemp

function xmitTemp()
local temp = 0

temp = getTemp()
if temp == -999999 then
return
end

cu:send(tostring(temp))

end — xmitTemp

function initUDP()

— setup UDP port
cu=net.createConnection(net.UDP)
cu:connect(7,”10.0.4.252″)
— cu:connect(7,”10.0.4.252″)
end — initUDP

function initWIFI()

print(“Setting up WIFI…”)

wifi.setmode(wifi.STATION)
wifi.sta.config(“Your SSID”,”SSID Password”)
wifi.sta.connect()

tmr.alarm(1, 1000, 1,
function()
if wifi.sta.getip()== nil then
print(“IP unavailable, Waiting…”)
else
tmr.stop(1)
print(“Config done, IP is “..wifi.sta.getip())
end
end — function
)
end — initWIFI

initWIFI()
initUDP()
tmr.alarm(0, 10000, 1, xmitTem

Here follows the Nagios server modules:

Add to your localhost.cfg the following configuration.
This is usually at /usr/local/nagios/etc/objects/

define service{
use                             local-service         ; Name of service template to use
host_name                       Telehouse
service_description             Telehouse_Temperature
check_command                   check_temp
#check_interval                 0.5
#retry_interval                 1
#max_check_attempts             5
notification_interval           1
check_interval          1
retry_check_interval    1
max_check_attempts      5
}

#####

Create a file called check_temp in the libexec directory and make it executable.  (check_temp at /usr/local/nagios/libexec)

#!/bin/bash
DIRS=”/var/log /tmp”

temp1=`/usr/bin/cut -c 1-2 /home/nagios/current_temp.txt`
temp2=`/usr/bin/cut -c 3-4 /home/nagios/current_temp.txt`

op1=2200
op2=2500

count=$(/usr/bin/tail -n 1 /home/temp/current_temp.txt)

count2=$count

if [[ “$count2″ < “$op1″ ]] ; then

status=0
statustxt=OK

elif [[ “$count2″ < “$op2″ ]] ; then

status=1
statustxt=WARNING
else

status=2
statustxt=CRITICAL
fi

echo “$status Temperature:$temp1.$temp2; Triggers: 22.00;25.00;0; $statustxt – $count2″
exit $status

######

Add a new crontab to run tshark which will check the UDP echo messages from the ESP module.
If you don’t have tshark/wireshark installed, then make it available for your box.
CentOS: yum install wireshark
Debian: apt-get install wireshark

nano /etc/crontab

01 * * * * root cd /home/temp && /usr/bin/tshark -a duration:3600 -i eth0 src 10.0.4.30 -T fields -e data -w temp2.pcap & > /dev/null
* * * * * root /home/temp/temp.sh

######

Create a new directory in /home as temp

mkdir /home/temp

Create a file called temp.sh

#!/bin/bash

cat /home/temp/temp2.pcap | tr -dc ‘[:alnum:]\n\r’ | cut -c 2-5 | awk ‘length($0) > 2′ | tail -n 1 -c 5 > /home/temp/current_temp.txt

To check ESP8266 sending the correct UDP packet run this command:

tcpdump -i eth0 udp

You need to see similar UDP packets from the ESP module every 10 seconds:

18:10:40.248116 IP 10.0.4.30.45908 > nagiosnew.echo: UDP, length 6

And also in the Nagios you will hopefully see this:

telehouse

References:

https://bigdanzblog.wordpress.com/2015/04/29/snmp-environmental-monitoring-using-esp8266-based-sensors/
http://www.instructables.com/id/Low-cost-WIFI-temperature-data-logger-based-on-ESP/?ALLSTEPS
http://benlo.com/esp8266/
https://github.com/nodemcu/nodemcu-firmware/tree/master/


http://www.7layer.org/downloads/init.lua

http://www.7layer.org/downloads/xmitTemp.lua
http://www.7layer.org/downloads/esp8266_flasher.exe
http://www.7layer.org/downloads/v0.9.2.2 AT Firmware.bin

 

FacebookTwitterGoogle+LinkedInShare
 

VMware networking setup for vMotion/iSCSI & VM traffic

VMware ESX/ESXi network setup.

In the following post I will show you some networking setup regarding to VMware servers.
This will involve Cisco switches(2960/3750 series) and HP or Dell servers setup.
I got these configurations in production running for quite some times now(2+years) without any issues.

As we know the networking setup for VMware servers, got much more complicated, than any other regular server setup earlier we had with “classic” Linux or Windows physical boxes.
Classic only one uplink connection with regular vlan is not enough for vmware anymore.
You must separate virtual machine traffic from the management traffic and also you must separate the storage and vmotion traffic.
Although VMware says you can have separated vswitches for all physical connections with different vlans, but the failover to other physical connections is more complicated, than if you have one or two vswitches. VMware server needs minimum 2 network uplinks for VM traffic and management traffic, but VMware recommends 4 uplinks for the physical servers.

The following picture shows briefly the current setup.

7layer

So let’s take a look the 4 uplink configuration in the VMware ESXi host:

two

We got all 4 uplinks connected to the same vswitch. With this configuration is very easy to create the failover for the management traffic and to separate the storage and vmotion traffic as well. Let’s take a look the vswitch properties:

vswitch0-1

Also take a look the NIC teaming for the vswitch.
As you can see all adapters are active in this vswitch:

nic-teaming

 

Now take a look the management uplink settings.
The management network has one active adapters and two standby adapters.
If the active vnic0 adapter physical connection fails(switch issue or cable connection issue), then VMware kernel will activate one of the other standby adapters.
With this setup the management network will always be available and you cannot lose the connection to the VMware box.
       mgmt-ipstorage

Now we check the vMotion settings.
Here we have an added VMkernel port with vMotion and IP storage which contains extra IP address for the vMotion.
As you can see here we have one active adapters and three unused adapters. To properly separate this kind of traffic by the kernel you must tick the failover order and move down the adapters, that you don’t want to use in the kernel. This settings is the same with iSCSI storage.
vmotionvmotion-ip

Now take a look the Storage IP kernel settings.
Here we have also an extra added VMkernel port with extra IP address.
In this setup also the extra active vswitch adapters have been disconnected and unused as you can see on the picture.
Without this you won’t be able to add properly the iSCSI software storage. The VMkernel IP settings creates a point to point 1 to 1 connection to the storage and therefore only one active adapters should be enabled in any VMkernel port groups. With this setup you can have more than one path to the iSCSI storage, but for this you need to enable this feature in iSCSI setup.

storageiscsi

So now take a brief look to the Virtual Machine Port Group settings regarding to the Vlan settings.
You can add new vlans here to the kernel and create load balance and failover for the virtual machines.
I used two adapters from the physical adapters for the virtual machines and they are activated as vnic5/vnic1 and vnic1/vnic5 opposite to each other.
But if you have 4 or 6 uplink adapters, then you could active 3-4 adapters for the virtual machines, it’s up to you.
Also this depends on how heavily loaded your virtual boxes, obviously if the boxes are pretty loaded then, it’s better if you separate the loads and leave out vmotion and management traffic from the physical uplink connections.
1-x-network-vlan

4-x-network-vlan

 

I know it’s getting a bit confusing, so here we are again some binding regarding to the VLAN, management traffic and vMotion traffic:
mgmt-vlan

vmotion-vlan

The storage traffic is not added to any of those traffics, it is just connection via the VMkernel port group IP address as a one to one connection:

storage-vlan

In this setup I use the same vlan for the vMotion, because this is only used for maintenance, but if you use heavily the vMotion then it is better to be separeted into a different vlan.You might as well create a new physical uplink for traffic, which could help you to separate this traffic not just on a vlan level, but on the physical level also.

And finally the physical uplink ports to the Cisco switch:

interface GigabitEthernet1/0/22
description 10.0.4.92 vnic0
switchport trunk allowed vlan 100,200,300,400
switchport trunk native vlan 999
switchport mode trunk
switchport nonegotiate
speed 1000
duplex full

interface GigabitEthernet1/0/23
description 10.0.4.92 vnic2
switchport trunk allowed vlan 100,200,300,400
switchport trunk native vlan 999

switchport mode trunk
switchport nonegotiate
speed 1000
duplex full

The native vlan 999 command is used to change the default untagged vlan traffic which is vlan1.
With this command you can avoid unnecessary layer 2 traffic to the VMware server, like flooding and broadcast.
Also if you have a system already configured with vCenter, then sometimes you cannot change the management vlan, because vCenter won’t be able to reach the box anymore and the changes goes into error or the box could get dropped from vSphere. In that case you would need to disconnect the connected server from vCenter and create a second VMkernel interface with a different IP subnet with different physical interface, than the currently running one and connect to the box via that KMkernel. With this you can do any major changes to the main interface. (native vlan, vlan tagging etc) I have seen few times, when I wanted to do changes, then I lost the connection to the server and I needed to either reset the VMkernel management or rollback the switch configuration or change the native vlan on the switch. So you need to be careful with this changes, if you cannot reach your physical box for any reason (server is in a data-center or a different office)

So now let’s take a look the Cisco switch side, after the native vlan configuration and the trunking configuration:

Port        Mode             Encapsulation  Status        Native vlan
Gi1/0/22    on               802.1q         trunking      999

Port        Vlans allowed on trunk
Gi1/0/22    100,200,400

Port        Vlans allowed and active in management domain
Gi1/0/22    100,200,400

 

References:

https://www.vmware.com/files/pdf/support/landing_pages/Virtual-Support-Day-Best-Practices-Virtual-Networking-June-2012.pdf
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2038869
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2045040

 

IoT Temperature logger with ESP8266 and DS18B20 sensor

Current living room temperature:

I will post the circuit schematics and coding shortly, in the meantime this is the module that I used.
Also I’m posting the firmware flasher and the firmware that I used for this project.
There are many available on the net and you could get confused easily, so there you go follow this links and check what I bought and used.

ESP8266 used for this project: ESP-01: http://www.esp8266.com/wiki/doku.php?id=esp8266-module-family#esp-01

esp8266-pinout

ESP8266 on ebay: http://www.ebay.co.uk/sch/items/?_nkw=esp8266&_sacat=&_ex_kw=&_mPrRngCbx=1&_udlo=&_udhi=&_sop=12&_fpos=&_fspt=1&_sadis=&LH_CAds=&rmvSB=true

DS18B20 sensor on ebay: http://www.ebay.co.uk/sch/i.html?_fspt=1&_mPrRngCbx=1&_from=R40&_sacat=0&_nkw=DS18B20+sensor&_sop=15

Nodemcu Firmware: https://github.com/nodemcu/nodemcu-firmware/tree/master/pre_build/latest

esp8266_flasher

Programming and testing with Lualoader: http://benlo.com/esp8266/
download: http://benlo.com/esp8266/LuaLoader.zip

Programming and testing with ESPlorer: http://esp8266.ru/esplorer/
download: http://esp8266.ru/esplorer-latest/?f=ESPlorer.zip  

 

 

 

 

Linux server migration with VMware converter

The next post will show you how to migrate a Live Linux/Windows machine from any source to any destination remotely.

I’ll do this on 7layer.org website and post all related pictures regarding to this migration.
I am going to use VMware converter which will deal with everything.

Requirements: 

– Running VMware server

– Source machine with SSH or RDP connection (Linux/Windows)

– Destination VMware server

– VMware converter (free to download from VMware site)

Ok let’s start up the Vmware converter and connect up to the source machine.

Here you need to select source type as Powered on machine.
Also you need to add the login name and password.

 

convert1

 

At the next tab you need to type the destination machine’s IP address and the login details also.

 

convert2

 

Then at the next tab you must add the machine name.
On Linux this will be picked up from the host file automatically, you could leave it as it is if you prefer.

convert3

 

At the next step you must be really careful with the machine version number.
VMware automatically offers Version 10 which can only be managed from VCenter and that is not free.
So change this to version 8 or lover then you will be able to manage the machine from ESXi vSphere client.
This version means the machine hardware type. I use version 8 which is the highest available free version with out any licensing issue and cost.

convert4

convert5

Also on the destination page you should choose which datastore you want to use for the machine.

At the next tab converter will ask you the final parameters regarding to the conversion.
Here you must edit the Helper VM network tab and add an extra IP on the local network where the destination VMware server is.
With out this usually the converter dies at around 1% or 2% with out any extra notification.

convert6

 

convert7

 

Also you should check the option at advanced option. Reconfigure destination virtual machine should be ticked.
This will fix the initramdisk on the destination machine.

convert8

After this we can start the real migration process.

convert9

 

This will create the machine at the destination server and automatically starts it up and start pulling down the data from the source machine.

convert10

Destination server console with the running machine while it’s pulling down the data from source:

convert11

You can see the progress is quiet quick, it’s depend on the actual network speed and the source and destination machine CPU and disk speed.

convert12

At the source machine VMware converter uses tar command to compress the full disk into an image and send it through via the network to the destination machine as a compressed file.
The source gets overloaded a bit wit this process, but of course it’s depend on the source box. This is not a real machine just only a virtual one with 1 CPU socket and 512 RAM.
So this is definitely not a strong box and it runs other web sites at the moment, so that’s why the load is high on top.

convert13

Now let’s see the finished converted machine:

converter14

Indeed it reached the final stage and pull down the whole machine is about an hour.
The destination machine currently switched off on the destination server, but it contains the full copy of the source.
So let’s see if it boots up with out any issues:

vmware2

Seems like we got a kernel panic because of the local disk UUID was missing.
So right now we will need CentOS disk and boot up the box in rescue mode to fix the disk UUID.
Upload the CentOS version that you used on the source box and add it to VMware server.
I’ve got 64 bit on 7layer, so I’ll use that to fix the destination machine.

convert14

 

 

Boot up the machine but be quick you will have only about 1 sec to press escape at the BIOS screen and choose to boot from CDROM.
At the CentOS boot screen then you should choose Rescue installed system option to fix the box.

convert15

 

Then choose the continue option and then this give you the option to modify the root file system
convert16

Then at the next screen you can see the root system mounted under /mnt/sysimage.

convert17

 

 

Then choose the shell screen menu.

convert18

convert19

convert20

 

Then now we have all the root system mounted, so we can check the fstab and boot entries on the box.
Check the grub device map /boot/grub/device.map and /boot/grub/grub.conf regarding to the HDD type.
Also check the fstab /etc/fstab if it’s correct.

Old fstab:

convert21

 

New fstab corrected by VMware at converting:

convert22

Also device map looks correct:

convert23

After we checked these we need to fix the grub loader and the initramdisk.

This procedure can be found on VMware site also: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1002402

Rebuild initramsdisk:

convert25

mkinitrd -v -f /boot/initramfs-2.6.32-431.29.2.el6.x86_64.img 2.6.32-431.29.2.el6.x86_64

The line should matches with your grub kernel config. So check it in /boot directory.
This takes about 5-10 sec to rebuild the whole initramdisk.

Also we need to correct grub boot loader disk UUID. Either follow one of these steps below:

Correcting grub loader with UUID change:
#############

Run ls -l /dev/disk/by-uuid to check the correct UUID for the sda3 disk.

This is a very long line and easy to make mistakes here, so it’s better to be added with ls command to grub.conf and then move it to the correct place.

ls -l /dev/disk/by-uuid >> /boot/grub/grub.conf then move it from the end.

In this case the last line contains the correct UUID which is /dev/sda3.
So move this to root=UUID=  line

convert26

convert27

 

#############

Correcting grub loader with changing only kernel line in /boot/grub/grub.conf:

convert30

So now you can run grub-install with the corrected disk name:

convert24

 

When it finished try to reboot the box. Usually when you try to reboot ot halt commands here in rescue mode wont work.
Use from the VMware machine top menu ==>> VM ==>> Guest ==>> Send Ctrl+Alt+del tab to reboot the rescue disk.
And wait to reboot the box and see if it works fine.

convert28

convert29

 

Voila it’s booting up! :)

If you have trouble with /lib/modules/{kernel-number}/modules.dep at boot then you need to rebuild the initramdisk again.
Try to investigate via VMware site and carefully check the initramdisk name and kernel name at the boot directory.

I’ll mark the important parts which should be corrected otherwise the kernel wont boot:

#############

[root@7layer.org grub]# cat grub.conf | head -n 20

#
# Hetzner Online AG – installimage
# GRUB bootloader configuration file
#

timeout 5
default 0

title CentOS (2.6.32-431.29.2.el6.x86_64)
root (hd0,1)
kernel /vmlinuz-2.6.32-431.29.2.el6.x86_64 ro root=UUID=c8fbeb09-a9d6-449f-8a99-6f83b7cf4362 rd_NO_LUKS rd_NO_DM nomodeset crashkernel=auto SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYTABLE=de
initrd /initramfs-2.6.32-431.29.2.el6.x86_64.img

#############

cat menu.lst | head -n 20
#
# Hetzner Online AG – installimage
# GRUB bootloader configuration file
#

timeout 5
default 0

title CentOS (2.6.32-358.6.1.el6.x86_64)
root (hd0,1)
kernel /boot/vmlinuz-2.6.32-358.6.1.el6.x86_64 ro root=UUID=c8fbeb09-a9d6-449f-8a99-6f83b7cf4362 rd_NO_LUKS rd_NO_DM nomodeset
initrd /boot/initramfs-2.6.32-358.6.1.el6.x86_64.img

#############

cat device.map
(hd0) /dev/sda

#############

Fstab:

 

 

 

SPF record setup for mail server

How to set up and test SPF record for mail server:

Let’s check Google’s SPF record first with dig command.

[root@mail ~]# dig txt google.com

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.30.rc1.el6_6.1 <<>> txt google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 52169
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;google.com. IN TXT

;; ANSWER SECTION:
google.com. 3599 IN TXT “v=spf1 include:_spf.google.com ip4:216.73.93.70/31 ip4:216.73.93.72/31 ~all”

;; Query time: 12 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Fri Feb 27 08:47:05 2015
;; MSG SIZE rcvd: 116

[root@mail ~]#

In the answer section you can see the IP addresses. These are the servers which allowed to send mails via google.com.
So you have your domain name e.g. google.com and you have your mail server on it with an A record mail.google.com. This server can send mails for its own name, but any other servers are not allowed to send mails. With the SPF record, you can send mail from the IP address via google’s mail server. So server 216.73.93.70 and 72 can send mails (relay) via google’s mail server.

Also you can use domain names in SPF record and tell the server to use that instead of the IP address.

[root@mail ~]# dig txt smsnetmonitor.com

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.30.rc1.el6_6.1 <<>> txt smsnetmonitor.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 64785
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;smsnetmonitor.com. IN TXT

;; ANSWER SECTION:
smsnetmonitor.com. 21599 IN TXT “v=spf1 ip4:212.23.51.62 include:cloudsupportuk.com include:cctvalarm.net include:7layer.org include:smsgpstracker.com ~all”
smsnetmonitor.com. 21599 IN TXT “v=DMARC1\; p=none\; adkim=r\; aspf=r\; sp=none”

;; Query time: 36 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Fri Feb 27 08:59:31 2015
;; MSG SIZE rcvd: 225

[root@mail ~]#

 

Create and check SPF records:

http://www.spfwizard.net/
http://www.mtgsy.net/dns/spfwizard.php

http://mxtoolbox.com/spf.aspx
http://vamsoft.com/support/tools/spf-syntax-validator

Header check for emails to analyse SPF and other issues:

https://toolbox.googleapps.com/apps/messageheader/
http://mxtoolbox.com/EmailHeaders.aspx

 

 

 

MIPS Development

MIPS related development board from Imagination technology.

http://blog.imgtec.com/powervr-developers/new-mips-creator-ci20-development-board-for-linux-and-android-debuts

I just received my new IC20 MIPS based development board from Imgtec and I must say this is a piece of engineering art! :)

Already setup Apache2 web server, ssh-server, My-SQL server and also basic firewall (with iptables) and a Postfix mail server. http://86.1.80.160
I’m going to do an SMS server and hook it up to my CCTV system and publish everything about it shortly.

Thank you again Imgtec! http://www.imgtec.com/

Related links to CI20 and other Linux based hardware development:

http://elinux.org/Main_Page
http://elinux.org/MIPS_Creator_CI20

 

Debian Distro Upgrade

So let's make our hands dirty with some Debian Linux distro update!
It happened to be this week I have received a complaint against one of our server, which had some dodgy outdated PHP packages installed on it.
I had to investigate that what has happened with the box and fix the issue.
I figured out it has Debian lenny installed on it, which considered quiet old and end of life support.
For this release has no security update since 2012, so it must be updated to never release to fix this issue.
Although this box is behind a firewall, but still it's dangerous to have an outdated box sitting on the net.
So I had to do full distro update on the box, which will follow here:
First I installed the latest packages from the original distro, which was lenny.
After the update I rebooted the box and changed the source to squeeze:
# nano /etc/apt/sources.list
deb http://ftp.uk.debian.org/debian/ wheezy main contrib non-free
deb-src http://ftp.uk.debian.org/debian/ wheezy main contrib non-free
deb http://security.debian.org/ wheezy/updates main contrib non-free
deb-src http://security.debian.org/ wheezy/updates main contrib non-free
Then I started the upgrade process like this:
# aptitude update
# aptitude safe-upgrade
# aptitude dist-upgrade
Follow the instructions by the aptitude, it will asks what you want to do with the conflicting packages.
For example php.ini has a modified version, then what to do?
Keep the current modified version or use the provided one by the distro?
Sometimes you need to use the distro provided config file otherwise the service wont be able to start up.
For example I kept the mysql-server config and the new version could not start up.
So I replaced to the new one and modified the config with some old settings and viola it started up just fine.
So to do upgrade from lenny to wheezy you must upgrade first to squeeze, then to wheezy:
lenny -> squeeze -> wheezy
Be patient and prepare few good coffee for the upgrade, because it will take some time!
 

VMware free backup solution from virtuallyGhetto

Backup:

VMware free backup solution for ESXi servers: https://communities.vmware.com/docs/DOC-8760

Download the script from github: https://github.com/lamw/ghettoVCB then modify it for your system to fit in.

I’m going to explain the important parts that I usually change in this script:

In ghettoVCB.sh file:

– Backup path
– Rotation
– Backup format
– Email server
– Email to
– Email from

1: 

# directory that all VM backups should go (e.g. /vmfs/volumes/SAN_LUN1/mybackupdir)
VM_BACKUP_VOLUME=/vmfs/volumes/NAS-Fuji/backup

2:

# Format output of VMDK backup
# zeroedthick
# 2gbsparse
# thin
# eagerzeroedthick
DISK_BACKUP_FORMAT=thin

3:

# Number of backups for a given VM before deleting
VM_BACKUP_ROTATION_COUNT=3

Also in ghettoVCB.conf

VM_BACKUP_VOLUME=/vmfs/volumes/NAS-Fuji
DISK_BACKUP_FORMAT=thin
VM_BACKUP_ROTATION_COUNT=3

4:

EMAIL_SERVER=10.0.100.10

5:

EMAIL_TO=lszabo@7layer.org

6:

EMAIL_FROM=root@ghettoVCB

When you uploaded the ghettovcb.sh and ghettovcb.conf files you need to add execute flag to the ghettoVCB.sh file:

chmod +x ghettoVCB.sh

Then you can start backing up your VM machines.

Backup only one machine run this:

./ghettoVCB.sh -m vm_to_backup

Backup all machines:

./ghettoVCB.sh -a

If you want to machine to be avoided from backup then use an except file to achive this:
./ghettoVCB.sh -a -e vm_exclusion_list

VMware firewall wont let allow to send outgoing emails from the script, this need to be fixed.
Upload smtp.xml file to VMware server and update the firewall, with out this you will receive an error on VMware ssh console.

Script:  smtp.xml

Upload it to /etc/vmware/firewall and run esxcli update:

esxcli network firewall refresh

Then click on server name, configuration, security profile and you will see the new smtp outbond port appeared as a new outgoing firewall rule to allow smtp outgoing traffic from the server.

backup-smtp

 



Restore:

Altough you can use the restore script: https://communities.vmware.com/docs/DOC-10595 to restore machines from backup, but you can use them straight away when you add to your machine from the backup script.
This is much quicker then the restore process, but obviously the machine will reside on the backup path not on the original path. With this you can get back the machine ASAP, then create a backup onto the original path and shut down the backup path machine and add to the inventory the original path machine and start it up.

backup

Only one thing left to do is to make this process be automatic.
Edit crontab on your server and add this to it:
10 00 * * 1-5 /vmfs/volumes/ghettoVCB.sh -f /vmfs/volumes/Fuji-NAS/backuplist > /vmfs/volumes/ghettoVCB-backup-$(date +\%s).log

Crontab file located on VMware ESXi 5.5 at: /var/spool/crontabs/ and root file contains the current configuration for crontab.

cat /var/spool/cron/crontabs/root

#min hour day mon dow command
1 1 * * * /sbin/tmpwatch.py
1 * * * * /sbin/auto-backup.sh
0 * * * * /usr/lib/vmware/vmksummary/log-heartbeat.py
*/5 * * * * /sbin/hostd-probe ++group=host/vim/vmvisor/hostd-probe
10 00 * * 1-5 /vmfs/volumes/datastore1/root/opt/ghettoVCB.sh -f /vmfs/volumes/datastore1/root/opt/vmbackup.txt

This will run backup on every day at 10'o clock, but you can change it according to your needs.
 

NAS4Free

NAS4Free High available iSCSI failover VMware server. 

The following post will be how to install and set up NAS4Free server for your ESXi/ESX VMware server as an iSCSI storage.
NAS4Free is based on FreeBSD and has all the required services to serve your system as a High-Available Storage server. (HAST and CARP)
Of course you can use this solution in your network as a High-Available storage or as a Windows cifs samba server, if you modify the services on NAS4Free.
I’ll stick first to the iSCSI setup and later we will show you how to set up NFS and Windows(SAMBA) shares.

The following setup used here:

Node1 primary IP address for serving iSCSI and CARP services: 192.168.101.165
Node1 secondary IP address for HAST synchronisation: 172.16.100.1

Node2 primary IP address for serving iSCSI and CARP services: 192.168.101.166
Node2 secondary IP address for HAST synchronisation: 172.16.100.2

Virtual IP address(CARP address) for iSCSI service: 192.168.101.167

Node1 host name: has1
Node2 host name: has2

Install both nodes with lates NAS4Free edition.

– Change node names according to your set up for example: node1 and node2.

hostname

– Add node names to host file on both nodes.

hosts
– Setup carp services under Network/Interface management:

carp1

– Advertisement skew on has1 node: 0
– Advertisement skew on has2 node: 10

If has1 node dies then has2 node will take over all the services.

carp2

You must use same link up and link down action on both side of the nodes otherwise the switch over wont work properly!
So everything should be the same except the advertisement skew value.

Next step setup HAST services:

hast1

hast3

As you can see here the second network interface card used for the HAST service synchronisation not the main interface.
After you setup HAST service reboot both nodes, the apply wont help to start the services for some reason. 

– Switch on ssh service and ssh into both nodes.

On Master issue these commands:

hastctl role init disk1
hastctl create disk1
hastctl role primary disk1

On Slave issue these commands:

hastctl role init disk1
hastctl create disk1
hastctl role secondary disk1

Check both nodes with: hastctl status

Then configure ZFS
On Master:

Add disks (Disks->Management)

disk1: N/A (HAST device)
Advanced Power Management: Level 254
Acoustic level: Maximum performance
S.M.A.R.T.: Checked
Preformatted file system: ZFS storage pool device

Format as zfs (Disks->Format)

Add ZFS Virtual Disks (Disks->ZFS->Pools->Virtual Device)

Add Pools(Disks->ZFS->Pools->Management)

Add PostInit script on both nodes to /system/advanced/command scripts/ tab.
/usr/local/sbin/carp-hast-switch slave

Shut down the master and on the slave import the pool through the GUI.  Tab: /ZFS/Configuration/Detected
Then synchronise the pool on the slave!

When finished on slave, start master and switch VIP back to master.

zpool status disk1
hastctl status

Troubleshooting commands from SSH terminal:

zpool status

########
nast1: ~ # zpool status mvda0
  pool: mvda0
 state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run ‘zpool clear’.
   see: http://illumos.org/msg/ZFS-8000-HC
  scan: none requested
config:

        NAME                   STATE     READ WRITE CKSUM
        mvda0                  UNAVAIL      0     0     0
          2144332937472371213  REMOVED      0     0     0  was /dev/hast/hast
#########

If status unavailable then you could try:

zpool clear “pool name”

It will scan and scrub the local disks.

#########
nast1: ~ # zpool status mvda0
  pool: mvda0
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: scrub in progress since Mon Jun  2 15:26:25 2014
        1.19G scanned out of 1.43G at 28.3M/s, 0h0m to go
        0 repaired, 82.75% done
config:

        NAME         STATE     READ WRITE CKSUM
        mvda0        ONLINE       0     0     0
          hast/hast  ONLINE       0     0     0
#########

Then check pool again:
zpool status

#########
nast1: ~ # zpool status
  pool: mvda0
 state: ONLINE
  scan: scrub repaired 0 in 0h0m with 0 errors on Mon Jun  2 15:27:17 2014
config:

        NAME         STATE     READ WRITE CKSUM
        mvda0        ONLINE       0     0     0
          hast/hast  ONLINE       0     0     0
#########

Recreate sync on disks or split brain:

On Master issue these commands:

hastctl role init disk1
hastctl create disk1
hastctl role primary disk1

On Slave issue these commands:

hastctl role init disk1
hastctl create disk1
hastctl role secondary disk1

If you lost sync because of disk error or network error then you could recreate the sync between the hast disk(s).
Just recreate the roles and the nodes will start syncing the data. (use commands above)  Be careful with the roles and the nodes, don’t mix them up!
If you recreate the roles and the disks, you wont lose data at all. It will only start synching the disk(s) bt wont overwrite data.

If it a split brain scenario then you should decide which node has the newer data and issue the above commands according to the data. So for example if the secondary node has newer data then the primary then obviously you should issue: role primary on the second node and role secondary on the primary node and vica-versa.