Archive for the ‘ Cloud, Devops, Monitoring ’ Category

VMware ESXi & VCenter used disk percentage monitoring

GitHub: https://github.com/7layerorg/Monitoring/tree/master/Nagios/ESXi

The next article is about SNMP monitoring with Nagios to check VMware ESXi and VCenter servers used disk size.

I’ve had some trouble recently with our VCenter server because the logs just filled up one of the volume and I was unable to log in to the server at all.
Even the main console didn’t work and it was complaining about the storage filling up with logs.
VCenter server do not log rotate the old logs unfortunately by default, so sooner or later the volume will be filled up.

Well there are some solutions to monitor used disk size but the default Nagios won’t give you straight and appropriate answer from the VCenter server or form the ESXi boxes either.
This is why I made a script with SNMPwalk to be able to monitor any kind of ESXi or VCenter servers.
There are some “tricks” in the script because for VCenter need a different OID to check than on ESXi boxes, but this is built in to the script already.
Also there is a way to monitor disk size with SNMP under Cacti monitoring server but to receive an email regarding to the triggers you will need to modify way too many thresholds and would take ages if you have several servers.

With this script you will need to add only the host names and the disk number and you are all set, Nagios will take care all of the rest.
The check_vmware_disk script needs to be uploaded into your Nagios libexec folder, so Nagios will be able to run the script automatically when it’s been scheduled.

There are some pictures regarding to this service:

Also the second script which is available via my GitHub account is to check the ESXi host’s network up-link.
root@nagios: ./check_esxi_vnic 10.0.4.71 1
up
0 status:0; Triggers: down; OK – up

root@nagios: ./check_esxi_vnic 10.0.4.71 2
down
1 status:1; Triggers: down; CRITICAL – down

 

 

VMware Replication & Recovery

The following three videos show how to create virtual machine offsite replication in your vCenter server on any available storage drive in your server.
This solution is available freely from VMware and can be integrated into all type of vCenters, even into VMware Small Business Essentials Plus.
To download VMware replication follow this link: VMware Replication

The backup storage can be a network share mount or a local drive.
If you use for example an NFS share or iSCSI storage, then this gives you the benefit of an offsite backup for your virtual machines.
The replication is automatic and scheduled to run in the background on the vCenter server.

The offsite backup, can be restored with out the vCenter server, if for any reason vCenter is unavailable.
In this case you need to add manually the backed up machine onto your vCenter or stand alone ESXi server’s inventory.

 

 

VMware networking setup for vMotion/iSCSI & VM traffic

VMware ESX/ESXi network setup.

In the following post I will show you some networking setup regarding to VMware servers.
This will involve Cisco switches(2960/3750 series) and HP or Dell servers setup.
I got these configurations in production running for quite some times now(2+years) without any issues.

As we know the networking setup for VMware servers, got much more complicated, than any other regular server setup earlier we had with “classic” Linux or Windows physical boxes.
Classic only one uplink connection with regular vlan is not enough for vmware anymore.
You must separate virtual machine traffic from the management traffic and also you must separate the storage and vmotion traffic.
Although VMware says you can have separated vswitches for all physical connections with different vlans, but the failover to other physical connections is more complicated, than if you have one or two vswitches. VMware server needs minimum 2 network uplinks for VM traffic and management traffic, but VMware recommends 4 uplinks for the physical servers.

The following picture shows briefly the current setup.

7layer

So let’s take a look the 4 uplink configuration in the VMware ESXi host:

two

We got all 4 uplinks connected to the same vswitch. With this configuration is very easy to create the failover for the management traffic and to separate the storage and vmotion traffic as well. Let’s take a look the vswitch properties:

vswitch0-1

Also take a look the NIC teaming for the vswitch.
As you can see all adapters are active in this vswitch:

nic-teaming

 

Now take a look the management uplink settings.
The management network has one active adapters and two standby adapters.
If the active vnic0 adapter physical connection fails(switch issue or cable connection issue), then VMware kernel will activate one of the other standby adapters.
With this setup the management network will always be available and you cannot lose the connection to the VMware box.
       mgmt-ipstorage

Now we check the vMotion settings.
Here we have an added VMkernel port with vMotion and IP storage which contains extra IP address for the vMotion.
As you can see here we have one active adapters and three unused adapters. To properly separate this kind of traffic by the kernel you must tick the failover order and move down the adapters, that you don’t want to use in the kernel. This settings is the same with iSCSI storage.
vmotionvmotion-ip

Now take a look the Storage IP kernel settings.
Here we have also an extra added VMkernel port with extra IP address.
In this setup also the extra active vswitch adapters have been disconnected and unused as you can see on the picture.
Without this you won’t be able to add properly the iSCSI software storage. The VMkernel IP settings creates a point to point 1 to 1 connection to the storage and therefore only one active adapters should be enabled in any VMkernel port groups. With this setup you can have more than one path to the iSCSI storage, but for this you need to enable this feature in iSCSI setup.

storageiscsi

So now take a brief look to the Virtual Machine Port Group settings regarding to the Vlan settings.
You can add new vlans here to the kernel and create load balance and failover for the virtual machines.
I used two adapters from the physical adapters for the virtual machines and they are activated as vnic5/vnic1 and vnic1/vnic5 opposite to each other.
But if you have 4 or 6 uplink adapters, then you could active 3-4 adapters for the virtual machines, it’s up to you.
Also this depends on how heavily loaded your virtual boxes, obviously if the boxes are pretty loaded then, it’s better if you separate the loads and leave out vmotion and management traffic from the physical uplink connections.
1-x-network-vlan

4-x-network-vlan

 

I know it’s getting a bit confusing, so here we are again some binding regarding to the VLAN, management traffic and vMotion traffic:
mgmt-vlan

vmotion-vlan

The storage traffic is not added to any of those traffics, it is just connection via the VMkernel port group IP address as a one to one connection:

storage-vlan

In this setup I use the same vlan for the vMotion, because this is only used for maintenance, but if you use heavily the vMotion then it is better to be separeted into a different vlan.You might as well create a new physical uplink for traffic, which could help you to separate this traffic not just on a vlan level, but on the physical level also.

And finally the physical uplink ports to the Cisco switch:

interface GigabitEthernet1/0/22
description 10.0.4.92 vnic0
switchport trunk allowed vlan 100,200,300,400
switchport trunk native vlan 999
switchport mode trunk
switchport nonegotiate
speed 1000
duplex full

interface GigabitEthernet1/0/23
description 10.0.4.92 vnic2
switchport trunk allowed vlan 100,200,300,400
switchport trunk native vlan 999

switchport mode trunk
switchport nonegotiate
speed 1000
duplex full

The native vlan 999 command is used to change the default untagged vlan traffic which is vlan1.
With this command you can avoid unnecessary layer 2 traffic to the VMware server, like flooding and broadcast.
Also if you have a system already configured with vCenter, then sometimes you cannot change the management vlan, because vCenter won’t be able to reach the box anymore and the changes goes into error or the box could get dropped from vSphere. In that case you would need to disconnect the connected server from vCenter and create a second VMkernel interface with a different IP subnet with different physical interface, than the currently running one and connect to the box via that KMkernel. With this you can do any major changes to the main interface. (native vlan, vlan tagging etc) I have seen few times, when I wanted to do changes, then I lost the connection to the server and I needed to either reset the VMkernel management or rollback the switch configuration or change the native vlan on the switch. So you need to be careful with this changes, if you cannot reach your physical box for any reason (server is in a data-center or a different office)

So now let’s take a look the Cisco switch side, after the native vlan configuration and the trunking configuration:

Port        Mode             Encapsulation  Status        Native vlan
Gi1/0/22    on               802.1q         trunking      999

Port        Vlans allowed on trunk
Gi1/0/22    100,200,400

Port        Vlans allowed and active in management domain
Gi1/0/22    100,200,400

 

References:

https://www.vmware.com/files/pdf/support/landing_pages/Virtual-Support-Day-Best-Practices-Virtual-Networking-June-2012.pdf
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2038869
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2045040

Linux server migration with VMware converter

The next post will show you how to migrate a Live Linux/Windows machine from any source to any destination remotely.

I’ll do this on 7layer.org website and post all related pictures regarding to this migration.
I am going to use VMware converter which will deal with everything.

Requirements: 

– Running VMware server

– Source machine with SSH or RDP connection (Linux/Windows)

– Destination VMware server

– VMware converter (free to download from VMware site)

Ok let’s start up the Vmware converter and connect up to the source machine.

Here you need to select source type as Powered on machine.
Also you need to add the login name and password.

 

convert1

 

At the next tab you need to type the destination machine’s IP address and the login details also.

 

convert2

 

Then at the next tab you must add the machine name.
On Linux this will be picked up from the host file automatically, you could leave it as it is if you prefer.

convert3

 

At the next step you must be really careful with the machine version number.
VMware automatically offers Version 10 which can only be managed from VCenter and that is not free.
So change this to version 8 or lover then you will be able to manage the machine from ESXi vSphere client.
This version means the machine hardware type. I use version 8 which is the highest available free version with out any licensing issue and cost.

convert4

convert5

Also on the destination page you should choose which datastore you want to use for the machine.

At the next tab converter will ask you the final parameters regarding to the conversion.
Here you must edit the Helper VM network tab and add an extra IP on the local network where the destination VMware server is.
With out this usually the converter dies at around 1% or 2% with out any extra notification.

convert6

 

convert7

 

Also you should check the option at advanced option. Reconfigure destination virtual machine should be ticked.
This will fix the initramdisk on the destination machine.

convert8

After this we can start the real migration process.

convert9

 

This will create the machine at the destination server and automatically starts it up and start pulling down the data from the source machine.

convert10

Destination server console with the running machine while it’s pulling down the data from source:

convert11

You can see the progress is quiet quick, it’s depend on the actual network speed and the source and destination machine CPU and disk speed.

convert12

At the source machine VMware converter uses tar command to compress the full disk into an image and send it through via the network to the destination machine as a compressed file.
The source gets overloaded a bit wit this process, but of course it’s depend on the source box. This is not a real machine just only a virtual one with 1 CPU socket and 512 RAM.
So this is definitely not a strong box and it runs other web sites at the moment, so that’s why the load is high on top.

convert13

Now let’s see the finished converted machine:

converter14

Indeed it reached the final stage and pull down the whole machine is about an hour.
The destination machine currently switched off on the destination server, but it contains the full copy of the source.
So let’s see if it boots up with out any issues:

vmware2

Seems like we got a kernel panic because of the local disk UUID was missing.
So right now we will need CentOS disk and boot up the box in rescue mode to fix the disk UUID.
Upload the CentOS version that you used on the source box and add it to VMware server.
I’ve got 64 bit on 7layer, so I’ll use that to fix the destination machine.

convert14

 

 

Boot up the machine but be quick you will have only about 1 sec to press escape at the BIOS screen and choose to boot from CDROM.
At the CentOS boot screen then you should choose Rescue installed system option to fix the box.

convert15

 

Then choose the continue option and then this give you the option to modify the root file system
convert16

Then at the next screen you can see the root system mounted under /mnt/sysimage.

convert17

 

 

Then choose the shell screen menu.

convert18

convert19

convert20

 

Then now we have all the root system mounted, so we can check the fstab and boot entries on the box.
Check the grub device map /boot/grub/device.map and /boot/grub/grub.conf regarding to the HDD type.
Also check the fstab /etc/fstab if it’s correct.

Old fstab:

convert21

 

New fstab corrected by VMware at converting:

convert22

Also device map looks correct:

convert23

After we checked these we need to fix the grub loader and the initramdisk.

This procedure can be found on VMware site also: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1002402

Rebuild initramsdisk:

convert25

mkinitrd -v -f /boot/initramfs-2.6.32-431.29.2.el6.x86_64.img 2.6.32-431.29.2.el6.x86_64

The line should matches with your grub kernel config. So check it in /boot directory.
This takes about 5-10 sec to rebuild the whole initramdisk.

Also we need to correct grub boot loader disk UUID. Either follow one of these steps below:

Correcting grub loader with UUID change:
#############

Run ls -l /dev/disk/by-uuid to check the correct UUID for the sda3 disk.

This is a very long line and easy to make mistakes here, so it’s better to be added with ls command to grub.conf and then move it to the correct place.

ls -l /dev/disk/by-uuid >> /boot/grub/grub.conf then move it from the end.

In this case the last line contains the correct UUID which is /dev/sda3.
So move this to root=UUID=  line

convert26

convert27

 

#############

Correcting grub loader with changing only kernel line in /boot/grub/grub.conf:

convert30

So now you can run grub-install with the corrected disk name:

convert24

 

When it finished try to reboot the box. Usually when you try to reboot ot halt commands here in rescue mode wont work.
Use from the VMware machine top menu ==>> VM ==>> Guest ==>> Send Ctrl+Alt+del tab to reboot the rescue disk.
And wait to reboot the box and see if it works fine.

convert28

convert29

 

Voila it’s booting up! 🙂

If you have trouble with /lib/modules/{kernel-number}/modules.dep at boot then you need to rebuild the initramdisk again.
Try to investigate via VMware site and carefully check the initramdisk name and kernel name at the boot directory.

I’ll mark the important parts which should be corrected otherwise the kernel wont boot:

#############

[root@7layer.org grub]# cat grub.conf | head -n 20

#
# Hetzner Online AG – installimage
# GRUB bootloader configuration file
#

timeout 5
default 0

title CentOS (2.6.32-431.29.2.el6.x86_64)
root (hd0,1)
kernel /vmlinuz-2.6.32-431.29.2.el6.x86_64 ro root=UUID=c8fbeb09-a9d6-449f-8a99-6f83b7cf4362 rd_NO_LUKS rd_NO_DM nomodeset crashkernel=auto SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYTABLE=de
initrd /initramfs-2.6.32-431.29.2.el6.x86_64.img

#############

cat menu.lst | head -n 20
#
# Hetzner Online AG – installimage
# GRUB bootloader configuration file
#

timeout 5
default 0

title CentOS (2.6.32-358.6.1.el6.x86_64)
root (hd0,1)
kernel /boot/vmlinuz-2.6.32-358.6.1.el6.x86_64 ro root=UUID=c8fbeb09-a9d6-449f-8a99-6f83b7cf4362 rd_NO_LUKS rd_NO_DM nomodeset
initrd /boot/initramfs-2.6.32-358.6.1.el6.x86_64.img

#############

cat device.map
(hd0) /dev/sda

#############

Fstab:

 

 

SPF record setup for mail server

How to set up and test SPF record for mail server:

Let’s check Google’s SPF record first with dig command.

[root@mail ~]# dig txt google.com

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.30.rc1.el6_6.1 <<>> txt google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 52169
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;google.com. IN TXT

;; ANSWER SECTION:
google.com. 3599 IN TXT “v=spf1 include:_spf.google.com ip4:216.73.93.70/31 ip4:216.73.93.72/31 ~all”

;; Query time: 12 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Fri Feb 27 08:47:05 2015
;; MSG SIZE rcvd: 116

[root@mail ~]#

In the answer section you can see the IP addresses. These are the servers which allowed to send mails via google.com.
So you have your domain name e.g. google.com and you have your mail server on it with an A record mail.google.com. This server can send mails for its own name, but any other servers are not allowed to send mails. With the SPF record, you can send mail from the IP address via google’s mail server. So server 216.73.93.70 and 72 can send mails (relay) via google’s mail server.

Also you can use domain names in SPF record and tell the server to use that instead of the IP address.

[root@mail ~]# dig txt smsnetmonitor.com

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.30.rc1.el6_6.1 <<>> txt smsnetmonitor.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 64785
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;smsnetmonitor.com. IN TXT

;; ANSWER SECTION:
smsnetmonitor.com. 21599 IN TXT “v=spf1 ip4:212.23.51.62 include:cloudsupportuk.com include:cctvalarm.net include:7layer.org include:smsgpstracker.com ~all”
smsnetmonitor.com. 21599 IN TXT “v=DMARC1\; p=none\; adkim=r\; aspf=r\; sp=none”

;; Query time: 36 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Fri Feb 27 08:59:31 2015
;; MSG SIZE rcvd: 225

[root@mail ~]#

 

Create and check SPF records:

http://www.spfwizard.net/
http://www.mtgsy.net/dns/spfwizard.php

http://mxtoolbox.com/spf.aspx
http://vamsoft.com/support/tools/spf-syntax-validator

Header check for emails to analyse SPF and other issues:

https://toolbox.googleapps.com/apps/messageheader/
http://mxtoolbox.com/EmailHeaders.aspx

 

 

MIPS Development

MIPS related development board from Imagination technology.

http://blog.imgtec.com/powervr-developers/new-mips-creator-ci20-development-board-for-linux-and-android-debuts

I just received my new IC20 MIPS based development board from Imgtec and I must say this is a piece of engineering art! 🙂

Already setup Apache2 web server, ssh-server, My-SQL server and also basic firewall (with iptables) and a Postfix mail server. http://86.1.80.160
I’m going to do an SMS server and hook it up to my CCTV system and publish everything about it shortly.

Thank you again Imgtec! http://www.imgtec.com/

Related links to CI20 and other Linux based hardware development:

http://elinux.org/Main_Page
http://elinux.org/MIPS_Creator_CI20

VMware free backup solution from virtuallyGhetto

Backup:

VMware free backup solution for ESXi servers: https://communities.vmware.com/docs/DOC-8760

Download the script from github: https://github.com/lamw/ghettoVCB then modify it for your system to fit in.

I’m going to explain the important parts that I usually change in this script:

In ghettoVCB.sh file:

– Backup path
– Rotation
– Backup format
– Email server
– Email to
– Email from

1: 

# directory that all VM backups should go (e.g. /vmfs/volumes/SAN_LUN1/mybackupdir)
VM_BACKUP_VOLUME=/vmfs/volumes/NAS-Fuji/backup

2:

# Format output of VMDK backup
# zeroedthick
# 2gbsparse
# thin
# eagerzeroedthick
DISK_BACKUP_FORMAT=thin

3:

# Number of backups for a given VM before deleting
VM_BACKUP_ROTATION_COUNT=3

Also in ghettoVCB.conf

VM_BACKUP_VOLUME=/vmfs/volumes/NAS-Fuji
DISK_BACKUP_FORMAT=thin
VM_BACKUP_ROTATION_COUNT=3

4:

EMAIL_SERVER=10.0.100.10

5:

EMAIL_TO=lszabo@7layer.org

6:

EMAIL_FROM=root@ghettoVCB

When you uploaded the ghettovcb.sh and ghettovcb.conf files you need to add execute flag to the ghettoVCB.sh file:

chmod +x ghettoVCB.sh

Then you can start backing up your VM machines.

Backup only one machine run this:

./ghettoVCB.sh -m vm_to_backup

Backup all machines:

./ghettoVCB.sh -a

If you want to machine to be avoided from backup then use an except file to achive this:
./ghettoVCB.sh -a -e vm_exclusion_list

VMware firewall wont let allow to send outgoing emails from the script, this need to be fixed.
Upload smtp.xml file to VMware server and update the firewall, with out this you will receive an error on VMware ssh console.

Script:  smtp.xml

Upload it to /etc/vmware/firewall and run esxcli update:

esxcli network firewall refresh

Then click on server name, configuration, security profile and you will see the new smtp outbond port appeared as a new outgoing firewall rule to allow smtp outgoing traffic from the server.

backup-smtp

 



Restore:

Altough you can use the restore script: https://communities.vmware.com/docs/DOC-10595 to restore machines from backup, but you can use them straight away when you add to your machine from the backup script.
This is much quicker then the restore process, but obviously the machine will reside on the backup path not on the original path. With this you can get back the machine ASAP, then create a backup onto the original path and shut down the backup path machine and add to the inventory the original path machine and start it up.

backup

Only one thing left to do is to make this process be automatic.
Edit crontab on your server and add this to it:
10 00 * * 1-5 /vmfs/volumes/ghettoVCB.sh -f /vmfs/volumes/Fuji-NAS/backuplist > /vmfs/volumes/ghettoVCB-backup-$(date +\%s).log

Crontab file located on VMware ESXi 5.5 at: /var/spool/crontabs/ and root file contains the current configuration for crontab.

cat /var/spool/cron/crontabs/root

#min hour day mon dow command
1 1 * * * /sbin/tmpwatch.py
1 * * * * /sbin/auto-backup.sh
0 * * * * /usr/lib/vmware/vmksummary/log-heartbeat.py
*/5 * * * * /sbin/hostd-probe ++group=host/vim/vmvisor/hostd-probe
10 00 * * 1-5 /vmfs/volumes/datastore1/root/opt/ghettoVCB.sh -f /vmfs/volumes/datastore1/root/opt/vmbackup.txt

This will run backup on every day at 10'o clock, but you can change it according to your needs.

NAS4Free

NAS4Free High available iSCSI failover VMware server. 

The following post will be how to install and set up NAS4Free server for your ESXi/ESX VMware server as an iSCSI storage.
NAS4Free is based on FreeBSD and has all the required services to serve your system as a High-Available Storage server. (HAST and CARP)
Of course you can use this solution in your network as a High-Available storage or as a Windows cifs samba server, if you modify the services on NAS4Free.
I’ll stick first to the iSCSI setup and later we will show you how to set up NFS and Windows(SAMBA) shares.

The following setup used here:

Node1 primary IP address for serving iSCSI and CARP services: 192.168.101.165
Node1 secondary IP address for HAST synchronisation: 172.16.100.1

Node2 primary IP address for serving iSCSI and CARP services: 192.168.101.166
Node2 secondary IP address for HAST synchronisation: 172.16.100.2

Virtual IP address(CARP address) for iSCSI service: 192.168.101.167

Node1 host name: has1
Node2 host name: has2

Install both nodes with lates NAS4Free edition.

– Change node names according to your set up for example: node1 and node2.

hostname

– Add node names to host file on both nodes.

hosts
– Setup carp services under Network/Interface management:

carp1

– Advertisement skew on has1 node: 0
– Advertisement skew on has2 node: 10

If has1 node dies then has2 node will take over all the services.

carp2

You must use same link up and link down action on both side of the nodes otherwise the switch over wont work properly!
So everything should be the same except the advertisement skew value.

Next step setup HAST services:

hast1

hast3

As you can see here the second network interface card used for the HAST service synchronisation not the main interface.
After you setup HAST service reboot both nodes, the apply wont help to start the services for some reason. 

– Switch on ssh service and ssh into both nodes.

On Master issue these commands:

hastctl role init disk1
hastctl create disk1
hastctl role primary disk1

On Slave issue these commands:

hastctl role init disk1
hastctl create disk1
hastctl role secondary disk1

Check both nodes with: hastctl status

Then configure ZFS
On Master:

Add disks (Disks->Management)

disk1: N/A (HAST device)
Advanced Power Management: Level 254
Acoustic level: Maximum performance
S.M.A.R.T.: Checked
Preformatted file system: ZFS storage pool device

Format as zfs (Disks->Format)

Add ZFS Virtual Disks (Disks->ZFS->Pools->Virtual Device)

Add Pools(Disks->ZFS->Pools->Management)

Add PostInit script on both nodes to /system/advanced/command scripts/ tab.
/usr/local/sbin/carp-hast-switch slave

Shut down the master and on the slave import the pool through the GUI.  Tab: /ZFS/Configuration/Detected
Then synchronise the pool on the slave!

When finished on slave, start master and switch VIP back to master.

zpool status disk1
hastctl status

Troubleshooting commands from SSH terminal:

zpool status

########
nast1: ~ # zpool status mvda0
  pool: mvda0
 state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run ‘zpool clear’.
   see: http://illumos.org/msg/ZFS-8000-HC
  scan: none requested
config:

        NAME                   STATE     READ WRITE CKSUM
        mvda0                  UNAVAIL      0     0     0
          2144332937472371213  REMOVED      0     0     0  was /dev/hast/hast
#########

If status unavailable then you could try:

zpool clear “pool name”

It will scan and scrub the local disks.

#########
nast1: ~ # zpool status mvda0
  pool: mvda0
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: scrub in progress since Mon Jun  2 15:26:25 2014
        1.19G scanned out of 1.43G at 28.3M/s, 0h0m to go
        0 repaired, 82.75% done
config:

        NAME         STATE     READ WRITE CKSUM
        mvda0        ONLINE       0     0     0
          hast/hast  ONLINE       0     0     0
#########

Then check pool again:
zpool status

#########
nast1: ~ # zpool status
  pool: mvda0
 state: ONLINE
  scan: scrub repaired 0 in 0h0m with 0 errors on Mon Jun  2 15:27:17 2014
config:

        NAME         STATE     READ WRITE CKSUM
        mvda0        ONLINE       0     0     0
          hast/hast  ONLINE       0     0     0
#########

Recreate sync on disks or split brain:

On Master issue these commands:

hastctl role init disk1
hastctl create disk1
hastctl role primary disk1

On Slave issue these commands:

hastctl role init disk1
hastctl create disk1
hastctl role secondary disk1

If you lost sync because of disk error or network error then you could recreate the sync between the hast disk(s).
Just recreate the roles and the nodes will start syncing the data. (use commands above)  Be careful with the roles and the nodes, don’t mix them up!
If you recreate the roles and the disks, you wont lose data at all. It will only start synching the disk(s) bt wont overwrite data.

If it a split brain scenario then you should decide which node has the newer data and issue the above commands according to the data. So for example if the secondary node has newer data then the primary then obviously you should issue: role primary on the second node and role secondary on the primary node and vica-versa.

High Availability Postfix mail server on GlusterFS

The next article will be soon: High-available mail server on glusterfs.

– Two node CentOS Linux
– GlusterFS shared storage
– NFS share for mails on GlusterFS
– Postfix mail server with squirrelmail weblient
– Dovecot IMAP/POP server

 

So let’s get started.

In this article I used two local private nodes for testing.
You should change the IPs according to your real configuration. GlusterFS can manage different geo-locations to sync files/directories.
But if you want both servers at the same physical location then use a firewall for example pfSense or Snort and use local IPs behind the firewall.

GlusterFS part:

First edit the hosts file and insert all the nodes which will be in the cluster.

cat /etc/hosts
127.0.0.1    localhost    localhost.localdomain localhost4 localhost4.localdomain4
::1    localhost    localhost.localdomain localhost6 localhost6.localdomain6
192.168.1.200    test2.local    test2
192.168.1.201    test3.local    test3

yum install glusterfs glusterfs-fuse glusterfs-server postfix dovecot

service glusterd start

gluster peer probe 192.168.1.201

gluster peer probe 192.168.1.200

On every node you should have the other nodes UUID peers.

ls /var/lib/glusterd/peers

878b63e8-5a3c-4746-984a-a14f4918c4b8

cat /var/lib/glusterd/peers/878b63e8-5a3c-4746-984a-a14f4918c4b8

uuid=878b63e8-5a3c-4746-984a-a14f4918c4b8
state=3
hostname1=test3.local

service glusterd status
glusterd (pid  1620) is running…

Start glusterd on other node too.
Then check glusterd status on both node:

gluster peer status (node1)
Number of Peers: 1

Hostname: test3.local
Uuid: 878b63e8-5a3c-4746-984a-a14f4918c4b8
State: Peer in Cluster (Connected)

gluster peer status (node2)
Number of Peers: 1

Hostname: test2.local
Uuid: 0d06c152-3966-4938-a1c4-84b624689927
State: Peer in Cluster (Connected)

Now let’s create the glusterfs volume.

Before you run the appropriate command be careful with sysctl! I had some trouble with: net.ipv4.ip_nonlocal_bind = 0 in sysctl.conf because I used the nodes for heartbeat and corosync to test them and I could not create glusterfs volume.
So change this from 1 to 0 in sysctl.conf and run sysctl -p to reconfigure this kernel parameter.

So create the volume:

gluster volume create gv0 replica 2 test2:/export/brick1 test3:/export/brick1

You could check the volume with this command too:

gluster volume info

Volume Name: gv0
Type: Replicate
Volume ID: da3d4c48-d168-4b4f-9590-e8d87cf5aa87
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: test2.local:/export/brick1
Brick2: test3.local:/export/brick1

Start the volume sharing with this command:

gluster volume start gvo

XFS part:

Next step install xfs modules.

modprobe xfs  (CentOS 6.3 already got installed kmod-xfs)

Create xfs file system on the extra disk that you want as a glusterfs volume.

mkfs.xfs -i size=512 /dev/vdb1

NFS part:

Then install nfs services.

yum install nfs-utils

And mount the nfs share as a glusterfs volume:

mount -o mountproto=tcp,vers=3 -t nfs test2.local:/gv0 /mnt/

Check the mounts:

mount
/dev/mapper/VolGroup-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/vda1 on /boot type ext4 (rw)
/dev/vdb1 on /export/brick1 type xfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
nfsd on /proc/fs/nfsd type nfsd (rw)
test2.local:/gv0 on /mnt type nfs (rw,mountproto=tcp,vers=3,addr=192.168.1.200)

Start services automatically at boot:

chkconfig nfs on

chkconfig glusterd on

Postfix Part:

Create a symbolic link to /var under /mnt

 ln -s /var/ /mnt/

Then insert into /etc/postfix/main.cf to front of every refer that contains /var/ an extra /mnt/ like this:

From this: mail_spool_directory = /var/spool/mail
To this: mail_spool_directory = /mnt/var/spool/mail

And configure Postfix as usual.

Dovecot Part:

Change the default mail location in /etc/dovecot/conf.d/10-mail.conf

from this: mail_location = maildir:~/Maildir
To this: mail_location = mbox:~/mail:INBOX=/var/mail/%u

In this configuration dovecot will keep the mails in the old unix format not new dovecot format.
And you can reach the mails from both nodes.

Configure the rest of dovecot as usual.

In this setup you should have a shared mail system on nfs volume, so users should be able to reach their mails all the time whatever happens with the other nodes. The MX records configured to deliver mails to the second node if the first unreachable.
You need to use same unix users on both nodes otherwise the user boxes will be mixed and can’t be successful the whole setup.

 

 

VMware ESXi / VMware Player & KVM Preinstalled machines

VMware ESXi / VMware Player & KVM Preinstalled machines.       Quick link to browse download directory: https://www.7layer.org/downloads 

I”ve started making preinstalled ready to use virtual machines for everybody, no charges for this.
Machines will be categorized and sorted shortly. Machines can run on Windows, Linux, Apple Server and Client.
Image formats: OVF http://en.wikipedia.org/wiki/Open_Virtualization_Format & KVM http://en.wikipedia.org/wiki/Kernel-based_Virtual_Machine based.

If you have not used VMware before then follow this link: https://my.vmware.com/web/vmware/free#desktop_end_user_computing/vmware_player/6_0  and download the Player(for client) or the ESXi(for server) whichever you want to use. The Player can be used on any running Windows/Linux/Apple client and the ESXi server can be used only as server. Both of them free, no need to pay any licence for them and no time limit restriction either.
Although on ESXi server you must upload the free ESXi license number which you got when you downloaded it in 60 day time.

You can import these machines and start using it straight away.

#############
Debian based Named DNS master-slave server.

Master server:
 Debian-master-dns readme
Slave server: Debian-slave-dns readme
References: http://www.bind9.net/arm97.pdfhttp://www.howtoforge.com/debian_bind9_master_slave_system

#############
pfSense OVF template image with Squid proxy server.
pfSense server: pfSense readme
References: https://doc.pfsense.org/index.php/PfSense_2_on_VMware_ESXi_5#Booting_your_VM_from_CD.2FDVD

#############
CentOS 6.5 based server with Squid proxy and Webmin server.

CentOS 6.5: 64bit 32bit readme
References: http://www.centos.org/docs/5/

#############

 

 

 
Show Buttons
Hide Buttons