Archive for the ‘ Cloud, Devops, Monitoring ’ Category

pfSense download cut off issue on VMware ESXi

Recently I had some trouble with my newly installed pfSense virtual box.

When I tried to download large files the pfSense cut off the download and could not download anything at all.
The strange thing was that the same exact pfSense was behaving fine not cutting of any downloads on a different up-link provider.

So I have tried to switch off the checksum offload and TCP segmentation offload also the large receive offload as it was suggested on many different sites like proxmox for example.

None of them helped and finally I found the solution to changing the main firewall behavior under System/Advanced/Firewall & NAT and then I changed Firewall Optimization Options to Conservative from Normal. After this all large file download went through on the firewall, no cuts off whatsoever.

So again the same box, same version and patch and also same virtual machine version on VMware behaved differently because of the ISP provider up-link.
If you have this issue just change the Firewall Optimizations at System/Advanced/Firewall & NAT.


VMware ESXi & VCenter used disk percentage monitoring


The next article is about SNMP monitoring with Nagios to check VMware ESXi and VCenter servers used disk size.

I’ve had some trouble recently with our VCenter server because the logs just filled up one of the volume and I was unable to log in to the server at all.
Even the main console didn’t work and it was complaining about the storage filling up with logs.
VCenter server do not log rotate the old logs unfortunately by default, so sooner or later the volume will be filled up.

Well there are some solutions to monitor used disk size but the default Nagios won’t give you straight and appropriate answer from the VCenter server or form the ESXi boxes either.
This is why I made a script with SNMPwalk to be able to monitor any kind of ESXi or VCenter servers.
There are some “tricks” in the script because for VCenter need a different OID to check than on ESXi boxes, but this is built in to the script already.
Also there is a way to monitor disk size with SNMP under Cacti monitoring server but to receive an email regarding to the triggers you will need to modify way too many thresholds and would take ages if you have several servers.

With this script you will need to add only the host names and the disk number and you are all set, Nagios will take care all of the rest.
The check_vmware_disk script needs to be uploaded into your Nagios libexec folder, so Nagios will be able to run the script automatically when it’s been scheduled.

There are some pictures regarding to this service:

Also the second script which is available via my GitHub account is to check the ESXi host’s network up-link.
root@nagios: ./check_esxi_vnic 1
0 status:0; Triggers: down; OK – up

root@nagios: ./check_esxi_vnic 2
1 status:1; Triggers: down; CRITICAL – down



VMware Replication & Recovery

The following three videos show how to create virtual machine offsite replication in your vCenter server on any available storage drive in your server.
This solution is available freely from VMware and can be integrated into all type of vCenters, even into VMware Small Business Essentials Plus.
To download VMware replication follow this link: VMware Replication

The backup storage can be a network share mount or a local drive.
If you use for example an NFS share or iSCSI storage, then this gives you the benefit of an offsite backup for your virtual machines.
The replication is automatic and scheduled to run in the background on the vCenter server.

The offsite backup, can be restored with out the vCenter server, if for any reason vCenter is unavailable.
In this case you need to add manually the backed up machine onto your vCenter or stand alone ESXi server’s inventory.



VMware networking setup for vMotion/iSCSI & VM traffic

VMware ESX/ESXi network setup.

In the following post I will show you some networking setup regarding to VMware servers.
This will involve Cisco switches(2960/3750 series) and HP or Dell servers setup.
I got these configurations in production running for quite some times now(2+years) without any issues.

As we know the networking setup for VMware servers, got much more complicated, than any other regular server setup earlier we had with “classic” Linux or Windows physical boxes.
Classic only one uplink connection with regular vlan is not enough for vmware anymore.
You must separate virtual machine traffic from the management traffic and also you must separate the storage and vmotion traffic.
Although VMware says you can have separated vswitches for all physical connections with different vlans, but the failover to other physical connections is more complicated, than if you have one or two vswitches. VMware server needs minimum 2 network uplinks for VM traffic and management traffic, but VMware recommends 4 uplinks for the physical servers.

The following picture shows briefly the current setup.


So let’s take a look the 4 uplink configuration in the VMware ESXi host:


We got all 4 uplinks connected to the same vswitch. With this configuration is very easy to create the failover for the management traffic and to separate the storage and vmotion traffic as well. Let’s take a look the vswitch properties:


Also take a look the NIC teaming for the vswitch.
As you can see all adapters are active in this vswitch:



Now take a look the management uplink settings.
The management network has one active adapters and two standby adapters.
If the active vnic0 adapter physical connection fails(switch issue or cable connection issue), then VMware kernel will activate one of the other standby adapters.
With this setup the management network will always be available and you cannot lose the connection to the VMware box.

Now we check the vMotion settings.
Here we have an added VMkernel port with vMotion and IP storage which contains extra IP address for the vMotion.
As you can see here we have one active adapters and three unused adapters. To properly separate this kind of traffic by the kernel you must tick the failover order and move down the adapters, that you don’t want to use in the kernel. This settings is the same with iSCSI storage.

Now take a look the Storage IP kernel settings.
Here we have also an extra added VMkernel port with extra IP address.
In this setup also the extra active vswitch adapters have been disconnected and unused as you can see on the picture.
Without this you won’t be able to add properly the iSCSI software storage. The VMkernel IP settings creates a point to point 1 to 1 connection to the storage and therefore only one active adapters should be enabled in any VMkernel port groups. With this setup you can have more than one path to the iSCSI storage, but for this you need to enable this feature in iSCSI setup.


So now take a brief look to the Virtual Machine Port Group settings regarding to the Vlan settings.
You can add new vlans here to the kernel and create load balance and failover for the virtual machines.
I used two adapters from the physical adapters for the virtual machines and they are activated as vnic5/vnic1 and vnic1/vnic5 opposite to each other.
But if you have 4 or 6 uplink adapters, then you could active 3-4 adapters for the virtual machines, it’s up to you.
Also this depends on how heavily loaded your virtual boxes, obviously if the boxes are pretty loaded then, it’s better if you separate the loads and leave out vmotion and management traffic from the physical uplink connections.



I know it’s getting a bit confusing, so here we are again some binding regarding to the VLAN, management traffic and vMotion traffic:


The storage traffic is not added to any of those traffics, it is just connection via the VMkernel port group IP address as a one to one connection:


In this setup I use the same vlan for the vMotion, because this is only used for maintenance, but if you use heavily the vMotion then it is better to be separeted into a different vlan.You might as well create a new physical uplink for traffic, which could help you to separate this traffic not just on a vlan level, but on the physical level also.

And finally the physical uplink ports to the Cisco switch:

interface GigabitEthernet1/0/22
description vnic0
switchport trunk allowed vlan 100,200,300,400
switchport trunk native vlan 999
switchport mode trunk
switchport nonegotiate
speed 1000
duplex full

interface GigabitEthernet1/0/23
description vnic2
switchport trunk allowed vlan 100,200,300,400
switchport trunk native vlan 999

switchport mode trunk
switchport nonegotiate
speed 1000
duplex full

The native vlan 999 command is used to change the default untagged vlan traffic which is vlan1.
With this command you can avoid unnecessary layer 2 traffic to the VMware server, like flooding and broadcast.
Also if you have a system already configured with vCenter, then sometimes you cannot change the management vlan, because vCenter won’t be able to reach the box anymore and the changes goes into error or the box could get dropped from vSphere. In that case you would need to disconnect the connected server from vCenter and create a second VMkernel interface with a different IP subnet with different physical interface, than the currently running one and connect to the box via that KMkernel. With this you can do any major changes to the main interface. (native vlan, vlan tagging etc) I have seen few times, when I wanted to do changes, then I lost the connection to the server and I needed to either reset the VMkernel management or rollback the switch configuration or change the native vlan on the switch. So you need to be careful with this changes, if you cannot reach your physical box for any reason (server is in a data-center or a different office)

So now let’s take a look the Cisco switch side, after the native vlan configuration and the trunking configuration:

Port        Mode             Encapsulation  Status        Native vlan
Gi1/0/22    on               802.1q         trunking      999

Port        Vlans allowed on trunk
Gi1/0/22    100,200,400

Port        Vlans allowed and active in management domain
Gi1/0/22    100,200,400



Linux server migration with VMware converter

The next post will show you how to migrate a Live Linux/Windows machine from any source to any destination remotely.

I’ll do this on website and post all related pictures regarding to this migration.
I am going to use VMware converter which will deal with everything.


– Running VMware server

– Source machine with SSH or RDP connection (Linux/Windows)

– Destination VMware server

– VMware converter (free to download from VMware site)

Ok let’s start up the Vmware converter and connect up to the source machine.

Here you need to select source type as Powered on machine.
Also you need to add the login name and password.




At the next tab you need to type the destination machine’s IP address and the login details also.




Then at the next tab you must add the machine name.
On Linux this will be picked up from the host file automatically, you could leave it as it is if you prefer.



At the next step you must be really careful with the machine version number.
VMware automatically offers Version 10 which can only be managed from VCenter and that is not free.
So change this to version 8 or lover then you will be able to manage the machine from ESXi vSphere client.
This version means the machine hardware type. I use version 8 which is the highest available free version with out any licensing issue and cost.



Also on the destination page you should choose which datastore you want to use for the machine.

At the next tab converter will ask you the final parameters regarding to the conversion.
Here you must edit the Helper VM network tab and add an extra IP on the local network where the destination VMware server is.
With out this usually the converter dies at around 1% or 2% with out any extra notification.





Also you should check the option at advanced option. Reconfigure destination virtual machine should be ticked.
This will fix the initramdisk on the destination machine.


After this we can start the real migration process.



This will create the machine at the destination server and automatically starts it up and start pulling down the data from the source machine.


Destination server console with the running machine while it’s pulling down the data from source:


You can see the progress is quiet quick, it’s depend on the actual network speed and the source and destination machine CPU and disk speed.


At the source machine VMware converter uses tar command to compress the full disk into an image and send it through via the network to the destination machine as a compressed file.
The source gets overloaded a bit wit this process, but of course it’s depend on the source box. This is not a real machine just only a virtual one with 1 CPU socket and 512 RAM.
So this is definitely not a strong box and it runs other web sites at the moment, so that’s why the load is high on top.


Now let’s see the finished converted machine:


Indeed it reached the final stage and pull down the whole machine is about an hour.
The destination machine currently switched off on the destination server, but it contains the full copy of the source.
So let’s see if it boots up with out any issues:


Seems like we got a kernel panic because of the local disk UUID was missing.
So right now we will need CentOS disk and boot up the box in rescue mode to fix the disk UUID.
Upload the CentOS version that you used on the source box and add it to VMware server.
I’ve got 64 bit on 7layer, so I’ll use that to fix the destination machine.




Boot up the machine but be quick you will have only about 1 sec to press escape at the BIOS screen and choose to boot from CDROM.
At the CentOS boot screen then you should choose Rescue installed system option to fix the box.



Then choose the continue option and then this give you the option to modify the root file system

Then at the next screen you can see the root system mounted under /mnt/sysimage.




Then choose the shell screen menu.





Then now we have all the root system mounted, so we can check the fstab and boot entries on the box.
Check the grub device map /boot/grub/ and /boot/grub/grub.conf regarding to the HDD type.
Also check the fstab /etc/fstab if it’s correct.

Old fstab:



New fstab corrected by VMware at converting:


Also device map looks correct:


After we checked these we need to fix the grub loader and the initramdisk.

This procedure can be found on VMware site also:

Rebuild initramsdisk:


mkinitrd -v -f /boot/initramfs-2.6.32-431.29.2.el6.x86_64.img 2.6.32-431.29.2.el6.x86_64

The line should matches with your grub kernel config. So check it in /boot directory.
This takes about 5-10 sec to rebuild the whole initramdisk.

Also we need to correct grub boot loader disk UUID. Either follow one of these steps below:

Correcting grub loader with UUID change:

Run ls -l /dev/disk/by-uuid to check the correct UUID for the sda3 disk.

This is a very long line and easy to make mistakes here, so it’s better to be added with ls command to grub.conf and then move it to the correct place.

ls -l /dev/disk/by-uuid >> /boot/grub/grub.conf then move it from the end.

In this case the last line contains the correct UUID which is /dev/sda3.
So move this to root=UUID=  line





Correcting grub loader with changing only kernel line in /boot/grub/grub.conf:


So now you can run grub-install with the corrected disk name:



When it finished try to reboot the box. Usually when you try to reboot ot halt commands here in rescue mode wont work.
Use from the VMware machine top menu ==>> VM ==>> Guest ==>> Send Ctrl+Alt+del tab to reboot the rescue disk.
And wait to reboot the box and see if it works fine.




Voila it’s booting up! 🙂

If you have trouble with /lib/modules/{kernel-number}/modules.dep at boot then you need to rebuild the initramdisk again.
Try to investigate via VMware site and carefully check the initramdisk name and kernel name at the boot directory.

I’ll mark the important parts which should be corrected otherwise the kernel wont boot:


[ grub]# cat grub.conf | head -n 20

# Hetzner Online AG – installimage
# GRUB bootloader configuration file

timeout 5
default 0

title CentOS (2.6.32-431.29.2.el6.x86_64)
root (hd0,1)
kernel /vmlinuz-2.6.32-431.29.2.el6.x86_64 ro root=UUID=c8fbeb09-a9d6-449f-8a99-6f83b7cf4362 rd_NO_LUKS rd_NO_DM nomodeset crashkernel=auto SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYTABLE=de
initrd /initramfs-2.6.32-431.29.2.el6.x86_64.img


cat menu.lst | head -n 20
# Hetzner Online AG – installimage
# GRUB bootloader configuration file

timeout 5
default 0

title CentOS (2.6.32-358.6.1.el6.x86_64)
root (hd0,1)
kernel /boot/vmlinuz-2.6.32-358.6.1.el6.x86_64 ro root=UUID=c8fbeb09-a9d6-449f-8a99-6f83b7cf4362 rd_NO_LUKS rd_NO_DM nomodeset
initrd /boot/initramfs-2.6.32-358.6.1.el6.x86_64.img


(hd0) /dev/sda





SPF record setup for mail server

How to set up and test SPF record for mail server:

Let’s check Google’s SPF record first with dig command.

[root@mail ~]# dig txt

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.30.rc1.el6_6.1 <<>> txt
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 52169
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0


;; ANSWER SECTION: 3599 IN TXT “v=spf1 ip4: ip4: ~all”

;; Query time: 12 msec
;; WHEN: Fri Feb 27 08:47:05 2015
;; MSG SIZE rcvd: 116

[root@mail ~]#

In the answer section you can see the IP addresses. These are the servers which allowed to send mails via
So you have your domain name e.g. and you have your mail server on it with an A record This server can send mails for its own name, but any other servers are not allowed to send mails. With the SPF record, you can send mail from the IP address via google’s mail server. So server and 72 can send mails (relay) via google’s mail server.

Also you can use domain names in SPF record and tell the server to use that instead of the IP address.

[root@mail ~]# dig txt

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.30.rc1.el6_6.1 <<>> txt
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 64785
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0


;; ANSWER SECTION: 21599 IN TXT “v=spf1 ip4: ~all” 21599 IN TXT “v=DMARC1\; p=none\; adkim=r\; aspf=r\; sp=none”

;; Query time: 36 msec
;; WHEN: Fri Feb 27 08:59:31 2015
;; MSG SIZE rcvd: 225

[root@mail ~]#


Create and check SPF records:

Header check for emails to analyse SPF and other issues:



MIPS Development

MIPS related development board from Imagination technology.

I just received my new IC20 MIPS based development board from Imgtec and I must say this is a piece of engineering art! 🙂

Already setup Apache2 web server, ssh-server, My-SQL server and also basic firewall (with iptables) and a Postfix mail server.
I’m going to do an SMS server and hook it up to my CCTV system and publish everything about it shortly.

Thank you again Imgtec!

Related links to CI20 and other Linux based hardware development:

VMware free backup solution from virtuallyGhetto


VMware free backup solution for ESXi servers:

Download the script from github: then modify it for your system to fit in.

I’m going to explain the important parts that I usually change in this script:

In file:

– Backup path
– Rotation
– Backup format
– Email server
– Email to
– Email from


# directory that all VM backups should go (e.g. /vmfs/volumes/SAN_LUN1/mybackupdir)


# Format output of VMDK backup
# zeroedthick
# 2gbsparse
# thin
# eagerzeroedthick


# Number of backups for a given VM before deleting

Also in ghettoVCB.conf







When you uploaded the and ghettovcb.conf files you need to add execute flag to the file:

chmod +x

Then you can start backing up your VM machines.

Backup only one machine run this:

./ -m vm_to_backup

Backup all machines:

./ -a

If you want to machine to be avoided from backup then use an except file to achive this:
./ -a -e vm_exclusion_list

VMware firewall wont let allow to send outgoing emails from the script, this need to be fixed.
Upload smtp.xml file to VMware server and update the firewall, with out this you will receive an error on VMware ssh console.

Script:  smtp.xml

Upload it to /etc/vmware/firewall and run esxcli update:

esxcli network firewall refresh

Then click on server name, configuration, security profile and you will see the new smtp outbond port appeared as a new outgoing firewall rule to allow smtp outgoing traffic from the server.




Altough you can use the restore script: to restore machines from backup, but you can use them straight away when you add to your machine from the backup script.
This is much quicker then the restore process, but obviously the machine will reside on the backup path not on the original path. With this you can get back the machine ASAP, then create a backup onto the original path and shut down the backup path machine and add to the inventory the original path machine and start it up.


Only one thing left to do is to make this process be automatic.
Edit crontab on your server and add this to it:
10 00 * * 1-5 /vmfs/volumes/ -f /vmfs/volumes/Fuji-NAS/backuplist > /vmfs/volumes/ghettoVCB-backup-$(date +\%s).log

Crontab file located on VMware ESXi 5.5 at: /var/spool/crontabs/ and root file contains the current configuration for crontab.

cat /var/spool/cron/crontabs/root

#min hour day mon dow command
1 1 * * * /sbin/
1 * * * * /sbin/
0 * * * * /usr/lib/vmware/vmksummary/
*/5 * * * * /sbin/hostd-probe ++group=host/vim/vmvisor/hostd-probe
10 00 * * 1-5 /vmfs/volumes/datastore1/root/opt/ -f /vmfs/volumes/datastore1/root/opt/vmbackup.txt

This will run backup on every day at 10'o clock, but you can change it according to your needs.


NAS4Free High available iSCSI failover VMware server. 

The following post will be how to install and set up NAS4Free server for your ESXi/ESX VMware server as an iSCSI storage.
NAS4Free is based on FreeBSD and has all the required services to serve your system as a High-Available Storage server. (HAST and CARP)
Of course you can use this solution in your network as a High-Available storage or as a Windows cifs samba server, if you modify the services on NAS4Free.
I’ll stick first to the iSCSI setup and later we will show you how to set up NFS and Windows(SAMBA) shares.

The following setup used here:

Node1 primary IP address for serving iSCSI and CARP services:
Node1 secondary IP address for HAST synchronisation:

Node2 primary IP address for serving iSCSI and CARP services:
Node2 secondary IP address for HAST synchronisation:

Virtual IP address(CARP address) for iSCSI service:

Node1 host name: has1
Node2 host name: has2

Install both nodes with lates NAS4Free edition.

– Change node names according to your set up for example: node1 and node2.


– Add node names to host file on both nodes.

– Setup carp services under Network/Interface management:


– Advertisement skew on has1 node: 0
– Advertisement skew on has2 node: 10

If has1 node dies then has2 node will take over all the services.


You must use same link up and link down action on both side of the nodes otherwise the switch over wont work properly!
So everything should be the same except the advertisement skew value.

Next step setup HAST services:



As you can see here the second network interface card used for the HAST service synchronisation not the main interface.
After you setup HAST service reboot both nodes, the apply wont help to start the services for some reason. 

– Switch on ssh service and ssh into both nodes.

On Master issue these commands:

hastctl role init disk1
hastctl create disk1
hastctl role primary disk1

On Slave issue these commands:

hastctl role init disk1
hastctl create disk1
hastctl role secondary disk1

Check both nodes with: hastctl status

Then configure ZFS
On Master:

Add disks (Disks->Management)

disk1: N/A (HAST device)
Advanced Power Management: Level 254
Acoustic level: Maximum performance
S.M.A.R.T.: Checked
Preformatted file system: ZFS storage pool device

Format as zfs (Disks->Format)

Add ZFS Virtual Disks (Disks->ZFS->Pools->Virtual Device)

Add Pools(Disks->ZFS->Pools->Management)

Add PostInit script on both nodes to /system/advanced/command scripts/ tab.
/usr/local/sbin/carp-hast-switch slave

Shut down the master and on the slave import the pool through the GUI.  Tab: /ZFS/Configuration/Detected
Then synchronise the pool on the slave!

When finished on slave, start master and switch VIP back to master.

zpool status disk1
hastctl status

Troubleshooting commands from SSH terminal:

zpool status

nast1: ~ # zpool status mvda0
  pool: mvda0
 state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run ‘zpool clear’.
  scan: none requested

        NAME                   STATE     READ WRITE CKSUM
        mvda0                  UNAVAIL      0     0     0
          2144332937472371213  REMOVED      0     0     0  was /dev/hast/hast

If status unavailable then you could try:

zpool clear “pool name”

It will scan and scrub the local disks.

nast1: ~ # zpool status mvda0
  pool: mvda0
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
  scan: scrub in progress since Mon Jun  2 15:26:25 2014
        1.19G scanned out of 1.43G at 28.3M/s, 0h0m to go
        0 repaired, 82.75% done

        NAME         STATE     READ WRITE CKSUM
        mvda0        ONLINE       0     0     0
          hast/hast  ONLINE       0     0     0

Then check pool again:
zpool status

nast1: ~ # zpool status
  pool: mvda0
 state: ONLINE
  scan: scrub repaired 0 in 0h0m with 0 errors on Mon Jun  2 15:27:17 2014

        NAME         STATE     READ WRITE CKSUM
        mvda0        ONLINE       0     0     0
          hast/hast  ONLINE       0     0     0

Recreate sync on disks or split brain:

On Master issue these commands:

hastctl role init disk1
hastctl create disk1
hastctl role primary disk1

On Slave issue these commands:

hastctl role init disk1
hastctl create disk1
hastctl role secondary disk1

If you lost sync because of disk error or network error then you could recreate the sync between the hast disk(s).
Just recreate the roles and the nodes will start syncing the data. (use commands above)  Be careful with the roles and the nodes, don’t mix them up!
If you recreate the roles and the disks, you wont lose data at all. It will only start synching the disk(s) bt wont overwrite data.

If it a split brain scenario then you should decide which node has the newer data and issue the above commands according to the data. So for example if the secondary node has newer data then the primary then obviously you should issue: role primary on the second node and role secondary on the primary node and vica-versa.

High Availability Postfix mail server on GlusterFS

The next article will be soon: High-available mail server on glusterfs.

– Two node CentOS Linux
– GlusterFS shared storage
– NFS share for mails on GlusterFS
– Postfix mail server with squirrelmail weblient
– Dovecot IMAP/POP server


So let’s get started.

In this article I used two local private nodes for testing.
You should change the IPs according to your real configuration. GlusterFS can manage different geo-locations to sync files/directories.
But if you want both servers at the same physical location then use a firewall for example pfSense or Snort and use local IPs behind the firewall.

GlusterFS part:

First edit the hosts file and insert all the nodes which will be in the cluster.

cat /etc/hosts    localhost    localhost.localdomain localhost4 localhost4.localdomain4
::1    localhost    localhost.localdomain localhost6 localhost6.localdomain6    test2.local    test2    test3.local    test3

yum install glusterfs glusterfs-fuse glusterfs-server postfix dovecot

service glusterd start

gluster peer probe

gluster peer probe

On every node you should have the other nodes UUID peers.

ls /var/lib/glusterd/peers


cat /var/lib/glusterd/peers/878b63e8-5a3c-4746-984a-a14f4918c4b8


service glusterd status
glusterd (pid  1620) is running…

Start glusterd on other node too.
Then check glusterd status on both node:

gluster peer status (node1)
Number of Peers: 1

Hostname: test3.local
Uuid: 878b63e8-5a3c-4746-984a-a14f4918c4b8
State: Peer in Cluster (Connected)

gluster peer status (node2)
Number of Peers: 1

Hostname: test2.local
Uuid: 0d06c152-3966-4938-a1c4-84b624689927
State: Peer in Cluster (Connected)

Now let’s create the glusterfs volume.

Before you run the appropriate command be careful with sysctl! I had some trouble with: net.ipv4.ip_nonlocal_bind = 0 in sysctl.conf because I used the nodes for heartbeat and corosync to test them and I could not create glusterfs volume.
So change this from 1 to 0 in sysctl.conf and run sysctl -p to reconfigure this kernel parameter.

So create the volume:

gluster volume create gv0 replica 2 test2:/export/brick1 test3:/export/brick1

You could check the volume with this command too:

gluster volume info

Volume Name: gv0
Type: Replicate
Volume ID: da3d4c48-d168-4b4f-9590-e8d87cf5aa87
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Brick1: test2.local:/export/brick1
Brick2: test3.local:/export/brick1

Start the volume sharing with this command:

gluster volume start gvo

XFS part:

Next step install xfs modules.

modprobe xfs  (CentOS 6.3 already got installed kmod-xfs)

Create xfs file system on the extra disk that you want as a glusterfs volume.

mkfs.xfs -i size=512 /dev/vdb1

NFS part:

Then install nfs services.

yum install nfs-utils

And mount the nfs share as a glusterfs volume:

mount -o mountproto=tcp,vers=3 -t nfs test2.local:/gv0 /mnt/

Check the mounts:

/dev/mapper/VolGroup-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/vda1 on /boot type ext4 (rw)
/dev/vdb1 on /export/brick1 type xfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
nfsd on /proc/fs/nfsd type nfsd (rw)
test2.local:/gv0 on /mnt type nfs (rw,mountproto=tcp,vers=3,addr=

Start services automatically at boot:

chkconfig nfs on

chkconfig glusterd on

Postfix Part:

Create a symbolic link to /var under /mnt

 ln -s /var/ /mnt/

Then insert into /etc/postfix/ to front of every refer that contains /var/ an extra /mnt/ like this:

From this: mail_spool_directory = /var/spool/mail
To this: mail_spool_directory = /mnt/var/spool/mail

And configure Postfix as usual.

Dovecot Part:

Change the default mail location in /etc/dovecot/conf.d/10-mail.conf

from this: mail_location = maildir:~/Maildir
To this: mail_location = mbox:~/mail:INBOX=/var/mail/%u

In this configuration dovecot will keep the mails in the old unix format not new dovecot format.
And you can reach the mails from both nodes.

Configure the rest of dovecot as usual.

In this setup you should have a shared mail system on nfs volume, so users should be able to reach their mails all the time whatever happens with the other nodes. The MX records configured to deliver mails to the second node if the first unreachable.
You need to use same unix users on both nodes otherwise the user boxes will be mixed and can’t be successful the whole setup.



Show Buttons
Hide Buttons