index
This page generally describes the process to build a OSR Cluster from scratch. It is independently from distribution and filesystem. All these dependencies are described in here. This page is still UNDER CONSTRUCTION!!!
0. Prerequisites
- This HOWTO will be the general OSR Howto in futur time. Until now it is still under CONSTRUCTION. So please be patient.
- See assumptions
- We suppose that you have already a system installed with the current version of your OS. Valid options are
- RHEL5 based
- RHEL6 based
- SLES11
- We suppose that either SELinux and AppArmor are switched OFF.
- No iptables are setup.
- Root and boot are setup in two different file systems.
- The necessary software for your root file system is already installed.
- Additional services like ntp and syslog are successfully setup.
- Time source of all servers is the same.
1. Prepare storage
1.1 Prepare root file system device
** NOTE: Only valid for EXT3, EXT4, GFS and GFS2. **
Label for lvm usage
# pvcreate /dev/sdb # vgcreate vg_testcluster1_sr # lvcreate -l 90%free -n lv_sharedroot vg_testcluster_sr
1.2 Prepare boot file system device
** NOTE: Only valid for EXT3, EXT4, GFS, OCFS2 and GFS2. **
# parted /dev/sda mkpartfs primary ext3 0 1000M # grub << EOF device (hd0) /dev/vda root (hd0,0) setup (hd0) EOF
2. Prepare root file system
Mount the root file systems to /mnt/newroot of the template system already installed (see assumptions).
2.1. Create and mount root file system
If need be create the root file system and mount it to /mnt/newroot.
NFS
Use nolock as mount option!
# mount -t nfs -o nolock 192.168.0.1:/testcluster1/root /mnt/newroot
GFS
# mkfs.gfs -t testcluster1:lt_sharedroot -j 2 -p lock_dlm /dev/vg_testcluster1_sr/lv_sharedroot # mount -t gfs -o lockproto=lock_nolock /dev/vg_testcluster1_sr/lv_sharedroot /mnt/newroot
GFS2
# mkfs.gfs2 -t testcluster1:lt_sharedroot -j 2 -p lock_dlm /dev/vg_testcluster1_sr/lv_sharedroot # mount -t gfs2 -o lockproto=lock_nolock /dev/vg_testcluster1_sr/lv_sharedroot /mnt/newroot
2.2 Copy root file system
Copy the whole root file system onto the newly created one.
# cp -ax / /mnt/newroot
2.3 Clean ups
It is strongly recommended to link /tmp to /var/tmp. Do it as follows.
# rm -rf /mnt/newroot/tmp # cd /mnt/newroot /mnt/newroot# ln -s var/tmp tmp /mnt/newroot# cd etc /mnt/newroot/etc# rm mtab /mnt/newroot/etc# ln -s /proc/mounts mtab
3. Prepare boot file system
3.1. Create and mount boot file system
Mount the boot file systems to /mnt/newroot/boot of the template system already installed (see assumptions).
NFS
mount -t nfs 192.168.0.1:/testcluster1/root/boot /mnt/newroot/boot
All Others
mount /dev/sda1 /mnt/newroot/boot
3.2 Copy boot file system
Copy the whole boot file system onto the newly created one.
cp -ax /boot/* /mnt/newroot/boot
4. Change root to the new system
FOR ALL ROOT FILE SYSTEMS
Now we can change the root to the new file system in order to apply all changes there. This is done as follows:
4.1 Mount other file systems
To successfully execute all commands on the new system the following additional filesystems need to be mounted.
# mount --rbind /dev /mnt/newroot/dev/ # mount -t proc proc /mnt/newroot/proc # mount -t sysfs none /mnt/newroot/sys
4.2 Chroot into the new root file system
# chroot /mnt/newroot
ALL FURTHER COMMANDS WILL BE EXECUTED INSIDE THE NEW ROOT FILE SYSTEM!!
5. Install Com.oonics packages
Install the latest comoonics packages from a comoonics yum channel. Or shortcut to recommended_packages.
6. Create file system environment
Next the root file system has to be adapted so that it can be used as a shared root file system. For this a few files and directories have to be made host dependent. This can be done as follows. Dependent on software installed some files or directories might not be available.
6.1 Setup the CDSL environment
Create the CDSL environment with 2 nodes (--maxnodeidnum) to be using this shared root file system. And temporarily map nodeid 1 to this template node.
# com-cdslinvadm create --maxnodeidnum=2 # mount --bind /.cluster/cdsl/1/ /.cdsl.local/
6.2 Create host dependent files and directories
The following files and directories need to be host dependent for a shared root file system.
RHEL5/RHEL6
# com-mkcdsl --hostdependent /var/tmp/ /var/account /var/cache /var/local /var/lock /var/log /var/spool \ /var/lib/dbus /var/lib/dhclient /var/run /var/lib/logrotate.status /etc/blkid /etc/sysconfig/network
SLES11
# com-mkcdsl --hostdependent /var/tmp/ /var/account /var/cache /var/local /var/lock /var/log /var/spool \ /var/lib/dbus /var/lib/dhclient /var/run /var/lib/logrotate.status /etc/blkid
6.3 Create CDPNs
Because some services do not work on top of symbolic links as CDSLs there are few files and directories that have to be made host dependent with bind mounts (Context Dependent Path Names). Those are very rare and only for special configurations.
RHEL5/SLES11/RHEL6
If using RHEL5/SLES11 with device mapper multipath (holds for NFS, GFS, OCFS2 and GFS2) the directory /var/run has to be a CDPN. To achieve this proceed as follows.
# rm -rf /var/run # mkdir /var/run # cat >> /etc/cdsltab <<EOF bind /.cluster/cdsl/%(nodeid)s/var/run /var/run __initrd EOF
Also /var/run/lvm has to be existant for clvmd to be running. So if clvmd will be used also proceed with the following step.
# for i in $(seq 1 $(com-cdslinvadm --get maxnodeidnum )) default;do mkdir /$(com-cdslinvadm --get tree)/$i/var/run/lvm ;done
RHEL6
As RHEL6 also changed the whole starup process also /var/lock cannot be a symbolic link and has to be change to a CDPN.
# rm -rf /var/lock # mkdir /var/lock # cat >> /etc/cdsltab <<EOF bind /.cluster/cdsl/%(nodeid)s/var/lock /var/lock __initrd EOF
7. Create configuration files
7.1 Create cluster configuration file
** NOTE: Only valid for GFS, GFS2 and OCFS2 **
Download the apropriate cluster configuration from cluster configurations or create you're own one.
Then place the file in /etc/cluster/cluster.conf
# [ -d /etc/cluster ] || mkdir /etc/cluster # wget -o /etc/cluster/cluster.conf http://open-sharedroot.org/documentation/general-osr-howto/cluster-configurations/cluster.conf # chmod 640 /etc/cluster/cluster.conf
7.2 Validate networking for cluster communication
Validate that the node names can be resolved to their ip addresses. This is a MUST if a cluster is required for both nodes to communicate (Red Hat Cluster).
# host node1 node1 has address 192.168.100.101 # host node1 node2 has address 192.168.100.102
7.2 Create network configuration files
The network configuration files are one of the critical section for a sharedroot cluster to be running. Network communication is required prior to mounting of the root file system and has to be setup inside the so called initial ramdisk. To achieve this follow those instructions.
Be careful that only the network interfaces required for successful mount of the root file systems are specifically configured. All other network configurations might be kept untouched.
This section describes how to create the network configuration files based on assumptions. If you want to configure more complex network configurations refer to the /faq section or resolve by the means of your distribution.
For the network configuration files to be recognized as network configurations to be required for the root file system they need to be flagged. This is done with those the variable STARTMODE=nfsroot inside the appropriate network configuration file.
On the other side it has to be switched off for usage at boot. It will be powered up before the normal power up process takes place so set ONBOOT=no.
As both nodes have different ip addresses the network configuration files need to be host dependent. This is done with the com-mkcdsl --hostdependent <file> command.
Red Hat
# cat > /etc/sysconfig/network-scripts/ifcfg-eth0 # com-mkcdsl --hostdependent /etc/sysconfig/network-scripts/ifcfg-eth0 # cat > /.cluster/cdsl/1/etc/sysconfig/network-scripts/ifcfg-eth0 <<EOF STARTMODE=nfsroot ONBOOT=no DEVICE=eth0 IPADDRESS=192.168.0.101 NETMASK=255.255.255.0 EOF # cat > /.cluster/cdsl/2/etc/sysconfig/network-scripts/ifcfg-eth0 <<EOF STARTMODE=nfsroot ONBOOT=no DEVICE=eth0 IPADDRESS=192.168.0.102 NETMASK=255.255.255.0 EOF
SuSE
# cat > /etc/sysconfig/network/ifcfg-eth0 # com-mkcdsl --hostdependent /etc/sysconfig/network/ifcfg-eth0 # cat > /.cluster/cdsl/1/etc/sysconfig/network/ifcfg-eth0 <<EOF STARTMODE=nfsroot ONBOOT=no DEVICE=eth0 IPADDRESS=192.168.0.101 NETMASK=255.255.255.0 EOF # cat > /.cluster/cdsl/2/etc/sysconfig/network/ifcfg-eth0 <<EOF STARTMODE=nfsroot ONBOOT=no DEVICE=eth0 IPADDRESS=192.168.0.102 NETMASK=255.255.255.0 EOF
7.3 Adapt /etc/fstab
Adapt the /etc/fstab so that it has the root and boot file systems setup as follows:
NFS
# cat /etc/fstab .. 192.168.0.1:/testcluster1/root / nfs defaults,nolock,noatime 0 0 192.168.0.1:/testcluster1/root/boot /boot nfs defaults 0 0 ..
GFS
# cat /etc/fstab .. /dev/vg_testcluster1_sr/lv_sharedroot / gfs noauto,defaults,noatime 0 0 /dev/sda1 /boot ext3 noauto,defaults 1 2 ..
GFS2
# cat /etc/fstab .. /dev/vg_testcluster1_sr/lv_sharedroot / gfs2 noauto,defaults,noatime 0 0 /dev/sda1 /boot ext3 noauto,defaults 1 2 ..
OCFS2
# cat /etc/fstab .. /dev/vg_testcluster1_sr/lv_sharedroot / ocfs2 noauto,defaults,noatime 0 0 /dev/sda1 /boot ext3 noauto,defaults 1 2 ..
8. Setup boot environment
As NFS uses PXE for booting and all other filesystems use GRUB for booting at this place both are differentiated by their boot protocols.
Common to both setups is the generation of the initial ramdisk. Because of this that is done first.
8.1 Create initial ramdisk
As com.oonics uses a very enhanced boot image (initrd-ng) the process of building the initial ramdisk might take some time. Nevertheless there should be no more to do as call the following command. The file for the initial ramdisk should be called initrd_sr-KERNELVERSION.img.
# mkinitrd /boot/initrd_sr-$(uname -r).img $(uname -r) Executing files before mkinitrd. Sourcing script "01-clean-repository.sh". [ OK ] Executing script "20-clusterconf-validate.sh". [ OK ] Sourcing script "30-rootfs-check.sh". [ OK ] Sourcing script "35-rootdevice-check.sh". [ OK ] Sourcing script "38-multipath-check.sh". [ OK ] Executing script "50-bootimage-check.sh". [ OK ] Executing script "50-cdsl-check.sh". [ OK ] Sourcing script "60-osr-repository-generate.sh". [ OK ] Copying files.. [ OK ] Copying kernelmodules (2.6.32-100.0.19.el5)... [ OK ] Post settings .. [ OK ] Executing files after mkinitrd. Sourcing script "02-create-cdsl-repository.sh". cdsl env [ OK ] Sourcing script "03-nfs-deps.sh". nfs-deps [ OK ] Sourcing script "19-copy-network-configurations.sh". [ OK ] Sourcing script "20-copy-network-configurations.sh". [ OK ] Sourcing script "21-copy-cdsltab-configurations.sh". [ OK ] Sourcing script "98-copy-template-repository.sh". [ OK ] Sourcing script "99-clean-repository.sh". [ OK ] builddate_file [ OK ] Creating index file .. [ OK ] Cpio and compress.. [ OK ] Cleaning up (/tmp/initrd.mnt.r23612, )... [ OK ] -rw-r--r-- 1 root root 41615 May 3 2012 /boot/initrd_sr-2.6.32-100.0.19.el5.img
Next create some symbolic links to the current images in order to make it easy to recognize the current versions. For SuSE those files might be already existent. Just validate if they point to there proper counterparts.
# cd /boot /boot # ln -s vmlinuz.$(uname -r) vmlinuz /boot # ln -s initrd_sr-$(uname -r).img initrd.img
8.1 PXE Based
The boot configuration files will be found at /boot on the shared root file system. They will be called pxe.node1.conf and pxe.node2.conf. They will look as follows.
# cat > /tftpboot/testcluster/pxe.node1.conf <<EOF # This is the com.oonics PXE boot config file timeout 100 prompt 1 default comoonics LABEL comoonics KERNEL /testcluster1/vmlinuz APPEND initrd=/testcluster1/initrd_sr.img nodeid=1 EOF # cat > /tftpboot/testcluster/pxe.node2.conf <<EOF # This is the com.oonics PXE boot config file timeout 100 prompt 1 default comoonics LABEL comoonics KERNEL /testcluster1/vmlinuz APPEND initrd=/testcluster1/initrd_sr.img nodeid=2 EOF
For a PXE Environment we will suppose the following setup. If you configure it otherwise change at the appropriate steps.
We suppose that the nfs mount for the boot file system is also mounted to the tftp server that serves the boot images (the DHCP next-server).
We also suppose that both dhcpd and tftpd and the pxe environment are basically configured. That means if either node1/node2 boot they get an ip address and a next server for getting there bootimages as defined in the defaults.
Also the next server has an already configured boot environment that resides in this case on the tftp-server under /tftpboot/pxelinux.0/ as proposed by syslinux.
First go to the tftp-server and mount the nfs boot file system to the boot environment at /tftpboot/testcluster1. If both nfs server and tftp server are the same servers one might also use a bind mount. Don't use symbolic links as tftpd will not follow the symbolic links in other filesystems.
tftp-server # mkdir /tftpboot/testcluster1 tftp-server # mount 192.168.0.1:/testcluster1/root/boot /tftpboot/testcluster1 tftp-server # cd /tftpboot/pxelinux.cfg tftp-server /tftpboot/pxelinux.cfg # ln -s ../testcluster1/pxe.node1.conf node1 tftp-server /tftpboot/pxelinux.cfg # ln -s node1 01-52-54-00-01-01-01 tftp-server /tftpboot/pxelinux.cfg # ln -s ../testcluster1/pxe.node2.conf node2 tftp-server /tftpboot/pxelinux.cfg # ln -s node1 01-52-54-00-02-02-02
8.2 GRUB Based
If grub is used for booting (in all other cases) setup a configuration as follows.
# cat > /boot/grub/grub.conf <<EOF default=0 timeout=20 hiddenmenu title com.oonics SharedRoot ($(uname -r) root (hd0,0) kernel /vmlinuz ro root=/dev/vg_testcluster1_sr/lv_sharedroot initrd /initrd_sr.img
9. Afterworks
9.1 Disable some startup services
For the different distributions and file systems different services need to be enabled/disabled at startup.
RHEL5/NFS
# for service in gpm kudzu restorecond smartd pcscd bluetooth hidd irda mdmpd yum-updatesd ip6tables iptables; do chkconfig $service off; done
RHEL5/GFS/GFS2
# for service in cman clvmd gfs gfs2 gpm kudzu restorecond smartd pcscd bluetooth hidd irda mdmpd yum-updatesd ip6tables iptables; do chkconfig $service off; done # for service in clvmd; do chkconfig $service on; done
RHEL6/NFS
# for service in restorecond smartd ip6tables iptables; do echo $service off; chkconfig $service off; done
RHEL6/GFS2
# for service in restorecond smartd ip6tables iptables; do echo $service off; chkconfig $service off; done # for service in clvmd ricci modclusterd oddjobd netconsole ntpd snmpd rsyslog; do echo $service on; chkconfig $service on; done
SLES11/NFS
none so far.
SLES11/OCFS2
# for service in o2cb ocfs2; do chkconfig $service off; done
9.2 Other files to be changed
Some cronjobs are better disabled.
RHEL
# rm -f /etc/cron.*/rpm /etc/cron.*/makewhatis.cron /etc/cron.*/mlocate.cron /etc/cron.*/prelink /etc/cron.*/makewhatis.cron /etc/cron.*/99-raid-check
SLES
# rm -f /etc/cron.*/rpm /etc/cron.*/makewhatis.cron /etc/cron.*/mlocate.cron /etc/cron.*/prelink /etc/cron.*/makewhatis.cron /etc/cron.*/99-raid-check
10. Clean up
On installnode exist the chroot and
# exit # from chroot # umount /mnt/newroot/.cdsl.local # umount /mnt/newroot/dev # umount /mnt/newroot/proc # umount /mnt/newroot/sys # umount /mnt/newroot
10. Boot Cluster
Have Fun !!