Next Previous Contents

5. Installing Master Server

The question you will ask straight away is what is a master server? Most of Beowulf systems have only one server and gateway to the world outside the cluster, but some have multiple server for redundancy and reliability reasons. In a large disk-less client cluster you might want to use multiple NFS servers to serve system files to the client nodes. In a more distributed environment it is possible for all nodes to act as both client and servers. If you are going to use only one server node you can simply remove the word 'master', and think of a master server simply as the server.

Master server will be the most important node in your beowulf system. It will NFS serve file systems to the client nodes, it will be used for compiling the source code, starting parallel jobs and it will be your access point from the outside world. The following are the steps to installing and configuring master server.

5.1 Partition sizes

Important part of the installation process is choosing the partition sizes. It is very important to choose partition size which are correct for your needs because it might be very difficult to change this at a later stage when your cluster will be running production code.

My recommended partition sizes for the disk-less client configuration using Red Hat Linux 5.2 are as follows :

5.2 Installing Red Hat Linux

I will not go into the details of Red Hat Linux 5.2 installation as these are well described in Red Hat Linux Installation Manual http://www.redhat.com/support/docs/rhl/. I recommend installing the full Red Hat 5.2 distribution to save time now and later, when you will look for a package you'll need but did not install.

5.3 Network configuration

In most cases, nodes in a beowulf cluster use private IP addresses. The only node which has a "real" IP address, visible from the outside world, is the server node. All other nodes (clients) can only see nodes with in the beowulf cluster. An example of a five-node beowulf cluster is shown below. As you can see, node1 has two network interfaces, one for the cluster, and one for the outside world. I use the 10.0.0.0/8 private IP range, but others can also be used (please see RFC 1918 http://www.alternic.net/rfcs/1900/rfc1918.txt.html)

eth0 123.45.67.89       
----------------[node1]
                   | eth1 10.0.0.1
                   |
   10.0.0.2      ------      10.0.0.5
[node2]---------|SWITCH|---------[node5]
                 ------
                 |    |
                 |    |
        10.0.0.3 |    | 10.0.0.4
            [node3]  [node4]

If you haven't already done so, you should now configure both of your Ethernet cards. One of your cards should have a "real" IP address allocated to you by your network administrator (most probably you :), and the other a private IP (e.g. 10.0.0.1) visible only by the nodes within the cluster. You can configure your network interface by either using GUI tools shipped with Red Hat Linux, or simply create or edit /etc/system/network-scripts/ifcfg-eth* files. A simple Beowulf system might use 10.0.0.0/8 private IP address range with 10.0.0.1 being the server and 10.0.0.2 up to 10.0.0.254 being the IP addresses of client nodes. If you decide to use this IP range you will probably want to use 255.255.255.0 netmask, and 10.0.0.255 broadcast addresses. On Topcat eth0 is the interface connecting the cluster to the outside world, and eth1 connects to the internal cluster network. The routing table looks like this:


[jacek@topcat jacek]$ /sbin/route 
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
10.0.0.0        *               255.255.255.0   U     0      0        9 eth1
139.x.x.0     *               255.255.248.0   U     0      0        7 eth0
127.0.0.0       *               255.0.0.0       U     0      0        2 lo
default         139.x.x.1     0.0.0.0         UG    0      0       18 eth0

5.4 Setting up DNS

I no longer run DNS on Topcat (our Beowulf cluster). Originally I though that having a dedicated DNS domain and server for your Beowulf cluster simplified administration, but since then have configured Topcat without DNS, and it seems to work well. It is up to you to choose your configuration. I left this section on DNS for reference purposes, but will no longer maintain it. I believe that my DNS configuration files will not work with the latest version of named.

Setting up DNS is very straight forward. Your server (node1) will be the DNS server. It will resolve the names and IP addresses for the whole beowulf cluster. DNS Configuration files can be downloaded from ftp://ftp.sci.usq.edu.au/pub/jacek/beowulf-utils. The configuration files listed are the ones I used on our topcat system but you can include them in your system if you don't mind use the same names for your nodes as me. As you can see I use a private IP address range 10.0.0.0/8, with local subnet mask set to 255.255.255.0. Our domain will not be visible from outside (unless someone uses our node1 as their name server) so we can call it what ever we want. I chose beowulf.usq.edu.au for my domain name. There are few configuration files which you will have to modify for you DNS to work and you can find them ftp://ftp.sci.usq.edu.au/pub/jacek/beowulf-utils . After installing the configuration files restart the named daemon by executing /etc/rc.d/init.d/named restart.

Test your DNS server :


[root@node1 /root]# nslookup node2
Server:  node1.beowulf.usq.edu.au
Address:  10.0.0.1

Name:    node2.beowulf.usq.edu.au
Address:  10.0.0.2

[root@node1 /root]# nslookup 10.0.0.5
Server:  node1.beowulf.usq.edu.au
Address:  10.0.0.1

Name:    node5.beowulf.usq.edu.au
Address:  10.0.0.5

5.5 /etc/hosts

If you decide not use DNS server, then you will have to enter all of the nodes and their corresponding IP addresses in /etc/hosts file. If you use disk-less client configuration, the setup_template and adcn scripts will create hard links to this file, so it will be used by all nodes. Example /etc/hosts file from Topcat is shown below.


127.0.0.1               localhost localhost.localdomain
139.x.x.x               topcat.x.x.x. topcat
10.0.0.1                node1.beowulf.usq.edu.au node1
10.0.0.2                node2.beowulf.usq.edu.au node2
10.0.0.3                node3.beowulf.usq.edu.au node3
10.0.0.4                node4.beowulf.usq.edu.au node4
10.0.0.5                node5.beowulf.usq.edu.au node5
10.0.0.6                node6.beowulf.usq.edu.au node6
10.0.0.7                node7.beowulf.usq.edu.au node7
10.0.0.8                node8.beowulf.usq.edu.au node8
10.0.0.9                node9.beowulf.usq.edu.au node9
10.0.0.10               node10.beowulf.usq.edu.au node10
10.0.0.11               node11.beowulf.usq.edu.au node11
10.0.0.12               node12.beowulf.usq.edu.au node12
10.0.0.13               node13.beowulf.usq.edu.au node13

5.6 /etc/resolv.conf

If you have a DNS server running on the master server then your resolv.conf file should point to local name server first. This is the /etc/resolv/conf I had when I ran DNS on Topcat


search beowulf.usq.edu.au eng.usq.edu.au sci.usq.edu.au usq.edu.au
nameserver 127.0.0.1
nameserver 139.x.x.2
nameserver 139.x.x.3

If you don't have a DNS server then you will have to point to other name servers. This is my current /etc/resolv/conf file.
search eng.usq.edu.au sci.usq.edu.au usq.edu.au
nameserver 139.x.x.2
nameserver 139.x.x.3

5.7 /etc/hosts.equiv

In order to allow remote shells (rsh) from any node to any other in the cluster, for all users, you should list all host in /etc/hosts.equiv.


node1.beowulf.usq.edu.au
node2.beowulf.usq.edu.au
node3.beowulf.usq.edu.au
node4.beowulf.usq.edu.au
node5.beowulf.usq.edu.au
node6.beowulf.usq.edu.au

5.8 Security

The general security policy for Beowulf clusters should be such that all the nodes within the cluster should fully trust each other. The reason you can relax the security inside the cluster is because none of the client nodes are directly connected to the outside world, and all nodes are basically the same. If someone hacks into the master node they will not get any more information from any of the client nodes, therefore you don't have to worry about the security at this level. It is practically impossible for anyone to access any of your client nodes without actually sitting at the console, or going via the server node first. The main advantages of relaxing the security within the cluster are flexibility and ease of use and administer. The server node on the other hand should trust its client nodes but not the outside world. There are few things you can do to relax the security within the cluster and to protect your self from outside.

TCP wrappers.

The tcpd daemon, commonly known as TCP wrapper, is the first line of defense, and is the simplest way of limiting access to your machine and therefore increasing security. It comes as part of Red Hat installation and is simple to configure. There are three configuration files: /etc/hosts.allow which checks for hosts which are allowed connections, /etc/hosts.deny which is read if the host was not found in /etc/hosts.allow and checks for hosts which are to be refused connection, and /etc/inetd.conf which you should not have to modify to configure tcpd.

/etc/hosts.allow

hosts_access(5) man page provides good source of information on the syntax of these two files.


#
# hosts.allow   This file describes the names of the hosts which are
#               allowed to use the local INET services, as decided
#               by the '/usr/sbin/tcpd' server.
#

# we fully trust ourself and all the other nodes within the cluster
ALL : localhost, 10.0.0., 10.0.1., 10.0.2.

/etc/hosts.deny

The /etc/hosts.deny file is checked for matches when no match was found in /etc/hosts.allow. The best way of using the TCP wrappers is to deny everything that has not been allowed or matched by /etc/hosts.allow. In our cases we not only deny match, and therefore deny everything, but for every denied connection we send an e-mail to the administrator.


ALL: ALL: spawn ( \
echo -e "\n\
TCP Wrappers\:  Connection Refused\n\
By\:                    $(uname -n)\n\
Process\:               %d (pid %p)\n\
User\:                  %u\n\
Host\:                  %c\n\
Date\:                  $(date)\n\
" | /bin/mail -s "From tcpd@$(uname -n).  %u@%h -> %d." root) 

If a connection is attempted from a host not listed in /etc/hosts.allow the match will occur in /etc/hosts.deny, so connection will be closed and I will receive an e-mail with notification. An example of such an e-mail is shown below.

Date: Sat, 15 Aug 1998 15:31:08 +1000
From: Administrator <root@topcat.eng.usq.edu.au>
Message-Id: <199808150531.PAA20980@topcat.eng.usq.edu.au>
To: jacek@usq.edu.au
Subject: From tcpd@topcat.eng.usq.edu.au
X-Mozilla-Status: 0001
Content-Length: 197

        On Sat Aug 15 15:31:08 EST 1998
        user jacek from host agatka.usq.edu.au attempted an unauthorised connection 
        to topcat.eng.usq.edu.au.
        Attempted connection was to process in.rlogind (pid 20972)

Stopping unused daemons - /etc/inetd.conf

A very simple, but effective way of improving your security is to disable unwanted services. The rule of thumb is to disable every thing you don't need. Most daemons are started by the inetd super server and should be turned off by commenting out lines in inetd.conf. Example below show part of inetd.conf with login, exec, talk, and ntalk disabled.


shell   stream  tcp     nowait  root    /usr/sbin/tcpd  in.rshd
#login  stream  tcp     nowait  root    /usr/sbin/tcpd  in.rlogind
#exec   stream  tcp     nowait  root    /usr/sbin/tcpd  in.rexecd
#talk   dgram   udp     wait    root    /usr/sbin/tcpd  in.talkd
#ntalk  dgram   udp     wait    root    /usr/sbin/tcpd  in.ntalkd

After modifying the configuration file you will have to restart the inetd daemon. The simplest way to do it on Linux is to send a hang up single to the daemon which will force it to re-read its configuration file.
[root@topcat root]# killall -HUP inetd
Do not try this on other Unix system without reading the killall man page first! You can check which daemons are running by getting a list of all listening ports. You can easily get this list by running:
[root@topcat root]# netstat -a | grep LISTEN | grep -v unix

Servers like httpd start as rc scripts. Normally each should be disabled by deleting these link.

ipfwadm

ipfwadm program allows blocking packets from specific IP addresses to specific ports and is the most flexible way of controlling security. The example firewall rc script should be started automatically at boot time.

[root@topcat init.d]# cp /home/jacek/firewall /etc/rc.d/init.d
[root@topcat init.d]# chmod u+rx firewall
[root@topcat init.d]# ln -s /etc/rc.d/init.d/firewall /etc/rc.d/rc3.d/S05firewall
[root@topcat init.d]# ln -s /etc/rc.d/init.d/firewall /etc/rc.d/rc5.d/S05firewall


Next Previous Contents