Thursday, August 23, 2012

How To Configuring the cluster resources needed for an NFS service

This article provides the procedures for configuring the cluster resources needed for an NFS service.


To add a resource to your cluster using Conga, perform the following procedure:
  1. As an administrator of luci Select the cluster tab. 
  2. From the Choose a cluster to administer screen, select the cluster to which you will add resources. In this example, that is the cluster with the name nfsclust. 
  3. At the menu for cluster nfsclust (below the clusters menu), click Resources. This causes the display of menu items for resource configuration: Add a Resource and Configure a Resource. 
  4. Click Add a Resource. Clicking Add a Resource causes the Add a Resource page to be displayed. 
The following sections provide instructions for adding the resources you need for an NFS service.


Configuring an IP Address Resource


Use the following procedure to add the IP Address resource 10.15.86.96 to cluster nfsclust.


  1. At the Add a Resource page for cluster nfsclust, click the drop-down box under Select a Resource Type and select IP Address
  2. For IP Address, enter 10.15.86.96.
  3. Leave the Monitor Link checkbox selected to enable link status monitoring of the IP address resource.
  4. Click Submit. Clicking Submit displays a verification page. Verifying that you want to add this resource displays a progress page followed by the display of Resources page, which displays the resources that have been configured for the cluster.

Configuring a GFS Resource


Use the following procedure to add the GFS file system resource mygfs to cluster nfsclust.


  1. At the Add a Resource page for cluster nfsclust, click the drop-down box under Select a Resource Type and select GFS file system
  2. For Name, enter mygfs.
  3. For Mount point, enter /mnt/gfs. This is the path to which the GFS file system is mounted.
  4. For Device, enter /dev/myvg/myvol. The is the LVM logical volume on which the GFS file system was is created.
  5. The Options field specifies the mount options for the GFS file system. For this example, we are mounting the file system with the rw (read-write) and localflocks option.
  6. Leave the File System ID field blank. Leaving the field blank causes a file system ID to be assigned automatically after you click Submit at the File System Resource Configuration dialog box.
  7. Leave the Force Unmount checkbox unchecked. Force Unmount kills all processes using the mount point to free up the mount when it tries to unmount. With GFS resources, the mount point is not unmounted at service tear-down unless this box is checked.
  8. Click Submit and accept the verification screen.

Configuring an NFS Export Resource


Use the following procedure to add NFS export resource mynfs to cluster nfsclust.


  1. At the Add a Resource page for cluster nfsclust, click the drop-down box under Select a Resource Type and select NFS Export
  2. For Name, enter mynfs.
  3. Click Submit and accept the verification screen.

The NFS Eport resource that this configuration defines will be NFS Version 3 by default. If you need to restrict what NFS protocol your system provides to its clients, you can do this at NFS startup; this is not part of cluster configuration.

Configuring NFS Client Resources


This example procedure configures five NFS client resources for cluster nfsclust. The procedure for configuring the first two clients only is laid out explicitly.

Use the following procedure to add NFS client resource nfsclient1 to cluster nfsclust.


  1. At the Add a Resource page for cluster nfsclust, click the drop-down box under Select a Resource Type and select NFS client
  2. For Name, enter nfsclient1.
  3. For Target, enter nfsclient1.example.com. This is the first NFS client system.
  4. This Options field species additional client access rights. Specify rw (read-write) in this field. For more information, refer to the General Options section of the exports(5) man page.
  5. Check the Allow Recover checkbox. This indicates that if someone removes the export from the export list, the system will recover the export inline without taking down the NFS service.
  6. Click Submit and accept the verification screen.

Use the following procedure to add NFS client resource nfsclient2 to cluster nfsclust.


  1. At the Add a Resource page for cluster nfsclust, click the drop-down box under Select a Resource Type and select NFS client
  2. For Name, enter nfsclient2.
  3. For Target, enter nfsclient2.example.com. This is the second NFS client system.
  4. Leave the Options field blank.
  5. Check the Allow Recover checkbox.
  6. Click Submit and accept the verification screen.

Use the same procedure to configure the remaining three NFS client resources, using nfsclient3, nfsclient4, and nfsclient5 as the names of the resources and using nfsclient3.example.com, nfsclient4.example.com, and nfsclient5.example.com as the targets.

Add a Service to the Cluster


To add a service to your cluster using Conga, perform the following procedure:


  1. As an administrator of luci Select the cluster tab.
  2. From the Choose a cluster to administer screen, select the cluster to which you will add resources. In this example, that is the cluster with the name nfsclust.
  3. At the menu for cluster nfsclust (below the clusters menu), click Services. This causes the display of menu items for service configuration: Add a Service and Configure a Service.
  4. Click Add a Service. Clicking Add a Service causes the Add a Service page to be displayed.
  5. For Name, enter nfssvc.
  6. Leave the checkbox labeled Automatically start this service checked, which is the default setting. When the checkbox is checked, the service is started automatically when a cluster is started and running. If the checkbox is not checked, the service must be started manually any time the cluster comes up from the stopped state.
  7. Leave the Run Exclusive checkbox unchecked. The Run Exclusive checkbox sets a policy wherein the service only runs on nodes that have no other services running on them. Since an NFS service consumes few resources, two services could run together on the same node without contention for resources and you do not need to check this.
  8. For Failover Domain, leave the drop-down box default value of None. In this configuration, all of the nodes in the cluster may be used for failover.
  9. For Recovery Policy, the drop-down box displays Select a recovery policy. Click the drop-down box and select relocate. This policy indicates that the system should relocate the service before restarting; it should not restart the node where the service is currently located.
  10. Add the NFS service resources to this resource, as described in the following sections.
  11. After you have added the NFS resources to the service, click Submit. The system prompts you to verify that you want to create this service. Clicking OK causes a progress page to be displayed followed by the display of Services page for the cluster. That page displays the services that have been configured for the cluster.

Adding an IP Address Resource to an NFS Service


Use the following procedure to add an IP Address resource to the NFS cluster service nfssvc.


  1. At the Add a Service page for cluster nfsclust, click Add a resource to this service. Clicking Add a resource to this service causes the display of two drop-down boxes: Add a new local resource and Use an existing global resource.
    For this example, we will use global resources, which are resources that were previously added as global resources. Adding a new local resource would add a resource that is available only to this service.
  2. In the drop-down box underneath the Use an existing global resource display, click on the Select a resource name display. This displays the resources that have been defined for this cluster.
  3. Select 10.15.86.96 (IP Address). This returns you to the Add a Service page with the IP Address resource displayed.
    Leave the Monitor link checkbox selected, which is the default value. This enables link status monitoring of the IP address resource.

Adding a GFS Resource to an NFS Service


Use the following procedure to add a GFS resource to the NFS cluster service nfssvc.


  1. At the Add a Service page for cluster nfsclust, click Add a resource to this service.
  2. In the drop-down box underneath the Use an existing global resource display, click on the Select a resource name display.
  3. Select mygfs (GFS). This returns you to the Add a Service page with the GFS resource displayed, with the parameters that you defined in “Configuring a GFS Resource” displayed.

Adding an NFS Export Resource to an NFS Service


Configure the NFS Export resource as a child of the GFS resource by following this procedure:


  1. At the Add a Service page for cluster nfsclust, below the GFS Resource Configuration display, click Add a child. This causes the display of two drop-down boxes: Add a new local resource and Use an existing global resource.
  2. In the drop-down box underneath the Use an existing global resource display, click on the Select a resource name display.
  3. Select mynfs (NFS Export). This returns you to the Add a Service page with the NFS Export resource displayed.

Adding NFS Client Resources to an NFS Service


Configure the NFS Client resources as children of the NFS export resource by following this procedure for each NFS client:


  1. At the Add a Service page for cluster nfsclust, below the NFS Export Resource Configuration display, click Add a child. This causes the display of two drop-down boxes: Add a new local resource and Use an existing global resource.
  2. Click on the Select a resource name display in the drop-down box underneath the Use an existing global resource display.
  3. Select nfsclient1 (NFS Client). This returns you to the Add a Service page with the NFS client resource displayed with the parameters you defined in Section 4.4, “Configuring NFS Client Resources”.

Follow the same procedure to add a second, third, fourth, and fifth NFS client resource, selecting nfsclient2 (NFS Client), nfsclient3 (NFS Client), nfsclient4 (NFS Client), and nfsclient5 (NFS Client) as the resources to add.

After you have added the NFS client resources to the service, you can click Submit. The system prompts you to verify that you want to create this service. Clicking OK causes a progress page to be displayed followed by the display of Services page for the cluster. That page displays the services that have been configured for the cluster.

Testing the NFS Cluster Service


After you have configured the NFS service, you can check to be sure that the NFS service is working and that it will continue to work as expected if one of the nodes goes down. The following procedure tests an NFS mount on a client, fences the node on which the NFS service is running, and then checks to be sure that the NFS client can still access the file system.


  1. If the GFS file system in the nfsclust cluster is currently empty, populate the file system with test data.
  2. Log in to one of the client systems you defined as a target.
  3. Mount the NFS file system on the client system, and check to see if the data on that file system as available.
  4. On the Luci server, select Nodes from the menu for nfsclust. This displays the nodes in nfsclust and indicates which node is running the nfssvc service.
  5. The drop-down box for each node displays Choose a task. For the node on which the nfssvc service is running, select Fence this node.
  6. Refresh the screen. The nfssvc service should now be running in a different node.
  7. On the client system, check whether the file system you mounted is still available. Even though the NFS service is now running on a different node in the cluster, the client system should detect no difference.
  8. Restore the system to its previous state:

    • Unmount the file system from the client system.
    • Delete any test data you created in the GFS file system.
    • Click on Choose a task in the drop-down box for the node which you fenced and select Reboot this node.

    Troubleshooting


    If you find that you are seeing error messages when you try to configure your system, or if after configuration your system does not behave as expected, you can perform the following checks and examine the following areas.


    • Connect to one of the nodes in the cluster and execute the clustat(8) command. This command runs a utility that displays the status of the cluster. It shows membership information, quorum view, and the state of all configured user services.
      The following example shows the output of the clustat(8) command.
      [root@clusternode4 ~]# clustat
      Cluster Status for nfsclust @ Wed Dec  3 12:37:22 2008
      Member Status: Quorate
      
       Member Name                              ID   Status
       ------ ----                              ---- ------
       clusternode5.example.com          1 Online, rgmanager
       clusternode4.example.com          2 Online, Local, rgmanager
       clusternode3.example.com          3 Online, rgmanager
       clusternode2.example.com          4 Online, rgmanager
       clusternode1.example.com          5 Online, rgmanager
      
       Service Name             Owner (Last)                     State
       ------- ---              ----- ------                     -----
       service:nfssvc           clusternode2.example.com         starting
      
      In this example, clusternode4 is the local node since it is the host from which the command was run. If rgmanager did not appear in the Status category, it could indicate that cluster services are not running on the node.
    • Connect to one of the nodes in the cluster and execute the group_tool(8) command. This command provides information that you may find helpful in debugging your system. The following example shows the output of the group_tool(8) command.
      [root@clusternode1 ~]# group_tool
      type             level name       id       state
      fence            0     default    00010005 none
      [1 2 3 4 5]
      dlm              1     clvmd      00020005 none
      [1 2 3 4 5]
      dlm              1     rgmanager  00030005 none
      [3 4 5]
      dlm              1     mygfs      007f0005 none
      [5]
      gfs              2     mygfs      007e0005 none
      [5]
      
      The state of the group should be none. The numbers in the brackets are the node ID numbers of the cluster nodes in the group. The clustat shows which node IDs are associated with which nodes. If you do not see a node number in the group, it is not a member of that group. For example, if a node ID is not in dlm/rgmanager group, it is not using the rgmanager dlm lock space (and probably is not running rgmanager).
      The level of a group indicates the recovery ordering. 0 is recovered first, 1 is recovered second, and so forth.
    • Connect to one of the nodes in the cluster and execute the cman_tool nodes -f command This command provides information about the cluster nodes that you may want to look at. The following example shows the output of the cman_tool nodes -f command.
      [root@clusternode1 ~]# cman_tool nodes -f
      Node  Sts   Inc   Joined               Name
         1   M    752   2008-10-27 11:17:15  clusternode5.example.com
         2   M    752   2008-10-27 11:17:15  clusternode4.example.com
         3   M    760   2008-12-03 11:28:44  clusternode3.example.com
         4   M    756   2008-12-03 11:28:26  clusternode2.example.com
         5   M    744   2008-10-27 11:17:15  clusternode1.example.com
      
      The Sts heading indicates the status of a node. A status of M indicates the node is a member of the cluster. A status of X indicates that the node is dead. The Inc heading indicating the incarnation number of a node, which is for debugging purposes only.
    • Check whether the cluster.conf is identical in each node of the cluster. If you configure your system with Conga, as in the example provided in this document, these files should be identical, but one of the files may have accidentally been deleted or altered.
    • In addition to using Conga to fence a node in order to test whether failover is working properly as described in Chapter 6, Testing the NFS Cluster Service, you could disconnect the ethernet connection between cluster members. You might try disconnecting one, two, or three nodes, for example. This could help isolate where the problem is.
    • If you are having trouble mounting or modifying an NFS volume, check whether the cause is one of the following:

      • The network between server and client is down.
      • The storage devices are not connected to the system.
      • More than half of the nodes in the cluster have crashed, rendering the cluster inquorate. This stops the cluster.
      • The GFS file system is not mounted on the cluster nodes.
      • The GFS file system is not writable.
      • The IP address you defined in the cluster.conf is not bounded to the correct interface / NIC (sometimes the ip.sh script does not perform as expected).
    • Execute a showmount -e command on the node running the cluster service. If it shows up the right 5 exports, check your firewall configuration for all necessary ports for using NFS.
    • If SELinux is currently in enforcing mode on your system, check your /var/log/audit.log file for any relevant messages. If you are using NFS to serve home directories, check whether the correct SELinux boolean value for nfs_home_dirs has been set to 1; this is required if you want to use NFS-based home directories on a client that is running SELinux. If you do not set this value on, you can mount the directories as root but cannot use them as home directories for your users.
    • Check the /var/log/messages file for error messages from the NFS daemon.
    • If you see the expected results locally at the cluster nodes and between the cluster nodes but not at the defined clients, check the firewall configuration at the clients.


1 comment: