Skip to main content

Veritas cluster Interview Questions-2


Please go through questions and answers. Let me know if you have any doubt by leaving comment.
Adding and removing cluster node
 Q-1 How to add a node in an existing cluster?
Ans:   Adding a node into an existing cluster is a multi steps process.

1:       Set up the hardware
Before adding a node to an existing cluster, node must be physically connected with the cluster.
      1: Connect the VCS private Ethernet controllers
      2: Connect the node to the shared storage

2:       Install the VCS software in the node
          Install the VCS software and install the license.

3:       Configure LLT and GAB
Create the LLT & GAB configuration files (/etc/llthosts, /etc/llttab and /etc/gabtab) in the new node and update the files on the existing node.

4:       Add the node to an existing cluster
We have to perform below given tasks in any of the existing node of a cluster
         1:Make to cluster configuration R/W
          # haconf –makerw

          2:Add the new node to the cluster
          # hasys –add <new node name>

          3:Copy main.cf file from an existing node to new node
          # scp /etc/VRTSvcs/conf/config/main.cf new_node:/
          /etc/VRTSvcs/conf/config/main.cf

          4:Start vcs on the new node
          # hastart

          5:Now make the configuration again read only.
          # haconf –dump –makero

 5:       Start VCS and verify the cluster
          1:Start VCS on the new node
          # hastart

6:       Run the GAB configuration command on each node to verify that port a and port h   include the new node in the membership.
          # /sbin/gabconfig -a

Q-2 How to remove a node from an existing cluster?
Ans:   Removing a node from a cluster includes many steps, which are given below:

1:       Backup the configuration file
          # cp /etc/VRTSvcs/conf/config/main.cf /etc/VRTSvcs/conf/config/main.cf.orig

2:       Check the status of the nodes and the service groups
          # hastatus –summary

3:       Switch service group which is online on the node leaving the cluster
          # hagrp –switch <service group> to <node name>

4:       Delete the node from the VCS configuration
          1:       Make the cluster configuration R/W
                    # haconf –makerw

          2:       Stop the cluster on leaving node
                    # hastop –sys <node>

          3:       Delete the leaving node from the service group’s SystemList attribute.
                    # hagrp –modify <group> SystemList –delete <node>

          4:       Delete the node from the cluster
                    # hasys –delete <node>

          5:       Now again make the cluster configuration Read Only.
                    # haconf –dump –makero

5:       Modify the LLT and GAB configuration files to reflect changes
Modify /etc/llthosts, /etc/llttab and /etc/gabtab files on the remining node on the cluster.

6:       Remove VCS configuration on the node leaving the cluster
                    1:       Unconfigure and unload LLT and GAB
                              # /sbin/gabconfig –U
                              # /sbin/lltconfig –U
                   
                    2:       Unload the LLT and GAB modules
                              # modunload –i <gab_module>
                              # modunload –I <llt_module>

                  3:       Rename the startup files to prevent LLT, GAB and VCS from
                            starting up in future.
                            # mv /etc/rc2.d/S70llt /etc/rc2.d/s70llt
                            # mv /etc/rc2.d/S92gab /etc/rc2.d/s92gab
                            # mv /etc/rc3.d/S99vcs /etc/rc3.d/s99vcs
         
                 4:       Remove VCS package from the node

Some General Questions:
 Q-1     How to shutdown a node in VCS cluster?
Ans:   Shutting down a VCS node is multi step process.
         
1) Make the cluster configuration Read/Write
          # haconf –makerw

2) Either Switchover or failover all the service group which are online on shutting down node to remaining node
          # hagrp –switch <service group> -to <node name>

3) Freeze all the service group which are online in the cluster.
          # hagrp –freeze <service group> -persistent
         
4) Stop the cluster on the node that is going to be down.
          # hastop –local –force

5) Rename the VCS startup script
          # cd /etc/rc3.d
          # mv S99vcs s99vcs

6) Now reboot the box.

Once the system will come up after reboot, Follow the below given instructions.

1)  Start the VCS on this node
                    # hastart –force
2) Make the service group online if they were made offline before the system down.
                    # hagrp –online <service group> -sys <node name>

3) Unfreeze all the service groups which are frozen.
                    # hagrp -unfreeze <service group> -persistent

4) Now make the cluster configuration Read-Only
                    # haconf -dump –makero

5) Now again move back the VCS startup script
                    # cd /etc/rc3.d
                    # mv s99vcs S99vcs

Q-2     How do check the status of VERITAS Cluster Server?
Ans:   hastatus –sum

Q-3     Which is the main config file for VCS and where it is located?
Ans:   main.cf is the main configuration file for VCS and it is located in       /etc/VRTSvcs/conf/config.

Q-4     Which command you will use to check the syntax of the main.cf?
Ans:   hacf -verify /etc/VRTSvcs/conf/config

Q-5     How will you check the status of individual resource of VCS cluster?
Ans:   hares –state <resource>

Q-6     What is the service group in VCS?
Ans:   Service group is made up of resources and their links which you normally requires to maintain the HA of application.

Q-7     What is the use of halink command?
Ans:   halink is used to link the dependencies of the resources

Q-8     What is the difference between switchover and failover?
Ans:   Switchover is an manual task where as failover is automatic. You can switchover service group from online cluster node to offline cluster node in case of power outage, hardware failure, schedule shutdown and reboot. But the failover will failover the service group to the other node when VCS heartbeat link down, damaged, broken because of some disaster or system hung.

Q-9     What is the use of hagrp command?
Ans:   hagrp is used for doing administrative actions on service groups like online, offline, switch etc.

Q-10   How to switchover the service group in VCS?
Ans:   hagrp –switch <service group> to <node>

Q-11   How to online the service groups in VCS?
Ans:   hagrp –online <service group> -sys <node>

Q-12   How to access the VCS cluster management console?
Ans:   VCS cluster management console can be accessed by the below given URLs:
          http://Servername:8181/cmc/
                              or
          https://Servername:8443/cmc

Q-13   How to access the Cluster Manager Java Console?
Ans:   #/opt/VRTSvcs/bin/hagui

Q-14   What is Jeopardy?
Ans:   When a node in the cluster is having only one interconnected link remaining, then it’s very difficult for GAB to discriminate between system or network failure. A special membership category takes effect in this situation, called jeopardy membership. This memebship prevent cluster from split brain condition. When a system is placed in jeopardy membership, two actions occur:
1:       Service groups running on this node placed in auto disabled state. A service group in auto disabled state may failover on a resource or group fault but can’t failover on system fault.
2:       VCS operates the cluster as a single node cluster. Other systems in the clusters are partitioned off in a separate cluster membership.


Q-15   What is the main daemon of VCS?
Ans:   had (high availability daemon) which is started by hashadow daemon.

Q-16   What is GAB?
Ans:   Group Membership Services/Atomic Broadcast (GAB) is responsible for cluster membership and reliable cluster communication. GAB has two major functions:
          1: Cluster membership
GAB maintains cluster membership by receiving heartbeat from LLT. When a system no longer receives heartbeats from a cluster peer, GAB marks the node as down.
          2: Cluster communication
GAB provides the guranteed delivery of messages to all the systems. The atomic broadcast functionality is used by HAD to ensure that all systems within the cluster receive configuration change messages.

Q-17   What is LLT?
Ans:   Low Latency Transport (LLT) is used for all cluster communication. LLT has 2 major functions:
          1: Traffic Distribution
LLT works as a backbone for GAB. LLT distributes all inter communication across all configured network links. If a link is failes, traffic is directed to the remaining link.
          2: Heartbeat
                    LLT is responsible for sending and receiving heartbeat signals.

Q-18   How many network links are supported in LLT?
Ans: 8 links are supported.

Q-19   How many nodes can join a Cluster?
Ans:   Maximum of 32 nodes is supported in VCS.

Q-20   What is heartbeat?
Ans:   Heartbeat is an Ethernet broadcast packet. This packet notifies all othe nodes that sender is functional. This is the only broadcast traffic generated by VCS. Each node sends 2 hearbeat packets per second per interface. Heartbeat is used by GAB to determine cluster membership.

Q-21   What is split brain condition?
Ans:   When all the cluster interconnected links fail, it is possible for one cluster to separate into 2 subclusters, each of which doesn’t know about the other subcluster. The two subclusters could each carry out recovery actions for the departed system. For example two systems could try to import the same storage and cause data corruption.

Q-22   How do you shutdown a Veritas Cluster Server, leaving the applications running from the command line?
Ans: # hastop -all -forceQ-23   What is coordinator disk?
Ans:   Coordinator disks are three standard disks or LUNs set aside for I/O fencing during cluster reconfiguration. Coordinator disks do not serve any other storage purpose in the VCS configuration. These disks provide a lock mechanism to determine which nodes get to fence off data drives from other nodes. A node must eject a peer from the coordinator disks before it can fence the peer from the data drives. This concept of racing for control of the coordinator disks to gain the ability to fence data disks is key to understanding prevention of split brain through fencing.Q-24   What is IO fencing and how to configure IO fencing?
Ans:   IO fencing is a feature that prevents data corruption in the event of a communication breakdown in a cluster. IO fencing is used to remove the risk associated with split brain condition. I/O fencing allows write access for members of the active cluster and blocks access to storage from non-members; even a node that is alive is unable to cause damage.

Q-25   How to upgrade VCS?
Ans:  
 1) Removing the deprecated resource type   
 2) Start the installvcs program which is under the directory cluster_server

Q-26   How to perform minimal downtime up-gradation in VCS?
         
Q-27   How to upgrade Solaris OS in which VCS is running?
Ans:   To upgrade a Solaris OS in which VCS is running, Follow the below instruction:

1) Stop VCS on this node
Make the VCS configuration R/W
# haconf –makerw

Move all service groups from this node to another node and freeze this node:
  # hasys –freeze –persistent –evacuate <node name>

# Make the cluster configuration Read/Only?
# haconf –dump –makero

# Stop the cluster on this node
 # hastop –force –local

2) Stop, unconfigure and unsinstall LLT and GAB on this node
Unconfigure GAB
# gabconfig –U

Unconfigure LLT
 # lltconfig –U

Now remove GAB and LLT packages
 # pkgrm VRTSgab VRTSllt

3)  Now upgrade Solaris and switch to single user mode

4)  Now Install and configure LLT and GAB
# pkgadd –d . VRTSgab VRTSllt

5)  Now switch to multi user mode and start VCS
# init 3
# hastart

6) Now unfreeze this node
# hasys –unfreeze –persistent <node name>
# haconf –dump –makero

Comments

Popular posts from this blog

How to reset HP ILO password without reboot when you forgot the admin (root) password: ESX, Windows & Linux

How to reset HP ILO password without reboot when you forgot the admin (root) password: ESX, Windows & Linux This is still a test subject and i gathered all the methods and pasting it here for future reference & need to check whether the below steps are working or not on ESX and ESXI.  I am sure its working for Linux and Windows if the ILO drivers installed. ESX: You can configure the ILO board when you reboot your server and press <F8> but all your VM’s will be powered down. The other way is installing hponcfg in the service console.First we need to download the below three rpm files from HP site. Then copy files to the tmp folder of your server console. Install the rpm files using the following command. [root@esxhost tmp]# rpm -ihv hponcfg-version.linux.rpm hpasm-version.rhel3.i386.rpm hprsm-version.rhel3.i386.rpm Create a Password_ILO_reset.xml file and copy it to your tmp folder. <RIBCL VERSION=”2.0″> <LOGIN USER_LOGIN=”Administ