Linux, CCNA and MCSE Questions: 2016

Wednesday, November 2, 2016

Linux, CCNA and MCSE Questions: User Managment

Monday, October 17, 2016

RedHat Cluster

1. How can you define a cluster and what are its basic types?

A cluster is two or more computers (called nodes or members) that work together to perform a task. There are four major types of clusters:

Storage

High availability

Load balancing

High performance

2. What is Storage Cluster?

Storage clusters provide a consistent file system image across servers in a cluster, allowing the servers to simultaneously read and write to a single shared file system.

A storage cluster simplifies storage administration by limiting the installation and patching of applications to one file system.

The High Availability Add-On provides storage clustering in conjunction with Red Hat GFS2

3. What is High Availability Cluster?

High availability clusters provide highly available services by eliminating single points of failureand by failing over services from one cluster node to another in case a node becomes inoperative.

Typically, services in a high availability cluster read and write data (via read-write mounted file systems).

A high availability cluster must maintain data integrity as one cluster node takes over control of a service from another cluster node.

Node failures in a high availability cluster are not visible from clients outside the cluster.

High availability clusters are sometimes referred to as failover clusters.

4. What is Load Balancing Cluster?

Load-balancing clusters dispatch network service requests to multiple cluster nodes to balance the request load among the cluster nodes.

Load balancing provides cost-effective scalability because you can match the number of nodes according to load requirements. If a node in a load-balancing cluster becomes inoperative, the load-balancing software detects the failure and redirects requests to other cluster nodes.

Node failures in a load-balancing cluster are not visible from clients outside the cluster.

Load balancing is available with the Load Balancer Add-On.

5. What is a High Performance Cluster?

High-performance clusters use cluster nodes to perform concurrent calculations.

A high-performance cluster allows applications to work in parallel, therefore enhancing the performance of the applications.

High performance clusters are also referred to as computational clusters or grid computing.

6. How many nodes are supported in Red hat 6 Cluster?

A cluster configured with qdiskd supports a maximum of 16 nodes. The reason for the limit is because of scalability; increasing the node count increases the amount of synchronous I/O contention on the shared quorum disk device.

7. What is the minimum size of the Quorum Disk?

The minimum size of the block device is 10 Megabytes.

8. What is the order in which you will start the Red Hat Cluster services?

In Red Hat 4 :

# service ccsd start

# service cman start

# service fenced start

service clvmd start (If CLVM has been used to create clustered volumes)

# service gfs start

# service rgmanager start

In RedHat 5 :

# service cman start

# service clvmd start

# service gfs start

# service rgmanager start

In Red Hat 6 :

# service cman start

# service clvmd start

# service gfs2 start

# service rgmanager start

9. What is the order to stop the Red Hat Cluster services?

In Red Hat 4 :

# service rgmanager stop

# service gfs stop

# service clvmd stop

# service fenced stop

# service cmanstop

# service ccsd stop

In Red Hat 5 :

# service rgmanager stop

# servicegfsstop

# service clvmd stop

# servicecman stop

In Red Hat 6 :

# service rgmanagerstop

# service gfs2 stop

# service clvmdstop

# service cman stop

10. What are the performance enhancements in GFS2 as compared to GFS?

Better performance for heavy usage in a single directory

Faster synchronous I/O operations

Faster cached reads (no locking overhead)

Faster direct I/O with preallocated files (provided I/O size is reasonably large, such as 4M blocks)

Faster I/O operations in general

Faster Execution of the df command, because of faster statfs calls

Improved atime mode to reduce the number of write I/O operations generated by atime when compared with GFS

GFS2 supports the following features.

extended file attributes (xattr)

the lsattr() and chattr() attribute settings via standard ioctl() calls

nanosecond timestamps

GFS2 uses less kernel memory.

GFS2 requires no metadata generation numbers.

Allocating GFS2 metadata does not require reads. Copies of metadata blocks in multiple journals are managed by revoking blocks from the journal before lock release.

GFS2 includes a much simpler log manager that knows nothing about unlinked inodes or quota changes.

The gfs2_grow and gfs2_jadd commands use locking to prevent multiple instances running at the same time.

The ACL code has been simplified for calls like creat() and mkdir().

Unlinked inodes, quota changes, and statfs changes are recovered without remounting the journal.

11. What is the maximum file system support size for GFS2?

GFS2 is based on 64 bit architecture, which can theoretically accommodate an 8 EB file system.

However, the current supported maximum size of a GFS2 file system for 64-bit hardware is 100 TB.

The current supported maximum size of a GFS2 file system for 32-bit hardware for Red Hat Enterprise Linux Release 5.3 and later is 16 TB.

NOTE: It is better to have 10 1TB file systems than one 10TB file system.

12. What is the journaling filesystem?

A journaling filesystem is a filesystem that maintains a special file called a journal that is used to repair any inconsistencies that occur as the result of an improper shutdown of a computer.

In journaling file systems, every time GFS2 writes metadata, the metadata is committed to the journal before it is put into place.

This ensures that if the system crashes or loses power, you will recover all of the metadata when the journal is automatically replayed at mount time.

GFS2 requires one journal for each node in the cluster that needs to mount the file system. For example, if you have a 16-node cluster but need to mount only the file system from two nodes, you need only two journals. If you need to mount from a third node, you can always add a journal with the gfs2_jadd command.

13. What is the default size of journals in GFS?

When you run mkfs.gfs2 without the size attribute for journal to create a GFS2 partition, by default a 128MB sizejournal is created which is enough for most of the applications

In case you plan on reducing the size of the journal, it can severely affect the performance. Suppose you reduce the size of the journal to 32MB it does not take much file system activity to fill an 32MB journal, and when the journal is full, performance slows because GFS2 has to wait for writes to the storage.

14. What is a Quorum Disk?

Quorum Disk is a disk-based quorum daemon, qdiskd, that provides supplemental heuristics to determine node fitness.

With heuristics you can determine factors that are important to the operation of the node in the event of a network partition

For a 3 node cluster a quorum state is present until 2 of the 3 nodes are active i.e. more than half. But what if due to some reasons the 2nd node also stops communicating with the 3rd node? In that case under a normal architecture the cluster would dissolve and stop working. But for mission critical environments and such scenarios we use quorum disk in which an additional disk is configured which is mounted on all the nodes with qdiskd service running and a vote value is assigned to it.

So suppose in above case I have assigned 1 vote to qdisk so even after 2 nodes stops communicating with 3rd node, the cluster would have 2 votes (1 qdisk + 1 from 3rd node) which is still more than half of vote count for a 3 node cluster. Now both the inactive nodes would be fenced and your 3rd node would be still up and running being a part of the cluster.

15. What is rgmanager in Red Hat Cluster and its use?

This is a service termed as Resource Group Manager

RGManager manages and provides failover capabilities for collections of cluster resources called services, resource groups, or resource trees

it allows administrators to define, configure, and monitor cluster services. In the event of a node failure, rgmanager will relocate the clustered service to another node with minimal service disruption.

16. What is luci and ricci in Red Hat Cluster?

luci is the server component of the Conga administration utility

Conga is an integrated set of software components that provides centralized configuration and management of Red Hat clusters and storage

luci is a server that runs on one computer and communicates with multiple clusters and computers via ricci

ricci is the client component of the Conga administration utility

ricci is an agent that runs on each computer (either a cluster member or a standalone computer) managed by Conga

This service needs to be running on all the client nodes of the cluster.

17. What is cman in Red Hat Cluster?

This is an abbreviation used for Cluster Manager.

CMAN is a distributed cluster manager and runs in each cluster node.

It is responsible for monitoring, heartbeat, quorum, voting and communication between cluster nodes.

CMAN keeps track of cluster quorum by monitoring the count of cluster nodes.

18. What are the different port no. used in Red Hat Cluster?

IP Port no.	Protocol	Component
5404,5405	UDP	corosync/cman
11111	TCP	ricci
21064	TCP	dlm (Distributed Lock Manager)
16851	TCP	Modclustered
8084	TCP	luci
4196,4197	TCP	rgmanager

19. How does NetworkManager service affects Red Hat Cluster?

The use of NetworkManager is not supported on cluster nodes. If you have installed NetworkManager on your cluster nodes, you should either remove it or disable it.

# service NetworkManager stop

# chkconfig NetworkManager off

The cman service will not start if NetworkManager is either running or has been configured to run with the chkconfig command

20. What is the command used to relocate a service to another node?

# clusvcadm -r service_name -m node_name

21. What is split-brain condition in Red Hat Cluster?

We say a cluster has quorum if a majority of nodes are alive, communicating, and agree on the active cluster members. For example, in a thirteen-node cluster, quorum is only reached if seven or more nodes are communicating. If the seventh node dies, the cluster loses quorum and can no longer function.

A cluster must maintain quorum to prevent split-brain issues.

If quorum was not enforced, quorum, a communication error on that same thirteen-node cluster may cause a situation where six nodes are operating on the shared storage, while another six nodes are also operating on it, independently. Because of the communication error, the two partial-clusters would overwrite areas of the disk and corrupt the file system.

With quorum rules enforced, only one of the partial clusters can use the shared storage, thus protecting data integrity.

Quorum doesn't prevent split-brain situations, but it does decide who is dominant and allowed to function in the cluster.

quorum can be determined by a combination of communicating messages via Ethernet and through a quorum disk.

22. What are Tie-breakers in Red Hat Cluster?

Tie-breakers are additional heuristics that allow a cluster partition to decide whether or not it is quorate in the event of an even-split - prior to fencing.

With such a tie-breaker, nodes not only monitor each other, but also an upstream router that is on the same path as cluster communications. If the two nodes lose contact with each other, the one that wins is the one that can still ping the upstream router.That is why, even when using tie-breakers, it is important to ensure that fencing is configured correctly.

CMAN has no internal tie-breakers for various reasons. However, tie-breakers can be implemented using the API.

23. What is fencing in Red Hat Cluster?

Fencing is the disconnection of a node from the cluster's shared storage.

Fencing cuts off I/O from shared storage, thus ensuring data integrity.

The cluster infrastructure performs fencing through the fence daemon, fenced.

When CMAN determines that a node has failed, it communicates to other cluster-infrastructure components that the node has failed.

fenced, when notified of the failure, fences the failed node.

24. What are the various types of fencing supported by High Availability Add On?

Power fencing — A fencing method that uses a power controller to power off an inoperable node.

storage fencing — A fencing method that disables the Fibre Channel port that connects storage to an inoperable node.

Other fencing — Several other fencing methods that disable I/O or power of an inoperable node, including IBM Bladecenters, PAP, DRAC/MC, HP ILO, IPMI, IBM RSA II, and others.

25. What are the lock states in Red Hat Cluster?

A lock state indicates the current status of a lock request. A lock is always in one of three states:

Granted — The lock request succeeded and attained the requested mode.

Converting — A client attempted to change the lock mode and the new mode is incompatible with an existing lock.

Blocked — The request for a new lock could not be granted because conflicting locks exist.

A lock's state is determined by its requested mode and the modes of the other locks on the same resource.

26. What is DLM lock model?

DLM is a short abbreviation for Distributed Lock Manager.

A lock manager is a traffic cop who controls access to resources in the cluster, such as access to a GFS file system.

GFS2 uses locks from the lock manager to synchronize access to file system metadata (on shared storage)

CLVM uses locks from the lock manager to synchronize updates to LVM volumes and volume groups (also on shared storage)

In addition, rgmanager uses DLM to synchronize service states.

without a lock manager, there would be no control over access to your shared storage, and the nodes in the cluster would corrupt each other's data.

Veritas Volume Manager and Veritas Cluster

1. What is the difference between Failing and Failed?

Failing :

Failing means, it is going to fail. In failing disk's private region is available and public region is not available. so, we can recover the data using the private region.

Failed :

Failed means, it is already failed. In failed disk the both private and public regions are not available. So, we cannot get back the (recover) data. The only thing is replace or restore the data from backup.

2. What are the deamons of Veritas Volume Manager?

(a) vxconfigd :

(i) This is the main deamon in Veritas Volume Manager.

(ii) It maintains the Volume Manager configuration information.

(iii) It always resides in the private region of the disk.

(iv) It communicate with the kernel and update the Volume states to configure the database.

(v) It always starts before mounting the root ( / ) file system.

(b) vxiod :

(i) This is used to maintain I/O (input and output) operations.

(ii) This also defines how many I/O operations at a time.

(i) It always monitors the consistency in the disks and notify the user if failed using (by) vxnotifyd deamon.

(ii) It also relocate and recognize the new disk.

(d) vxrecoverd :

(i) It passes the lost data into new disk.

(ii) It also notify to the Administrators using (by) vxnotifyd deamon.

(e) vxnotifyd :

(i) It notify to the user (Administrator) about failed disks and after recovery also it notify to the Administrator.

3. How to create the root mirror?

(i) Bring the disk from O/S to Veritas Volume Manager control using the Veritas Advanced Management tool, # vxdiskadm command (It gives (displays) options for easy administration of Veritas Volume Manager).

(ii) Select 2nd option ie., Encapsulation because to preserve the existing data present in the disk and reboot the system to effect Encapsulation and modify the /etc/sysconfig file. While Encapsulating, it asks disk name and disk group (root disk name and rootdg).

(iii) Backup the / (root), /etc/sysconfig directories.

(iv) Take another disk and initialize it by # vxdisksetup -i command.

(v) Add the above initialized disk to the volume group ie., roodg by

# vxdg -g adddisk mirrordisk=

(vi) vxmirror -v -g (disk level mirroring)

(vii) For individual mirroring, # vxassist -g mirror or

# vxrootmirr -g command.

4. What is the service group in Vertias Cluster?

Service group is made up of resources and their links which we normally requires to maintain the High Availability for the application.

5. What is the use of ' halink ' command?

# halink command is used to link the dependencies of the resources.

6. What are the differences between switchover and failover?

SwitchOver	FailOver
(i) Switchover is the manual task.	(i) But, Failover is a automatic task.
(ii) We can switchover service groups from online cluster node to offline cluster node incase of power outage, hardware failure, schedule shutdown and reboot.	(ii) But, the failover will failover the service group to the other node when Veritas Cluster heartbeat linkdown, damaged, broken because of some disaster or system hung.

7. Which the main configuration file for VCS (Veritas Cluster) and where it is stored?

' main.cf ' is the main configuration file for VCS and it is located in /etc/VRTSvcs/conf/config directory.

8. What is the public region and private region?

when we bring the disk from O/S control to Volume Manager control in any format (either CDS, simple or sliced), the disk is logically divided into two parts.

(a) Private region :

It contains Veritas configuration information like disk type and name, disk group name, groupid and configdb. The default size is 2048 KB.

(b) Public region :

It contains the actual user's data like applications, databases and others.

9. There are five disks on VxVM (Veritas Volume Manager) and all are failed. What are the steps you follow to get those disks into online?

(i) Check the list of disks in Volume manager control by # vxdisk list command.

(ii) If the above disks are not present, then bring them O/S control to VxVM control by

# vxdisksetup -i (if data is not on those disk) or execute

# vxdiskadm command and select 2nd option ie., encapsulation method if the disks having the data.

(iii) Even though If it is not possible, then check the disks are available at O/S level by # fdisk -l command.

(a) If the disks are available, execute the above command once again.

(b) If the disks are not available then recognize them by scanning the hardware.

(iv) Even though if it is not possible, then reboot the system and follow the steps (i) and (ii).

10. What is the basic difference between private disk group and shared disk group?

Private disk group :

The disk group is only visible for the host on which we have created it. If the host is a part of the cluster, the private disk group will not be visible to the other cluster nodes.

Shared disk group :

The disk group is sharable and visible to the other cluster nodes.

11. How will you create private disk group and shared disk group?

# vxdg init = (to create the private disk group)

# vxdg -s init =(to create the shared disk group)

12. How will you add new disk to the existing disk group?

we can do this in two ways.

(i) Run # vxdiskadm command, which will open menu driven program to do various disk operations. Select add disk option and give disk group name and disk name.

(ii) # vxdg -g adddisk =

Example: # vxdg -g appsdg adddisk disk02=/dev/sdb

13. How will you grow or shrink the volume/file system? What is the meaning of grow by, grow to, shrink by and shrink to options?

(i) We can grow the volume/file system by,

# vxassist -g appsdg growby or growto 100GB appsvol (or)

# vxresize -g appsdg +100GB appsvol alloc =

(ii) We can shrink the volume/file system by,

# vxassist -g appsdg shrinkby 20GB appsvol

# vxassist -g appsdg shrinkto 20GB appsvol (or)

# vxresize -g appsdg -10GB appsvol (to shrink by the size 10GB)

# vxresize -g appsdg 10GB appsvol (to shrink to the size 10GB)

Meanings :

growby :

This will be used to grow the file system by adding new size to the existing file system.

growto :

This will be used to grow the file system upto the specified new size. This will not be added the new size to the existing one.

shrinkby :

This will be used to shrink the file system by reducing the new size from the existing file system size.

shrinkto :

This will be used to shrink the file system upto the specified new size. This will not be reduced the file system new size from the existing one.

14. If vxdisk list command gives you disk status as " error ". What are the steps you follow to make respective disk online?

This issue is mainly because of fabric disconnection. So, execute # vxdisk scandisks command. Otherwise unsetup the disks using # /etc/vx/bin/vxdiskunsetup and setup the disks again using # /etc/vx/bin/vxdisksetup command.

Note :/etc/vx/bin/vxdiskunsetup will remove the private region from the disk and destroy the data. So, backup the data before using this command.

15. Which are the different layouts for vxvm?

(i)mirror (ii)stripe (default)

(iii) concate (iv) raid 5

(v) stripe-mirror (vi) mirror-stripe

16. How will you setup and unsetup disks explicitly suing vxvm?

# /etc/vx/bin/vxdisksetup (to setup the disks)

# /etc/vx/bin/vxdiskunsetup (to unsetup the disks)

17. How will you list the disks which are in different disk groups?

# vxdisk list or # vxprint (to list from current disk group or imported disk group)

# vxdisk -o alldgs (to list all the disks which are in different disk groups)

18. Define LLT and GAB. What are the commands to create them?

LLT :

(i) LLT means Low Latency Transport protocol

(ii) It monitor the kernel to kernel communication.

(iii) It maintain and distribute the network traffic within the cluster.

(iv) It uses heartbeat between the interfaces.

GAB :

(i) GAB means Global Atomic Broadcasting.

(ii) It maintain and distribute the configuration information of the cluster.

(iii) It uses heartbeat between the disks.

Commands :

# gabconfig -a (to check the status of the GAB, ie., GAB is running or not)

If port ' a ' is listening, means GAB is running, otherwise GAB is not running.

If port ' b ' is listening, means I/O fencing is enabled, otherwise I/O fencing is disabled.

If port ' h ' is listening means had deamon is working, otherwise had deamon is not working.

# gabconfig -c n 2 (to start the GAB in 2 systems in the cluster, where 2 is seed no.)

# gabconfig -u (to stop the GAB)

# cat /etc/gabtab (to see the GAB configuration information and the it contains as, )

gabconfig -c n x (where x is a no. ie., 1, 2, 3, ....etc.,)

# lltconfig -a (to see the status of the llt)

# lltconfig -c (to start the llt)

# lltconfig -u (to stop the llt)

# lltstat -nvv (to see the traffic status between the interfaces)

# llttab -a (to see the cluster ID)

# haclus -display (to see all the information on the cluster)

# cat /etc/llttab (to see the llt configuration and the entries are as,)

Cluster ID, host ID, interface MAC address, ...etc.,

# cat /etc/llthosts (to see the no. of nodes present in the cluster)

19. How to check the status of the Veritas Cluster?

# hastatus -summary

20. Which command is used to check the syntax of the main.cf?

# hacf -verify /etc/VRTSvcs/conf/config

21. How will you check the status of the individual resources of Veritas Cluster (VCS)?

# hares -state

22. What is the use of # hagrp command?

# hagrp command is used doing administrative actions on service groups like, on-line service group, off-line service group and switch, ...etc.,

23. How to switch over the service group?

# hagrp -switch

24. How to online the service group in VCS?

# hagrp -online -sys

25. What are the steps to follow for switch over the application from System A to System B?

(i) First unmount the file system on System A.

(ii) Stop the volume on System A.

(iii) Deport the disk group from System A.

(iv) Import the disk group to another System B.

(v) Start the volume on System B.

(vi) Finally mount the file system on System B.

26. How many types of clusters available?

(i) Hybrid Cluster.

(ii) Parallel Cluster.

(iii) Failover Cluster.

27. What is meant by seeding?

Normally, we will define how many nodes to start in a cluster while booting or explicitly by executing

# gabconfig -c n 2 command. Here 2 means 2 seeds to start in a cluster. This no. is called seeding.

28. What is Split brain issue in VCS and how to resolve this?

A Split brain issue means, multiple systems use the same exclusive resources and usually resulting in data corruption.

Normally VCS is configured with multiple nodes and are communicates with each other. When power loss or system crashed, the VCS assumes the system has failed and trying to move service group to other system to maintain high availability. However communication (heartbeat) can also failed due to network failures.

If network traffic (connection) between any two groups of systems fail simultaneously, a network partition occurs. When this happen, systems on both sides of the partition can restart the applications from the other side, ie., resulting in duplicate services. So, the most serious problem caused by this and effects the data on shared disks.

This split brain issue normally occurs in VCS 3.5 to VCS 4.0 versions. But, from VCS 5.0 onwards the I/O fencing (new feature) is introduced to minimize the split brain issue. If I/O fencing is enabled in a cluster, then we can avoid the split brain issue.

29. What is Admin wait and Stale Admin wait?

ADMIN-WAIT :

If VCS is started on system with a valid configuration file and other systems are in the ADMIN-WAIT state, The new system transition to the ADMIN-WAIT state (or)

If VCS is started on system with a stale configuration file and if other systems are in the ADMIN-WAIT state, the new system transition to the ADMIN-WAIT state.

STALE-ADMIN-WAIT :

The configuration files are in read-only mode. If any changes wants to make that file as read-write mode. If any changes occurs in ' main.cf ' file in cluster, then the changes are in ' .stale ' hidden file under configuration directory. While changes occurring, if the system restarted or rebooted, then the cluster will start with ' .stale ' file. So, the VCS is started on a system with a stale configuration file, the system status will be STALE- ADMIN-WAIT until another system in the cluster starts with a valid configuration file or otherwise execute

# hasys -stale -force (or) # hasys -force to start the system forcefully with the correct or valid configuration file.

30. What is meant by resource and how many types?

Resource is a software or hardware component managed by the VCS.

Mount points, disk groups, volumes, IP addresses, ....etc., are the Software components.

Disks, Interfaces (NIC cards), ....etc., are the Hardware components.

There are two types of resources and they are,

(i) Persistent Resources (we can put them either on-line or off-line)

(ii) Non-Persistent Resources (we can put off-line only)

If the resource is in faulted state, then clear the service group state. Resources cab be critical or non-critical. If the resource is critical, then it automatically failover if the resource is failed. If the resource is Non-critical, then it switch over and we have to manually switch over the resource group to another available system.

31. What are the dependencies between resources in a Cluster?

If one resource depends on other resource, then there is a dependency between those resources.

Example : NIC (Network Interface Card) is hardware component nothing but hardware resource. The IP address is a software component nothing but software resource and it depends on NIC card. The relationship between NIC and IP address is Parent - Child relationship. The resource which one is starts first, that one is called Parentresource and the remaining dependency resources are called Child resource.

32. What are the minimum requirements for or in VCS?

(i) Minimum two identical (same configuration) systems.

(ii) Two switches (Optical Fibre Channel).

(iii) Minimum three NIC cards. (Two NICs for private network and one NIC for public network).

(iv) One common storage.

(v) Veritas Volume Manager with license.

(vi) Veritas Cluster with license.

33. What are the Veritas Cluster deamons?

(i) had :

(a) It is the main deamon in Veritas Cluster for high availability.

(b) It monitors the cluster configuration and whole cluster environment.

(ii) hashadow :

(a) It always monitor the had deamon.

(b) It's main functionality is logging about the cluster.

35. What are the main configuration files in a Cluster?

* /etc/VRTSvcs/conf/config/main.cf and

* /etc/VRTSvcs/conf/config/types.cf are the main configuration files in Cluster.

36. What are the main log files in a Cluster?

(i) /var/VRTSvcs/log/Engine_A.log (logging about when the cluster started, when failed, when failover occurs, when switchover forcefully, ...etc.,)

(ii) /var/VRTSvcs/log/hashadow_A.log (logging about the hashadow deamon)

(iii) /var/VRTSvcs/log/agent_A.log (logging bout agents)

37. What are the Cluster components?

(i) Cluster.

(ii) Service groups.

(iii) Resources.

(iv) Agents.

(v) Events.

38. What is your role in the Cluster?

Normally we will get requests like,

(i) Add a node.

(ii) Add a resource.

(iii) Add a service group.

(iv) Add a resource to the existing service group.

(v) Add mount points.

And sometimes we get some troubleshooting issues like,

(i) had deamon is not running.

(ii) Split barin issue.

(iii) If the resources are faulted, then restart the service groups and moving service groups from one node to another.

(iv) Cluster is not running.

(v) Communication failed between two nodes.

(vi) GAB and LLT are not running.

(vii) Resource not started.

(viii) main.cf and types.cf files corrupted.

(ix) I/O fencing (a locking mechanism to avoid the split brain issue) is not enabled (at disk level / SAN level).

(x) And the locks are,

(a) engine.lock

(b) ha.lock

39. What are the statuses of a service group?

(i) online

(ii) offline

(iii) partial

* If the non-critical resource is failed, then the status of the service group may be in partial status.

* If the critical resource is failed, then the status of the service group may be in offline status.

40. How to move the service group from one node to another node manually?

(i) Stop the application.

(ii) Stop the database.

(iii) Unmount the file system.

(iv) Stop the volume.

(v) Deport the disk group.

(vi) Import the disk group.

(vii) Start the volume.

(viii) Mount the file system.

(ix) Start the database.

(x) Start the application.

41. How to rename a disk group in VxVM in stepwise?

(i) Stop the application.

(ii) Stop the database.

(iii) Unmount the file system.

(iv) Stop the volume.

(v) Deport the disk group.

(vi) Rename the disk group.

(vii) Import the disk group.

(ix) Start the volume.

(x) Mount the file system.

(xi) Start the database.

(xii) Start the application.

42. How to create a volume with 4 disks?

(i) Bring the disks to O/S control by scanning the Luns using the following the command,

# echo "---" > /sys/class/scsi_host/< lun no. >/scan (to scan the lun no.)

(ii) Bring those disk from O/S control to VxVM control.

(a) If we want to preserve the data, then bring the disks to VxVM control using encapsulation method by

# vxdiskadm (here we get the options to do this and select 2nd option ie., Encapsulation)

(b) If we don't want to preserve the data, then bring the disks to VxVM control using initialization method by # vxdisksetup -i (for example # vxdisksetup -i /dev/sda)

# vxdisksetup -i (for example # vxdisksetup -i /dev/sdb)

# vxdisksetup -i (for example # vxdisksetup -i /dev/sdc)

# vxdisksetup -i (for example # vxdisksetup -i /dev/sdd)

# vxdisk list (to see VxVM controlled disks)

(iii) Create a disk group.

# vxdg init disk01=/dev/sda (for example diskgroup name as appsdg)

(iv) Adding remaining three disks to the above disk group.

# vxdg -g appsdg adddisk disk02=/dev/sdb

# vxdg -g appsdg adddisk disk02=/dev/sdc

# vxdg -g appsdg adddisk disk02=/dev/sdd

#vxdg list (to see all the disks belongs to that diskgroup for example appsdg)

(v) Create the Volume (for the requested size and requested layout).

# vxassist -g appsdg make (for example volume name is appsvol and size in TB/GB ... etc)

(vi) Create a file system on that volume.

# mkfs -F vxfs /dev/vx/rdsk/appsdg/appsvol

(vii) Create the mount point and provide the requested permissions to that mount point.

# mkdir /mnt/apps

(viii) Start the volume.

# vxvol -g appsdg start appsvol

(ix) Mount the file system on the above mount point.

# mount -F vxfs -o /dev/vx/dsk/appsdg/appsvol

(where rw means read-write and re means read-only)

(x) Put the entry into the "/etc/fstab" file for permanent mount.

* If the volume is created for cluster, don't put the entry in /etc/fstab file.

(xi) And finally send the mail to client or requested person

43. What is the difference between Global Cluster and Local Cluster? Have you configured the Global Cluster?

Local Cluster :

If all the nodes in a Cluster are placed in a same location, that Cluster is called Local Cluster.

Global Cluster :

If all the nodes in a Cluster are placed in different Geological locations, that Cluster is called Global Cluster. The main advantage of global cluster is high availability when Natural Calamities or disasters occurs.

* No, I haven't configure the Global Cluster.

44. How to start and stop the Cluster?

# hastart (to start the local node in the Cluster)

# hastart all (to start all the nodes in the Cluster)

# hastart -sys (to start a specified system or node in the Cluster)

# hasys -force (to forcefully start the system in the Cluster)

# hastop (to stop the local node in the Cluster)

# hastop all (to stop all the systems in the Cluster)

# hastop -sys (to stop the specified system or node in the Cluster)

45. What is the Service group and Resource?

Service group :

(i) A collection or group of physical and logical resources is called the Service group.

(ii) Moving service group from one system to another system means, moving resources from one system to another system.

Resources :

(i) It is a software or hardware components like, diskgroup, volume, IP address, mount point are software resources and disk, NIC cards are hardware resources.

(ii) The value of resource is known as Attribute.

Example : (a) System list is attribute of a System A or System B.

(b) Auto start is the attribute of System.

Resource	Attribute	Value
NIC	IP address	192.168.1.1
Diskgroup	diskgroup name	appsdg
Disk	disk name	disk01
Interface	Interface name	eth0

(iii) There are two types of resources.

(a) Persistent Resource :

Those resources which we cannot start or stop are called Persistent resources.

` Some resources we can start/stop and some other resources we cannot stop or start.

Example : We cannot start or stop the NIC card.

(b) Non - Persistent Resource :

Those resources which we can start/stop are called Non - Persistent Resources.

(iv) Resources may be critical or non-critical. We need to design the resources as critical or non-critical. ie., the customer will insists which is critical and which is non-critical.

(v) If critical resource is failed, then only the service group moved automatically from one system to another system. ie., failover, otherwise if non-critical resource is failed, then we need to the manual movement of service group from one system to another system. ie., switchover.

46. What are the steps you follow to put the volume in a Cluster?

(i) First create the diskgroup, volume and create the file system and mount and unmount before put the volume in a cluster because testing of that volume is working or not.

(ii) Create the service group and add the Attributes to it.

# hagrp -add

Example: # hagrp -add appssg

Attributes :

# hagrp -modify appssg system list={ sys A0, sys B0} (to add sys A and sys B attributes to service group)

# hagrp -modify appssg autostart list={ sys A} (to start the sys A attributes automatically)

# hagrp -modify appssg enabled 1 or 0 (1 means start and 0 means not to start automatically)

(iii) Creating resources and adding them to the service group and specify their attributes.

For file system :

(a) /mnt/apps (the mount point)

(b) /appsvol (the volume name)

# hares -add dg-apps diskgroup appssg (to add the diskgroup resource to a service group)

(where dg-apps is resource name, diskgroup is a keyword and appssg is a service group name)

# hares -modify dg-apps diskgroup appsdg (to add the diskgroup attribute to a service group)

# hares -modify dg-apps enable 1 (to enable the resource)

# hares -add dg-volume volume appssg (to add the volume resource to a service group)

# hares -modify dg-volume volume appsvol (to add the volume attribute to a service group)

# hares -modify dg-volume diskgroup appsdg (to add the diskgroup to the volume)

# hares -modify dg-volume enable 1 (to enable the volume resource)

# hares -modify dg-volume critical 1 (to make the resource as critical)

# hares -add dg-mnt mount appssg (to add the mount point resource to a service group)

# hares -modify dg-mnt blockdevice=/dev/vx/rdsk/appsdg/appsvol (to add the block device resource to a service group)

# hares -modify dg-mnt fstype=vxfs (to add the mount point attributes to a service group)

# hares -modify dg-mnt mount=/mnt/apps (to add the mount point directory attribute to a service group)

# hares -modify dg-mnt fsckopt=% y or %n (to add the fsck attribute either yes or no to service group)

(iv) Create links between the above diskgroup, volume and mount point resources.

# hares -link parent-res child-res

# hares -link dg-appdg dg-volume

# hares -link dg-volume dg-mnt

47. What is meant by freezing and unfreezing a service group with persistent and evacuate options?

Freezing :

If we want to apply patches to the system in a cluster, then we have to freeze the service group because first stop the service group, if it is critical, the service group will move automatically to another system in Cluster. So, we don't want to move the service group from one system to another system, we have to freeze the service group.

Unfreeze :

After completing the task, the service group should be unfreezed because, if the is crashed or down and the resources are critical, then the service group cannot move from system 1 to system 2 due to freezed the service group and results in not available of application. If unfreezed the service group after maintenance, the service group can move from system 1 to system 2. So, if system 1 failed, the system2 is available and application also available.

Persistent option :

If the service group is freezed with persistent option, then we can stop or down or restart the system. So, there is no loss of data and after restarted the system, the service group is remains in freezed state only.

Example : # hasys -freeze -persistent

# hasys -unfreeze -persistent

Evacuate :

If this option is used in freezed service group system, if the system down or restarted the persisted information is evacuated, ie., before freeze all the service groups should be moved from system 1 to another system 2.

48. What are the layouts are available in VxVM and how they will work and how to configure?

(i) There are 5 layouts available in VxVM. They are RAID-0, RAID-1, RAID-5, RAID-0+1 and RAID-1+0.

RAID-0 :

We can configure RAID-0 in two ways.

(a) Stripped (default).

(b) Concatenation.

Stripped :

(i) In this minimum two disks required to configure.

(ii) In this the data will write on both the disks parallelly. ie., one line in one disk and 2nd line on 2nd disk, ...etc.,

(iii) In this the data writing speed is fast.

(iv) In this there is no redundancy for data.

Concatenation :

(i) In this minimum one disk is required to configure.

(ii) In this the data will write in first disk and after filling of first disk then it will write on 2nd disk.

(iii) In this the data writing speed is less.

(iv) In this also there is no redundancy for data.

RAID-1 :

(I) It is nothing but mirroring.

(ii) In this minimum 4 disks are required to configure.

(iii) In this same data will be written on disk1 and disk 3, disk 2 and disk4.

(iv) If disk 1 failed, then we can recover the data from disk3 and if disk 2 failed, then we can recover the data from disk 4. So, there is no data loss or we can minimize the data loss.

(v) In this half of the disk space may be wasted.

RAID-5 :

(i) It is nothing but stripped with distributed parity.

(ii) In this minimum 3 disks required to configure.

(iii) In this one line will write on disk 1 and 2nd line write on disk 2 and the parity bit will write on disk3. The parity bit will write on 3 disk simultaneously. If disk 1 failed then we can recover the data from disk2 and parity bit from disk 3. So, in this data will be more secured.

(iv) In this disk utilization is more when compared to RAID-1, ie., 1/3 rd of disk space may be wasted.

(v) This RAID-5 will be configured for critical applications like Banking, Financial, SAX and Insurance...etc., because the data must be more secured.

Creating a volume with layout :

# vxassist -g make layout=

Example : # vxassist -g appsdg make appsvol 50GB layout=raid 5 (the default is RAID-5 in VxVM)

Logs :

* If the layout is mirror, then log is DRL.

* If the layout is RAID-5, then the log is RAID-5 log.

* The main purpose of the log is fast recovery operation.

* We have to specify whether the log is required or not in all types of layouts except RAID-5 because the logging is default in RAID-5.

* If we want to configure RAID-5 without logging then,

# vxassist -g make 50GB, nolog layout=raid 5

* If the layout is other than RAID-5 then,

# vxassist -g make 50GB, log layout=mirror

* If we want to add the log to the existing volume then,

# vxassist -g addlog logtype=drl or raid5

* If we want to remove the log from the existing volume then,

# vxassist -g rmlog

49. What is read policy and how many types of read policies available?

Read policy means, how the disk or volume should be read when accessing the data.

Types of read policies :

(i) Select

(ii) Prefer

(iii) Round Robin

* By default the read policy is Round Robin.

# vxvol -g rdpol = < select/prefer/roundrobin

50. What is your role in VxVM?

Normally, we get requests from application, development, production and QA people like,

(i) Create a volume.

(ii) Increase the volume.

(iii) Decrease the volume.

(iv) Provide Redundancy by implementing RAID-1 or RAID-5.

(v) Provide the required permissions.

(vi) Put the volume in the Virtual machine.

(vii) Put the volume in the Cluster.

(viii) Provide high availability to the applications and databases.

(ix) Sometimes destroy or remove the volume.

(x) Backup and restore the data whenever necessary.

And sometimes we get some troubleshooting issues like,

(i) Volume is not started.

(ii) Volume is not accessible.

(iii) Mount point deleted.

(iv) File system crashed.

(v) One disk failed in a volume.

(vi) Volume manager deamons are not running.

(vii) Volume manager configuration files missed or deleted.

(viii) VxVM licensing issues.

(ix) Diskgroup not deporting and not importing.

(x) Volume is started, deamons are running but users cannot accessing the data.

(xi) Disk are not ditected.

(xii) Hardware and software errors.

51. What is meant by snap backup and how to take the snap backup?

(i) Snap backup means, taking backup using snapshots.

(ii) In 24X7/365 days running servers normally we take snap backup.ie., no downtime allowed.

(iii) The above said servers are called BCV (Business Continuity Volumes).

Backup :

(i) First stop the Application.

(ii) Stop the Database.

(ii) Unmount the file system.

(iii) Stop the volume.

(iv) Deport the diskgroup.

(v) Import the diskgroup.

(vi) Join the snap diskgroup.

(vii) Syncing the data.

(viii) Take the backup.

(ix) Split the snap diskgroup.

(x) Deport the diskgroup.

(xi) Import the diskgroup.

(xii) Start the volume.

(xiii) Mount the file system.

(xiv) Start the Database.

(xv) Start the Application.

52. What are the steps you follow to rename a diskgroup?

(i) Stop the Application.

(ii) Stop the Database.

(iii) Unmount the file system.

(iv) Stop the volume.

(v) Deport the diskgroup.

(vi) Rename the diskgroup.

(vii) Import the diskgroup by

# vxdg -n import command.

(viii) Start the volume.

(ix) Mount the file system.

(x) Start the Database.

(xi) Start the Application.

53. How to install VxVM? What version of Veritas you are using and how to know the veritas version?

(i) Install the veritas supplied packages using # rpm or # yum commands.

(ii) Execute the command #vxinstall to install VxVM ie., enable the system to use volume manager.

(iii) #vxinstall will allow us to encapsulate or not encapsulate the root disk.

(iv) Always use option 2 ie., Custom installation because, if option 1 is used ie., Quick installation, it takes all the disks for rootdg.

License :

(i) All the licenses are stored in /etc/vx/licenses directory and we can take backup of this directory and restore it back, if we need reinstall the server.

(ii) Removing VxVM package will not remove the installed license.

(iii) To install license # vxlicinst command is used.

(iv) To see the VxVM license information by # vxlicrep command.

(v) To remove the VxVM license by # vxkeyless set NONE command.

(vi)The license packages are installed in /opt/VRTSvlic/bin/vxlicrep directory.

(vii) The license keys are stored in /etc/vx/licenses/lic directory.

(viii) We can see the licenses by executing the below commands,

# cat /etc/vx/licenses/lic/key or

# cat /opt/VRTSvlic/bin/vxlicrep | grep "License key"

(ix) To see the features of license key by # vxdctl license command.

Version :

(i) We are using VxVM6.2 version.

(ii) to know the version of VxVM by # rpm -qa VRTSvxvm command.

54. What are the available formats to take the control of disks from O/S to veritas in VxVM?

We can take the control of disks from O/S to veritas in 3 formats.

(i) CDS (Cross platform Data Sharing and the default format in VxVM).

(ii) Sliced.

(iii) Simple.

(i) CDS :

(a) We can share the data between different Unix flavours.

(b) The private and public both regions are available in 7th partition.

(d) So, there is a chance to loss the data because, if the disk is failed ie., partition 7 is corrupted or damaged then the data may be lost.

(e) This is the default in veritas volume manager.

(ii) Sliced :

(a) It is always used for root disk only.

(b) In this format we cannot share the data between different Unix flavours. Normally sliced is used for root disk and cds is used for data.

(d) So, if public region is failed, we can recover the data from private region ie., minimizing the data loss.

(iii) Simple :

(a) This format is not using widely now because, it is available in old VxVM 3.5

(b) In this private and public regions are available at 3rd partition.

Specifying the format while setup :

# vxdisksetup -i /dev/sda (to setup the disk and this is default format ie., CDS format)

# vxdisksetup -i /dev/sdb format = (to specify sliced or simple format)

55. In how many ways can we manage VxVM?

(I) Command line tool.

(ii) GUI (vea tool)

(iii) # vxdiskadm command (it gives the options to manage the disks)