Wednesday, November 2, 2016
Monday, October 17, 2016
RedHat Cluster
1. How can you define
a cluster and what are its basic types?
A cluster is two
or more computers (called nodes or members) that work together to
perform a task. There are four
major types of clusters:
Storage
High availability
Load balancing
High performance
2. What is Storage
Cluster?
Storage clusters provide a consistent file system image across servers
in a cluster, allowing the servers to simultaneously read and write to a single
shared file system.
A storage cluster simplifies storage administration by limiting the
installation and patching of applications to one file system.
The High Availability Add-On provides storage clustering in
conjunction with Red Hat GFS2
3. What is High
Availability Cluster?
High availability clusters provide highly available services by
eliminating single points of failureand by
failing over services from one cluster node to another in case a node becomes
inoperative.
Typically, services in a high availability cluster read and write data
(via read-write mounted file systems).
A high availability cluster must maintain data integrity as one
cluster node takes over control of a service from another cluster node.
Node failures in a high availability cluster are not visible from
clients outside the cluster.
High availability clusters are sometimes referred to as failover
clusters.
4. What is Load
Balancing Cluster?
Load-balancing clusters dispatch network service requests to multiple cluster nodes to balance the request
load among the cluster nodes.
Load balancing provides cost-effective scalability because you can
match the number of nodes according to load requirements. If a node in a
load-balancing cluster becomes inoperative, the load-balancing software detects
the failure and redirects requests to other cluster nodes.
Node failures in a load-balancing cluster are not visible from clients
outside the cluster.
Load balancing is available with the Load Balancer Add-On.
5. What is a High
Performance Cluster?
High-performance clusters use cluster nodes to perform concurrent
calculations.
A high-performance cluster allows applications to work in parallel,
therefore enhancing the performance of the applications.
High performance clusters are also referred to as computational
clusters or grid computing.
6. How many nodes are
supported in Red hat 6 Cluster?
A cluster
configured with qdiskd supports a maximum of 16 nodes. The reason for the limit
is because of scalability;
increasing the node count increases the amount of synchronous I/O contention on
the shared quorum disk device.
7. What is the minimum
size of the Quorum Disk?
The minimum size
of the block device is 10 Megabytes.
8. What is the order
in which you will start the Red Hat Cluster services?
In Red Hat 4 :
#
service ccsd start
#
service cman start
#
service fenced start
service
clvmd start (If CLVM has been used to create clustered volumes)
#
service gfs start
#
service rgmanager start
In RedHat 5 :
#
service cman start
#
service clvmd start
#
service gfs start
#
service rgmanager start
In Red Hat 6 :
#
service cman start
#
service clvmd start
#
service gfs2 start
#
service rgmanager start
In Red Hat 4 :
#
service rgmanager stop
#
service gfs stop
#
service clvmd stop
#
service fenced stop
#
service cmanstop
#
service ccsd stop
In Red Hat 5 :
#
service rgmanager stop
#
servicegfsstop
#
service clvmd stop
#
servicecman stop
In Red Hat 6 :
#
service rgmanagerstop
#
service gfs2 stop
#
service clvmdstop
#
service cman stop
10. What are the
performance enhancements in GFS2 as compared to GFS?
Better performance for heavy usage in a single directory
Faster synchronous I/O operations
Faster cached reads (no locking overhead)
Faster direct I/O with preallocated files (provided I/O size is
reasonably large, such as 4M blocks)
Faster I/O operations in general
Faster Execution of the df command, because of faster statfs calls
Improved atime mode to reduce the number of write I/O operations
generated by atime when compared with GFS
GFS2 supports the following features.
extended file attributes (xattr)
the lsattr() and chattr() attribute settings via standard ioctl() calls
nanosecond timestamps
GFS2 uses less kernel memory.
GFS2 requires no metadata generation numbers.
Allocating GFS2 metadata does not require reads. Copies of metadata
blocks in multiple journals are managed by revoking blocks from the journal
before lock release.
GFS2 includes a much simpler log manager that knows nothing about
unlinked inodes or quota changes.
The gfs2_grow and gfs2_jadd commands use locking to prevent multiple instances running at the same
time.
The ACL code has been simplified for calls like creat() and mkdir().
Unlinked inodes, quota changes, and statfs changes are recovered
without remounting the journal.
11. What is the maximum
file system support size for GFS2?
GFS2 is based on 64 bit architecture, which can theoretically accommodate
an 8 EB file system.
However, the current supported maximum size of a GFS2 file system for
64-bit hardware is 100 TB.
The current supported maximum size of a GFS2 file system for 32-bit
hardware for Red Hat Enterprise Linux Release 5.3 and later is 16 TB.
NOTE: It is better to have 10 1TB file systems
than one 10TB file system.
12. What is the
journaling filesystem?
A journaling filesystem is a filesystem that maintains a special file
called a journal that is used to repair any inconsistencies that occur as the
result of an improper shutdown of a computer.
In journaling file systems, every time GFS2 writes metadata, the
metadata is committed to the journal before it is put into place.
This ensures that if the system crashes or loses power, you will recover
all of the metadata when the journal is automatically replayed at mount time.
GFS2 requires one journal for each node in the cluster that needs to
mount the file system. For example, if you have a 16-node cluster but need to
mount only the file system from two nodes, you need only two journals. If you
need to mount from a third node, you can always add a journal with the
gfs2_jadd command.
13. What is the default
size of journals in GFS?
When you run
mkfs.gfs2 without the size attribute for journal to create a GFS2 partition, by
default a 128MB sizejournal
is created which is enough for most of the applications
In case you plan
on reducing the size of the journal, it can severely affect the performance. Suppose you reduce the size of the journal to 32MB it does not take much
file system activity to fill an 32MB journal, and when the journal is full, performance slows because
GFS2 has to wait for writes to the storage.
14. What is a Quorum
Disk?
Quorum Disk is a disk-based quorum daemon, qdiskd, that provides
supplemental heuristics to determine node fitness.
With heuristics you can determine factors that are important to the
operation of the node in the event of a network partition
For a 3 node
cluster a quorum state is present until 2 of the 3 nodes are active i.e. more
than half. But what if due
to some reasons the 2nd node also stops communicating with the 3rd node? In
that case under a normal
architecture the cluster would dissolve and stop working. But for mission
critical environments and such scenarios
we use quorum disk in which an additional disk is configured which is mounted
on all the nodes with qdiskd
service running and a vote value is assigned to it.
So suppose in
above case I have assigned 1 vote to qdisk so even after 2 nodes stops
communicating with 3rd node,
the cluster would have 2 votes (1 qdisk + 1 from 3rd node) which is still more
than half of vote count for a 3 node
cluster. Now both the inactive nodes would be fenced and your 3rd node would be
still up and running being a
part of the cluster.
15. What is rgmanager in
Red Hat Cluster and its use?
This is a service termed as Resource Group Manager
RGManager manages and provides failover capabilities for collections
of cluster resources called services, resource groups, or resource trees
it allows administrators to define, configure, and monitor cluster
services. In the event of a node failure, rgmanager will relocate the clustered
service to another node with minimal service disruption.
luci is the
server component of the Conga administration utility
Conga is an integrated set of software components that provides
centralized configuration and management of Red Hat clusters and storage
luci is a server that runs on one computer and communicates with
multiple clusters and computers via ricci
ricci is the
client component of the Conga administration utility
ricci is an agent that runs on each computer (either a cluster member
or a standalone computer) managed by Conga
This service needs to be running on all the client nodes of the
cluster.
17. What is cman in Red
Hat Cluster?
This is an abbreviation used for Cluster Manager.
CMAN is a distributed cluster manager and runs in each cluster
node.
It is responsible for monitoring, heartbeat, quorum, voting and
communication between cluster nodes.
CMAN keeps track of cluster quorum by monitoring the count of cluster
nodes.
18. What are the
different port no. used in Red Hat Cluster?
IP Port no.
|
Protocol
|
Component
|
5404,5405
|
UDP
|
corosync/cman
|
11111
|
TCP
|
ricci
|
21064
|
TCP
|
dlm (Distributed Lock Manager)
|
16851
|
TCP
|
Modclustered
|
8084
|
TCP
|
luci
|
4196,4197
|
TCP
|
rgmanager
|
19. How does
NetworkManager service affects Red Hat Cluster?
The use of NetworkManager is not supported on cluster nodes. If you
have installed NetworkManager on your cluster nodes, you should either remove
it or disable it.
#
service NetworkManager stop
#
chkconfig NetworkManager off
The cman service will not start if NetworkManager is either running or
has been configured to run with the chkconfig command
20. What is the command
used to relocate a service to another node?
#
clusvcadm -r service_name -m node_name
21. What is split-brain
condition in Red Hat Cluster?
We say a cluster has quorum if a majority of nodes are alive,
communicating, and agree on the active cluster members. For example, in a
thirteen-node cluster, quorum is only reached if seven or more nodes are
communicating. If the seventh node dies, the cluster loses quorum and can no
longer function.
A cluster must maintain quorum to prevent split-brain issues.
If quorum was not enforced, quorum, a communication error on that same
thirteen-node cluster may cause a situation where six nodes are operating on
the shared storage, while another six nodes are also operating on it,
independently. Because of the communication error, the two partial-clusters
would overwrite areas of the disk and corrupt the file system.
With quorum rules enforced, only one of the partial clusters can use
the shared storage, thus protecting data integrity.
Quorum doesn't prevent split-brain situations, but it does decide who
is dominant and allowed to function in the cluster.
quorum can be determined by a combination of communicating messages
via Ethernet and through a quorum disk.
22. What are
Tie-breakers in Red Hat Cluster?
Tie-breakers are additional heuristics that allow a cluster partition
to decide whether or not it is quorate in the event of an even-split - prior to
fencing.
With such a tie-breaker, nodes not only monitor each other, but also
an upstream router that is on the same path as cluster communications. If the
two nodes lose contact with each other, the one that wins is the one that can
still ping the upstream router.That is why, even when using tie-breakers, it is
important to ensure that fencing is configured correctly.
CMAN has no internal tie-breakers for various reasons. However,
tie-breakers can be implemented using the API.
23. What is fencing in
Red Hat Cluster?
Fencing is the disconnection of a node from the cluster's shared
storage.
Fencing cuts off I/O from shared storage, thus ensuring data
integrity.
The cluster infrastructure performs fencing through the fence daemon, fenced.
When CMAN determines that a node has failed, it communicates to other
cluster-infrastructure components that the node has failed.
fenced, when
notified of the failure, fences the failed node.
24. What are the various
types of fencing supported by High Availability Add On?
Power fencing — A fencing method that uses a power
controller to power off an inoperable node.
storage fencing — A fencing method that disables the
Fibre Channel port that connects storage to an inoperable node.
Other fencing — Several other fencing methods that
disable I/O or power of an inoperable node, including IBM Bladecenters, PAP, DRAC/MC, HP
ILO, IPMI, IBM RSA II, and others.
25. What are the lock
states in Red Hat Cluster?
A lock state
indicates the current status of a lock request. A lock is always in one of
three states:
Granted — The lock request succeeded and
attained the requested mode.
Converting — A client attempted to change the lock
mode and the new mode is incompatible with an existing lock.
Blocked — The request for a new lock could not
be granted because conflicting locks exist.
A lock's state is
determined by its requested mode and the modes of the other locks on the same
resource.
26. What is DLM lock
model?
DLM is a short abbreviation for Distributed Lock Manager.
A lock manager is a traffic cop who controls access to resources in
the cluster, such as access to a GFS file system.
GFS2 uses locks from the lock manager to synchronize access to file
system metadata (on shared storage)
CLVM uses locks from the lock manager to synchronize updates to LVM
volumes and volume groups (also on shared storage)
In addition, rgmanager uses DLM to synchronize service states.
without a lock manager, there would be no control over access to your
shared storage, and the nodes in the cluster would corrupt each other's data.
Veritas Volume Manager and Veritas Cluster
1. What is the
difference between Failing and
Failed?
Failing :
Failing
means, it is going to fail. In failing
disk's private region is available and public region is not available. so, we can recover the data using the private
region.
Failed :
Failed
means, it is already failed. In failed disk the both private and public
regions are not available. So, we cannot get back
the (recover) data. The only thing is
replace or restore the data from backup.
2. What are the deamons of Veritas Volume Manager?
(a) vxconfigd
:
(i) This is the main deamon in Veritas Volume
Manager.
(ii) It maintains the Volume Manager configuration
information.
(iii)
It always resides in the private region of the disk.
(iv)
It communicate with the kernel and update the Volume states to configure the
database.
(v)
It always starts before mounting the
root ( / ) file system.
(b) vxiod
:
(i) This is used to maintain I/O (input and
output) operations.
(ii) This also defines how many I/O
operations at a time.
(c) vxrelocd
:
(i) It always monitors the consistency in the
disks and notify the user if failed using
(by) vxnotifyd deamon.
(ii) It also relocate and recognize the new disk.
(d) vxrecoverd
:
(i) It passes the lost data into new disk.
(ii) It also notify to the Administrators
using (by) vxnotifyd deamon.
(e) vxnotifyd
:
(i) It notify to the user (Administrator) about failed disks and after recovery also it
notify to the Administrator.
3. How to create the
root mirror?
(i) Bring the disk from O/S
to Veritas Volume Manager control
using the Veritas Advanced Management tool,
# vxdiskadm command
(It gives (displays) options for easy administration of Veritas
Volume Manager).
(ii) Select 2nd option ie., Encapsulation because to preserve the
existing data present in the disk
and reboot the
system to effect Encapsulation and modify the
/etc/sysconfig file. While Encapsulating, it asks disk name and disk group (root disk name and
rootdg).
(iii) Backup the / (root), /etc/sysconfig directories.
(iv)
Take another disk and initialize it by #
vxdisksetup -i command.
(v) Add the above initialized disk to the
volume group ie., roodg
by
# vxdg
-g adddisk
mirrordisk=
(vi)
vxmirror -v -g
(disk
level mirroring)
(vii)
For individual mirroring, # vxassist -g
mirror or
# vxrootmirr
-g
command.
4. What is the service
group in Vertias Cluster?
Service
group is made up of resources and their links which we normally requires to
maintain the High Availability
for the application.
5. What is the use
of ' halink ' command?
# halink command is used to link the dependencies of
the resources.
6. What are the
differences between switchover
and failover?
SwitchOver
|
FailOver
|
(i)
Switchover is the manual task.
|
(i)
But, Failover is a automatic
task.
|
(ii) We can switchover service groups
from online
cluster node to offline cluster node incase of
power outage, hardware
failure, schedule
shutdown and reboot.
|
(ii) But, the failover will failover the
service group to
the other node when Veritas Cluster
heartbeat
linkdown, damaged, broken
because of some
disaster or system hung.
|
7. Which the main
configuration file for VCS (Veritas
Cluster) and where it is stored?
'
main.cf ' is the main configuration file for VCS
and it is located in /etc/VRTSvcs/conf/config directory.
8. What is the public
region and private region?
when
we bring the disk from O/S control to Volume Manager control in any
format (either CDS,
simple or sliced), the disk is logically divided into two parts.
(a) Private
region :
It
contains Veritas configuration information like
disk type and name,
disk group name, groupid and configdb.
The default size is 2048 KB.
(b) Public
region :
It
contains the actual user's data like applications, databases
and others.
9. There are five disks
on VxVM (Veritas Volume Manager) and
all are failed. What are the
steps you follow to get those
disks into online?
(i) Check the list of disks in Volume
manager control by #
vxdisk list command.
(ii) If the above disks are not present,
then bring them O/S control
to VxVM control by
# vxdisksetup -i
(if data is not on those disk) or
execute
# vxdiskadm command
and select 2nd option ie., encapsulation
method if the disks having the
data.
(iii) Even though If it is not possible,
then check the disks are available at
O/S level by # fdisk
-l command.
(a) If the disks are available, execute the above command once again.
(b)
If the disks are not available then recognize them by
scanning the hardware.
(iv)
Even though if it is not possible, then reboot the system
and follow the steps (i) and
(ii).
10. What is the basic
difference between private disk
group and shared disk group?
Private disk group :
The
disk group is only visible for the host on which we have created it. If the host is a part of the cluster, the private disk group will not be visible to the
other cluster nodes.
Shared disk group :
The
disk group is sharable and visible to the other cluster nodes.
11. How will you create
private disk group and shared disk group?
# vxdg
init = (to create the private disk group)
# vxdg
-s init = (to create the shared disk group)
12. How will you add new
disk to the existing disk group?
we
can do this in two ways.
(i) Run
# vxdiskadm command, which will open menu driven
program to do various disk operations.
Select add disk option and give disk group name and disk
name.
(ii) #
vxdg -g adddisk
=
Example: # vxdg -g
appsdg adddisk disk02=/dev/sdb
13. How will you grow or
shrink the volume/file system?
What is the meaning of grow by,
grow to, shrink by and shrink to
options?
(i) We can grow the volume/file
system by,
# vxassist -g
appsdg growby or
growto 100GB appsvol (or)
#
vxresize -g appsdg
+100GB appsvol alloc =
(ii) We can shrink the volume/file
system by,
#
vxassist -g appsdg
shrinkby 20GB appsvol
#
vxassist -g appsdg
shrinkto 20GB appsvol (or)
#
vxresize -g appsdg
-10GB appsvol (to shrink by the size 10GB)
#
vxresize -g appsdg
10GB appsvol (to shrink to the size 10GB)
Meanings :
growby :
This
will be used to grow the file system by adding new size to the existing file
system.
growto :
This
will be used to grow the file system upto
the specified new size. This will not be added the new size to the
existing one.
shrinkby :
This
will be used to shrink the file system by reducing the new size from the
existing file system size.
shrinkto :
This
will be used to shrink the file system upto the specified new size. This will
not be reduced the file system new size from the existing one.
14. If vxdisk
list command gives you disk
status as " error ". What are
the steps you follow to make respective disk online?
This
issue is mainly because of fabric disconnection. So, execute #
vxdisk scandisks command. Otherwise unsetup the disks
using # /etc/vx/bin/vxdiskunsetup
and setup the disks again
using #
/etc/vx/bin/vxdisksetup
command.
Note :/etc/vx/bin/vxdiskunsetup will
remove the private region from the
disk and destroy the data. So, backup the data before using this
command.
|
15. Which are the different
layouts for vxvm?
(i)mirror (ii)stripe (default)
(iii)
concate (iv)
raid 5
(v) stripe-mirror (vi) mirror-stripe
16. How will you
setup and unsetup
disks explicitly suing vxvm?
#
/etc/vx/bin/vxdisksetup (to
setup the disks)
#
/etc/vx/bin/vxdiskunsetup (to
unsetup the disks)
17. How will you list the
disks which are in different disk groups?
#
vxdisk list or
# vxprint (to
list from current disk group or imported
disk group)
#
vxdisk -o alldgs (to
list all the disks which are in different disk groups)
18. Define LLT and
GAB. What are the commands to
create them?
LLT :
(i) LLT
means Low Latency Transport protocol
(ii) It monitor the kernel to
kernel communication.
(iii)
It maintain and distribute the network traffic within the cluster.
(iv)
It uses
heartbeat between the interfaces.
GAB :
(i) GAB means
Global Atomic Broadcasting.
(ii) It maintain and distribute the configuration
information of the cluster.
(iii)
It uses
heartbeat between the disks.
Commands :
#
gabconfig -a (to
check the status of the GAB, ie., GAB is running or not)
If
port ' a ' is
listening, means GAB is
running, otherwise GAB is not running.
If
port ' b ' is
listening, means I/O fencing
is enabled, otherwise I/O fencing
is disabled.
If
port ' h ' is
listening means had deamon is working, otherwise had deamon is not working.
#
gabconfig -c n 2 (to
start the GAB in 2 systems in the
cluster, where 2 is
seed no.)
#
gabconfig -u (to
stop the GAB)
#
cat /etc/gabtab (to see the GAB
configuration information and the it contains as, )
gabconfig -c
n x (where
x is a no. ie., 1, 2, 3, ....etc.,)
#
lltconfig -a (to
see the status of the llt)
#
lltconfig -c (to start the
llt)
#
lltconfig -u (to stop the
llt)
#
lltstat -nvv (to see the
traffic status between the interfaces)
#
llttab -a (to see
the cluster ID)
#
haclus -display (to see all the
information on the cluster)
#
cat /etc/llttab (to see the llt configuration and
the entries are as,)
Cluster ID,
host ID, interface
MAC address, ...etc.,
#
cat /etc/llthosts (to
see the no. of nodes present in the cluster)
19. How to check the
status of the Veritas Cluster?
#
hastatus -summary
20. Which command is used
to check the syntax of the main.cf?
#
hacf -verify /etc/VRTSvcs/conf/config
21. How will you check the
status of the individual resources of Veritas Cluster (VCS)?
#
hares -state
22. What is the use
of # hagrp command?
#
hagrp command is used doing
administrative actions on service groups like, on-line service group, off-line service group and switch, ...etc.,
23. How to switch over the
service group?
#
hagrp -switch
24. How to online the
service group in VCS?
#
hagrp -online -sys
25. What are the steps to
follow for switch over the application from
System A to
System B?
(i) First unmount the file system on
System A.
(ii) Stop the volume on System A.
(iii) Deport the disk group from System A.
(iv) Import the disk group to another
System B.
(v) Start the volume on System B.
(vi)
Finally mount the file system on System
B.
26. How many types of
clusters available?
(i) Hybrid Cluster.
(ii) Parallel Cluster.
(iii)
Failover Cluster.
27. What is meant by
seeding?
Normally, we will define how many nodes to start in a
cluster while booting or explicitly by executing
# gabconfig -c
n 2 command. Here 2
means 2 seeds
to start in a cluster. This
no. is called seeding.
28. What is Split
brain issue in VCS and
how to resolve this?
A Split brain issue means, multiple systems use the same
exclusive resources and usually resulting in data corruption.
Normally VCS is
configured with multiple nodes and are communicates with each other. When power
loss or system
crashed, the VCS
assumes the system has failed and trying to move service group to other
system to maintain high
availability. However communication
(heartbeat) can also failed due
to network failures.
If
network traffic (connection) between any two groups of systems fail
simultaneously, a network partition occurs.
When this happen, systems on both sides of the partition can restart the
applications from the other side,
ie., resulting in duplicate services. So, the most serious problem caused by
this and effects the data on shared
disks.
This
split brain issue normally occurs in VCS
3.5 to
VCS 4.0 versions. But, from VCS
5.0 onwards the I/O fencing (new feature) is introduced to
minimize the split brain issue. If
I/O fencing is enabled in a
cluster, then we can avoid the split
brain issue.
29. What is Admin
wait and Stale Admin wait?
ADMIN-WAIT :
If VCS is
started on system with a valid configuration file and other systems are in
the ADMIN-WAIT state, The new
system transition to the ADMIN-WAIT state
(or)
If VCS is
started on system with a stale configuration file and if
other systems are in the ADMIN-WAIT state, the
new system transition to the ADMIN-WAIT state.
STALE-ADMIN-WAIT :
The
configuration files are in read-only mode. If any changes wants to make that
file as read-write mode. If any changes
occurs in ' main.cf ' file in
cluster, then the changes are in ' .stale ' hidden file under configuration directory. While changes
occurring, if the system restarted or
rebooted, then the cluster will start with ' .stale ' file.
So, the VCS is started on a system with a stale
configuration file, the system status will be
STALE- ADMIN-WAIT until
another system in the cluster starts with a valid configuration file or
otherwise execute
# hasys
-stale -force (or) #
hasys -force
to start the system forcefully with the correct
or valid configuration file.
30. What is meant by
resource and how many types?
Resource is a software or
hardware component managed by
the VCS.
Mount
points, disk groups, volumes,
IP addresses, ....etc., are the Software components.
Disks, Interfaces (NIC cards), ....etc., are the Hardware components.
There
are two types of resources and they are,
(i) Persistent Resources (we
can put them either on-line or
off-line)
(ii) Non-Persistent Resources (we can put
off-line only)
If the resource is in faulted state, then clear
the service group state. Resources cab be critical or
non-critical. If the resource is critical, then it
automatically failover if the resource
is failed. If the resource is Non-critical, then it switch over and we have to manually switch over
the resource group to another available
system.
31. What are the
dependencies between resources in a
Cluster?
If
one resource depends on other resource, then there is a dependency between
those resources.
Example : NIC
(Network Interface Card) is
hardware component nothing but hardware resource. The IP address
is a software component nothing but software resource and it
depends on NIC card. The relationship between NIC
and IP address is Parent
- Child relationship. The
resource which one is starts first, that one is called Parentresource and the
remaining dependency resources are called
Child resource.
32. What are the minimum
requirements for or in VCS?
(i) Minimum two identical (same configuration) systems.
(ii) Two
switches (Optical Fibre
Channel).
(iii)
Minimum three NIC
cards. (Two NICs for private network and
one NIC for
public network).
(iv) One
common storage.
(v) Veritas Volume
Manager with license.
(vi)
Veritas Cluster with
license.
33. What are the
Veritas Cluster deamons?
(i) had
:
(a) It is the main deamon in Veritas
Cluster for high availability.
(b) It monitors the cluster configuration and
whole cluster environment.
(c) It interacts with all the agents and
resources.
(ii) hashadow
:
(a) It
always monitor the had
deamon.
(b) It's
main functionality is
logging about the cluster.
35. What are the main
configuration files in a
Cluster?
* /etc/VRTSvcs/conf/config/main.cf and
* /etc/VRTSvcs/conf/config/types.cf are
the main configuration files in Cluster.
36. What are the main log
files in a Cluster?
(i) /var/VRTSvcs/log/Engine_A.log (logging about when the
cluster started, when failed, when
failover occurs, when switchover forcefully, ...etc.,)
(ii)
/var/VRTSvcs/log/hashadow_A.log (logging about
the hashadow deamon)
(iii)
/var/VRTSvcs/log/agent_A.log (logging bout
agents)
37. What are the
Cluster components?
(i) Cluster.
(ii) Service groups.
(iii) Resources.
(iv)
Agents.
(v) Events.
38. What is your role in the
Cluster?
Normally
we will get requests like,
(i) Add a node.
(ii) Add a resource.
(iii)
Add a service group.
(iv) Add a resource to the existing
service group.
(v) Add mount points.
And sometimes we get some troubleshooting
issues like,
(i) had deamon is not running.
(ii) Split barin issue.
(iii) If the resources are faulted, then
restart the service groups and moving service groups from one node to another.
(iv) Cluster is
not running.
(v) Communication failed between two
nodes.
(vi) GAB
and LLT are not running.
(vii)
Resource not started.
(viii)
main.cf and types.cf
files corrupted.
(ix) I/O
fencing (a locking mechanism to
avoid the split brain issue)
is not enabled (at disk level /
SAN level).
(x) And the locks are,
(a) engine.lock
(b) ha.lock
(c) agent.lock
39. What are the statuses
of a service group?
(i) online
(ii) offline
(iii) partial
* If
the non-critical resource is
failed, then the status of the service group may be in partial status.
* If
the critical resource is failed,
then the status of the service
group may be in offline status.
40. How to move the
service group from one node
to another node manually?
(i) Stop the application.
(ii) Stop the database.
(iii) Unmount the file system.
(iv)
Stop the volume.
(v) Deport the disk group.
(vi) Import the disk group.
(vii)
Start the volume.
(viii)
Mount the file system.
(ix) Start the database.
(x) Start the application.
41. How to rename a disk group
in VxVM in stepwise?
(i) Stop the application.
(ii) Stop the database.
(iii) Unmount the
file system.
(iv) Stop the volume.
(v) Deport the disk group.
(vi) Rename the disk group.
(vii)
Import the disk group.
(ix) Start the volume.
(x) Mount the file system.
(xi) Start the database.
(xii)
Start the application.
42. How to create a volume
with 4 disks?
(i) Bring the disks to O/S control
by scanning the Luns using the
following the command,
#
echo "---" >
/sys/class/scsi_host/< lun no.
>/scan (to scan the lun no.)
(ii) Bring those disk from O/S
control to VxVM
control.
(a) If we want to preserve the data, then bring the disks to VxVM
control using encapsulation method
by
# vxdiskadm (here we get the options to do this and select
2nd option ie.,
Encapsulation)
(b) If we don't
want to preserve the data, then
bring the disks to VxVM control
using initialization method by #
vxdisksetup -i
(for example # vxdisksetup -i
/dev/sda)
#
vxdisksetup -i
(for example # vxdisksetup -i
/dev/sdb)
#
vxdisksetup -i
(for example # vxdisksetup -i
/dev/sdc)
#
vxdisksetup -i
(for example # vxdisksetup -i
/dev/sdd)
#
vxdisk list (to see VxVM controlled disks)
(iii) Create a disk group.
#
vxdg init
disk01=/dev/sda (for
example diskgroup name as
appsdg)
(iv) Adding remaining three disks to the above
disk group.
#
vxdg -g appsdg adddisk disk02=/dev/sdb
#
vxdg -g appsdg adddisk disk02=/dev/sdc
#
vxdg -g appsdg adddisk disk02=/dev/sdd
#vxdg list
(to see all the disks belongs to that
diskgroup for example appsdg)
(v) Create the Volume (for the requested size and
requested layout).
#
vxassist -g appsdg
make (for
example volume name is appsvol
and size
in TB/GB ... etc)
(vi)
Create a file system on that volume.
#
mkfs -F vxfs
/dev/vx/rdsk/appsdg/appsvol
(vii)
Create the mount point and provide the requested permissions to that mount point.
#
mkdir /mnt/apps
(viii)
Start the volume.
#
vxvol -g appsdg
start appsvol
(ix) Mount the file system
on the above mount point.
#
mount -F vxfs
-o /dev/vx/dsk/appsdg/appsvol
(where
rw means read-write and
re means read-only)
(x) Put the entry into the
"/etc/fstab"
file for permanent mount.
* If the volume is created for cluster, don't
put the entry in /etc/fstab file.
(xi) And finally send the mail to
client or requested person
43. What is the difference
between Global Cluster and Local Cluster? Have you configured the Global Cluster?
Local Cluster :
If all the nodes in a Cluster are placed in a
same location, that Cluster is
called Local Cluster.
Global
Cluster :
If all the nodes in a Cluster are placed in
different Geological locations,
that Cluster is called Global
Cluster. The main advantage of global cluster is high
availability when Natural Calamities
or disasters occurs.
* No,
I haven't configure the Global
Cluster.
44. How to start and
stop the Cluster?
#
hastart (to
start the local node in the Cluster)
#
hastart all (to
start all the nodes in the Cluster)
#
hastart -sys (to start
a specified system
or node in the
Cluster)
#
hasys -force (to forcefully start the system in the Cluster)
#
hastop (to
stop the local node in the Cluster)
#
hastop all (to
stop all the systems in the Cluster)
#
hastop -sys (to stop the specified system or
node in the Cluster)
45. What is the Service group
and Resource?
Service group :
(i) A
collection or group of physical and
logical resources is called the
Service group.
(ii) Moving service group from one system to another system means, moving
resources from one system to another system.
Resources :
(i) It
is a software or hardware components like, diskgroup, volume,
IP address, mount point are
software resources
and disk, NIC cards
are hardware resources.
(ii) The value of resource is known
as Attribute.
Example
: (a)
System list is attribute of a System A
or System B.
(b)
Auto start is the attribute of
System.
Resource
|
Attribute
|
Value
|
NIC
|
IP
address
|
192.168.1.1
|
Diskgroup
|
diskgroup name
|
appsdg
|
Disk
|
disk name
|
disk01
|
Interface
|
Interface name
|
eth0
|
(iii)
There are two types of resources.
(a) Persistent
Resource :
Those resources which we cannot start
or stop are called Persistent resources.
` Some resources we can start/stop
and some other resources we
cannot stop or start.
Example : We cannot start or
stop the NIC card.
(b) Non
- Persistent Resource :
Those resources which we can start/stop
are called Non - Persistent Resources.
(iv)
Resources may be critical or non-critical. We need to design the resources
as critical or non-critical. ie., the customer
will insists which is critical and which is non-critical.
(v) If
critical resource is failed, then only the service group moved
automatically from one system to another system. ie., failover,
otherwise if non-critical
resource is failed, then we need to the manual movement of service group from one system
to another system. ie., switchover.
46. What are the steps you
follow to put the volume in a Cluster?
(i) First create the diskgroup, volume
and create the file system and
mount and unmount before put the volume
in a cluster because testing of that volume is working or
not.
(ii) Create the service group and add the
Attributes to it.
#
hagrp -add
Example: # hagrp
-add appssg
Attributes :
#
hagrp -modify appssg
system list={ sys A0, sys B0}
(to add sys A and
sys B attributes to service
group)
#
hagrp -modify appssg
autostart list={ sys A} (to start the sys A
attributes automatically)
#
hagrp -modify appssg
enabled 1 or 0 (1
means start and 0
means not to start automatically)
(iii) Creating resources
and adding them to the service group
and specify their attributes.
For file system :
(a) /mnt/apps (the
mount point)
(b) /appsvol (the
volume name)
(c) /appsdg (the
disk group)
#
hares -add dg-apps
diskgroup appssg (to add the
diskgroup resource to a service group)
(where
dg-apps is resource name, diskgroup
is a keyword and
appssg is a
service group name)
#
hares -modify dg-apps
diskgroup appsdg (to add the diskgroup attribute to a
service group)
#
hares -modify dg-apps
enable 1 (to enable the resource)
#
hares -add dg-volume
volume appssg (to add the volume resource to a service group)
#
hares -modify dg-volume
volume appsvol (to add the volume attribute to a
service group)
#
hares -modify dg-volume
diskgroup appsdg (to add the diskgroup to the volume)
#
hares -modify dg-volume
enable 1 (to enable the volume resource)
#
hares -modify dg-volume
critical 1 (to make the resource
as critical)
#
hares -add dg-mnt
mount appssg (to add the mount point
resource to a service group)
#
hares -modify dg-mnt
blockdevice=/dev/vx/rdsk/appsdg/appsvol (to add the block device resource to a service group)
#
hares -modify dg-mnt
fstype=vxfs (to add the mount point attributes
to a service group)
#
hares -modify dg-mnt
mount=/mnt/apps (to add the mount point
directory attribute to a service group)
#
hares -modify dg-mnt
fsckopt=% y or %n (to
add the fsck attribute either yes or
no to service group)
(iv)
Create links between
the above diskgroup, volume and
mount point resources.
#
hares -link parent-res child-res
# hares -link
dg-appdg dg-volume
#
hares -link dg-volume dg-mnt
47. What is meant by freezing and
unfreezing a service group with
persistent and evacuate options?
Freezing :
If we want to apply patches to the system in a
cluster, then we have to freeze the service group because first stop
the service group, if it is critical,
the service group will move automatically to another system in Cluster.
So, we don't want to move the service group from one
system to another system, we have to freeze the service group.
Unfreeze :
After
completing the task, the service group should be unfreezed because,
if the is crashed or down and the resources
are critical, then the service group cannot move from system 1
to system 2 due to freezed the service group and results in not available of application. If
unfreezed the service group after maintenance,
the service group can move
from system 1 to
system 2. So, if system 1
failed, the system2 is available
and application also available.
Persistent option :
If the service group is freezed with persistent
option, then we can stop or
down or restart the system. So, there is no loss of data and after
restarted the system, the service group is remains in freezed
state only.
Example : # hasys
-freeze -persistent
#
hasys -unfreeze -persistent
Evacuate :
If
this option is used in freezed service group system, if the system down or
restarted the persisted information is evacuated, ie.,
before freeze all the service
groups should be moved from system 1
to another system 2.
48. What are the layouts are available
in VxVM
and how they will work and
how to configure?
(i) There are 5 layouts
available in VxVM. They are RAID-0,
RAID-1, RAID-5, RAID-0+1
and RAID-1+0.
RAID-0 :
We
can configure RAID-0 in two ways.
(a) Stripped
(default).
(b) Concatenation.
Stripped :
(i) In this minimum two disks required
to configure.
(ii) In this the data will write on both
the disks parallelly. ie., one line in one disk and 2nd line on 2nd disk,
...etc.,
(iii)
In this the data writing speed is fast.
(iv)
In this there is no redundancy for data.
Concatenation :
(i) In this minimum one disk is required
to configure.
(ii) In this the data will write in first
disk and after filling of first disk then it will write on 2nd disk.
(iii) In this the data writing speed is
less.
(iv) In this also there is no redundancy
for data.
RAID-1 :
(I) It is nothing but mirroring.
(ii) In this minimum 4 disks are required to configure.
(iii) In this same data will be written on
disk1 and disk 3,
disk 2 and disk4.
(iv) If disk 1 failed, then we can recover the data from
disk3 and if disk 2
failed, then we can recover the data from disk 4. So,
there is no data loss or we can minimize the data loss.
(v) In this half of the disk space may be
wasted.
RAID-5 :
(i) It is nothing but stripped with
distributed parity.
(ii) In this minimum 3 disks
required to configure.
(iii) In this one line will write on disk
1 and
2nd line write on disk 2 and the
parity bit will write on disk3. The parity bit will write on 3 disk
simultaneously. If disk 1 failed then we
can recover the data from disk2 and parity bit from disk 3.
So, in this data will be more secured.
(iv) In this disk utilization is more when compared to RAID-1, ie., 1/3 rd
of disk space may be wasted.
(v) This RAID-5 will be configured for
critical applications like
Banking, Financial, SAX
and Insurance...etc., because
the data must be more secured.
Creating
a volume with layout :
#
vxassist -g
make
layout=
Example : # vxassist
-g appsdg make
appsvol 50GB layout=raid 5 (the default is
RAID-5 in VxVM)
Logs :
* If the layout is mirror, then log is DRL.
* If the layout is RAID-5, then the log is
RAID-5 log.
* The main purpose of the log is fast recovery
operation.
* We have to specify whether the log is
required or not in
all types of layouts except RAID-5 because the logging is default in RAID-5.
* If we want to configure RAID-5
without logging then,
#
vxassist -g
make
50GB, nolog layout=raid 5
* If the layout is other than RAID-5 then,
#
vxassist -g make
50GB, log
layout=mirror
* If we want to add the log to the existing
volume then,
#
vxassist -g addlog
logtype=drl or raid5
* If we want to remove the log from the
existing volume then,
#
vxassist -g rmlog
49. What is read policy and how many types of read policies
available?
Read
policy means, how the disk or volume should be read when accessing the
data.
Types of
read policies :
(i) Select
(ii) Prefer
(iii) Round Robin
* By default the read policy is Round Robin.
#
vxvol -g
rdpol = < select/prefer/roundrobin
50. What is your role in VxVM?
Normally,
we get requests from application, development, production and
QA people like,
(i) Create a
volume.
(ii) Increase the volume.
(iii) Decrease the volume.
(iv) Provide Redundancy by
implementing RAID-1 or
RAID-5.
(v) Provide the required permissions.
(vi)
Put the volume in the Virtual machine.
(vii)
Put the volume in the Cluster.
(viii)
Provide high availability to the applications
and databases.
(ix)
Sometimes destroy or remove the volume.
(x) Backup and
restore the data whenever
necessary.
And
sometimes we get some troubleshooting issues
like,
(i) Volume is not started.
(ii) Volume is not accessible.
(iii)
Mount point deleted.
(iv) File system crashed.
(v) One disk failed in a volume.
(vi)
Volume manager deamons are not running.
(vii)
Volume manager configuration files missed or
deleted.
(viii)
VxVM licensing issues.
(ix) Diskgroup not
deporting and not importing.
(x) Volume is started, deamons
are running but
users cannot accessing the data.
(xi)
Disk are
not ditected.
(xii)
Hardware and software
errors.
51. What is meant by snap backup
and how to take the snap backup?
(i) Snap backup means, taking backup using snapshots.
(ii) In
24X7/365 days running servers normally we take snap backup.ie., no downtime allowed.
(iii)
The above said servers are called BCV
(Business Continuity Volumes).
Backup :
(i) First stop the Application.
(ii)
Stop the Database.
(ii) Unmount the file system.
(iii) Stop the volume.
(iv) Deport the diskgroup.
(v)
Import the diskgroup.
(vi) Join the snap diskgroup.
(vii)
Syncing the data.
(viii)
Take the
backup.
(ix)
Split the snap diskgroup.
(x) Deport the diskgroup.
(xi) Import the diskgroup.
(xii)
Start the volume.
(xiii)
Mount the file system.
(xiv)
Start the Database.
(xv)
Start the Application.
52. What are the steps you follow to
rename a diskgroup?
(i)
Stop the Application.
(ii) Stop the Database.
(iii)
Unmount the file system.
(iv) Stop the volume.
(v) Deport the diskgroup.
(vi)
Rename the diskgroup.
(vii)
Import the diskgroup by
# vxdg
-n import command.
(viii)
Start the volume.
(ix)
Mount the file system.
(x) Start the Database.
(xi)
Start the Application.
53. How to install VxVM?
What version of Veritas you are using
and how to know the veritas version?
(i) Install the veritas supplied packages using # rpm
or # yum commands.
(ii) Execute the command #vxinstall to install
VxVM ie., enable the system to use volume manager.
(iii)
#vxinstall will allow us to encapsulate or not
encapsulate the root disk.
(iv)
Always use option 2 ie., Custom installation because, if option 1 is used ie., Quick installation, it takes all the disks for rootdg.
License :
(i) All the licenses are stored in
/etc/vx/licenses directory
and we can take backup of this
directory and restore it back,
if we need reinstall the server.
(ii) Removing VxVM
package will not remove the installed license.
(iii) To install license
# vxlicinst command is used.
(iv)
To see the VxVM license information by # vxlicrep command.
(v)
To remove the VxVM license by
# vxkeyless set
NONE command.
(vi)The
license packages are installed in /opt/VRTSvlic/bin/vxlicrep directory.
(vii)
The license keys are stored in /etc/vx/licenses/lic directory.
(viii)
We can see the licenses by executing the
below commands,
# cat
/etc/vx/licenses/lic/key or
# cat
/opt/VRTSvlic/bin/vxlicrep | grep
"License key"
(ix)
To see the features of license key
by # vxdctl license command.
Version :
(i) We are using VxVM6.2 version.
(ii) to know the version of VxVM by # rpm
-qa VRTSvxvm command.
54. What are the available formats to take
the control of disks from O/S to
veritas in VxVM?
We
can take the control of disks from
O/S to veritas
in 3 formats.
(i) CDS
(Cross platform Data
Sharing and the default
format in VxVM).
(ii) Sliced.
(iii)
Simple.
(i) CDS
:
(a) We can share the data between different Unix
flavours.
(b) The private
and public both regions are
available in 7th partition.
(c) The entire space is in 7th partition.
(d) So, there is a chance to loss the data
because, if the disk is failed ie., partition 7 is corrupted or
damaged then the data may be lost.
(e) This is the default in veritas volume manager.
(ii) Sliced
:
(a) It is always used for root disk only.
(b) In this format we cannot share the data between different
Unix flavours. Normally sliced is used
for root disk
and cds is used
for data.
(c) Private region is available at 4th partition
and public region is available at 3rd partition.
(d) So, if public region is failed, we can recover the data from private region
ie., minimizing the data loss.
(iii)
Simple :
(a) This format is not using widely now
because, it is available in old
VxVM 3.5
(b) In this private and public regions are available at 3rd partition.
Specifying the format while setup :
#
vxdisksetup -i /dev/sda (to
setup the disk and this is default format ie., CDS format)
#
vxdisksetup -i /dev/sdb format = (to specify sliced
or simple format)
55. In how many ways can we manage VxVM?
(I) Command line
tool.
(ii) GUI
(vea tool)
(iii) # vxdiskadm command
(it gives the options to manage
the disks)
Subscribe to:
Posts (Atom)
Linux, CCNA and MCSE Questions: User Managment
Linux, CCNA and MCSE Questions: User Managment
-
1. What is virtualization? Virtualization allows multiple operating system instances to run concurrently on...
-
1. What is Network? Combination of two more computers connected together to share their resources each o...
-
Write by vikrant choudhary. Networking Q1: How many type of simple networking? Ans : CAT 1 – Voice only CAT 2 – 4 Mbps CAT 3 – 10 Mbps CAT 4...