Tuesday, September 28, 2010

Beware of making changes when someone is monitoring you!!!

Changes to a system is a daily activity in a sysadmin's life. It cannot be avoided. And there are a lot of tools to help him do a change easily and without making mistakes. but still there is a need to be careful and be aware of these tootl which monitors and helps you. Because these tools might be there to help you, but still if you ignore them, it might turn against you.

Always check if there is any monitoring process running before making any changes to a system. Particularly if there is an application like VCS which monitors and also takes corrective actions when something abnormal is noticed, be extra careful.

Consequences because of overlooking a tool like VCS.

VCS normally defines dependencies between applications.
For example, an application will be dependent on a file system in-turn, the file system will be dependent on a disk group. Also an ip can be dependent on a file system in-turn a whole application may be dependent on that ip.

Making any changes to any of these components without the knowledge of VCS may lead to some bad things !!!

Let me explain a little about VCS first so that the below scenario can be more clear.

Veritas Cluster Server is a high availability solution from symantec. It monitors the resources(file-systems, dg, applications, ip, horc, etc) and has the ability to perform failovers in-case of failure on one system thus enabling the availability of the application. It is one of the best ways to minimize application downtime.

Practically, resources are configured in VCS and monitoring is enabled. Also dependencies are specified so that we could ensure that one particular resource cannot exists without another necessary resources. VCS has agents for each resources which monitors the status of the associated resources and can take appropriate actions like online/offline as per the rules specified.

The Scenario:
There is a filesystem /opt/MyApps/Billlogs on a machine name Server1. The filesystem is configured in VCS with some dependencies. ip-MyApps2 which is an ip resources of MyApps2 is dependent on this filesystem


fsSubApp1   SubApp2
   |        |
   |        |
   ip-MyApps2                         (parent)
         |
         |
         |
 /opt/MyApps/Billlogs                 (child)
         |
         |
    DG-MyApps


In the above configuration, the file system /opt/MyApps/Billlogs is necessary for all the resources dependent on it like ip-MyApps2, fsSubApp1 and SubApp2. If the filesystem /opt/MyApps/Billlogs is unmounted, it causes a cascading effect by pulling down the resources that depends on it.
VCS continually monitors each and every types of resources. If a particular resource is taken offline, the dependent resources are also appropriately dealt with. So if VCS is enabled and running, and the resources are monitored by VCS, off-lining or on-lining a resource without the knowledge of VCS might lead to un-foreseen impacts. For example, if the file system /opt/MyApps/Billlogs is unmounted through the system, the VCS thinks something has gone wrong so will try to take the dependent resources offline.
But in-case if the dependency is wrongly specified while configuring, ie, if there is in fact no dependency of ip-MyApps2 on /opt/MyApps/Billlogs and still if the dependency has been specified, this could lead to downtime of ip-MyApps2, which is not expected, but had happened because of the dependency. This is a mistake that happens because of improper configuration of dependency and also the off-lining activity done without  consultation with VCS.

So how to do it?
The safe way to remove a resource without affecting the already running other dependent applications is to first unlink them so that no dependency is established between the resources and then safely turn off the resources that is no longer required.

hagrp -unlink parent_group child_group

No comments:

Post a Comment