Thursday, January 17, 2013

Clearing a minor fmadm faulty alert



(MySolaris:/)# fmadm faulty
--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Dec 16 22:36:07 96c775c2-6764-6eae-ea5b-ea57f62cc2c0  FMD-8000-0W    Minor

Host        : MySolaris
Platform    : SUNW,Sun-SPARC Enterprise T5240        Chassis_id  :
Product_sn  :

Fault class : defect.sunos.fmd.nosub
FRU         : None
                  faulty

Description : The Solaris Fault Manager received an event from a component to
              which no automated diagnosis software is currently subscribed.
              Refer to http://sun.com/msg/FMD-8000-0W for more information.

Response    : Error reports from the component will be logged for examination
              by Sun.

Impact      : Automated diagnosis and response for these events will not occur.

Action      : Run pkgchk -n SUNWfmd to ensure that fault management software is
              installed properly.  Contact Sun for support.

(MySolaris:/)#



The FMADM fault currently logged on this system is caused by a logical inconsistency in the checkpointed data, causing the system do disable cpumem-diagnosis. As described in the attached document, this in turn, causes the FMD-8000-0W defect.sunos.fmd.nosub on the next transient memory error which should have been handled by the cpumem-diagnosis module. We can clear the FMD-8000-0W, but any little thing which cpumem-diagnosis would normally handle will trigger another FMD-8000-0W defect.sunos.fmd.nosub. Please see belowtThe resolution for this issue:

First roll the logs and restart the FMA daemon to keep the history.

logadm -p now -s 1b /var/fm/fmd/errlog
logadm -p now -s 1b /var/fm/fmd/fltlog

svcadm restart fmd


...wait two minutes...

Now scrub the checkpoint files


svcadm disable -st fmd
find /var/fm/fmd/ckpt -type f | xargs rm

svcadm enable fmd


...wait 2 minutes...

Now see if everything is clear

fmadm config - check that cpumem-diagnosis is active

fmadm faulty -a - shouldn't return anything

also check to see if we logged any new errors on fmd startup; if we did, we'll need to check further...

fmdump -e

should return nothing

No comments:

Post a Comment