Monday, October 4, 2010

Breaking an unresponsive system

System is unresponsive and is unreachable. Checking the status of the system through console shows the system is running but still no response from it. Looks like the system is hung. So to come out, we need to send break from console and reset the machine.

M9000 console. The unresponsive machine is domain 0

XSCF> sendbreak -d 0
Send break signal to DomainID 0?[y|n] :y
XSCF>

Open another console and login to domain and give sync in the ok prompt to initiate core dumping

XSCF> console -f -d 0
Connect to DomainID 0?[y|n] :y

### System reaches OK prompt. Give sync to force coredump

{a7} ok sync
panic[cpu167]/thread=2a174821ca0: sync initiated
sched: software trap 0x7f
pid=0, pc=0xf005d18c, sp=0x2a174820cb1, tstate=0x4400001407, context=0x0
g1-g7: 10511c4, 18de000, 60, 0, 0, 0, 2a174821ca0
00000000fdb79cd0 unix ync_handler+144 (182e400, f7, 3, 1, 1, 109f400)
%l0-3: 0000000001893e80 00000000018dddd0 00000000018ddc00 000000000000017f
%l4-7: 00000000018c1000 0000000000000000 00000000018bac00 0000000000000037
00000000fdb79da0 unix:vx_handler+80 (fdb02078, 183e038, 7fffffffffffffff, 1, 183e140, f006d515)
%l0-3: 000000000183e140 0000000000000000 0000000000000001 0000000000000001
%l4-7: 000000000182ec00 00000000f0000000 0000000001000000 0000000001019734
00000000fdb79e50 unix:callback_handler+20 (fdb02078, fdfea400, 0, 0, 0, 0)
%l0-3: 0000000000000016 00000000fdb79701 0000000000000000 0000000000000000
%l4-7: 0000000000000000 0000000000000000 0000000000000000 0000000000000001
syncing file systems... 570 568 568 568 568 568 568 568 568 568 568 568 568 568 568 568 568 568 568 568 568 568 done (not all i/o completed)
dumping to /dev/md/dsk/d11, offset 21476081664, content: kernel
100% done: 2916208 pages dumped, compression ratio 2.90, dump succeeded

### System reboots to init level 3
rebooting...
Resetting...
.POST Sequence 01 CPU Check
LSB#02 (XSB#01-0): POST 2.11.0 (2009/06/18 09:30)
LSB#06 (XSB#03-1): POST 2.11.0 (2009/06/18 09:30)
LSB#07 (XSB#03-2): POST 2.11.0 (2009/06/18 09:30)
LSB#03 (XSB#02-0): POST 2.11.0 (2009/06/18 09:30)
LSB#01 (XSB#00-1): POST 2.11.0 (2009/06/18 09:30)
LSB#04 (XSB#02-1): POST 2.11.0 (2009/06/18 09:30)
POST Sequence 02 Banner

Machine dumps core and reboots.

No comments:

Post a Comment