!!! UNIX !!!: August 2011

Tuesday, August 30, 2011

Force a Panic and collect a core dump - M Series SPARC

Force a Panic and collect a core dump

Scenario: System not reachable and no response through console.
Action Taken: Initate Forced Panic and forced savecore

Send break signal from chassis to the domain. (2 ways)

Before that check the domainmode. Should be on
setdomainmode -d 00 -m secure=on/off

Ways to send break.
reset -d 0 panic
sendbreak -d 0 -y

####Send break to domain 0

XSCF> sendbreak -d 0
Send break signal to DomainID 0?[y|n] :y
XSCF>

#### Open a new session and connect to console again

XSCF> console -f -d 0
Connect to DomainID 0?[y|n] :y

### System reaches OK prompt. Give sync to force coredump

{a7} ok sync
panic[cpu167]/thread=2a174821ca0: sync initiated
sched: software trap 0x7f
pid=0, pc=0xf005d18c, sp=0x2a174820cb1, tstate=0x4400001407, context=0x0
g1-g7: 10511c4, 18de000, 60, 0, 0, 0, 2a174821ca0
00000000fdb79cd0 unix ync_handler+144 (182e400, f7, 3, 1, 1, 109f400)
%l0-3: 0000000001893e80 00000000018dddd0 00000000018ddc00 000000000000017f
%l4-7: 00000000018c1000 0000000000000000 00000000018bac00 0000000000000037
00000000fdb79da0 unix:vx_handler+80 (fdb02078, 183e038, 7fffffffffffffff, 1, 183e140, f006d515)
%l0-3: 000000000183e140 0000000000000000 0000000000000001 0000000000000001
%l4-7: 000000000182ec00 00000000f0000000 0000000001000000 0000000001019734
00000000fdb79e50 unix:callback_handler+20 (fdb02078, fdfea400, 0, 0, 0, 0)
%l0-3: 0000000000000016 00000000fdb79701 0000000000000000 0000000000000000
%l4-7: 0000000000000000 0000000000000000 0000000000000000 0000000000000001
syncing file systems... 570 568 568 568 568 568 568 568 568 568 568 568 568 568 568 568 568 568 568 568 568 568 done (not

all i/o completed)
dumping to /dev/md/dsk/d11, offset 21476081664, content: kernel
100% done: 2916208 pages dumped, compression ratio 2.90, dump succeeded

### System reboots to init level 3
rebooting...
Resetting...
.POST Sequence 01 CPU Check
LSB#02 (XSB#01-0): POST 2.11.0 (2009/06/18 09:30)
LSB#06 (XSB#03-1): POST 2.11.0 (2009/06/18 09:30)
LSB#07 (XSB#03-2): POST 2.11.0 (2009/06/18 09:30)
LSB#03 (XSB#02-0): POST 2.11.0 (2009/06/18 09:30)
LSB#01 (XSB#00-1): POST 2.11.0 (2009/06/18 09:30)
LSB#04 (XSB#02-1): POST 2.11.0 (2009/06/18 09:30)
POST Sequence 02 Banner

Wednesday, August 24, 2011

IPMP failover

(From the man page)

NAME

if_mpadm - change operational status of interfaces within a

multipathing group

SYNOPSIS

/usr/sbin/if_mpadm -d interface_name

/usr/sbin/if_mpadm -r interface_name

If the interface is operational, you can use if_mpadm -d to detach or off-line the interface. If the interface is off-lined, use if_mpadm -r to revert it to its original state. When a network interface is off-lined, all network access fails over to a different interface in the IP multipathing group. Any addresses that do not failover are brought down. Addresses marked with IFF_NOFAILOVER do not failover. They are marked down. After an interface is off-lined, the system will not use the interface for any outbound or inbound traffic, and the interface can be safely removed from the system without any loss of network access.

The if_mpadm utility can be applied only to interfaces that are part of an IP multipathing group.

Here the backup ip 174.48.22.36 is in IPMP group backup-lan

(SolarisHost1:/root)# ifconfig -a
lo0: flags=2001000849 mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
igb2: flags=69000842 mtu 0 index 2
inet 0.0.0.0 netmask 0
groupname backup-lan
ether 3d:4a:97:76:8d:00
igb722000: flags=269000842 mtu 0 index 3
inet 0.0.0.0 netmask 0
groupname main-lan
ether 3d:4a:97:76:8d:01
igb1: flags=1000843 mtu 1500 index 4
inet 174.48.22.36 netmask fffffe00 broadcast 10.48.233.255
groupname backup-lan
ether 3d:4a:97:76:8d:0e
igb1:1: flags=1000843 mtu 1496 index 4
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
igb722000: flags=201000843 mtu 1500 index 5
inet 174.10.11.201 netmask ffffff00 broadcast 10.120.10.255
groupname main-lan
ether 3d:4a:97:76:8d:0f
igb722000:1: flags=201000843 mtu 1496 index 5
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
(SolarisHost1:/root)#

We use if_mpadm to do a manual failover between an IPMP group.

The failover is performed now

(SolarisHost1:/root)# if_mpadm -d igb

Here the ip is now failed over to the other interface igb2.

(SolarisHost1:/root)# ifconfig -a

lo0: flags=2001000849 mtu 8232 index 1

inet 127.0.0.1 netmask ff000000

igb2: flags=21000843 mtu 1496 index 2

inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255

groupname backup-lan

ether 3d:4a:97:76:8d:00

igb2:1: flags=21000843 mtu 1496 index 2

inet 174.48.22.36 netmask fffffe00 broadcast 10.48.233.255

igb722000: flags=269000842 mtu 0 index 3

inet 0.0.0.0 netmask 0

groupname main-lan

ether 3d:4a:97:76:8d:01

igb1: flags=89000842 mtu 0 index 4

inet 0.0.0.0 netmask 0

groupname backup-lan

ether 3d:4a:97:76:8d:0e

igb722000: flags=201000843 mtu 1500 index 5

inet 174.10.11.201 netmask ffffff00 broadcast 10.120.10.255

groupname main-lan

ether 3d:4a:97:76:8d:0f

igb722000:1: flags=201000843 mtu 1496 index 5

inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255

(SolarisHost1:/root)#

Reattaching the ip back to its original interface

(SolarisHost1:/root)# if_mpadm -r igb

(SolarisHost1:/root)# ifconfig -a

lo0: flags=2001000849 mtu 8232 index 1

inet 127.0.0.1 netmask ff000000

igb2: flags=69000842 mtu 0 index 2

inet 0.0.0.0 netmask 0

groupname backup-lan

ether 3d:4a:97:76:8d:00

igb722000: flags=269000842 mtu 0 index 3

inet 0.0.0.0 netmask 0

groupname main-lan

ether 3d:4a:97:76:8d:01

igb1: flags=1000843 mtu 1500 index 4

inet 174.48.22.36 netmask fffffe00 broadcast 10.48.233.255

groupname backup-lan

ether 3d:4a:97:76:8d:0e

igb1:1: flags=1000843 mtu 1496 index 4

inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255

igb722000: flags=201000843 mtu 1500 index 5

inet 174.10.11.201 netmask ffffff00 broadcast 10.120.10.255

groupname main-lan

ether 3d:4a:97:76:8d:0f

igb722000:1: flags=201000843 mtu 1496 index 5

inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255

(SolarisHost1:/root)#

Wednesday, August 10, 2011

Sendmail - load average

The mail queue is full and no more mail is delivered from the system.

SolarisBox1:/root# mailq

/var/spool/mqueue (845 requests)

----Q-ID---- --Size-- -----Q-Time----- ------------Sender/Recipient------------

q86892K23102X 0 Thu Sep 6 10:09 root

schweitzer@mydomain.com

q8687N218303X 0 Thu Sep 6 10:07 root

schweitzer@mydomain.com

q8688X121871X 0 Thu Sep 6 10:08 root

schweitzer@mydomain.com

q868FH944906X 0 Thu Sep 6 10:15 root

schweitzer@mydomain.com

....., Very High

SolarisBox1:/root# tail -f /var/log/syslog

Sep 6 10:21:03 SolarisBox1 sendmail[48275]: [ID 702911 mail.info] accepting connections again for daemon MTA-IPv4

Sep 6 10:21:03 SolarisBox1 sendmail[48275]: [ID 702911 mail.info] accepting connections again for daemon MTA-IPv6

Sep 6 10:21:03 SolarisBox1 sendmail[48275]: [ID 702911 mail.info] accepting connections again for daemon MSA

Sep 6 10:21:10 SolarisBox1 sendmail[48275]: [ID 702911 mail.info] rejecting connections on daemon MTA-IPv4: load average: 192

Sep 6 10:21:10 SolarisBox1 sendmail[48275]: [ID 702911 mail.info] rejecting connections on daemon MTA-IPv6: load average: 192

Sep 6 10:21:10 SolarisBox1 sendmail[48275]: [ID 702911 mail.info] rejecting connections on daemon MSA: load average: 192

Sep 6 10:21:10 SolarisBox1 sendmail[8293]: [ID 801593 mail.info] NOQUEUE: [10.1.5.17] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA-IPv4

SolarisBox1:/root# mailx -v -s "test" schweitzer@mydomain.com

EOT

SolarisBox1:/root# schweitzer@mydomain.com... queued

Tried restarting sendmail. No luck

Sep 6 10:15:48 SolarisBox1 sendmail[46426]: [ID 801593 mail.info] NOQUEUE: [10.3.5.16] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA-IPv4

Sep 6 10:16:16 SolarisBox1 sendmail[48275]: [ID 702911 mail.info] starting daemon (8.11.7p3+Sun): SMTP+queueing@00:15:00

Sep 6 10:16:16 SolarisBox1 sendmail[48275]: [ID 702911 mail.info] runqueue: Skipping queue run -- load average too high

Sep 6 10:16:17 SolarisBox1 sendmail[48395]: [ID 801593 mail.info] NOQUEUE: [10.1.5.16] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA-IPv4

Sep 6 10:16:18 SolarisBox1 sendmail[48572]: [ID 801593 mail.info] NOQUEUE: [10.3.5.16] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA-IPv4

Load average is reported as high

Sep 6 10:21:25 SolarisBox1 sendmail[48275]: [ID 702911 mail.info] rejecting connections on daemon MTA-IPv4: load average: 193

Sep 6 10:21:25 SolarisBox1 sendmail[48275]: [ID 702911 mail.info] rejecting connections on daemon MTA-IPv6: load average: 193

Sep 6 10:21:25 SolarisBox1 sendmail[48275]: [ID 702911 mail.info] rejecting connections on daemon MSA: load average: 193

Sep 6 10:21:40 SolarisBox1 sendmail[48275]: [ID 702911 mail.info] rejecting connections on daemon MTA-IPv4: load average: 192

Sep 6 10:21:40 SolarisBox1 sendmail[48275]: [ID 702911 mail.info] rejecting connections on daemon MTA-IPv6: load average: 192

Sep 6 10:21:40 SolarisBox1 sendmail[48275]: [ID 702911 mail.info] rejecting connections on daemon MSA: load average: 192

Sep 6 10:21:55 SolarisBox1 sendmail[48275]: [ID 702911 mail.info] rejecting connections on daemon MTA-IPv4: load average: 194

Sep 6 10:21:55 SolarisBox1 sendmail[48275]: [ID 702911 mail.info] rejecting connections on daemon MTA-IPv6: load average: 194

Sep 6 10:21:55 SolarisBox1 sendmail[48275]: [ID 702911 mail.info] rejecting connections on daemon MSA: load average: 194

Sep 6 10:21:55 SolarisBox1 sendmail[10488]: [ID 702911 mail.info] runqueue: Skipping queue run -- load average too high

SolarisBox1:/root# grep QueueLA /etc/mail/sendmail.cf

O QueueLA=128

SolarisBox1:/root#

Temporarily edit the queue size to above the load and restart sendmail

# load average at which we just queue messages

O QueueLA=228

# load average at which we refuse connections

O RefuseLA=292

SolarisBox1:/root# /etc/init.d/sendmail stop

SolarisBox1:/root# /etc/init.d/sendmail start

SolarisBox1:/root#

Sep 6 10:27:19 SolarisBox1 sendmail[30901]: [ID 702911 mail.info] rejecting connections on daemon MSA: load average: 202

Sep 6 10:27:34 SolarisBox1 sendmail[30901]: [ID 702911 mail.info] rejecting connections on daemon MTA-IPv4: load average: 200

Sep 6 10:27:34 SolarisBox1 sendmail[30901]: [ID 702911 mail.info] rejecting connections on daemon MTA-IPv6: load average: 200

Sep 6 10:27:34 SolarisBox1 sendmail[30901]: [ID 702911 mail.info] rejecting connections on daemon MSA: load average: 200

Sep 6 10:27:59 SolarisBox1 sendmail[33515]: [ID 702911 mail.info] starting daemon (8.11.7p3+Sun): SMTP+queueing@00:15:00

Sep 6 10:28:00 SolarisBox1 sendmail[33540]: [ID 801593 mail.info] NOQUEUE: [10.3.1.17] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA-IPv4

Sep 6 10:28:01 SolarisBox1 sendmail[33571]: [ID 801593 mail.info] NOQUEUE: [10.1.5.16] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA-IPv4

Sep 6 10:28:02 SolarisBox1 sendmail[33653]: [ID 801593 mail.info] NOQUEUE: [10.1.5.17] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA-IPv4

Sep 6 10:28:02 SolarisBox1 sendmail[33702]: [ID 801593 mail.info] NOQUEUE: [10.3.5.16] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA-IPv4

Sep 6 10:28:05 SolarisBox1 sendmail[33922]: [ID 801593 mail.info] NOQUEUE: [10.3.5.17] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA-IPv4

Sep 6 10:28:06 SolarisBox1 sendmail[34017]: [ID 801593 mail.info] NOQUEUE: [10.1.5.16] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA-IPv4

Sep 6 10:28:07 SolarisBox1 sendmail[34139]: [ID 801593 mail.info] NOQUEUE: [10.1.5.17] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA-IPv4

Sep 6 10:28:07 SolarisBox1 sendmail[34197]: [ID 801593 mail.info] NOQUEUE: [10.3.5.16] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA-IPv4

Sep 6 10:28:08 SolarisBox1 sendmail[33516]: [ID 801593 mail.info] q86892K23102: to=schweitzer@mydomain.com, ctladdr=root (0/1), delay=00:19:06, xdelay=00:00:00, mailer=relay, pri=120058,

relay=mailrelay [10.5.253.35], dsn=2.0.0, stat=Sent ( 201209060809.q86892K23102@SolarisBox1.sub.mydomain.com Queued mail for delivery)

Sep 6 10:28:09 SolarisBox1 sendmail[33516]: [ID 801593 mail.info] q8687N218303: to=schweitzer@mydomain.com, ctladdr=root (0/1), delay=00:20:46, xdelay=00:00:01, mailer=relay, pri=120058,

relay=mailrelay [10.5.253.35], dsn=2.0.0, stat=Sent ( 201209060807.q8687N218303@SolarisBox1.sub.mydomain.com Queued mail for delivery)

SolarisBox1:/root#

SolarisBox1:/root# mailq

/var/spool/mqueue is empty

SolarisBox1:/root#

Messages delivered.

Now change back to the default value

# load average at which we just queue messages

O QueueLA=128

# load average at which we refuse connections

O RefuseLA=192

SolarisBox1:/root# /etc/init.d/sendmail stop

SolarisBox1:/root# /etc/init.d/sendmail start

SolarisBox1:/root#

Friday, August 5, 2011

NFS4 nobody permissions

NFS4 nobody permissions

When the NFS share is mounted in a client, the permissions are displayed as nobody. If this is in NFS4, it is because of the new representation of users and group information between the systems.

(MyClient:/)# mount -F nfs MyServer:/root/application/archive /application/archive

nfs mount: mount: /application/archive: Permission denied

(MyClient:/)# nslookup MyClient

Server: 175.21.86.11

Address: 175.21.86.11#53

Name: MyClient.bc

Address: 177.10.7.11

The share is not ok.

(MyServer:/)# vi /etc/dfs/dfstab

"/etc/dfs/dfstab" 12 lines, 763 characters

# Place share(1M) commands here for automatic execution

# on entering init state 3.

# Issue the command '/etc/init.d/nfs.server start' to run the NFS

# daemon processes and the share commands, after adding the very

# first entry to this file.

# share [-F fstype] [ -o options] [-d ""] [resource]

# .e.g,

# share -F nfs -o rw=engineering -d "home dirs" /export/home2

share -F nfs -o rw=MyClient.bc,anon=0 /root/application/archive

"/etc/dfs/dfstab" 12 lines, 775 characters

(MyServer:/)#

(MyServer:/)# shareall

(MyServer:/)#

Now the dir is shared.

In the client, NFS is now mounted

(MyClient:/)# mount -F nfs MyServer:/root/application/archive /application/archive

(MyClient:/)#

(MyClient:/)# df -h /application/archive

Filesystem size used avail capacity Mounted on

MyServer:/root/application/archive

880G 720G 153G 83% /application/archive

(MyClient:/)# grep /application/archive /etc/mnttab

MyServer:/root/application/archive /application/archive nfs rw,nodevices,xattr,zone=MyClient,dev=59412eb 1361790688

(MyClient:/)# ls -ld /application/archive

drwxr-xr-x+ 36 nobody nobody 1024 Oct 5 18:55 /application/archive

Solaris handles one NFSv4 domain.

If the client or server receives an user/group string that does not match its domain, it will map that user/group into uid/gid "nobody" (60001).

(MyClient:/)# grep NFSMAPID_DOMAIN /etc/default/nfs

NFSMAPID_DOMAIN=mywrongdomain.com //wrong domain

(MyClient:/)#

(MyClient:/)# cp /etc/default/nfs /etc/default/nfs.old

(MyClient:/)# vi /etc/default/nfs

# Specifies to nfsmapid daemon that it is to override its default

# behavior of using the DNS domain, and that it is to use 'domain' as

# the domain to append to outbound attribute strings, and that it is to

# use 'domain' to compare against inbound attribute strings.

NFSMAPID_DOMAIN=nfscorrectdomain.nfs //Correct nfs domain that can map the user/group

Restart the nfs mapid

(MyClient:/)# svcs -a|grep mapid

online Feb_07 svc:/network/nfs/mapid:default

(MyClient:/)# svcadm restart /network/nfs/mapid

(MyClient:/)#

(MyClient:/)# svcs -a|grep mapid

online 12:13:36 svc:/network/nfs/mapid:default

(MyClient:/)#

(MyClient:/)# ls -ld /application/archive

drwxr-xr-x+ 36 applicationload applicationgr 1024 Oct 5 18:55 /application/archive

(MyClient:/)#

Ref: nfsmapid_domain from oracle blog

Monday, August 1, 2011

rsh and rlogin port secrets

rsh and rlogin are two utilities for establishing and executing commands on remote host.
It is insecure and doesn't encrypt the data it transfers.

The default port used by rsh is 514
The default port used by rlogin is 513

The issue:
Firewall has blocked 513, but 514 is open

We observe the following behavior:

//No response. snooping shows no response from remote host 10.56.41.11

sunhost1:/# rsh 10.56.41.11
^Csunhost1:/#

//We have got the desired result

sunhost1:/# rsh 10.56.41.11 "uname -a"
SunOS MDSSMP01 5.10 Generic_127127-11 sun4v sparc SUNW,Sun-Fire-T200
sunhost1:/#

Why is the difference?
As per MAN page of rsh - If you omit command, instead of executing a single command, rsh logs you in on the remote host using rlogin(1).

Thus the rsh without a sub-command has used rlogin and thus port 513 was used which resulted in packets dropped by firewall.