Thursday, October 28, 2010

/etc/passwd and /etc/shadow

/etc/passwd file maintains the user account in a unix machine. It has 7 fields.
1:2:3:4:5:6:7
  1. Username: The user login name. Length is between 1 and 32 characters.
  2. Password: An x character indicates that encrypted password is stored in /etc/shadow file.
  3. User ID (UID): Each user must be assigned a user ID (UID). UID 0 (zero) is reserved for root and UIDs 1-99 are reserved for other predefined accounts. Further UID 100-999 are reserved by system for administrative and system accounts/groups.
  4. Group ID (GID): The primary group ID (stored in /etc/group file). Group ID must exists before you can use them.
  5. User ID Info: The comment field to specify more information. 
  6. Home directory: The absolute path to the directory the user will be in when they log in. If this directory does not exists then users directory becomes /
  7. Command/shell: The absolute path of a command or shell (/bin/bash). Typically, this is a shell.
/etc/shadow file maintains the user password information. The encrypted passwd is stored in this file and is accessible only for the root account. It has 8 fields.
1:2:3:4:5:6:7:8
  1. User name : The user login name.
  2. Password: The encrypted password. The password should be minimum 6-8 characters long including special characters/digits. The length can be altered by changing configuration files.
  3. Last password change (lastchanged): Days since Jan 1, 1970 that password was last changed
  4. Minimum: The minimum number of days required between password changes i.e. the number of days left before the user is allowed to change his/her password
  5. Maximum: The maximum number of days the password is valid (after that user is forced to change his/her password)
  6. Warn : The number of days before password is to expire that user is warned that his/her password must be changed
  7. Inactive : The number of days after password expires that account is disabled
  8. Expire : days since Jan 1, 1970 that account is disabled i.e. an absolute date specifying when the login may no longer be used 
Editing these files manually is not advised. Adding user should be done by useradd/usermod commands.
pwconv is used to synchronize /etc/passwd and /etc/shadow file.

Friday, October 22, 2010

Upgrading zones to match global machines patch level

The Global Zone and Non-Global Zones are on different patch levels. So to bring them both on the same patch level, the following steps can be followed.

If the zone is configured in cluster(VCS), stop all resources running in the SG through VCS including zone. Do not stop the zone root fs and DG. ( Halt the zone manually if necessary if some problem through VCS. eg) zoneadm -z zone04 halt)

Server1:/# zoneadm list -icv
  ID NAME             STATUS     PATH                           BRAND    IP
   0 global           running    /                              native   shared
   1 zone04           running    /zones/zone04                  native   shared
After the zone is down, put the zone in configured state (Normally VCS will put the zone in configured state automatically if it brings the zone down or by editing the /etc/zones/index file if the zone was halted manually)

Server1:/# zoneadm list -icv
  ID NAME             STATUS     PATH                           BRAND    IP
   0 global           running    /                              native   shared
   1 zone04     down       /zones/zone04                 native   shared


Server1:/# zoneadm list -icv
  ID NAME             STATUS     PATH                           BRAND    IP
   0 global           running    /                              native   shared
   - zone04     configured /zones/zone04                 native   shared

Server1:/# cat /etc/zones/index
# Copyright 2004 Sun Microsystems, Inc.  All rights reserved.
# Use is subject to license terms.
#
# ident "@(#)zones-index        1.2     04/04/01 SMI"
#
# DO NOT EDIT: this file is automatically generated by zoneadm(1M)
# and zonecfg(1M).  Any manual changes will be lost.
#
global:configured:/:

Don't offline the whole DG. Only stop the zone. Zoneroot should be mounted for attaching

Attach the zone

      zoneadm -z zone04 attach -u

      - While attaching if any package inconsistency error is thrown, remove the packages using pkgrm

Server1:/# zoneadm -z zone04 attach -u
/zones/zone04 must not be group readable.
/zones/zone04 must not be group executable.
/zones/zone04 must not be world readable.
/zones/zone04 must not be world executable.

Check if the zoneroot is mounted properly

Server1:/# ls -ld /zones/zone04
drwxr-xr-x   3 root     root         512 Mar 18  2010 /zones/zone04

Server1:/# df -k /zones/zone04
Filesystem            kbytes    used   avail capacity  Mounted on
/dev/md/dsk/d10      12396483 10032378 2240141    82%    /


Not mounted - So mount zoneroot properly (in this case the whole SG was brought down so both zone root fs and DG were stopped)

After mounting


Server1:/# ls -ld /zones/zone04
drwx------   5 root     root        1024 Mar 29  2010 /zones/zone04

Try attaching now.

Server1:/# zoneadm -z zone04 attach -u
zoneadm: zone 'zone04': ERROR: attempt to downgrade package SUNWlur, the source had patch 121430-43 but this system only has 121430-42

zoneadm: zone 'zone04': ERROR: attempt to downgrade package SUNWluu, the source had patch 121430-43 but this system only has 121430-42

So now we have to remove the two packages SUNWlur and SUNWluu.
After removing the package again attach

Server1:/# zoneadm -z zone04 attach -u
Getting the list of files to remove
Removing 1208 files
Remove 24 of 24 packages
Installing 23631 files
Add 415 of 415 packages
Installation of these packages generated warnings: SUNWgssc SUNWinstall-patch-utils-root SUNWkrbr SUNWmconr SUNWnisu SUNWntpr SUNWpkgcmdsr SUNWsacom SUNWwbcor VRTSjre15
Updating editable files
The file within the zone contains a log of the zone update.

Now boot the zone and bring up the resources

Server1:/#zoneadm -z zone04 boot

Verify the patch levels of both global and non-global zones

zone04:/root# uname -a
SunOS zone04 5.10 Generic_142900-02 sun4u sparc SUNW,Sun-Fire-15000
zone04:/root#
zone04:/root# cat /etc/release
                      Solaris 10 10/09 s10s_u8wos_08a SPARC
           Copyright 2009 Sun Microsystems, Inc.  All Rights Reserved.
                        Use is subject to license terms.
                           Assembled 16 September 2009

Server1:/# uname -a
SunOS Server1 5.10 Generic_142900-02 sun4u sparc SUNW,Sun-Fire-15000
Server1:/# cat /etc/release
                      Solaris 10 10/09 s10s_u8wos_08a SPARC
           Copyright 2009 Sun Microsystems, Inc.  All Rights Reserved.
                        Use is subject to license terms.
                           Assembled 16 September 2009

Friday, October 8, 2010

proc Utilities

Lets see some process related commands.

The pgrep command displays a list of the process IDs of active processes on the system that match the pattern specified in the command line.

[Displays pid of process matching cro* pattern]
server1:/root# pgrep cro*
5399
20171
20146
20178


[Display pid of process matching cron]
server1:/root# pgrep cron
5399

[Display pid and name of process]
server1:/root# pgrep -l cron
5399 cron


[Display process associated with a user]
server1:/root# pgrep -u pagent
15267
14798
19430
19429
1731
25549

 
[Display pid and process name associated with a username]
server1:/root# pgrep -l -u pagent
15267 PatrolAgent
14798 ds_listener
19430 bgscollect
19429 bgsagent
1731 ksh
25549 ksh



pflags - Print the /proc tracing flags, the pending and held signals, and other /proc status information for each lwp in each process.

server1:/root# pflags 28453
28453: /usr/lib/ssh/sshd
data model = _ILP32 flags = ORPHAN|MSACCT|MSFORK
/1: flags = ASLEEP pollsys(0xffbff3f0,0x1,0x0,0x0)



pcred - Print the credentials (effective, real, saved UIDs and GIDs) of each process.

server1:/root# pcred 15267
15267: euid=1320 ruid=1320 suid=0 e/r/sgid=1300
groups: 1300 7929 32506 7211 32502 13500 32505 7156 32504 32503


pldd - List the dynamic libraries linked into each process, including shared objects explicitly attached using dlopen(3C).

server1:/root# pldd 1731
1731: /bin/ksh
/lib/libc.so.1
/platform/sun4u-us3/lib/libc_psr.so.1



psig - List the signal actions and handlers of each process.

server1:/root# psig 1731
1731: /bin/ksh
HUP ignored
INT caught sh_fault RESTART
QUIT ignored
ILL caught sh_done RESTART
TRAP caught sh_done RESTART
ABRT caught sh_done RESTART
EMT caught sh_done RESTART
FPE ignored
KILL default
BUS caught sh_done RESTART
SEGV default
SYS caught sh_done RESTART
PIPE ignored
ALRM caught sh_fault RESTART
TERM caught sh_done RESTART
USR1 caught sh_done RESTART
USR2 caught sh_done RESTART
CLD caught sh_fault NOCLDSTOP
PWR default
WINCH default
URG default
POLL default
STOP default
TSTP ignored
CONT default
TTIN ignored
TTOU ignored
VTALRM default
PROF default
XCPU caught sh_done RESTART
XFSZ ignored
WAITING default
LWP default
FREEZE default
THAW default
CANCEL default
LOST default
XRES default
JVM1 default
JVM2 default
RTMIN default
RTMIN+1 default
RTMIN+2 default
RTMIN+3 default
RTMAX-3 default
RTMAX-2 default
RTMAX-1 default
RTMAX default



pstack - Print a hex+symbolic stack trace for each lwp in each process.

server1:/root# pstack 1731
1731: /bin/ksh
ff2cc400 read (0, ff339c44, 1)
000233fc io_readbuff (0, ff339c44, 1, 24400, 527e0, 400) + 314
000248c4 ???????? (0, ff339c44, 53444, 1, 527e0, 5)
00024adc io_readc (2, ffbff908, 53d78, 0, ffbff90b, 53000) + 2c
00029f5c ???????? (300000, 0, 0, 53000, 53000, 0)
000299cc main (20000000, 2bc00, ffbffc24, 53000, 53000, ffff8000) + a30
00016b20 _start (0, 0, 0, 0, 0, 0) + 108



pfiles - Report information for all open files in each process. In addition, a path to the file is reported if the information is available from /proc/pid/path. This is not necessarily the same name used to open the file.

server1:/root# pfiles 1731
1731: /bin/ksh
Current rlimit: 4096 file descriptors
0: S_IFIFO mode:0000 dev:368,0 ino:751569929 uid:1320 gid:1300 size:0
O_RDWR
1: S_IFIFO mode:0000 dev:368,0 ino:751569928 uid:1320 gid:1300 size:0
O_RDWR
2: S_IFIFO mode:0000 dev:368,0 ino:751569928 uid:1320 gid:1300 size:0
O_RDWR



pwdx - Print the current working directory of each process.

server1:/proc# pwdx 1731
1731: /opt/patrol


pstop - Stop each process (PR_REQUESTED stop).

prun - Set each process running (inverse of pstop).

pwait - Wait for all of the specified processes to terminate.

ptime - Time the command, like time(1), but using microstate accounting for reproducible precision. Unlike time(1), children of the command are not timed.

server1:/# ptime cat /var/tmp/1

real 0.005
user 0.001
sys 0.003



ptree - Print the process trees containing the specified pids or users, with child processes indented from their respective parent processes.

server1:/proc# ptree 23662
28453 /usr/lib/ssh/sshd
23650 /usr/lib/ssh/sshd
23652 /usr/lib/ssh/sshd
23662 -ksh
14517 isql -syb_dba -SDRSBDT5 -w0000000000000000000000000000000000000000000000000000

Thursday, October 7, 2010

su

su command is used to change to another user. It is most commonly employed to change the ownership from an ordinary user to the root.

su [options] [commands] [-] [username]

#su root
If the correct password is provided, ownership of the session is changed to root.

whoami command displays the current user.

The default behavior of su is to maintain the current directory and the environmental variables of the original user, which means the variables like PATH and others will still be the original user's value. For ordinary users PATH is usually something like /usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/home/username/bin:/sbin:/usr/sbin:/bin:/usr/bin For root it generally resembles

To overcome this use,
su -

The hyphen has two effects: (1) it switches the current directory to the home directory of the new user (e.g., to /root in the case of the root user) and (2) it changes the environmental variables to those of the new user. 

The common option used with su is the -c option. Which tells su to execute the command that directly follows it on the same line and exit and return back to the original user. 

eg) su -c "ls -l /home" - aaron
This will attempt to switch to user 'aaron' and execute the command and return back exiting user aaron's session.

Monitoring usage of su:

Normally su attempts are logged in /var/adm/sulog file. This has to be set up when system is commissioned. The setup is done by editing the fie /etc/default/login.

#SULOG=/var/adm/sulog  => This line should be un-commented.
 
eg)# tail /var/adm/sulog
SU 10/07 10:35 + pts/3 winsel-root
SU 10/07 15:05 - console root-daemon
SU 10/07 15:54 + console root-daemon
SU 10/07 16:28 - pts/3 winsel-root
SU 10/08 08:23 + console root-daemon

MeasureWare Agent (MWA)

MeasureWare Agent uses data source integration (DSI) technology to receive, alarm on, and log data from external data sources such as applications, databases, networks, and other operating systems.

MeasureWare Agent installs in the /opt/perf/ directory and creates its log and status files in the /var/opt/perf/ directory.

root@server1:/hroot# which mwa
/opt/perf/bin/mwa


Starting the agent:-

The mwa script starts MeasureWare Agent and all its processes, including the scopeux data collector, the midaemon (measurement interface daemon), the perflbd, the rep.server, the ttd and the alarm generator.

root@server1:/hroot# mwa start

The Perf Agent scope collector is being started.
         The ARM registration daemon ttd is already running.
         It will be signaled to reprocess its configuration file.

         The Performance collection daemon
         /opt/perf/bin/scopeux has been started.

         The coda daemon /opt/OV/lbin/perf/coda has been started.
         It will be fully operational in a few minutes.


The Perf Agent server daemons are being started.
         The Perf Agent Location Broker daemon
         /opt/perf/bin/perflbd has been started.


Stopping the agent:-

root@server1:/hroot# mwa stop

Shutting down Perf Agent collection software
NOTE:   The ARM registration daemon ttd will be left running.

Shutting down coda daemon
         Shutting down coda, pid(s) 7953


Shutting down the Perf Agent server daemons
         Shutting down the alarmgen process.  This may take a while
         depending upon how many monitoring systems have to be
         notified that Perf Agent Server is shutting down.


         The alarmgen process has terminated

         Shutting down the perflbd process

         The perflbd process has terminated

         The agdbserver process terminated

         The rep_server processes have terminated

         The Perf Agent Server has been shut down successfully


To start individual components:-

root@server1:/hroot# mwa restart scope

Shutting down Perf Agent collection software
NOTE:   The ARM registration daemon ttd will be left running.

The Perf Agent scope collector is being started.
         The ARM registration daemon ttd is already running.
         It will be signaled to reprocess its configuration file.

         The Performance collection daemon
         /opt/perf/bin/scopeux has been started.

root@server1:/hroot# ps -ef | grep /opt/perf/bin/scopeux
    root 21769  5322  1 10:22:16 pts/2     0:00 grep /opt/perf/bin/scopeux
root@server1:/var/opt/perf#


But the process has not started, so have to check why it has not started.
The status of the mwa agents are recorded in the file /var/opt/perf/status.* files. Each component has its own files

root@server1:/var/opt/perf# ls
.gp                aldxc09.log        perfd              reptfile           status.perfalarm   ttd.pid
adviser.syntax     aldxd09            perfd.ini          repthead           status.perfd-5227  vppa.env
alarmdef           app-defaults       perflbd.rc         repthist           status.perflbd
alarmdef.old       datafiles          pkey               rxitemid           status.rep_server
alarmdef.org       gkey               reptSASstd         rxshorts           status.scope
aldlog09           mwakey             reptTBL            status.alarmgen    status.ttd
aldxc09            parm               reptall            status.mi          ttd.conf




Checking the status.scope file will give us the details of why the process has not started

eg)
root@server1:/var/opt/perf# tail -f status.scope

A FILE, GROUP or USER parameter is limited to 15 characters.
A parameter was truncated.


**** /opt/perf/bin/scopeux : 09/09/10 15:11:51 ****
ERROR: Unable to read from logfile '/var/opt/perf/datafiles/logproc' - corrupted data. (PE221-24)

**** /opt/perf/bin/scopeux : 09/09/10 15:11:51 ****
COLLECTOR END. program terminated abnormally.



Action Taken:-
Move the corrupt file and restart. First stop the process

>>mwa stop
>>mv /var/opt/perf/datafiles/logproc /var/opt/perf/datafiles/logproc.bkp
>>mwa start

Wednesday, October 6, 2010

EFI Disk Label

A Disk label is a place where the disk geometry id stored. There are 2 types of labels VTOC and EFI.

The EFI label provides support for physical disks and virtual disk volumes. It is used to support disks which are more than 2TB size.

The UFS file system is compatible with the EFI disk label, and you can create a UFS file system greater than 2 TB.

You can use the format-e command to label a disk less than 1TB with an EFI label.

EFI labeled disk cannot be used for booting.



For more information
http://docs.sun.com/app/docs/doc/817-5093/disksconcepts-14?a=view

How to find out the current run level

$ who -r
 .    run-level 3  Sep 13 10:18  3  0 S
$
 
run-level 3
Identifies the current run level
Sep 13 10:18
Identifies the date of last run level change
3
Also identifies the current run level
0
Identifies the number of times the system has been at this run level since the last reboot
S
Identifies the previous run level
  
In Solaris 10, check the SMF milestones and make sure the 
multi-user server service is enabled and running. 

Restricting an ftp user within his home directory

1. Go to the host and check the /etc/ftpd/ftpaccess file.

2. Add the below entry

    restricted-uid [login-id]


 eg)restricted-uid ftpuseraaron

3. This will restrict the ftp user to his home directory and deny navigation through filesystems.

Tuesday, October 5, 2010

Why syslog stops working ?

1. Could be because of space issue in /var
2. Could be because of spaces in /etc/syslog.conf

Check dmesg command and see when was the last log entry made. Also check if there is any space issue reported. If there was any space issues logged, clear the file systems and restart the daemon again.

The use of a space instead of a tab between facility.level and destination in /etc/syslog.conf will stop sylogd loging anything. Restore a fresh version of syslog.conf file and restart the daemon.

1. Check the syslog daemon and restart it if necessary.

(server1:/)# ps -ef | grep syslog
    root   226  7244   0 10:22:41 ?           0:01 /usr/sbin/syslogd
    root  3068  1691   0 10:07:58 pts/2       0:00 grep syslog

(server1:/)# svcs svc:/system/system-log:default
STATE          STIME    FMRI
online         Sep_28   svc:/system/system-log:default
(server1:/)#
(server1:/)# svcadm restart svc:/system/system-log:default
(server1:/)#
(server1:/)# svcs svc:/system/system-log:default
STATE          STIME    FMRI
online         12:39:27 svc:/system/system-log:default
(server1:/)#



Check the messages file and confirm if the logging is working fine after the corrections.

dmesg (or)
(server1:/)# ls -l /var/adm/messages
-rw-r--r--   1 root     root       46666 Oct  4 12:40 /var/adm/messages
(server1:/)#

Monday, October 4, 2010

Breaking an unresponsive system

System is unresponsive and is unreachable. Checking the status of the system through console shows the system is running but still no response from it. Looks like the system is hung. So to come out, we need to send break from console and reset the machine.

M9000 console. The unresponsive machine is domain 0

XSCF> sendbreak -d 0
Send break signal to DomainID 0?[y|n] :y
XSCF>

Open another console and login to domain and give sync in the ok prompt to initiate core dumping

XSCF> console -f -d 0
Connect to DomainID 0?[y|n] :y

### System reaches OK prompt. Give sync to force coredump

{a7} ok sync
panic[cpu167]/thread=2a174821ca0: sync initiated
sched: software trap 0x7f
pid=0, pc=0xf005d18c, sp=0x2a174820cb1, tstate=0x4400001407, context=0x0
g1-g7: 10511c4, 18de000, 60, 0, 0, 0, 2a174821ca0
00000000fdb79cd0 unix ync_handler+144 (182e400, f7, 3, 1, 1, 109f400)
%l0-3: 0000000001893e80 00000000018dddd0 00000000018ddc00 000000000000017f
%l4-7: 00000000018c1000 0000000000000000 00000000018bac00 0000000000000037
00000000fdb79da0 unix:vx_handler+80 (fdb02078, 183e038, 7fffffffffffffff, 1, 183e140, f006d515)
%l0-3: 000000000183e140 0000000000000000 0000000000000001 0000000000000001
%l4-7: 000000000182ec00 00000000f0000000 0000000001000000 0000000001019734
00000000fdb79e50 unix:callback_handler+20 (fdb02078, fdfea400, 0, 0, 0, 0)
%l0-3: 0000000000000016 00000000fdb79701 0000000000000000 0000000000000000
%l4-7: 0000000000000000 0000000000000000 0000000000000000 0000000000000001
syncing file systems... 570 568 568 568 568 568 568 568 568 568 568 568 568 568 568 568 568 568 568 568 568 568 done (not all i/o completed)
dumping to /dev/md/dsk/d11, offset 21476081664, content: kernel
100% done: 2916208 pages dumped, compression ratio 2.90, dump succeeded

### System reboots to init level 3
rebooting...
Resetting...
.POST Sequence 01 CPU Check
LSB#02 (XSB#01-0): POST 2.11.0 (2009/06/18 09:30)
LSB#06 (XSB#03-1): POST 2.11.0 (2009/06/18 09:30)
LSB#07 (XSB#03-2): POST 2.11.0 (2009/06/18 09:30)
LSB#03 (XSB#02-0): POST 2.11.0 (2009/06/18 09:30)
LSB#01 (XSB#00-1): POST 2.11.0 (2009/06/18 09:30)
LSB#04 (XSB#02-1): POST 2.11.0 (2009/06/18 09:30)
POST Sequence 02 Banner

Machine dumps core and reboots.

Sunday, October 3, 2010

Changing UMASK for an ftp user account

UMASK is used to set the default permissions of a newly created file.

This value is defined as a system wide property in the /etc/profiles file. By default this value is set as 022.
For each user, this value can be set in their ~/.profiles file there by setting their own customized values.

To calculate permissions which will result from specific umask values, subtract the umask from 777. 
For files, the subtraction is done from 666 and for directories, 777 is used. If umask is 022, this will cause files to be created with permissions of 644 (rw-r--r--) and directories to be created with permissions of 755 (rwxr-xr-x).

So to set this umask value for a ftp user to 006 - GLOBALLY

1. Edit the file /etc/inetd.conf and change the umask value as below

$vi /etc/inetd.conf

EDIT==> ftp          stream tcp6 nowait root /usr/lbin/ftpd     ftpd -l -u 006

2. Save and reinitialize the daemon.

****Do not restart the inetd daemon*** Instead use the below command to re-initialize.

$inetd -c


Saturday, October 2, 2010

Replacing a faulted disk in a SVM - hotswap

Lets see how to replace a defective disk which is in 'Maintenance' state in SVM.

This is a hot swap in which the old failed disk is pulled out of the live system and a new disk is attached back into it. Before removing a disk, it must be un-configured from SVM.

The disk is part of a concatenated mirror. Six disks are organized as mirror with 3 disks forming a concatenation at each ends.

Mirror - d6
Sub-Mirror 1 - d15
Sub-Mirror 2 - d16

           d 6
           | |
           | |
       d15 | | d16


One of the failed disk is in the sub-mirror d16. To replace, we need to detach the sub-mirror, clear the sub-mirror, replace the disk, recreate the sub-mirror.

Step-1

$metadetach d6 d16
[Once detached, the logging device is no longer part of the trans, thus the trans is no longer logging and all benefits of logging are lost. Use the -f option if the device is busy. Use it only if the disk is in maintenance state]
$metadetach -f d6 d16
Step-2
$metaclear d16
[The metaclear command deletes the specified metadevice or deletes all hotspares/soft partitions. Once cleared, we need to create again using metainit to be able to use again]
Step-3
Here the disk is pulled out and the new disk is inserted. Necessary un-configuration is done using cfgadm and after replacement, then action is taken to recognize the disk at oS level.

$cfgadm -al

Step-4

The new attached disk is now formatted similar to the opposite sub-mirrors configuartion.

$prtvtoc /dev/dsk/c2t1d0s2|fmthard -s - /dev/rdsk/c0t1d0s2 
[Copy the VTOC from old disk to new using fmthard command]

Step-5

$metainit d16 3 1 c0t0d0s6 1 c0t1d0s2 1 c0t2d0s2
[Recreate the sub-mirror using metainit]

Step-6

$metattach d6 d16
[This reattaches the other sub-mirror and immediately starts the synchronization of data from the other sub-mirror]

Step-7

$metastat -C
[This checks the metadevice states]