Thursday, October 13, 2011

Collectl, an all-in-one tool for collecting Linux statistical data

Collectl,collect for Linux, is a single tool which integrates functions of various tools:sar,iostat,mpstat,top,slaptop,netstat,nfstat,ps .. .
- Supported: Linux
- Requirement: Perl
Collectl features:
- run in command line or run as daemon
- Various output formats: raw,gunplot,gexprt(ganglia),sexpr,lexpr,csv(--sep ,)
- Send data to other programs (ganglia) remotely via socket instead of writing to a file
- IPMI monitoring for fans and temperature sensors
- Support module (Perl scripts)  for customized checks
- Monitor process’s disk read/write, find the top processes keeping disk busy
The last one is the most impressive feature, I haven’t found other Linux tools can do it. (DTtrace can in Solaris)
collectl  examples
#help, all options 
$collect –x
#-s?, what to monitor:c – cpu  d – disk “collectl   --showsubsys”
#-c 5 : collect 5 samples and exit
#-oT:  T - preface output with time only ; “collectl   --showoptions”
$collectl   -sc -c5 -i2 --verbose -oT
waiting for 2 second sample...
# CPU SUMMARY (INTR, CTXSW & PROC /sec)
#Time      User  Nice   Sys  Wait   IRQ  Soft Steal  Idle  CPUs  Intr  Ctxsw  Proc  RunQ   Run   Avg1  Avg5 Avg15
12:39:34      0     0     0     0     0     1     0    97     1  1082     23     0    76     1   0.42  0.42  0.44
12:39:36      0     0     0     0     0     1     0    97     1  1088     24     0    76     1   0.42  0.42  0.44

The following demonstrates how collectl identify the process reading/writing most data to disk
#Hammer disk by writing 50mb data with dd
$dd if=/dev/urandom of=test bs=1k count=50000
#collectl identifies the “dd” process
#in top mode, sort by  “iokb   total I/O KB” ; “collectl –showtopopts”
$collectl -i2  --top iokb
TOP PROCESSES sorted by iokb (counters are /sec) 12:50:31
# PID  User     PR  PPID THRD S   VSZ   RSS CP  SysT  UsrT Pct  AccuTime  RKB  WKB MajF MinF Command
6861  root     18  6784    0 R    3M  572K  0  0.91  0.00  45   0:00.91    0 3680    0   97 dd
1  root     15     0    0 S    2M  632K  0  0.00  0.00   0   0:28.21    0    0    0    0 init
2  root     RT     1    0 S     0     0  0  0.00  0.00   0   0:00.00    0    0    0    0 migration/0

Tuesday, October 11, 2011

Understanding Red Hat Linux recovery runlevels

If Linux system can boot but hang during starting a service, booting to “recovery runlevels” can skip the service and gain shell to troubleshoot.
If Linux system can’t boot at all,  booting from rescue CD (first installation media) and type “linux rescue” to gain shell to troubleshoot
Red Hat Linux boot order
The BIOS ->MBR->Boot Loader->Kernel->/sbin/init->
/etc/inittab->
/etc/rc.d/rc.sysinit->
/etc/rc.d/rcX.d/ #where X is run level in /etc/inittab
run script with K then script with S
Recovery runlevels
- runlevel  1
Execute up to /etc/rc.d/rc.sysinit and /etc/rc.d/rc1.d/
Runlevel 1 is identical to singleuser mode. It is switched to singleuser mode in last step, just a number of trivial scripts executed before that.
 $ls  /etc/rc.d/rc1.d/S*
 /etc/rc.d/rc1.d/S02lvm2-monitor  /etc/rc.d/rc1.d/S13cpuspeed  /etc/rc.d/rc1.d/S99singlesingleuser

- single
Execute up to /etc/rc.d/rc.sysinit

- Emergency
Does not execute /etc/rc.d/rc.sysinit.
 Because rc.sysinit is not executed, file system is mounted in read-only mode. You need run “mount –o rw,remount /” to remount it in read-write mode.
emergency runlevel is Red Hat term, it is identical to  “init=/bin/sh” in any Linux distribution
How to go to a  runlevel
In the grub menu, type “a” to append one of following options to boot line.
1  
single  
emergency   
init=/bin/sh
When Centos hung on starting up boot services, how to get to shell without rescue CD
RHCE Notes - Troubleshooting booting issue

Advanced RPM topics

Query
“queryformat”  option can query every piece information of a rpm package, the  information tags (macros ) are returned  by “rpm –querytags” command
#list top 2 rpm packages sorted by installation time
$rpm -qa  | xargs -I{} rpm -q --queryformat "{}        %{installtime}\n" {} | sort -rn -k2 | head -2
collectl-3.5.1-1        1317864013
git-1.7.4.1-1.el5        1316484590
#unfortunately the time returned is unixtime.  You can convert it to human readable format by  “date –d @timestring” e.g 
$date -d @1317864013
Thu Oct  6 12:20:13 EST 2011
#but there is a shortcut  “--last”
$ rpm -qa --last  | head -2
collectl-3.5.1-1                              Thu 06 Oct 2011 12:20:13 PM EST
git-1.7.4.1-1.el5                             Tue 20 Sep 2011 12:09:50 PM EST

"rpm -qa" supports regular expression itself, rather than pipe to grep e.g “rpm -qa | grep perl”
“rpm –qa perl\*” also works. There is no improvement on speed but typing become lesser.
requires and provides
#You can check the package dependency before install the package
$ rpm -qp --requires git-1.7.6-1.el5.rf.i386.rpm
..
libssl.so.6 
#To meet the dependency, you want to check who provides libssl.so.6 
$yum whatprovides libssl.so.6
openssl-0.9.8e-20.el5.i686 : The OpenSSL toolkit
Repo        : base
Matched from:
Other       : libssl.so.6
#if openssl has been installed, “rpm -q –whatprovides” can also provide the answer
$rpm -q --whatprovides libssl.so.6
openssl-0.9.8e-12.el5_4.6

rpm scriptlets

#query all nopre|nopost|nopreun|nopostun  scripts
$rpm -q --scripts xinetd
postinstall scriptlet (using /bin/sh):
if [ $1 = 1 ]; then
/sbin/chkconfig --add xinetd
fi
preuninstall scriptlet (using /bin/sh):
if [ $1 = 0 ]; then
/sbin/service xinetd stop > /dev/null 2>&1
/sbin/chkconfig --del xinetd
fi
postuninstall scriptlet (using /bin/sh):
if [ $1 -ge 1 ]; then
/sbin/service xinetd condrestart >/dev/null 2>&1
Fi
#query postinstall script only
$ rpm -q --queryformat "%{POSTIN}"  xinetd
if [ $1 = 1 ]; then
/sbin/chkconfig --add xinetd
#Don’t run the scripts during install/remove
rpm –i –noscripts|nopre|nopost|nopreun|nopostun   pkgname
rpm –e –noscripts|nopre|nopost|nopreun|nopostun   pkgname

Extract rpm contents without install

#use rpm2cpio to extract everything
$mkdir /tmp/epel
$ cd /tmp/epel
$ rpm2cpio /root/epel-release-5-4.noarch.rpm | cpio -ivd
./etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL
..
#use rpm2cpio to extract particular file
$rpm2cpio /root/epel-release-5-4.noarch.rpm | cpio -ivd  ./usr/share/doc/epel-release-5
#another way is to use rpm install with alternative root
$ rpm --root /tmp/epel/ -ivh  --nodeps /root/epel-release-5-4.noarch.rpm
Recover corrupted rpm database
Build RPM from source file