This article shows how a Linux process can be traced and profiled. Using the “last” command as an example, profiling is used to explain why a process was very slow, and why another, very similar, process (dump-utmp) was much faster. “Tracing” here means seeing what a process is doing at any moment. “Profiling” means showing (afterwards) how long it spent doing different things.
“Last” is a Linux command that reads and summarizes the utmp file, where login records are stored. I had a “last” command taking hours to complete because the utmp file had grown large (1.9 Gb). Used strace to see what it was doing.
Pid was 14974:
[root@pluto log]# strace -p 14974 Process 14974 attached - interrupt to quit _llseek(3, 1975549952, [1975549952], SEEK_SET) = 0 read(3, " "..., 16384) = 16384 _llseek(3, 1975533568, [1975533568], SEEK_SET) = 0 read(3, " "..., 16384) = 16384 write(1, "00:00) nskapasi vsftpd:8313 "..., 4096) = 4096 _llseek(3, 1975517184, [1975517184], SEEK_SET) = 0
Last processes the wtmp file in reverse, which is why it is calling llseek() all the time to move backwards in the file (by 16 k chunks). However this does not explain the slowness. The “last” process was maxing out one CPU, ie. it was CPU bound. Why so ?
Using strace with the -c flag to profile it:
# strace -c -p 14974
Process 14974 attached - interrupt to quit Process 14974 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 46.85 0.000699 19 36 read 45.17 0.000674 42 16 write 7.10 0.000106 3 36 _llseek 0.87 0.000013 7 2 brk ------ ----------- ----------- --------- --------- ---------------- 100.00 0.001492 90 total
The process is spending only about 7% of its time in llseek. Whatever is thrashing the CPU, it isn’t llseek.
There is another profiling command called ltrace. It is like strace but traces library calls rather than system calls. Used it thus. NB the PID changed because I restarted “last“.
# ltrace -p 15459 strncmp("vsftpd:24274", "vsftpd:10592", 32) = 1 strncmp("vsftpd:24281", "vsftpd:10592", 32) = 1 strncmp("vsftpd:24288", "vsftpd:10592", 32) = 1 strncmp("vsftpd:24295", "vsftpd:10592", 32) = 1 strncmp("vsftpd:24302", "vsftpd:10592", 32) = 1 strncmp("vsftpd:24309", "vsftpd:10592", 32) = 1
…and so on. The “last” process sure is making a lot of calls to strncmp().
ltrace offers profiling too, again with addition of a -c flag:
# ltrace -c -p 15459 % time seconds usecs/call calls function ------ ----------- ----------- --------- -------------------- 99.92 5.655910 81 69222 strncmp 0.06 0.003652 23 154 _IO_putc 0.01 0.000288 24 12 memcpy 0.00 0.000163 32 5 ctime 0.00 0.000126 25 5 free 0.00 0.000099 24 4 sprintf 0.00 0.000098 24 4 malloc 0.00 0.000096 24 4 strncat 0.00 0.000052 26 2 snprintf 0.00 0.000050 25 2 strcpy 0.00 0.000026 26 1 printf ------ ----------- ----------- --------- -------------------- 100.00 5.660560 69415 total
The “last” process is spending nearly all of its time (99.92%) in strncmp(), a compute-intensive C library function for comparing 2 strings. Which explains why the last process was hogging a whole CPU and executing so slowly.
The wmtp file had grown so large because the system (an FTP server) was receiving up to 300 FTP connections per minute and logging them all.
Regarding wtmp, there exists another command called dump-utmp, which can process the whole 1.9 Gb utmp file in 2.5 minutes (“last” would take at least 3 hours, possibly many more, if left to complete. The full command was:
# dump-utmp ./wtmp > test.out
Using strace and ltrace on dump-utmp shows that the command processes the wtmp file forwards and so does not need to call lseek() all the time like “last“. Also it just seems to read and write the data, not bothering with comparing strings or other processing, and produces a much more “raw” report than the output of “last“.
So Linux contains some pretty cool tracing tools. The above was on Red Hat. A quick shufty indicates ltrace and strace are readily available in Fedora 16 and Debian Squeeze, for example.
Footnote
Truss offers strace-like functionality in Solaris. Use truss -p
Pingback: Linux Capabilities and Ping | Unix etc.