【Linux】命令 - strace

Posted by 西维蜀黍 on 2021-02-18, Last Modified on 2023-05-02

strace

strace is a diagnostic, debugging and instructional userspace utility for Linux. It is used to monitor and tamper with interactions between processes and the Linux kernel, which include system calls, signal deliveries, and changes of process state. The operation of strace is made possible by the kernel feature known as ptrace.

在Linux世界,进程不能直接访问硬件设备,当进程需要访问硬件设备(比如读取磁盘文件,接收网络数据等等)时,必须由用户态模式切换至内核态模式,通过系统调用访问硬件设备。strace可以跟踪到一个进程产生的系统调用,包括参数,返回值,执行消耗的时间。

输出含义

每一行都是一条系统调用,左边是系统调用的函数名及其参数,右边是该调用的返回值。 strace 显示这些调用的参数并返回符号形式的值。

$ strace cat /dev/null
execve("/bin/cat", ["cat", "/dev/null"], [/* 22 vars */]) = 0
brk(0)                                  = 0xab1000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f29379a7000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
...

Errors (typically a return value of -1) have the errno symbol and error string appended.

open("/foo/bar", O_RDONLY) = -1 ENOENT (No such file or directory)

Signals are printed as signal symbol and decoded siginfo structure. An excerpt from stracing and interrupting the command “sleep 666” is:

  sigsuspend([] <unfinished ...>
 --- SIGINT {si_signo=SIGINT, si_code=SI_USER, si_pid=...} ---
 +++ killed by SIGINT +++

If a system call is being executed and meanwhile another one is being called from a different thread/process then strace will try to prese rve the order of those events and mark the ongoing call as being unfinished. When the call returns it will be marked as resumed.

[pid 28772] select(4, [3], NULL, NULL, NULL <unfinished ...>
[pid 28779] clock_gettime(CLOCK_REALTIME, {1130322148, 939977000}) = 0
[pid 28772] <... select resumed> )      = 1 (in [3])

Usage

$ strace [-ACdffhiqrtttTvVxxy] [-I n] [-b execve] [-e expr]... [-a column] [-o file] [-s strsize] [-X format] [-P path]... [-p pid]... { -p pid | [-D] [-E var[=val]]...
              [-u username] command [args] }

$ strace -c [-df] [-I n] [-b execve] [-e expr]... [-O overhead] [-S sortby] [-P path]... [-p pid]... { -p pid | [-D] [-E var[=val]]... [-u username] command [args] }

Startup:

  • -p [pid]: trace process with process id PID

Filtering:

  • -P [path]: trace vfork 出现来进程
  • -f: follow forks(fork出来的进程)
  • -e expr: A qualifying expression which modifies which events to trace or how to trace them. The format of the expression is: [qualifier=][!][?]value1[,[?]value2]... where qualifier is one of trace, abbrev, verbose, raw, signal, read, write, fault, inject, or kvm and value is a qualifier-dependent symbol or number. The default qualifier is trace.
    • Using an exclamation mark(感叹号) negates the set of values.
    • For example, -e open means literally -e trace=open which in turn means trace only the open system call.
    • By contrast, -e trace=!open means to trace every system call except open.
    • Question mark before the syscall qualification allows suppression of error in case no syscalls matched the qualification provided.
    • Appending one of “@64”, “@32”, or “@x32” suffixes to the syscall qualification allows specifying syscalls only for the 64-bit, 32-bit, or 32-on-64-bit personality, respectively. In addition, the special values all and none have the obvious meanings.
    • Note that some shells use the exclamation point for history expansion even inside quoted arguments. If so, you must escape the exclamation point with a backslash.
  • -e trace=set: Trace only the specified set of system calls. The -c option is useful for determining which system calls might be useful to trace. For example, trace=open,close,read,write means to only trace those four system calls. Be careful when making inferences about the user/kernel boundary if only a subset of system calls are being monitored. The default is trace=all.

Output format

  • -o [file]: Write the trace output to the file filename rather than to stderr
  • -A: Open the file provided in the -o option in append mode.
  • -t: Prefix each line of the trace with the wall clock time(经过的时间),e.g., 22:50:09 close(1) = 0
  • -tt: If given twice, the time printed will include the microseconds (e.g., 22:49:50.738093 close(2) = 0).
  • -ttt: If given thrice, the time printed will include the microseconds and the leading portion will be printed as the number of secon ds since the epoch (e.g., 1613746151.937596 close(2) = 0).
  • -T: Show the time spent in system calls. This records the time difference between the beginning and the end of each system call, e.g., close(1) = 0 <0.000022>
  • -r: Print a relative timestamp upon entry to each system call. This records the time difference between the beginning of successive system calls. Note that since -r option uses the monotonic clock time for measuring time difference and not the wall clock time, its measurements can differ from the difference in time reported by the -t option.

Demo

观察常规 System Call

一个测试的Python 程序:

import time

print("111")
time.sleep(1)
$ python sw.py
111
...
$ strace -ttp 28649
strace: Process 28649 attached
17:32:18.756573 _newselect(0, NULL, NULL, NULL, {tv_sec=3, tv_usec=219410}) = 0 (Timeout)

17:32:21.979863 write(1, "111\n", 4)    = 4
17:32:21.980421 clock_gettime(CLOCK_MONOTONIC, {tv_sec=766504, tv_nsec=210350862}) = 0
17:32:21.980850 _newselect(0, NULL, NULL, NULL, {tv_sec=10, tv_usec=0}) = 0 (Timeout)

17:32:31.991693 write(1, "111\n", 4)    = 4
17:32:31.992677 clock_gettime(CLOCK_MONOTONIC, {tv_sec=766514, tv_nsec=222688674}) = 0
17:32:31.993379 _newselect(0, NULL, NULL, NULL, {tv_sec=10, tv_usec=0}) = 0 (Timeout)
...
# 如果只想看 read system call
$ strace -ttp 28649 -e write
strace: Process 28649 attached
17:35:22.109868 write(1, "111\n", 4)    = 4
17:35:32.114985 write(1, "111\n", 4)    = 4
...

只关注特定 Sytem Call

# Trace a process and filter output by system call:
$ strace -p [pid] -e [system_call_name]
# 只关注 write 
$ strace -e write ls
...
write(1, "Desktop  Documents  Downloads  M"..., 89Desktop  Documents  Downloads  MagPi  Music  mymiio  Pictures  Public  Templates  Videos
) = 89
+++ exited with 0 +++

# 只关注 read 
$ strace -e read ls
...
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\254\3\0\0004\0\0\0"..., 512) = 512

# 只关注 gettimeofday 
$ strace -p 13031 -e gettimeofday
...
gettimeofday({tv_sec=1613748737, tv_usec=39467}, NULL) = 0
gettimeofday({tv_sec=1613748737, tv_usec=40644}, NULL) = 0

跟踪可执行程序

$ strace -f -F -o ~/straceout.txt myserver

-f -F 选项告诉strace同时跟踪fork和vfork出来的进程,-o选项把所有strace输出写到 ~/straceout.txt 里面,myserver是要启动和调试的程序。

跟踪服务程序

$ strace -o output.txt -T -tt -e trace=all -p 28979

跟踪28979进程的所有系统调用(-e trace=all。注意到,-e trace=all 是默认值,因此这里加不加这个对于输出的内容来说没有区别),并统计系统调用的花费时间,以及开始时间(并以可视化的时分秒格式显示),最后将记录结果存在output.txt文件里面。

观察 Signal

起一个HTTP server:

$ python2 -m SimpleHTTPServer 8888&
# 通过 SIGINT 杀掉这个 process
$ kill -2 12486
$ strace -p 12469 -e signal
strace: Process 12469 attached
--- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} ---
sigreturn({mask=[]})                    = -1 EINTR (Interrupted system call)
rt_sigaction(SIGINT, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6cc0120}, {sa_handler=0x114584, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6cc0120}, 8) = 0
+++ exited with 1 +++

-p - 观察特定进程

如果一个进程已经在运行,你可以通过它的pid进行追踪,它会显示追踪后这个进程的系统调用,使用CTRL+C退出。

# Start tracing a specific process by its PID:
$ strace -p [pid]

$ sudo strace -p 3569
strace: Process 3569 attached
restart_syscall(<... resuming interrupted poll ...>) = 1
recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"U\2\24\300!\247\330\0\3\24\4\0\20\0\0\0\0\0\0\24\24\24\24\24\0\0\3\37%\2\0\0", 4096}], msg_controllen=0, msg_flags=0}, 0) = 32
recvmsg(4, 0x7ffee4dbf870, 0)           = -1 EAGAIN (Resource temporarily unavailable)
recvmsg(4, 0x7ffee4dbf850, 0)           = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=10, events=POLLIN}, {fd=30, events=POLLIN}, {fd=31, events=POLLIN}], 6, -1) = 1 ([{fd=31, revents=POLLIN}])
read(31, "\372", 1)                     = 1
recvmsg(4, 0x7ffee4dbf850, 0)           = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=10, events=POLLIN}, {fd=30, events=POLLIN}, {fd=31, events=POLLIN}], 6, 0) = 1 ([{fd=31, revents=POLLIN}])
read(31, "\372", 1)                     = 1
recvmsg(4, 0x7ffee4dbf850, 0)           = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=10, events=POLLIN}, {fd=30, events=POLLIN}, {fd=31, events=POLLIN}], 6, 0) = 0 (Timeout)
mprotect(0x207faa20000, 8192, PROT_READ|PROT_WRITE) = 0
mprotect(0x207faa20000, 8192, PROT_READ|PROT_EXEC) = 0
mprotect(0x207faa21000, 4096, PROT_READ|PROT_WRITE) = 0
mprotect(0x207faa21000, 4096, PROT_READ|PROT_EXEC) = 0
...

-c - summary syscalls

使用-c参数,可以得到追踪的每一种系统调用的耗时、次数和失败数,如下所示:

# Count time, calls, and errors for each system call and report a summary on program exit:
$ strace -p [pid] -c

$ sudo strace -c -p 3569

strace: Process 3569 attached

^Cstrace: Process 3569 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 99.73    0.016000           8      1971           poll
  0.16    0.000025           0       509        75 futex
  0.06    0.000010           0      1985      1966 recvmsg
  0.06    0.000009           0      2336           mprotect
  0.00    0.000000           0       478           read
  0.00    0.000000           0        13           write
  0.00    0.000000           0        29           mmap
  0.00    0.000000           0         9           munmap
  0.00    0.000000           0        18           writev
  0.00    0.000000           0       351           madvise
  0.00    0.000000           0         1           restart_syscall
------ ----------- ----------- --------- --------- ----------------
100.00    0.016044                  7700      2041 total

-T - time spent

# Show the time spent in every system call:
$ strace -p [pid] -T

显示每一次调用的时间戳

-t参数可以显示时间戳。

$ sudo strace -t df -h
15:19:25 execve("/bin/df", ["df", "-h"], [/* 17 vars */]) = 0
15:19:25 brk(NULL)                      = 0x234c000
15:19:25 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
15:19:25 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f8c7f1d9000
15:19:25 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
15:19:25 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
15:19:25 fstat(3, {st_mode=S_IFREG|0644, st_size=147662, ...}) = 0
15:19:25 mmap(NULL, 147662, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f8c7f1b4000
15:19:25 close(3)                       = 0
15:19:25 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
15:19:25 open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
15:19:25 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\t\2\0\0\0\0\0"..., 832) = 832
15:19:25 fstat(3, {st_mode=S_IFREG|0755, st_size=1868984, ...}) = 0
15:19:25 mmap(NULL, 3971488, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f8c7ebec000
15:19:25 mprotect(0x7f8c7edac000, 2097152, PROT_NONE) = 0
...

macOS - dtruss

Usage

  • -p PID: examine this PID
  • -n name: processes with this name
  • -f: follow children as they are forked
  • -e: print elapsed times, us
  • -o: print on-cpu times, us
# examine PID 1871
$ dtruss -p 1871 

# examine all processes called "tar"
$ dtruss -n tar

# run test.sh and follow children
$ dtruss -f test.sh

# run the "date" command and print elapsed and on cpu times,
$ dtruss -eo date

Demo

$ sudo dtruss ls
Password:
SYSCALL(args) 		 = return
...
issetugid(0x0, 0x0, 0x0)		 = 0 0
getentropy(0x7FFEE11F6000, 0x20, 0x0)		 = 0 0
getentropy(0x7FFEE11F6050, 0x40, 0x0)		 = 0 0
getpid(0x0, 0x0, 0x0)		 = 16330 0
stat64("/AppleInternal\0", 0x7FFEE11F65F0, 0x0)		 = -1 Err#2
csops_audittoken(0x3FCA, 0x7, 0x7FFEE11F6140)		 = 0 0
proc_info(0x2, 0x3FCA, 0xD)		 = 64 0
csops_audittoken(0x3FCA, 0x7, 0x7FFEE11F59C0)		 = 0 0
open_nocancel("/usr/share/locale/UTF-8/LC_CTYPE\0", 0x0, 0x7)		 = 3 0
fcntl_nocancel(0x3, 0x3, 0x0)		 = 0 0
getrlimit(0x1008, 0x7FFEE11F6D10, 0x0)		 = 0 0
fstat64(0x3, 0x7FFEE11F6D88, 0x0)		 = 0 0
fstat64(0x3, 0x7FFEE11F6B88, 0x0)		 = 0 0
lseek(0x3, 0x0, 0x1)		 = 0 0
lseek(0x3, 0x0, 0x0)		 = 0 0
read_nocancel(0x3, "RuneMagAUTF-8\0", 0x1000)		 = 4096 0
read_nocancel(0x3, "\0", 0x1000)		 = 4096 0
read_nocancel(0x3, "\0", 0x1000)		 = 4096 0
read_nocancel(0x3, "\0", 0x1000)		 = 4096 0
read_nocancel(0x3, "\0", 0x1000)		 = 4096 0
read_nocancel(0x3, "\0", 0x1000)		 = 4096 0
read_nocancel(0x3, "\0", 0x1000)		 = 4096 0
read_nocancel(0x3, "@\004\211\0", 0xF5D0)		 = 62928 0
close_nocancel(0x3)		 = 0 0
ioctl(0x1, 0x4004667A, 0x7FFEE11F72E4)		 = 0 0
ioctl(0x1, 0x40087468, 0x7FFEE11F7958)		 = 0 0
getuid(0x0, 0x0, 0x0)		 = 0 0
fstatat64(0xFFFFFFFFFFFFFFFE, 0x7FC9A1704298, 0x7FFEE11F71F8)		 = 0 0
open_nocancel(".\0", 0x1000000, 0x0)		 = 3 0
fchdir(0x3, 0x0, 0x0)		 = 0 0
open_nocancel(".\0", 0x1000000, 0x0)		 = 4 0
open_nocancel(".\0", 0x1100004, 0x0)		 = 5 0
getattrlistbulk(0x5, 0x7FFEE11F7248, 0x7FC9A280A000)		 = 92 0
getattrlistbulk(0x5, 0x7FFEE11F7248, 0x7FC9A280A000)		 = 0 0
close_nocancel(0x5)		 = 0 0
fchdir(0x4, 0x0, 0x0)		 = 0 0
close_nocancel(0x4)		 = 0 0
fstat64(0x1, 0x7FFEE11F6AA8, 0x0)		 = 0 0
ioctl(0x1, 0x4004667A, 0x7FFEE11F6AF4)		 = 0 0
write_nocancel(0x1, ".CFUserTextEncoding\t\t.config\t\t\t\t.npm\t\t\t\t.tldr\t\t\t\tDocuments\t\t\tgo\n@\004(\0", 0x40)		 = 64 0
write_nocancel(0x1, ".DS_Store\t\t\t.dlv\t\t\t\t.oh-my-zsh\t\t\t.viminfo\t\t\tDownloads\t\t\tgoogle-cloud-sdk\n\004(\0", 0x49)		 = 73 0
write_nocancel(0x1, ".ShadowsocksX-NG\t\t.docker\t\t\t\t.oracle_jre_usage\t\t.vnc\t\t\t\tGolandProjects\t\t\tgrep\n(\0", 0x4E)		 = 78 0
write_nocancel(0x1, ".Trash\t\t\t\t.gitconfig\t\t\t.pip\t\t\t\t.vscode\t\t\t\tIdeaProjects\t\t\tjava_error_in_goland_646.log\n\b\0", 0x56)		 = 86 0
write_nocancel(0x1, ".TrojanX\t\t\t.gitflow_export\t\t\t.profile\t\t\t.wget-hsts\t\t\tLibrary\t\t\t\tlogs\n$\b\0", 0x45)		 = 69 0
write_nocancel(0x1, ".VenGO\t\t\t\t.gitignore_global\t\t.python_history\t\t\t.zcompdump\t\t\tMovies\t\t\t\tmymiio\n\004(\0", 0x4D)		 = 77 0
write_nocancel(0x1, ".account\t\t\t.gitkraken\t\t\t.qshell\t\t\t\t.zcompdump-weishi-mac-5.3\tMusic\t\t\t\tnode_modules\n\0", 0x53)		 = 83 0
write_nocancel(0x1, ".adobe\t\t\t\t.gvm\t\t\t\t.rediscli_history\t\t.zcompdump-weishi-mac-5.7.1\tPictures\t\t\tnohup.out\n\0", 0x56)		 = 86 0
write_nocancel(0x1, ".android\t\t\t.hgignore_global\t\t.s3cfg\t\t\t\t.zsh_history\t\t\tPostman\t\t\t\tpackage-lock.json\n\0", 0x53)		 = 83 0
write_nocancel(0x1, ".anyconnect\t\t\t.hidpi-disable\t\t\t.smc_config.json\t\t.zshrc\t\t\t\tPublic\t\t\t\tpprof\n\0", 0x4B)		 = 75 0
write_nocancel(0x1, ".atom\t\t\t\t.homeassistant\t\t\t.smc_docker\t\t\t.zshrc.backup\t\t\tPycharmProjects\t\t\tsme\n\0", 0x4E)		 = 78 0
write_nocancel(0x1, ".bash_history\t\t\t.jetbrains\t\t\t.sogouinput\t\t\t0\t\t\t\tSW\t\t\t\tspex\n\0", 0x3B)		 = 59 0
write_nocancel(0x1, ".bash_profile\t\t\t.local\t\t\t\t.spcli\t\t\t\t0#\t\t\t\\n@\004\0", 0x40)		 = 64 0
write_nocancel(0x1, ".bash_sessions\t\t\t.matplotlib\t\t\t.spex\t\t\t\t1.html\t\t\t\tWorking\n\0", 0x3A)		 = 58 0
write_nocancel(0x1, ".bashrc\t\t\t\t.mysql_history\t\t\t.ssh\t\t\t\tApplications\t\t\tdelete_disable_sdsf.sql\n\0", 0x4B)		 = 75 0
write_nocancel(0x1, ".cisco\t\t\t\t.node_repl_history\t\t.stCommitMsg\t\t\tDesktop\t\t\t\tdump.rdb\n\004\0", 0x41)		 = 65 0
fchdir(0x3, 0x0, 0x0)		 = 0 0
close_nocancel(0x3)		 = 0 0

Reference