【Operating System】进程 - Linux 启动进程的几种方式

Posted by 西维蜀黍 on 2019-07-18, Last Modified on 2022-12-10

有时候,我们需要在自己的程序(进程)中启动另一个程序(进程)来帮助我们完成一些工作,那么我们需要怎么才能在自己的进程中启动其他的进程呢?在Linux中提供了不少的方法来实现这一点,下面就来介绍一个这些方法及它们之间的区别。

system() 函数调用

Function

The system() function hands the argument command to the command interpreter sh(1). The calling process waits for the shell to finish executing the command, ignoring SIGINT and SIGQUIT, and blocking SIGCHLD.

If command is a NULL pointer, system() will return non-zero if the command interpreter sh(1) is available, and zero if it is not.

The system() library function uses fork(2) to create a childprocess that executes the shell command specified in command using execl(3) as follows:

execl("/bin/sh", "sh", "-c", command, (char *)

system() returns after the command has been completed.

SYNOPSIS

#include <stdlib.h>
int system (const char *string);

它的作用是,运行以字符串参数的形式传递给它的命令并等待该命令的完成。命令的执行情况就如同在shell中执行命令:sh -c string。如果无法启动shell来运行这个命令,system() 函数返回错误代码127;如果是其他错误,则返回-1。否则,system函数将返回该命令的退出码。

Demo

可以先运行下面的例子,源文件为 sw.c ,代码如下:

#include <stdlib.h>
#include <stdio.h>
 
int main()
{
    printf("Running test with system\n");
    system("echo $SW");

    // sleep 进程只选结束后,才会继续执行后续的代码
    int res = system("sleep 100");// 1
    printf("test Done %d\n", res);
    exit(0);
}
$ gcc sw.c
$ SW=aaa ./a.out
Running test with system
aaa
test Done 0

Observation:

# 查看进程的环境变量
$ ps eww 21530
  PID   TT  STAT      TIME COMMAND
21530 s001  S+     0:00.01 ./a.out TERM_SESSION_ID=w0t1p0:A303486F-BF80-4546-84E2-99859FED4FEF ... SW=aaa _=/Users/wei.shi/Downloads/./a.out
➜  ~ ps eww 21532
  PID   TT  STAT      TIME COMMAND
21532 s001  S+     0:00.01 sleep 100 TERM_PROGRAM=iTerm.app ... SW=aaa

说明

  • system() 是通过 fork 的方式启动一个子进程来执行特定命令
    • 因此,在父进程中设置的环境变量,在子进程中也能被读到(在上面例子中,在子进程sleep 100,pid为21532,也可以读到 $SW
  • system() 是阻塞式的,这意味着如果子进程需要很长时间来执行,父进程会被阻塞

替换进程映像——使用exec系列函数

总结来说, exec 会把当前进程替换为一个新进程,也就是说你可以使用exec函数将程序的执行从一个进程(当前进程)切换到另一个进程

Function

In computing, exec is a functionality of an operating system that runs an executable file in the context of an already existing process, replacing the previous executable. This act is also referred to as an overlay. It is especially important in Unix-like systems, although it exists elsewhere. As no new process is created, the process identifier (PID) does not change, but the machine code, data, heap, and stack of the process are replaced by those of the new program.

SYNOPSIS

#include <unistd.h>

extern char **environ;

int execl(const char *path, const char *arg, ...
                /* (char  *) NULL */);
int execlp(const char *file, const char *arg, ...
                /* (char  *) NULL */);
int execle(const char *path, const char *arg, ...
                /*, (char *) NULL, char * const envp[] */);
int execv(const char *path, char *const argv[]);
int execvp(const char *file, char *const argv[]);
int execvpe(const char *file, char *const argv[],
                char *const envp[]);

exec 系列函数由一组相关的函数组成,它们在进程的启动方式和程序参数的表达方式上各有不同。

但是,exec 系列函数都有一个共同的工作方式,就是把当前进程替换为一个新进程,也就是说你可以使用exec函数将程序的执行从一个进程(当前进程)切换到另一个进程

在新的程序启动后,原来的程序就不再执行了,新进程由path或file参数指定。exec系列函数比 system() 函数更有效。

exec 系列函数解析

exec 系列函数可以分为两大类:

  • execl()execlp()execle() 的参数是长度可变的,以一个空指针(null pointer)结束
  • execv()execvp()execve() 的第二个参数是一个字符串数组(这个是字符串数组以一个空指针结束)
    • 在调用新进程时,argv作为新进程的main函数的参数。而 exevp()可作为新进程的环境变量,传递给新的进程,从而变量它可用的环境变量。

The execle() and execvpe() functions allow the caller to specify the environment of the executed program via the argument envp. The envp argument is an array of pointers to null-terminated strings and must be terminated by a null pointer. The other functions take the environment for the new process image from the external variable environ in the calling process.


Special semantics for execlp(), execvp() and execvpe()

The execlp(), execvp(), and execvpe() functions duplicate the actions of the shell in searching for an executable file if the specified filename does not contain a slash (/) character. The file is sought in the colon-separated list of directory pathnames specified in the PATH environment variable. If this variable isn’t defined, the path list defaults to a list that includes the directories returned by confstr(_CS_PATH) (which typically returns the value “/bin:/usr/bin”) and possibly also the current working directory; see NOTES for further details.

Demo

承接上一个例子,如果想用exec系统函数来启动ps进程,则这6个不同的函数的调用语句为:

注:arg0为程序的名字,所以在这个例子中全为ps。

char *const ps_envp[] = {"PATH=/bin:usr/bin", "TERM=console", 0};
char *const ps_argv[] = {"ps", "au", 0};
 
execl("/bin/ps", "ps", "au", 0);
execlp("ps", "ps", "au", 0);
execle("/bin/ps", "ps", "au", 0, ps_envp);
 
execv("/bin/ps", ps_argv);
execvp("ps", ps_argv);
execve("/bin/ps", ps_argv, ps_envp);

下面我给出一个完整的例子,源文件名为 new_ps_exec.c,代码如下:

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
 
int main()
{
    printf("Running ps with execlp\n");
    execlp("ps", "ps", "au", (char*)0);
    printf("ps Done");
    exit(0);
}
$ gcc -o new_ps_exec new_ps_exec.c
$ ./new_ps_exec
Running ps with execlp
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root       496  0.0  3.1 185612 29576 tty7     Ssl+ Feb09   5:31 /usr/lib/xorg/Xorg :0 -seat seat0 -auth /var/run/lightdm/root/:0 -nolisten tcp vt7 -novtswitch
root       497  0.0  0.1   5620  1572 tty1     Ss   Feb09   0:00 /bin/login -f
pi         713  0.0  0.1   8492  1520 tty1     S+   Feb09   0:00 -bash
pi         918  0.0  0.4   8488  3808 pts/0    Ss   12:31   0:00 -bash
pi        2584  0.0  0.2   9788  2448 pts/0    R+   13:46   0:00 ps au

注意,一般情况下,exec系列函数是不会返回的,除非发生错误返回-1,由exec系列函数启动的新进程继承了原进程的许多特性,在原进程中已打开的文件描述符在新进程中仍将保持打开,但任何在原进程中已打开的目录流都将在新进程中被关闭。

复制进程映像—— fork() 函数

Function

fork() causes creation of a new process. The new process (child process) is an exact copy of the calling process (parent process) except for the following:

  • The child process has a unique process ID.
  • The child process has a different parent process ID (i.e., the process ID of the parent process).
  • The child process has its own copy of the parent’s descriptors. These descriptors reference the same underlying objects, so that, for instance, file pointers in file objects are shared between the child and the parent, so that an lseek(2) on a descriptor in the child process can affect a subsequent read or write by the parent. This descriptor copying is also used by the shell to establish standard input and output for newly created processes as well as to set up pipes.
  • The child processes resource utilizations are set to 0; see setrlimit(2).

SYNOPSIS

   #include <sys/types.h>
   #include <unistd.h>

   pid_t fork(void);

fork() creates a new process by duplicating the calling process. The new process is referred to as the child process. The calling process is referred to as the parent process.

The child process and the parent process run in separate memory spaces. At the time of fork() both memory spaces have the same content. Memory writes, file mappings, and unmappings performed by one of the processes do not affect the other.

The child process is an exact duplicate of the parent process except for the following points:

  • The child has its own unique process ID, and this PID does not match the ID of any existing process group (setpgid(2)) or session.
  • The child’s parent process ID is the same as the parent’s process ID.
  • The child does not inherit its parent’s memory locks (mlock(2), mlockall(2)).
  • Process resource utilizations (getrusage(2)) and CPU time counters (times(2)) are reset to zero in the child.
  • The child’s set of pending signals is initially empty (sigpending(2)).
  • The child does not inherit semaphore adjustments from its parent (semop(2)).
  • The child does not inherit process-associated record locks from its parent (fcntl(2)). (On the other hand, it does inherit fcntl(2) open file description locks and flock(2) locks from its parent.)
  • The child does not inherit timers from its parent (setitimer(2), alarm(2), timer_create(2)).
  • The child does not inherit outstanding asynchronous I/O operations from its parent (aio_read(3), aio_write(3)), nor does it inherit any asynchronous I/O contexts from its parent (see io_setup(2)).

RETURN VALUE

On success, the PID of the child process is returned in the parent, and 0 is returned in the child. On failure, -1 is returned in the parent, no child process is created, and errno is set appropriately.

fork() 函数的应用

exec 系列函数调用用新的进程替换当前执行的进程,而我们也可以用fork() 来复制一个新的进程,新的进程几乎与原进程一模一样,执行的代码也完全相同,但新进程有自己的数据空间、环境和文件描述符。

fork() 函数的原型为:

#include <sys/type.h>
#include <unistd.h>

pid_t fork();

等待一个进程

wait()函数和waitpid()函数的原型为:

#include <sys/types.h>
#include <sys/wait.h>

pid_t wait(int *stat_loc);
pid_t waitpid(pid_t pid, int *stat_loc, int options);

wait()用于在父进程中调用,让父进程暂停执行等待子进程的结束,返回子进程的PID,如果stat_loc不是空指针,状态信息将被写入stat_loc指向的位置。

waitpid()等待进程id为pid的子进程的结束(pid为-1,将返回任一子进程的信息),stat_loc参数的作用与wait函数相同,options用于改变waitpid的行为,其中有一个很重要的选项WNOHANG,它的作用是防止waippid调用者的执行挂起。如果子进程没有结束或意外终止,它返回0,否则返回子进程的pid。

Demo

#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
int main()
{
  
    // make two process which run same
    // program after this instruction
    fork();
  
    printf("Hello world!\n");
    sleep(100);
    return 0;
}
$ gcc sw.c
$ ./a.out
Hello world!
Hello world!

总结

首先是最简单的system()函数,它需要启动新的shell并在新的shell是执行子进程,所以对环境的依赖较大,而且效率也不高。同时system()函数要等待子进程的返回才能执行下面的语句。

exec系统函数是用新的进程来替换原先的进程,效率较高,但是它不会返回到原先的进程,也就是说在exec函数后面的所以代码都不会被执行,除非exec调用失败。然而exec启动的新进程继承了原进程的许多特性,在原进程中已打开的文件描述符在新进程中仍将保持打开,但需要注意,任何在原进程中已打开的目录流都将在新进程中被关闭。

fork()则是用当前的进程来复制出一个新的进程,新进程与原进程一模一样,执行的代码也完全相同,但新进程有自己的数据空间、环境变量和文件描述符,我们通常根据fork()函数的返回值来确定当前的进程是子进程还是父进程,即它并不像exec那样并不返回,而是返回一个pid_t的值用于判断,我们还可以继续执行fork()后面的代码。

Reference

system

exec

fork