深入理解Linux的fork函数

浏览数：27 / 时间：2015年06月20日

一、问题引入
工作期间，某系统设计师抛出如下一个问题，下面的代码，输出几个“-”？：

/****************************************************************************** 
Copyright by Thomas Hu, All rights reserved! 
Filename    : fork01.c 
Author      : Thomas Hu 
Date        : 2012-8-5 
Version     : 1.0 
Description : fork函数问题原型 
******************************************************************************/  
#include <unistd.h>  
#include <stdio.h>  
  
int main()  
{  
    int i = 0;  
    for(i = 0; i < 2; i++)  
    {  
        fork();  
        printf("-");  
    }  
  
    return 0;  
}

过了N久之后，仍然没有人回答这个问题（也许大家都忙，没空理他^_^)。

如果您回答是2，那建议您还是先看看Linux中的fork函数使用说明；
如果您回答是6，说明您对fork函数有一定的理解了，但还需要继续看本篇文档；
如果您回答是8，并且理解背后原理（不是执行程序得出的结论），那您就不需要看本文啦，请绕道行走^_^。

我大略分析了一下，然后输入代码编译执行，执行结果竟然为8个，觉得不可思议！（理论上是6个啊，对8个百思不得其解，后来查阅了资料，才发现自己还没搞懂 fork 背后的本质，因此撰此文，大家共同探讨。）

要搞清楚fork的执行过程，就必须先弄清楚操作系统中的“进程(process)”概念。一个进程，主要包含三个元素：
1. 一个可以执行的程序；
2. 和该进程相关联的全部数据（包括变量，内存空间，缓冲区等等）；
3. 程序的执行上下文（execution context）。
不妨简单理解为，一个进程表示的，就是一个可执行程序的一次执行过程中的一个状态。操作系统对进程的管理，典型的情况，是通过进程表完成的。进程表中的每一个表项，记录的是当前操作系统中一个进程的情况。对于单 CPU的情况而言，每一特定时刻只有一个进程占用 CPU，但是系统中可能同时存在多个活动的（等待执行或继续执行的）进程。
一个称为“程序计数器（program counter, pc）”的寄存器，指出当前占用 CPU的进程要执行的下一条指令的位置。
当分给某个进程的 CPU时间已经用完，操作系统将该进程相关的寄存器的值，保存到该进程在进程表中对应的表项里面；把将要接替这个进程占用 CPU的那个进程的上下文，从进程表中读出，并更新相应的寄存器（这个过程称为“上下文交换(process context switch)”，实际的上下文交换需要涉及到更多的数据，那和fork无关，不再多说，主要要记住程序寄存器pc记录了程序当前已经执行到哪里，是进程上下文的重要内容，换出 CPU的进程要保存这个寄存器的值，换入CPU的进程，也要根据进程表中保存的本进程执行上下文信息，更新这个寄存器）。

二、fork函数详解

        #include<unistd.h>
　　#include<sys/types.h>
　　函数定义：
　　pid_t fork( void);
　　（pid_t 是一个宏定义，其实质是int 被定义在#include<sys/types.h>中）
　　返回值：
                  若成功调用一次则返回两个值，子进程返回0，父进程返回子进程ID；否则，出错返回-1 。
                 fork出错可能有两种原因：（1）当前的进程数已经达到了系统规定的上限，这时errno的值被设置为EAGAIN。（2）系统内存不足，这时errno的值被设置为ENOMEM。
　　函数说明：
　　一个现有进程可以调用fork函数创建一个新进程。由fork创建的新进程被称为子进程（child process）。fork函数被调用一次但返回两次。两次返回的唯一区别是子进程中返回0值而父进程中返回子进程ID。将子进程id返回给父进程的理由是：因为一个进程的子进程可以多于一个，没有一个函数使一个进程可以获得其所有子进程的进程id。对子进程来说，之所以fork返回0给它，是因为它随时可以调用getpid()来获取自己的pid；也可以调用getppid()来获取父进程的id。(进程id 0总是由交换进程使用，所以一个子进程的进程id不可能为0 )。
　　子进程是父进程的副本，它将获得父进程数据空间、堆、栈等资源的副本。注意，子进程持有的是上述存储空间的“副本”，这意味着父子进程间不共享这些存储空间。
　　linux将复制父进程的地址空间内容给子进程，因此，子进程有了独立的地址空间。可以这样想象，2个进程一直同时运行，而且步调一致，在fork之后，他们分别做不同的工作，也就是分岔了。这也是fork为什么叫fork的原因。
         Linux帮助手册，对 fork 函数有非常详细的说明，如下：

DESCRIPTION
fork() creates a new process by duplicating the calling process. The new process, referred to as the child, is an exact duplicate of the calling process, referred to as the parent, except for the following
points:

* The child has its own unique process ID, and this PID does not match the ID of any existing process group (setpgid(2)).

* The child‘s parent process ID is the same as the parent‘s process ID.

* The child does not inherit its parent‘s memory locks (mlock(2), mlockall(2)).

* Process resource utilizations (getrusage(2)) and CPU time counters (times(2)) are reset to zero in the child.

* The child‘s set of pending signals is initially empty (sigpending(2)).

* The child does not inherit semaphore adjustments from its parent (semop(2)).

* The child does not inherit record locks from its parent (fcntl(2)).

* The child does not inherit timers from its parent (setitimer(2), alarm(2), timer_create(2)).

* The child does not inherit outstanding asynchronous I/O operations from its parent (aio_read(3), aio_write(3)), nor does it inherit any asynchronous I/O contexts from its parent (seeio_setup(2)).

The process attributes in the preceding list are all specified in POSIX.1-2001. The parent and child also differ with respect to the following Linux-specific process attributes:

* The child does not inherit directory change notifications (dnotify) from its parent (see the description of F_NOTIFY in fcntl(2)).

* The prctl(2) PR_SET_PDEATHSIG setting is reset so that the child does not receive a signal when its parent terminates.

* Memory mappings that have been marked with the madvise(2) MADV_DONTFORK flag are not inherited across a fork().

* The termination signal of the child is always SIGCHLD (see clone(2)).
Note the following further points:

* The child process is created with a single thread ?.the one that called fork(). The entire virtual address space of the parent is replicated in the child, including the states of mutexes, condition variables,
and other pthreads objects; the use of pthread_atfork(3) may be helpful for dealing with problems that this can cause.

* The child inherits copies of the parent‘s set of open file descriptors. Each file descriptor in the child refers to the same open file description (see open(2)) as the corresponding file descriptor in the
parent. This means that the two descriptors share open file status flags, current file offset, and signal-driven I/O attributes (see the description of F_SETOWN and F_SETSIG in fcntl(2)).

* The child inherits copies of the parent‘s set of open message queue descriptors (see mq_overview(7)). Each descriptor in the child refers to the same open message queue description as the corresponding
descriptor in the parent. This means that the two descriptors share the same flags (mq_flags).

* The child inherits copies of the parent‘s set of open directory streams (see opendir(3)). POSIX.1-2001 says that the corresponding directory streams in the parent and child may share the directory stream
positioning; on Linux/glibc they do not.

RETURN VALUE
On success, the PID of the child process is returned in the parent, and 0 is returned in the child. On failure, -1 is returned in the parent, no child process is created, and errno is set appropriately.

ERRORS
EAGAIN fork() cannot allocate sufficient memory to copy the parent‘s page tables and allocate a task structure for the child.

EAGAIN It was not possible to create a new process because the caller‘s RLIMIT_NPROC resource limit was encountered. To exceed this limit, the process must have either the CAP_SYS_ADMIN or the CAP_SYS_RESOURCE
capability.

ENOMEM fork() failed to allocate the necessary kernel structures because memory is tight.

CONFORMING TO
SVr4, 4.3BSD, POSIX.1-2001.

NOTES
Under Linux, fork() is implemented using copy-on-write pages, so the only penalty that it incurs is the time and memory required to duplicate the parent‘s page tables, and to create a unique task structure for
the child.

Since version 2.3.3, rather than invoking the kernel‘s fork() system call, the glibc fork() wrapper that is provided as part of the NPTL threading implementation invokes clone(2) with flags that provide the same
effect as the traditional system call. The glibc wrapper invokes any fork handlers that have been established using pthread_atfork(3).

以上英文内容，相信大家都看得懂吧^_^，如果不懂，那还真不太适合搞程序啊。下次有时间的话，我再给大家翻译吧（如果有需求的话^_^）。

三、问题分析
前面说了那么多废话，其实都是为了解决那个“诡异”的输出 8 个“-”的问题的。fork函数，使子进程复制了父进程的整个虚拟地址空间（包括互斥状态、条件变量、其他pthread对象等），子进程继承父进程的打开文件描述符集合、打开消息队列描述符集合以及打开的目录流集合等，但内存锁、CPU时间片、旗标、记录锁、定时器等不会从父进程继承下来。
         下面从for循环开始逐步分析源码。
          1、当 i = 0 时，在循环体内执行 fork 函数，此时父进程（暂且命名为 P）创建了一个子进程（姑且命名为 A）。此时，进程 A拥有与父进程相同的条件变量，在进程A中，i 也为0；接着两个进程 P 和 A 执行 printf 语句。注意，此时系统中存在两个进程，分别分析如下。
         2、在 P 进程中， i 加 1，此时 i = 1，满足循环条件，进入循环体执行。执行 fork 函数，再次创建一个子进程 B，此时在进程 P 和 B 中， i = 1；接着两个进程 P 和 A 执行 printf语句。
         3、在 A 进程中， i 加1，此时 i = 1，满足循环条件，进入循环体执行。执行 fork 函数，创建一个进程 A 的子进程（姑且命名为AA）。此时，在进程 A 和 AA中， i =1；接着两个进程 A 和 AA分别执行 printf 语句。
         4、在进程 P、 A、AA、B进程中，i 再次加1，此时 i = 2；均不满足循环体判断条件，4个进程跳出循环体，执行循环体后面的 return 语句，进程结束。
以上分析过程，如下图所示（相同颜色的是同一个进程）：

技术分享
仔细的读者可能会惊呼，4个进程，不是总共只执行了 6 次 printf 语句吗？怎么会打印 8 个“-”呢？是的，只执行了 6 次 printf语句，这毋庸置疑！
这是因为printf(“-”);语句在作怪！我们知道，在Linux/Unix下的设备有“块设备”和“字符设备”的概念，所谓块设备，就是以一块一块的数据存取的设备，字符设备是一次存取一个字符的设备。磁盘、内存、显示器都是块设备，字符设备如键盘和串口。块设备一般都有缓存，而字符设备一般都没有缓存。
所以，对于上述程序，printf(“-”);把“-”放到了缓存中，并没有真正的输出，在fork的时候，缓存被复制到了子进程空间，所以，就多了两个，就成了8个，而不是6个。
我们如果修改一下上面的printf语句为：
printf("-\n");
或是
printf("-");
flush();

就没有问题了，程序会只输出6个 “-”，因为程序遇到“\n”或是EOF，或是缓中区满，或是文件描述符关闭，或是主动flush，就会把数据刷出缓冲区。

完整的代码如下：

/****************************************************************************** 
Copyright by Thomas Hu, All rights reserved! 
Filename    : fork02.c 
Author      : Thomas Hu 
Date        : 2012-8-5 
Version     : 1.0 
Description : fork函数问题，打印进程号，通过 pstree -p 查看进程树关系 
******************************************************************************/  
#include <unistd.h>  
#include <stdio.h>  
  
int main()  
{  
    int i = 0;  
    for(i = 0; i < 2; i++)  
    {  
        fork();  
          
        /*注意：下面的printf有“\n”*/  
        printf("ppid=%d, pid=%d, i=%d \n", getppid(), getpid(), i);  
    }  
  
    sleep(10); /*让进程停留十秒，这样我们可以用pstree -p 查看一下进程树*/  
  
    return 0;  
}

执行结果如下：

技术分享

通过进程树查看，如下所示：

技术分享

如下图所示，就是阴影并双边框了那两个子进程复制了父进程标准输出缓中区里的的内容，而导致了多次输出。

技术分享

注：以上进程树分析的两张图片，摘自：http://coolshell.cn/articles/7965.html ，版权归原作者所有，在此表示感谢！

四、总结
在计算机编程领域，从来就没有所谓“诡异”的事件，有果必有因，有因必有果！若出现“诡异”事件，说明在某个隐蔽的角落，我们没有想到，或没有深入理解其本质，才会导致某些现象“不可思议”！
我们只有透过现象，看透本质，某些“诡异”的问题，就能迎刃而解，最终发现“诡异”现象本身就是一种自然现象，是我们的无知造成了“灵异”事件^_^。

郑重声明：本站内容如果来自互联网及其他传播媒体，其版权均属原媒体及文章作者所有。转载目的在于传递更多信息及用于网络分享，并不代表本站赞同其观点和对其真实性负责，也不构成任何其他建议。

深入理解Linux的fork函数