6.S081:3.素数筛Primes实验


6.S081:3.素数筛Primes实验

Problem:primes (moderate)/(hard)

Write a concurrent version of prime sieve using pipes. This idea is due to Doug McIlroy, inventor of Unix pipes. The picture halfway down this page and the surrounding text explain how to do it. Your solution should be in the file user/primes.c.

Your goal is to use pipe and fork to set up the pipeline. The first process feeds the numbers 2 through 35 into the pipeline. For each prime number, you will arrange to create one process that reads from its left neighbor over a pipe and writes to its right neighbor over another pipe. Since xv6 has limited number of file descriptors and processes, the first process can stop at 35.

primes

Some hints:

  • Be careful to close file descriptors that a process doesn't need, because otherwise your program will run xv6 out of resources before the first process reaches 35.
  • Once the first process reaches 35, it should wait until the entire pipeline terminates, including all children, grandchildren, &c. Thus the main primes process should only exit after all the output has been printed, and after all the other primes processes have exited.
  • Hint: read returns zero when the write-side of a pipe is closed.
  • It's simplest to directly write 32-bit (4-byte) ints to the pipes, rather than using formatted ASCII I/O.
  • You should create the processes in the pipeline only as they are needed.
  • Add the program to UPROGS in Makefile.

Your solution is correct if it implements a pipe-based sieve and produces the following output:

$ make qemu
...
init: starting sh
$ primes
prime 2
prime 3
prime 5
prime 7
prime 11
prime 13
prime 17
prime 19
prime 23
prime 29
prime 31
$

思考

解决这道题的关键是让管道连续流动,父进程写数据,子进程读数据即可。那势必需要多次fork(),因为没遇到一个素数就需要创建一个新进程,管道流动起来。

首先明确,父进程的职能是将素数写入管道,子进程负责从管道读出素数,接着进行一次筛选,再创建新进程,此时子进程变成了父进程,重复上述操作即可,上述操作的伪代码可以这样描述。

while(condition){
    if(fork() == 0){
        read primes from pipe;
        prime sieve;
        print min prime;
    }
    else{
        write primes into pipe;
    }
}

只有fork()出来的父子进程才能通过一个管道通信,为了完成父进程写,子进程读,需要我们关闭父进程的读端和子进程的写端,管道通信的示意图如下

pipe

由于管道通信仅涉及父子进程,那么一个想法是在每一次创建进程之前,都先进行pipe(fd),这样父子进程的fd都指向同一管道。

while(condition){
    int fd[2];
    pipe(fd);
    if(fork() == 0){
        close(fd[1]);
        read(fd[0], buf, n);//read primes from pipe;
        prime sieve;
        print min prime;
        exit(0);
    }
    else{
        //write primes into pipe;
        close(fd[0]);
        write(fd[1], buf, n);
        wait()
        exit(0);
    }
}

这样基本上整个问题的大致思路就对了,只需要把细节考虑进去

  • 由于需要一组数在管道中流动,那么实际上需要传送一个数组,为了简单起见,每次只传送一个整数,这样就需要,一个循环来读写。
  • while停止的条件,可以记录每次传送数据的个数,当数据个数为0时,认为素数筛选完毕,此时退出循环,程序结束。
  • 值得注意的是,由于每次循环都执行fork()操作,会导致进程指数级增长(多叉树),我们期望的是这棵树长成一个单链表,那就需要wait()机制,wait可以让父进程等待子进程的退出,在父进程的代码里,加上wait()则会让父进程阻塞在这里,从而不会执行循环,也就不会形成多叉树。
#include "kernel/types.h"
#include "user/user.h"
#define MAX 34


int main(){
    int num[MAX];
    for(int i = 0; i < MAX; i++)
        num[i] = i + 2;
    int len = MAX;
    while(len != 0){
        int fd[2];
        if(pipe(fd) < 0){
            printf("pipe error!\n");
            exit(-1);
        }
        int pid = fork();
        if(pid < 0){
            printf("fork error\n");
            exit(-2);
        }
        else if(pid > 0){
            close(fd[0]);
            for(int i = 0 ; i < len; i++)
                write(fd[1], &num[i], sizeof(int));
            close(fd[1]);
            wait(0);	// d
            exit(0);
        }
        else{
            close(fd[1]);
            len = 0;
            while(read(fd[0], &num[len++], sizeof(int)) != 0);
            len--;
            int prime = num[0];
            printf("prime %d\n", prime);
            int index = 0;
            for(int i = 0; i < len; i++)
                if(num[i] % prime != 0)
                    num[index++] = num[i];
            len = index;
        }
    }
    exit(0);
}

彩蛋

其实没有wait(),让父进程直接退出也能达到不产生多叉树的效果,但是如果注释掉wait(),会发现一个意想不到的结果。

会发现,命令提示符$被打印了出来?这里的原因需要追溯到shell的运行机制,实际上shell也是一个c程序,中间也是一段fork(),子进程执行命令,父进程等待子进程的退出(显示命令提示符$),由于cpu调度时,一次性让primes.c的主进程(shell的子进程)执行完了,进程退出后,会回到shell程序,接着打印出$,从而有了上述的结果。


Author: Paranoid
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source Paranoid !
评论
  TOC