进程调度函数schedule()解读

在linux系统中,单处理器也是多线程处理信号、事件等。这就需要一个核心算法来进行进程调度。这个算法就是CFS(Completely Fair Scheduler)。在 LInux Kernel Development 一书中用一句话总结CFS进程调度:

运行rbtree树中最左边叶子节点所代表的那个进程。

在一个自平衡二叉搜索树红黑树rbtree的树节点中,存储了下一个应该运行进程的数据。在这里我们看到了二叉搜索树的完美运用。具体可参见Introduction to Algorithms Page 174~182。

而进程调度的主要入口函数就是schedule()。它定义在文件kernel/sched.c中。

我们先看一个在等待队列中进行进程调度的例子:

    DEFINE_WAIT(wait); //申明等待队列

    add_wait_queue(q,&wait); //把我们用的q队列加入到wait等待队列中
    while(!condition){ //当等待事件没有来临时
         prepare_to_wait(&q,&wait,TASK_INTERRUPTIBLE);
         //将q从TASK_RUNNING或者其他状态置为TASK_INTERRUPTIBLE不可运行的休眠状态。
         //同时接受信号&&事件来唤醒它
         if(signal_pending(current))  //如果有来自从处理器的信号
         { processingsignal();}//处理信号
         schedule(); //调用红黑树中的下一个进程
    }
    finish_wait(&q,&wait); //将进程设置为TASK_RUNNING并移出等待队列.

其实我们可以这么理解这段代码。现在有一个任务要等待事件到来才能运行,怎么实现呢?就是阻塞加查询。但是这样会使得这段代码独占整个操作系统。为了解决这个问题,就在阻塞查询之中加入了队列和进程调度schedule(),从而不耽误其它线程的执行。

再来看一看schedule()函数的结构:


##schedule()函数结构##

 
    asmlinkage void __sched schedule(void)  ///定义通过堆栈传值
    {
    struct task_struct *prev, *next;
    unsigned long *switch_count;
    struct rq *rq;
    int cpu;

    /*At the end of this function, it will check if need_resched() return
    true, if that indeed happen, then goto here.*/
    need_resched:

    /*current process won't be preempted after call preemept_disable()*/
    preempt_disable(); //不让优先占有当前进程
    cpu = smp_processor_id();
    rq = cpu_rq(cpu);
    /* rcu_sched_qs ? */
    rcu_sched_qs(cpu);

    /* prev point to current task_struct */
    prev = rq->curr;

    /* get current task_struct's context switch count */
    switch_count = &prev->nivcsw;

    /* kernel_flag is "the big kernel lock". 
     * This spinlock is taken and released recursively by lock_kernel()
     * and unlock_kernel(). It is transparently dropped and reacquired
     * over schedule(). It is used to protect legacy code that hasn't
     * been migrated to a proper locking design yet.
     * In task_struct, there is a member lock_depth, which is inited -1,
     * indicates that the current task have no kernel lock.
     * When lock_depth >=0 indicate that it own kernel lock.
     * During context switching, it is not permitted that the task  
     * switched away remain own kernel lock , so in scedule(),it
     * call release_kernel_lock(), release kernel lock.
     */
    release_kernel_lock(prev);

    need_resched_nonpreemptible:

    schedule_debug(prev);

    if (sched_feat(HRTICK))
        hrtick_clear(rq);

    /* occupy current rq's lock */
    raw_spin_lock_irq(&rq->lock); //占有rq自旋锁

    /* update rq's clock,this function will call sched_clock_cpu() */
    update_rq_clock(rq);

    /* clear bit in task_struct's thread_struct's flag TIF_NEED_RESCHED.
     * In case that it will be rescheduled, because it prepare to give
     * up cpu.
     */
    clear_tsk_need_resched(prev);



    if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) {
        if (unlikely(signal_pending_state(prev->state, prev)))
            prev->state = TASK_RUNNING;
        else
            deactivate_task(rq, prev, 1);
        switch_count = &prev->nvcsw;
    }

    /* For none-SMP, pre_schedule is NULL */
    pre_schedule(rq, prev);

  
    if (unlikely(!rq->nr_running))
        idle_balance(cpu, rq);

    put_prev_task(rq, prev);


    next = pick_next_task(rq);


    if (likely(prev != next)) {
        sched_info_switch(prev, next);
        perf_event_task_sched_out(prev, next);

        rq->nr_switches++;
        rq->curr = next;
        ++*switch_count;

        context_switch(rq, prev, next); /* unlocks the rq */
        /*
         * the context switch might have flipped the stack from under
         * us, hence refresh the local variables.
         */
        cpu = smp_processor_id();
        rq = cpu_rq(cpu);
    } else
      raw_spin_unlock_irq(&rq->lock);//current task still occupy cpu

    post_schedule(rq);

    if (unlikely(reacquire_kernel_lock(current) < 0)) {
        prev = rq->curr;
        switch_count = &prev->nivcsw;
        goto need_resched_nonpreemptible;
    }

    preempt_enable_no_resched();
    if (need_resched())
        goto need_resched;
    }
    EXPORT_SYMBOL(schedule);

schedule()函数的目的在于用另一个进程替换当前正在运行的进程。因此,这个函数的主要结果就是设置一个名为next的变量,以便它指向所选中的 代替current的进程的描述符。如果在系统中没有可运行进程的优先级大于current的优先级,那么,结果是next与current一致,没有进程切换发生。


##References##

[1].UNDERSTANDING THE LINUX KERNEL. Page 276

[2].Linux Kernel Development. Page 52

[3].http://hi.baidu.com/zengzhaonong/item/20d9e8207b04cb8f6e2cc323



Previous     Next
tuzhi /
Published under (CC) BY-NC-SA in categories Linux  tagged with Kernel