Swarmkit笔记(3)——swarmd程序框架

agent.Node结构体有4channel,理解它们的作用就可以理解swarmd程序的框架:

// Node implements the primary node functionality for a member of a swarm
// cluster. Node handles workloads and may also run as a manager.
type Node struct {
    ......
    started              chan struct{}
    stopped              chan struct{}
    ready                chan struct{} // closed when agent has completed registration and manager(if enabled) is ready to receive control requests
    ......
    closed               chan struct{}
    ......
}

swarmd程序的框架(其中executor通过engine-addr得到,代表最终运行task的实体,实际是一个Docker engineapi.APIClient。其它参数都通过命令行直接得到。):

        ......
        // Create a context for our GRPC call
        ctx, cancel := context.WithCancel(context.Background())
        defer cancel()

        ......

        n, err := agent.NewNode(&agent.NodeConfig{
            Hostname:         hostname,
            ForceNewCluster:  forceNewCluster,
            ListenControlAPI: unix,
            ListenRemoteAPI:  addr,
            JoinAddr:         managerAddr,
            StateDir:         stateDir,
            JoinToken:        joinToken,
            ExternalCAs:      externalCAOpt.Value(),
            Executor:         executor,
            HeartbeatTick:    hb,
            ElectionTick:     election,
        })
        if err != nil {
            return err
        }

        if err := n.Start(ctx); err != nil {
            return err
        }

        c := make(chan os.Signal, 1)
        signal.Notify(c, os.Interrupt)
        go func() {
            <-c
            n.Stop(ctx)
        }()

        go func() {
            select {
            case <-n.Ready():
            case <-ctx.Done():
            }
            if ctx.Err() == nil {
                logrus.Info("node is ready")
            }
        }()

        return n.Err(ctx)

(1)

if err := n.Start(ctx); err != nil {
    return err
}

看一下Node.Start()函数的实现:

// Start starts a node instance.
func (n *Node) Start(ctx context.Context) error {
    select {
    case <-n.started:
        select {
        case <-n.closed:
            return n.err
        case <-n.stopped:
            return errAgentStopped
        case <-ctx.Done():
            return ctx.Err()
        default:
            return errAgentStarted
        }
    case <-ctx.Done():
        return ctx.Err()
    default:
    }

    close(n.started)
    go n.run(ctx)
    return nil
}

如果执行Node.Start()时没有任何异常发生,就会把Node.started这个channel关掉(close(n.started)),然后启动这个节点初始化过程:go n.run(ctx)

(2)

c := make(chan os.Signal, 1)
signal.Notify(c, os.Interrupt)
go func() {
    <-c
    n.Stop(ctx)
}()

这段代码的含义是用户按Ctrl+C可以中断程序。Node.Stop()函数实现如下:

// Stop stops node execution
func (n *Node) Stop(ctx context.Context) error {
    select {
    case <-n.started:
        select {
        case <-n.closed:
            return n.err
        case <-n.stopped:
            select {
            case <-n.closed:
                return n.err
            case <-ctx.Done():
                return ctx.Err()
            }
        case <-ctx.Done():
            return ctx.Err()
        default:
            close(n.stopped)
            // recurse and wait for closure
            return n.Stop(ctx)
        }
    case <-ctx.Done():
        return ctx.Err()
    default:
        return errAgentNotStarted
    }
}

由于此时Node.started这个channel已经被关掉,所以会永远执行select的第一个case分支:case <-n.started。然后会根据当时的情况,再决定执行哪个分支。

(3)

go func() {
    select {
        case <-n.Ready():
        case <-ctx.Done():
    }
    if ctx.Err() == nil {
        logrus.Info("node is ready")
    }
}()

Node.Ready()函数会返回Node.ready这个channel

// Ready returns a channel that is closed after node's initialization has
// completes for the first time.
func (n *Node) Ready() <-chan struct{} {
    return n.ready
}

Node初始化完成后,Node.ready这个channel就会被关掉。因此如果一切顺利的话,就会看到“node is ready”的log

(4)

return n.Err(ctx)

Node.Err()函数的实现:

// Err returns the error that caused the node to shutdown or nil. Err blocks
// until the node has fully shut down.
func (n *Node) Err(ctx context.Context) error {
    select {
    case <-n.closed:
        return n.err
    case <-ctx.Done():
        return ctx.Err()
    }
}

Node.Err()函数阻塞在这里,等待Node关闭。

 

发表评论

电子邮件地址不会被公开。 必填项已用*标注