Kubernetes笔记(7)—— 搭建”k8s on Mesos”注意事项

在本地搭建k8s on Mesos项目时,Mesos client脚本要以root身份运行。另外如果本地环境用到了proxy,一定要注意可能(不确定是否有例外,比如也许取决于你所使用的proxy或操作系统)需要把k8s或者MesosIP地址加入到no-proxy/NO_PROXY环境变量中。具体可参见下列issues
The kubernetes on Mesos can’t run successfully on the same machine
Why does “km controller-manager” think it is an invalid event?
The “km controller-manager” command doesn’t work successfully behind proxy.

 

Kubernetes笔记(6)—— kubectl代码分析(1)

kubectl实际运行的函数(k8s.io/kubernetes/pkg/kubectl/app/kubectl.go):

func Run() error {
    cmd := cmd.NewKubectlCommand(cmdutil.NewFactory(nil), os.Stdin, os.Stdout, os.Stderr)
    return cmd.Execute()
}

NewKubectlCommand的代码(k8s.io/kubernetes/pkg/kubectl/cmd/cmd.go):

// NewKubectlCommand creates the `kubectl` command and its nested children.
func NewKubectlCommand(f *cmdutil.Factory, in io.Reader, out, err io.Writer) *cobra.Command {
    // Parent command to which all subcommands are added.
    cmds := &cobra.Command{
        Use:   "kubectl",
        Short: "kubectl controls the Kubernetes cluster manager",
        Long: `kubectl controls the Kubernetes cluster manager.

Find more information at https://github.com/kubernetes/kubernetes.`,
        Run: runHelp,
        BashCompletionFunction: bash_completion_func,
    }

    f.BindFlags(cmds.PersistentFlags())

    // From this point and forward we get warnings on flags that contain "_" separators
    cmds.SetGlobalNormalizationFunc(util.WarnWordSepNormalizeFunc)

    cmds.AddCommand(NewCmdGet(f, out))
    cmds.AddCommand(NewCmdDescribe(f, out))
    cmds.AddCommand(NewCmdCreate(f, out))
    ......
    return &cmds
}

所以cmd := cmd.NewKubectlCommand(cmdutil.NewFactory(nil), os.Stdin, os.Stdout, os.Stderr)得到的cmd是一个指向cobra.Command的指针。

cmdutil.NewFactory代码(k8s.io/kubernetes/pkg/kubectl/cmd/util/factory.go):

// NewFactory creates a factory with the default Kubernetes resources defined
// if optionalClientConfig is nil, then flags will be bound to a new clientcmd.ClientConfig.
// if optionalClientConfig is not nil, then this factory will make use of it.
func NewFactory(optionalClientConfig clientcmd.ClientConfig) *Factory {
    mapper := kubectl.ShortcutExpander{RESTMapper: api.RESTMapper}

    flags := pflag.NewFlagSet("", pflag.ContinueOnError)
    flags.SetNormalizeFunc(util.WarnWordSepNormalizeFunc) // Warn for "_" flags

    generators := map[string]kubectl.Generator{
        "run/v1":                          kubectl.BasicReplicationController{},
        "run-pod/v1":                      kubectl.BasicPod{},
        "service/v1":                      kubectl.ServiceGeneratorV1{},
        "service/v2":                      kubectl.ServiceGeneratorV2{},
        "horizontalpodautoscaler/v1beta1": kubectl.HorizontalPodAutoscalerV1Beta1{},
        "deployment/v1beta1":              kubectl.DeploymentV1Beta1{},
        "job/v1beta1":                     kubectl.JobV1Beta1{},
    }

    clientConfig := optionalClientConfig
    if optionalClientConfig == nil {
        clientConfig = DefaultClientConfig(flags)
    }

    clients := NewClientCache(clientConfig)

    return &Factory{
        ......
    }

因为cmdutil.NewFactory(nil)参数为nil,所以调用的是DefaultClientConfig函数。

clientcmd.ClientConfig定义在k8s.io/kubernetes/pkg/client/unversioned/clientcmd/client_config.go文件:

// ClientConfig is used to make it easy to get an api server client
type ClientConfig interface {
    // RawConfig returns the merged result of all overrides
    RawConfig() (clientcmdapi.Config, error)
    // ClientConfig returns a complete client config
    ClientConfig() (*client.Config, error)
    // Namespace returns the namespace resulting from the merged
    // result of all overrides and a boolean indicating if it was
    // overridden
    Namespace() (string, bool, error)
}

DefaultClientConfig代码(k8s.io/kubernetes/pkg/kubectl/cmd/util/factory.go):

func DefaultClientConfig(flags *pflag.FlagSet) clientcmd.ClientConfig {
    loadingRules := clientcmd.NewDefaultClientConfigLoadingRules()
    flags.StringVar(&loadingRules.ExplicitPath, "kubeconfig", "", "Path to the kubeconfig file to use for CLI requests.")

    overrides := &clientcmd.ConfigOverrides{}
    flagNames := clientcmd.RecommendedConfigOverrideFlags("")
    // short flagnames are disabled by default.  These are here for compatibility with existing scripts
    flagNames.ClusterOverrideFlags.APIServer.ShortName = "s"

    clientcmd.BindOverrideFlags(overrides, flags, flagNames)
    clientConfig := clientcmd.NewInteractiveDeferredLoadingClientConfig(loadingRules, overrides, os.Stdin)

    return clientConfig
}

最后返回一个DeferredLoadingClientConfig结构体(定义在k8s.io/kubernetes/pkg/client/unversioned/clientcmd/merged_client_builder.go):

// DeferredLoadingClientConfig is a ClientConfig interface that is backed by a set of loading rules
// It is used in cases where the loading rules may change after you've instantiated them and you want to be sure that
// the most recent rules are used.  This is useful in cases where you bind flags to loading rule parameters before
// the parse happens and you want your calling code to be ignorant of how the values are being mutated to avoid
// passing extraneous information down a call stack
type DeferredLoadingClientConfig struct {
    loadingRules   *ClientConfigLoadingRules
    overrides      *ConfigOverrides
    fallbackReader io.Reader
}

 

Kubernetes笔记(5)—— kubernetes layout

Capture

Node(也称之为minion)运行docker container,而master则负责调度管理这些container

Master运行下列服务:

  1. API Server—nearly all the components on the master and nodes accomplish their respective tasks by making API calls. These are handled by the API Server running on the master.
  2. Etcd—Etcd is a service whose job is to keep and replicate the current configuration and run state of the cluster. It is implemented as a lightweight distributed key-value store and was developed inside the CoreOS project.
  3. Scheduler and Controller Manager—These processes schedule containers (actually, pods—but more on them later) onto target nodes. They also make sure that the correct numbers of these things are running at all times.

Node运行下列进程:

  1. Kubelet—A special background process (daemon that runs on each node whose job is to respond to commands from the master to create, destroy, and monitor the containers on that host.
  2. Proxy—This is a simple network proxy that’s used to separate the IP address of a target container from the name of the service it provides.
  3. cAdvisor (optional)—http://bit.ly/1izYGLi[Container Advisor (cAdvisor)] is a special daemon that collects, aggregates, processes, and exports information about running containers. This information includes information about resource isolation, historical usage, and key network statistics.

 

 

Kubernetes笔记(4)—— application VS service

A service is a process that:
1. is designed to do a small number of things (often just one).
2. has no user interface and is invoked solely via some kind of API.
An application, on the other hand, is pretty much the opposite of that. It has a user interface (even if it’s just a command line) and often performs lots of different tasks. It can also expose an API, but that’s just bonus points.

个人理解,service一般专注做一件事,没有用户界面,并且通过APIapplication交互。而application会有用户界面,并且通常可以运行很多任务。举个例子,web browser就是application,而web server即为service

A Kubernetes cluster does not manage a fleet of applications. It manages a cluster of services.A service running in a container managed by Kubernetes is designed to do a very small number of discrete things.

If your services are small and of limited purpose, then they can more easily be scheduled and re-arranged as your load demands. Otherwise, the dependencies become too much to manage and either your scale or your stability suffers.

K8s即是用来管理service的。

 

Kubernetes笔记(3)—— kubectl和cobra

kubectl是控制k8s clustet manager的命令行工具:

$ kubectl
kubectl controls the Kubernetes cluster manager.

Find more information at https://github.com/kubernetes/kubernetes.

Usage:
  kubectl [flags]
  kubectl [command]

Available Commands:
  get            Display one or many resources
  describe       Show details of a specific resource or group of resources
......

kubectl使用了cobra这个项目作为构建命令行的工具,它包含commandsargsflags的概念:

Commands represent actions, Args are things and Flags are modifiers for those actions.

The pattern to follow is APPNAME VERB NOUN --ADJECTIVE. or APPNAME COMMAND ARG --FLAG

举个例子:

$ kubectl get node --v=10

kubectlappnamegetcommandnodearg--v=10flag

 

Kubernetes笔记(2)—— 编译时的workspace

编译k8s代码时,会在k8s根目录下生成一个_output文件夹,同时这个文件夹下还包含local文件夹:

~/kubernetes/_output/local$ ls
bin  go

go文件夹下就是一个标准的Go语言workspace

:~/kubernetes/_output/local/go$ ls -alt
total 20
drwxrwxr-x 4 nan nan 4096 Dec  9 22:09 ..
drwxrwxr-x 2 nan nan 4096 Dec  9 22:09 bin
drwxrwxr-x 4 nan nan 4096 Dec  9 22:08 pkg
drwxrwxr-x 5 nan nan 4096 Dec  9 22:07 .
drwxrwxr-x 3 nan nan 4096 Dec  9 22:04 src

进入src文件夹:

~/kubernetes/_output/local/go/src$ ls -alt
total 12
drwxrwxr-x 5 nan nan 4096 Dec  9 22:07 ..
drwxrwxr-x 2 nan nan 4096 Dec  9 22:06 k8s.io
drwxrwxr-x 3 nan nan 4096 Dec  9 22:04 .
nan@ubuntu:~/kubernetes/_output/local/go/src$ cd k8s.io/
nan@ubuntu:~/kubernetes/_output/local/go/src/k8s.io$ ls -alt
total 8
drwxrwxr-x 2 nan nan 4096 Dec  9 22:06 .
lrwxrwxrwx 1 nan nan   20 Dec  9 22:06 kubernetes -> /home/nan/kubernetes
drwxrwxr-x 3 nan nan 4096 Dec  9 22:04 ..

可以看到,src/k8s.io/kubernetes就是一个指向外层工作目录的软链接。

至此,可以理解代码里下面import语句为什么能工作了:

import (
    "k8s.io/kubernetes/contrib/mesos/pkg/controllermanager"
    "k8s.io/kubernetes/contrib/mesos/pkg/hyperkube"
)

 

Kubernetes笔记(1)—— hyperkube

k8s中用到一个hyperkube模块,其功能如下:

// Package hyperkube is a framework for kubernetes server components. It
// allows us to combine all of the kubernetes server components into a single
// binary where the user selects which components to run in any individual
// process.
//
// Currently, only one server component can be run at once. As such there is
// no need to harmonize flags or identify logs across the various servers. In
// the future we will support launching and running many servers — either by
// managing processes or running in-proc.
//
// This package is inspired by https://github.com/spf13/cobra. However, as
// the eventual goal is to run multiple servers from one call, a new package
// was needed.

通俗地讲,hyperkube模块就是把各种功能集成到一个可执行文件,然后在运行时指定模块。比如km程序就用到了hyperkube

$ km --help
This is an all-in-one binary that can run any of the various Kubernetes-Mesos
servers.

Usage

  km <server> [flags]

Servers

  apiserver
    The main API entrypoint and interface to the storage system. The API server
    is also the focal point for all authorization decisions.
......

hyperkube结构体定义:

type HyperKube struct {
    Name string // The executable name, used for help and soft-link invocation
    Long string // A long description of the binary.  It will be world wrapped before output.

    servers     []Server
    baseFlags   *pflag.FlagSet
    out         io.Writer
    helpFlagVal bool
}

其中用到的Server结构体的定义:

// Server describes a server that this binary can morph into.
type Server struct {
    SimpleUsage string        // One line description of the server.
    Long        string        // Longer free form description of the server
    Run         serverRunFunc // Run the server.  This is not expected to return.

    flags *pflag.FlagSet // Flags for the command (and all dependents)
    name  string
    hk    *HyperKube
}

参看km程序的main函数:

func main() {
    hk := HyperKube{
        Name: "km",
        Long: "This is an all-in-one binary that can run any of the various Kubernetes-Mesos servers.",
    }

    hk.AddServer(NewKubeAPIServer())
    hk.AddServer(NewControllerManager())
    hk.AddServer(NewScheduler())
    hk.AddServer(NewKubeletExecutor())
    hk.AddServer(NewKubeProxy())
    hk.AddServer(NewMinion())

    hk.RunToExit(os.Args)
}

main函数用到了hyperkube的一个重要方法AddServer

// AddServer adds a server to the HyperKube object.
func (hk *HyperKube) AddServer(s *Server) {
    hk.servers = append(hk.servers, *s)
    hk.servers[len(hk.servers)-1].hk = hk
}

可以看到,在这个方法中,hk.servers[len(hk.servers)-1].hk = hk可以让Server结构体的hk字段指向同一个binary中的hyperkube,这样就把这些功能集成到一起。 接下来调用hyperkubeRunToExit运行相应的功能。

 

etcd简介

最近在搭建kubernetes on Mesos,用到了etcdetcd是一个分布式的,可靠的,key-value数据库。其项目主页在这里

etcd的搭建很简单,有现成的binary可以使用。不过让我感到与其它项目不同的是,它会为某项服务同时启动两个端口。比如:

-listen-peer-urls

List of URLs to listen on for peer traffic. This flag tells the etcd to accept incoming requests from its peers on the specified scheme://IP:port combinations. Scheme can be either http or https.If 0.0.0.0 is specified as the IP, etcd listens to the given port on all interfaces. If an IP address is given as well as a port, etcd will listen on the given port and interface. Multiple URLs may be used to specify a number of addresses and ports to listen on. The etcd will respond to requests from any of the listed addresses and ports.

default: “http://localhost:2380,http://localhost:7001”
env variable: ETCDLISTENPEER_URLS
example: “http://10.0.0.1:2380”
invalid example: “http://example.com:2380” (domain name is invalid for binding)

-listen-client-urls

List of URLs to listen on for client traffic. This flag tells the etcd to accept incoming requests from the clients on the specified scheme://IP:port combinations. Scheme can be either http or https. If 0.0.0.0 is specified as the IP, etcd listens to the given port on all interfaces. If an IP address is given as well as a port, etcd will listen on the given port and interface. Multiple URLs may be used to specify a number of addresses and ports to listen on. The etcd will respond to requests from any of the listed addresses and ports.
default: “http://localhost:2379,http://localhost:4001”
env variable: ETCDLISTENCLIENT_URLS
example: “http://10.0.0.1:2379”
invalid example: “http://example.com:2379” (domain name is invalid for binding)

-listen-peer-urls监听peer的请求,默认端口为23807001,而-listen-client-urls监听client的请求,默认端口为23794001

另外一个有趣的配置项是-advertise-client-urls,它是把当前运行的etcdclient URL地址通知给cluster中的其它成员。

-advertise-client-urls

List of this member’s client URLs to advertise to the rest of the cluster. These URLs can contain domain names. default: “http://localhost:2379,http://localhost:4001”
env variable: ETCDADVERTISECLIENT_URLS
example: “http://example.com:2379, http://10.0.0.1:2379”
Be careful if you are advertising URLs such as http://localhost:2379 from a cluster member and are using the proxy feature of etcd. This will cause loops, because the proxy will be forwarding requests to itself until its resources (memory, file descriptors) are eventually depleted.

参考资料:
Configuration Flags

 

Mesos和Kubernetes的区别

stackocerflow上的这篇帖子描述了kubernetesmesos的区别:

Kubernetes is an open source project that brings ‘Google style’ cluster management capabilities to the world of virtual machines, or ‘on the metal’ scenarios. It works very well with modern operating system environments (like CoreOS or Red Hat Atomic) that offer up lightweight computing ‘nodes’ that are managed for you. It is written in Golang and is lightweight, modular, portable and extensible. We (the Kubernetes team) are working with a number of different technology companies (including Mesosphere who curate the Mesos open source project) to establish Kubernetes as the standard way to interact with computing clusters. The idea is to reproduce the patterns that we see people needing to build cluster applications based on our experience at Google. Some of these concepts include:

pods — a way to group containers together
replication controllers — a way to handle the lifecycle of containers
labels — a way to find and query containers, and
services — a set of containers performing a common function.
So with Kubernetes alone you will have something that is simple, easy to get up-and-running, portable and extensible that adds ‘cluster’ as a noun to the things that you manage in the lightest weight manner possible. Run an application on a cluster, and stop worrying about an individual machine. In this case, cluster is a flexible resource just like a VM. It is a logical computing unit. Turn it up, use it, resize it, turn it down quickly and easily.

With Mesos, there is a fair amount of overlap in terms of the basic vision, but the products are at quite different points in their lifecycle and have different sweet spots. Mesos is a distributed systems kernel that stitches together a lot of different machines into a logical computer. It was born for a world where you own a lot of physical resources to create a big static computing cluster. The great thing about it is that lots of modern scalable data processing application run well on Mesos (Hadoop, Kafka, Spark) and it is nice because you can run them all on the same basic resource pool, along with your new age container packaged apps. It is somewhat more heavy weight than the Kubernetes project, but is getting easier and easier to manage thanks to the work of folks like Mesosphere.

Now what gets really interesting is that Mesos is currently being adapted to add a lot of the Kubernetes concepts and to support the Kubernetes API. So it will be a gateway to getting more capabilities for your Kubernetes app (high availability master, more advanced scheduling semantics, ability to scale to a very large number of nodes) if you need them, and is well suited to run production workloads (Kubernetes is still in an alpha state).

When asked, I tend to say:

Kubernetes is a great place to start if you are new to the clustering world; it is the quickest, easiest and lightest way to kick the tires and start experimenting with cluster oriented development. It offers a very high level of portability since it is being supported by a lot of different providers (Microsoft, IBM, Red Hat, CoreOs, MesoSphere, VMWare, etc).

If you have existing workloads (Hadoop, Spark, Kafka, etc), Mesos gives you a framework that let’s you interleave those workloads with each other, and mix in a some of the new stuff including Kubernetes apps.

Mesos gives you an escape valve if you need capabilities that are not yet implemented by the community in the Kubernetes framework.

简而言之,Mesos是一个distributed systems kernel,它的作用可以理解成把一组physical机器抽象成一台logic机器供程序使用。Kubernetes是一个运行在Mesos上的framework,它是一个管理containercluster manager