12月, 2015 | 我的站点

Kubernetes笔记（2）—— 编译时的workspace

编译k8s代码时，会在k8s根目录下生成一个_output文件夹，同时这个文件夹下还包含local文件夹：

~/kubernetes/_output/local$ ls
bin  go

go文件夹下就是一个标准的Go语言workspace：

:~/kubernetes/_output/local/go$ ls -alt
total 20
drwxrwxr-x 4 nan nan 4096 Dec  9 22:09 ..
drwxrwxr-x 2 nan nan 4096 Dec  9 22:09 bin
drwxrwxr-x 4 nan nan 4096 Dec  9 22:08 pkg
drwxrwxr-x 5 nan nan 4096 Dec  9 22:07 .
drwxrwxr-x 3 nan nan 4096 Dec  9 22:04 src

进入src文件夹：

~/kubernetes/_output/local/go/src$ ls -alt
total 12
drwxrwxr-x 5 nan nan 4096 Dec  9 22:07 ..
drwxrwxr-x 2 nan nan 4096 Dec  9 22:06 k8s.io
drwxrwxr-x 3 nan nan 4096 Dec  9 22:04 .
nan@ubuntu:~/kubernetes/_output/local/go/src$ cd k8s.io/
nan@ubuntu:~/kubernetes/_output/local/go/src/k8s.io$ ls -alt
total 8
drwxrwxr-x 2 nan nan 4096 Dec  9 22:06 .
lrwxrwxrwx 1 nan nan   20 Dec  9 22:06 kubernetes -> /home/nan/kubernetes
drwxrwxr-x 3 nan nan 4096 Dec  9 22:04 ..

可以看到，src/k8s.io/kubernetes就是一个指向外层工作目录的软链接。

至此，可以理解代码里下面import语句为什么能工作了：

import (
    "k8s.io/kubernetes/contrib/mesos/pkg/controllermanager"
    "k8s.io/kubernetes/contrib/mesos/pkg/hyperkube"
)

Kubernetes笔记（1）—— hyperkube

k8s中用到一个hyperkube模块，其功能如下：

// Package hyperkube is a framework for kubernetes server components. It
// allows us to combine all of the kubernetes server components into a single
// binary where the user selects which components to run in any individual
// process.
//
// Currently, only one server component can be run at once. As such there is
// no need to harmonize flags or identify logs across the various servers. In
// the future we will support launching and running many servers — either by
// managing processes or running in-proc.
//
// This package is inspired by https://github.com/spf13/cobra. However, as
// the eventual goal is to run multiple servers from one call, a new package
// was needed.

通俗地讲，hyperkube模块就是把各种功能集成到一个可执行文件，然后在运行时指定模块。比如km程序就用到了hyperkube：

$ km --help
This is an all-in-one binary that can run any of the various Kubernetes-Mesos
servers.

Usage

  km <server> [flags]

Servers

  apiserver
    The main API entrypoint and interface to the storage system. The API server
    is also the focal point for all authorization decisions.
......

hyperkube结构体定义：

type HyperKube struct {
    Name string // The executable name, used for help and soft-link invocation
    Long string // A long description of the binary.  It will be world wrapped before output.

    servers     []Server
    baseFlags   *pflag.FlagSet
    out         io.Writer
    helpFlagVal bool
}

其中用到的Server结构体的定义：

// Server describes a server that this binary can morph into.
type Server struct {
    SimpleUsage string        // One line description of the server.
    Long        string        // Longer free form description of the server
    Run         serverRunFunc // Run the server.  This is not expected to return.

    flags *pflag.FlagSet // Flags for the command (and all dependents)
    name  string
    hk    *HyperKube
}

参看km程序的main函数：

func main() {
    hk := HyperKube{
        Name: "km",
        Long: "This is an all-in-one binary that can run any of the various Kubernetes-Mesos servers.",
    }

    hk.AddServer(NewKubeAPIServer())
    hk.AddServer(NewControllerManager())
    hk.AddServer(NewScheduler())
    hk.AddServer(NewKubeletExecutor())
    hk.AddServer(NewKubeProxy())
    hk.AddServer(NewMinion())

    hk.RunToExit(os.Args)
}

main函数用到了hyperkube的一个重要方法AddServer：

// AddServer adds a server to the HyperKube object.
func (hk *HyperKube) AddServer(s *Server) {
    hk.servers = append(hk.servers, *s)
    hk.servers[len(hk.servers)-1].hk = hk
}

可以看到，在这个方法中，hk.servers[len(hk.servers)-1].hk = hk可以让Server结构体的hk字段指向同一个binary中的hyperkube，这样就把这些功能集成到一起。接下来调用hyperkube的RunToExit运行相应的功能。

etcd简介

最近在搭建kubernetes on Mesos，用到了etcd。etcd是一个分布式的，可靠的，key-value数据库。其项目主页在这里。

etcd的搭建很简单，有现成的binary可以使用。不过让我感到与其它项目不同的是，它会为某项服务同时启动两个端口。比如：

-listen-peer-urls

List of URLs to listen on for peer traffic. This flag tells the etcd to accept incoming requests from its peers on the specified scheme://IP:port combinations. Scheme can be either http or https.If 0.0.0.0 is specified as the IP, etcd listens to the given port on all interfaces. If an IP address is given as well as a port, etcd will listen on the given port and interface. Multiple URLs may be used to specify a number of addresses and ports to listen on. The etcd will respond to requests from any of the listed addresses and ports.

default: “http://localhost:2380,http://localhost:7001”
env variable: ETCDLISTENPEER_URLS
example: “http://10.0.0.1:2380”
invalid example: “http://example.com:2380” (domain name is invalid for binding)

-listen-client-urls

List of URLs to listen on for client traffic. This flag tells the etcd to accept incoming requests from the clients on the specified scheme://IP:port combinations. Scheme can be either http or https. If 0.0.0.0 is specified as the IP, etcd listens to the given port on all interfaces. If an IP address is given as well as a port, etcd will listen on the given port and interface. Multiple URLs may be used to specify a number of addresses and ports to listen on. The etcd will respond to requests from any of the listed addresses and ports.
default: “http://localhost:2379,http://localhost:4001”
env variable: ETCDLISTENCLIENT_URLS
example: “http://10.0.0.1:2379”
invalid example: “http://example.com:2379” (domain name is invalid for binding)

-listen-peer-urls监听peer的请求，默认端口为2380和7001，而-listen-client-urls监听client的请求，默认端口为2379和4001。

另外一个有趣的配置项是-advertise-client-urls，它是把当前运行的etcd的client URL地址通知给cluster中的其它成员。

-advertise-client-urls

List of this member’s client URLs to advertise to the rest of the cluster. These URLs can contain domain names. default: “http://localhost:2379,http://localhost:4001”
env variable: ETCDADVERTISECLIENT_URLS
example: “http://example.com:2379, http://10.0.0.1:2379”
Be careful if you are advertising URLs such as http://localhost:2379 from a cluster member and are using the proxy feature of etcd. This will cause loops, because the proxy will be forwarding requests to itself until its resources (memory, file descriptors) are eventually depleted.

参考资料：
Configuration Flags。

Haskell笔记（6）—— 字符串

（1）

The empty string is written “”, and is a synonym for [].

举例如下：

> "" == []
True

（2）

Since a string is a list of characters, we can use the regular list operators to construct new strings.

举例如下：

> 'a':"bc"
"abc"
> "foo" ++ "bar"
"foobar"

Haskell笔记（5）—— ghci

（1）

ghci stores the result of the last expression into a variable whose name is “it”. (This isn’t a Haskell language feature; it’s specific to ghci alone.)

举例如下：

> "foo"
"foo"
> it ++ "bar"
"foobar"
> it
"foobar"

（2）可以使用:set <option>和:unset <option>来设置和取消一些选项。以打印计算的结果类型为例：

> :set +t
> fst ('a', 'b')
'a'
it :: Char
> :unset +t
> fst ('a', 'b')
'a'

（3）在ghci中定义变量与源文件中定义变量不一样，需要在变量前加上let：

ghci> let a = 10
ghci> b = 100

<interactive>:9:3: parse error on input ‘=’

Mesos和Kubernetes的区别

stackocerflow上的这篇帖子描述了kubernetes和mesos的区别：

Kubernetes is an open source project that brings ‘Google style’ cluster management capabilities to the world of virtual machines, or ‘on the metal’ scenarios. It works very well with modern operating system environments (like CoreOS or Red Hat Atomic) that offer up lightweight computing ‘nodes’ that are managed for you. It is written in Golang and is lightweight, modular, portable and extensible. We (the Kubernetes team) are working with a number of different technology companies (including Mesosphere who curate the Mesos open source project) to establish Kubernetes as the standard way to interact with computing clusters. The idea is to reproduce the patterns that we see people needing to build cluster applications based on our experience at Google. Some of these concepts include:

pods — a way to group containers together
replication controllers — a way to handle the lifecycle of containers
labels — a way to find and query containers, and
services — a set of containers performing a common function.
So with Kubernetes alone you will have something that is simple, easy to get up-and-running, portable and extensible that adds ‘cluster’ as a noun to the things that you manage in the lightest weight manner possible. Run an application on a cluster, and stop worrying about an individual machine. In this case, cluster is a flexible resource just like a VM. It is a logical computing unit. Turn it up, use it, resize it, turn it down quickly and easily.

With Mesos, there is a fair amount of overlap in terms of the basic vision, but the products are at quite different points in their lifecycle and have different sweet spots. Mesos is a distributed systems kernel that stitches together a lot of different machines into a logical computer. It was born for a world where you own a lot of physical resources to create a big static computing cluster. The great thing about it is that lots of modern scalable data processing application run well on Mesos (Hadoop, Kafka, Spark) and it is nice because you can run them all on the same basic resource pool, along with your new age container packaged apps. It is somewhat more heavy weight than the Kubernetes project, but is getting easier and easier to manage thanks to the work of folks like Mesosphere.

Now what gets really interesting is that Mesos is currently being adapted to add a lot of the Kubernetes concepts and to support the Kubernetes API. So it will be a gateway to getting more capabilities for your Kubernetes app (high availability master, more advanced scheduling semantics, ability to scale to a very large number of nodes) if you need them, and is well suited to run production workloads (Kubernetes is still in an alpha state).

When asked, I tend to say:

Kubernetes is a great place to start if you are new to the clustering world; it is the quickest, easiest and lightest way to kick the tires and start experimenting with cluster oriented development. It offers a very high level of portability since it is being supported by a lot of different providers (Microsoft, IBM, Red Hat, CoreOs, MesoSphere, VMWare, etc).

If you have existing workloads (Hadoop, Spark, Kafka, etc), Mesos gives you a framework that let’s you interleave those workloads with each other, and mix in a some of the new stuff including Kubernetes apps.

Mesos gives you an escape valve if you need capabilities that are not yet implemented by the community in the Kubernetes framework.

简而言之，Mesos是一个distributed systems kernel，它的作用可以理解成把一组physical机器抽象成一台logic机器供程序使用。Kubernetes是一个运行在Mesos上的framework，它是一个管理container的cluster manager。

SystemTap 笔记（16）—— probe alias

Probe alias的语法：

probe <alias> = <probepoint> { <prologue_stmts> }
probe <alias> += <probepoint> { <epilogue_stmts> }

（1）第一种方式定义的prologue_stmts会在probe handler执行前执行，而第二种方式定义的<epilogue_stmts>则会在probe handler执行后执行。要注意，上述的方式只是定义了probe alias，而并没有激活它们（参考Re: How does stap execute probe aliases?）：

# cat timer_test.stp
#!/usr/bin/stap

probe timer_alias = timer.s(3) {printf("Entering timer\n")}
# ./timer_test.stp
semantic error: no probes found
Pass 2: analysis failed.  [man error::pass2]

下面则能正常运行：

# cat timer_test.stp
#!/usr/bin/stap

probe timer_alias = timer.s(3) {printf("Entering timer\n")}
probe timer_alias {}
# ./timer_test.stp
Entering timer
......

（2）看下面脚本的执行结果：

# cat timer_test.stp
#!/usr/bin/stap

probe timer_alias = timer.s(3) {printf("Entering timer\n")}
probe timer_alias += timer.s(3) {printf("Leaving timer\n")}
probe timer_alias {printf("In timer \n")}
# ./timer_test.stp
Entering timer
In timer
In timer
Leaving timer
......

它相当于执行下面的脚本（参考 Re: Why is the same log printed twice when using probe alias?）：

# cat timer_test.stp
#!/usr/bin/stap

probe timer.s(3)
{
        printf("Entering timer\n")
        printf("In timer\n")
}
probe timer.s(3)
{
        printf("In timer\n")
        printf("Leaving timer\n")
}

(3) Alias suffixes

It is possible to include a suffix with a probe alias invocation. If only the initial part of a probe point matches an alias, the remainder is treated as a suffix and attached to the underlying probe point(s) when the alias is expanded. For example:

/* Define an alias: */
probe sendrecv = tcp.sendmsg, tcp.recvmsg { … }

/* Use the alias in its basic form: */
probe sendrecv { … }

/* Use the alias with an additional suffix: */
probe sendrecv.return { … }

Here, the second use of the probe alias is equivalent to writing probe tcp.sendmsg.return, tcp.recvmsg.return.

Mesos笔记（1）—— 架构

本文内容摘自下列文章：
APACHE MESOS: THE TRUE OS FOR THE SOFTWARE DEFINED DATA CENTER?；
Mesos Architecture。

Imagine if instead of individual physical servers, we could aggregate all the resources in a data center into a single large virtual pool, exposing not virtual machines but primitives such as CPU, RAM, and I/O? In conjunction, imagine if we could break applications into small isolated units of tasks that could be dynamically assigned resources from our virtual data center pool, based on the needs of the applications in our data center? The analogy here would be a PC with an operating system that is pooling the PC’s processors and RAM and coordinating the allocation and deallocation of those resources for use by different processes. Now extend that analogy to make the data center the PC with Mesos as the operating system kernel. That, in a nutshell, is how Mesos is transforming the data center and making true SDDC a reality.

通俗地讲，可以把一个数据中心里所有的硬件资源看成一个整体。Mesos的功能就是为应用程序管理和分配这些资源。

Mesos结构如下图所示：

The modified diagram above from the Apache Mesos website shows how Mesos implements it’s two-level scheduling architecture for managing multiple types of applications. The first level is the master daemon which manages slave daemons running on each node in the Mesos cluster. The cluster consists of all servers, physical or virtual, that will be running applications tasks, such as Hadoop and MPI jobs. The second level consists of a component called a framework. A framework includes a scheduler and an executor process, the latter of which also runs on each node. Mesos is able to communicate with different types of frameworks with each one managing a different clustered application. The diagram above shows Hadoop and MPI but other frameworks have been written as well for other types of applications.

Mesos cluster里的每一个node都运行一个slave daemon程序，由一个master daemon程序统一管理。Mesos上运行的程序称之为framework，它包含两个部分：scheduler和executor process。分布式系统通常包含一个controller和多个worker，worker可以不依赖于controller而独立地运行。对于framework而言，scheduler就是controller，executor process就是worker。

A framework running on top of Mesos consists of two components: a scheduler that registers with the master to be offered resources, and an executor process that is launched on slave nodes to run the framework’s tasks (see the App/Framework development guide for more details about application schedulers and executors). While the master determines how many resources are offered to each framework, the frameworks’ schedulers select which of the offered resources to use. When a frameworks accepts offered resources, it passes to Mesos a description of the tasks it wants to run on them. In turn, Mesos launches the tasks on the corresponding slaves.

scheduler从master得到需要的资源，而executor process则会在slave node上运行framework task。

一	二	三	四	五	六	日
« 11月				1月 »
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

月份：2015年12月

Kubernetes笔记（2）—— 编译时的workspace

Go语言的workspace

Kubernetes笔记（1）—— hyperkube

etcd简介

Haskell笔记（7）—— 类型

Haskell笔记（6）—— 字符串

Haskell笔记（5）—— ghci

Mesos和Kubernetes的区别

SystemTap 笔记（16）—— probe alias

Mesos笔记（1）—— 架构