A brief intro of TCP keep-alive in Go’s HTTP implementation

Let’s see a Go web program:

package main

import (
        "fmt"
        "io"
        "io/ioutil"
        "net/http"
        "os"
        "time"
)

func main() {
        for {
                resp, err := http.Get("https://www.google.com/")
                if err != nil {
                        fmt.Fprintf(os.Stderr, "fetch: %v\n", err)
                        os.Exit(1)
                }
                _, err = io.Copy(ioutil.Discard, resp.Body)
                resp.Body.Close()
                if err != nil {
                        fmt.Fprintf(os.Stderr, "fetch: reading: %v\n", err)
                        os.Exit(1)
                }
                time.Sleep(45 * time.Second)
        }
}

The logic of above application is not hard, just retrieve information from the specified Website URL. I am a layman of web development, and think the HTTP communication should be “short-lived”, which means when HTTPclient issues a request, it will start a TCP connection with HTTP server, once the client receives the response, this session should end, and the TCP connection should be demolished. But is the fact really like this? I use lsof command to check my guess:

# lsof -P -n -p 907
......
fetch   907 root    3u    IPv4 0xfffff80013677810              0t0     TCP 192.168.80.129:32618->xxx.xxx.xxx.xxx:8080 (ESTABLISHED)
......

Oh! My assumption is wrong, and there is a “long-lived” TCP connection. Why does this happen? When I come across the network related troubles, I will always seek help from tcpdump and wireshark and try to capture packets for analysis. This time should not be exception, and the communication flow is as the following picture:

1

(1) The start number of packet is 4, that’s because the first 3 packets are TCP handshake, and it is safe to ignore them;
(2) Packet 4 ~ 43 is the first HTTP GET flow, and this process lasts about 2 seconds, and ends at 19:20:37;
(3) Then after half a minute, at 19:21:07, there occurs a TCP keep-alive packet on the wire. Oh dear! The root cause has been found! Although the HTTP session is over, the TCP connection still exists and uses keep-alive mechanism to make the TCP passway alive, so this TCP route can be reused by following HTTP messages;
(4) As expected, 15 seconds later, a new HTTP session begins at packet 46, which is exactly 45 seconds pass after the first HTTPconversation.

That’s the effect of TCP keep-alive, which keeps the TCP connection alive and can be reused by following HTTP sessions. Another thing you should pay attention of reusing connection is the Response.Body must be read to completion and closed (Please refer here):

type Response struct {
    ......

    // Body represents the response body.
    //
    // The http Client and Transport guarantee that Body is always
    // non-nil, even on responses without a body or responses with
    // a zero-length body. It is the caller's responsibility to
    // close Body. The default HTTP client's Transport does not
    // attempt to reuse HTTP/1.0 or HTTP/1.1 TCP connections
    // ("keep-alive") unless the Body is read to completion and is
    // closed.
    //
    // The Body is automatically dechunked if the server replied
    // with a "chunked" Transfer-Encoding.
    Body io.ReadCloser

    ......
}

As an example, modify the above program as follows:

package main

import (
        "fmt"
        "io"
        "io/ioutil"
        "net/http"
        "os"
        "time"
)

func closeResp(resp *http.Response) {
        _, err := io.Copy(ioutil.Discard, resp.Body)
        resp.Body.Close()
        if err != nil {
                fmt.Fprintf(os.Stderr, "fetch: reading: %v\n", err)
                os.Exit(1)
        }
}

func main() {
        for {
                resp1, err := http.Get("https://www.google.com/")
                if err != nil {
                        fmt.Fprintf(os.Stderr, "fetch: %v\n", err)
                        os.Exit(1)
                }

                resp2, err := http.Get("https://www.facebook.com/")
                if err != nil {
                        fmt.Fprintf(os.Stderr, "fetch: %v\n", err)
                        os.Exit(1)
                }

                time.Sleep(45 * time.Second)
                for _, v := range []*http.Response{resp1, resp2} {
                        closeResp(v)
                }
        }
}

During running it, You will see 2 TCP connections, not 1:

# lsof -P -n -p 1982
......
fetch   1982 root    3u    IPv4 0xfffff80013677810              0t0     TCP 192.168.80.129:43793->xxx.xxx.xxx.xxx:8080 (ESTABLISHED)
......
fetch   1982 root    6u    IPv4 0xfffff80013677000              0t0     TCP 192.168.80.129:12105->xxx.xxx.xxx.xxx:8080 (ESTABLISHED)

If you call closeResp function before issuing new HTTP request, the TCP connection can be reused:

package main

import (
        "fmt"
        "io"
        "io/ioutil"
        "net/http"
        "os"
        "time"
)

func closeResp(resp *http.Response) {
        _, err := io.Copy(ioutil.Discard, resp.Body)
        resp.Body.Close()
        if err != nil {
                fmt.Fprintf(os.Stderr, "fetch: reading: %v\n", err)
                os.Exit(1)
        }
}

func main() {
        for {
                resp1, err := http.Get("https://www.google.com/")
                if err != nil {
                        fmt.Fprintf(os.Stderr, "fetch: %v\n", err)
                        os.Exit(1)
                }
                closeResp(resp1)


                time.Sleep(45 * time.Second)

                resp2, err := http.Get("https://www.facebook.com/")
                if err != nil {
                        fmt.Fprintf(os.Stderr, "fetch: %v\n", err)
                        os.Exit(1)
                }
                closeResp(resp2)
        }
}

P.S., the full code is here.

References:
Package http;
Reusing http connections in Golang;
Is HTTP Shortlived?.

A trick of using “Find usages” in IntelliJ IDEA as Go IDE

Recently, I am using IntelliJ IDEA as a Go IDE to browse Docker Swarm code. When I want to search where Discovery.Watch method intoken package is called, the “Find usages“(Alt+F7) function of IntelliJ IDEA gives me a confusion:

method

It just prints one occurrence: the test code of token package. It doesn’t make sense, where the hell is the Discovery.Watch method called? When I search the usages of Watch in Watcher interface which token.Discovery satisfies by accident, I catch where theDiscovery.Watch method is used:

spec

The caveat from this lesson is if you can’t find where methods of a package are called, you should try to find the interfaces where the packages satisfy, maybe it will give you the answer.

 

Deploy Docker Swarm cluster on one host

Sometimes, you just want to learn the internal mechanics of Docker Swarm, but unfortunately there is only one Linux box at hand, and you don’t want to bother to install Virtual Machines on it. In this scenario, you certainly can build a Docker Swarm cluster on one host, and this tutorial will provide a detailed guide:

(1) Make sure the Go environment has been ready on your system, if not, please follow this document to setup it. Also remember add$GOPATH/bin into $PATH environment variable.

(2) Install Docker Swarm:

# go get -u github.com/docker/swarm

Execute swarm command to check whether Docker Swarm is well equipped:

# swarm
Usage: swarm [OPTIONS] COMMAND [arg...]

A Docker-native clustering system

Version: 1.2.3 (HEAD)

Options:
  --debug                       debug mode [$DEBUG]
  --log-level, -l "info"        Log level (options: debug, info, warn, error, fatal, panic)
  --experimental                enable experimental features
  --help, -h                    show help
  --version, -v                 print the version
......

(3) Modify the Docker configuration file. E.g., on my RHEL 7, the file is /etc/sysconfig/docker:

# systemctl show docker
......
EnvironmentFile=/etc/sysconfig/docker (ignore_errors=yes)
......

Add “-H tcp://127.0.0.1:2375” in OPTIONS field:

# cat /etc/sysconfig/docker
# /etc/sysconfig/docker

# Modify these options if you want to change the way the docker daemon runs
OPTIONS='--selinux-enabled -H tcp://127.0.0.1:2375 -H unix:///var/run/docker.sock'

Restart Docker, and check whether the new OPTIONS takes effect:

# systemctl restart docker
# systemctl status docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
   Active: active (running) since Wed 2016-06-08 12:32:19 CST; 10s ago
     Docs: http://docs.docker.com
 Main PID: 14429 (sh)
   CGroup: /system.slice/docker.service
           ├─14429 /bin/sh -c /usr/bin/docker-current daemon $OPTIONS            $DOCKER_STORAGE_OPTIONS            $DOCKER_NETWORK_OPTI...
           ├─14430 /usr/bin/docker-current daemon --selinux-enabled -H tcp://127.0.0.1:2375 -H unix:///var/run/docker.sock --add-registr...
           └─14431 /usr/bin/forward-journald -tag docker
......

(4) Run “swarm create” command to create token for the cluster:

# swarm create
d10eacbda9763b0740548a2a4c2f1a59

(5) Execute swarm join to create a Docker Swarm node:

# swarm join --addr 127.0.0.1:2375 token://d10eacbda9763b0740548a2a4c2f1a59
INFO[0000] Registering on the discovery service every 1m0s...  addr=127.0.0.1:2375 discovery=token://d10eacbda9763b0740548a2a4c2f1a59
......

You should notice that the argument of --addr option is the IP and port of the Docker engine on this host. Since we have set theOPTIONS in Docker configuration file in step 3, the IP should be 127.0.0.1 whilst port is 2375.

(6) Open a new terminal, and create the manager of the cluster. Because port 2375 is occupied by Docker engine, we use another available port:

# swarm manage -H 127.0.0.1:3375 token://d10eacbda9763b0740548a2a4c2f1a59
INFO[0000] Listening for HTTP                            addr=127.0.0.1:3375 proto=tcp
INFO[0001] Registered Engine localhost.localdomain at 127.0.0.1:2375

Through the log, you can see the node and manager have communicated successfully.

Now, you can think a Docker engine is listening on tcp://127.0.0.1:3375, but actually, there is one Docker cluster behindtcp://127.0.0.1:3375, even though the cluster has only one node. You can play docker client commands now, such as get the cluster info:

# docker -H tcp://127.0.0.1:3375 info
Containers: 0
Images: 5
Server Version: swarm/1.2.3
Role: primary
Strategy: spread
Filters: health, port, containerslots, dependency, affinity, constraint
Nodes: 1
 localhost.localdomain: 127.0.0.1:2375
  └ ID: ZUIV:BMPV:3B5R:2WBC:JXEI:2S6H:XM3H:66W5:UZQI:NJON:JY4T:HIFB
  └ Status: Healthy
  └ Containers: 0 (0 Running, 0 Paused, 0 Stopped)
  └ Reserved CPUs: 0 / 8
  └ Reserved Memory: 0 B / 12.1 GiB
  └ Labels: executiondriver=native-0.2, kernelversion=3.10.0-327.el7.x86_64, operatingsystem=Red Hat Network, storagedriver=devicemapper
  └ UpdatedAt: 2016-06-08T04:58:05Z
  └ ServerVersion: 1.9.1
Kernel Version: 3.10.0-327.el7.x86_64
......

Or run a container:

# docker -H tcp://127.0.0.1:3375 run hello-world

Hello from Docker.
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
......

Enjoy Docker Swarm now!

Reference:
Swarm docs;
Docker Swarm Tutorial and Examples.