The byproducts of reading OpenBSD netcat code

When I took part in a training last year, I heard about netcat for the first time. During that class, the tutor showed some hacks and tricks of using netcat which appealed to me and motivated me to learn the guts of it. Fortunately, in the past 2 months, I was not so busy that I can spend my spare time to dive into OpenBSD‘s netcat source code, and got abundant byproducts during this process.

(1) Brush up socket programming. I wrote my first network application more than 10 years ago, and always think the socket APIs are marvelous. Just ~10 functions (socket, bind, listen, accept…) with some IO multiplexing buddies (select, poll, epoll…) connect the whole world, wonderful! From that time, I developed a habit that is when touching a new programming language, network programming is an essential exercise. Even though I don’t write socket related code now, reading netcat socket code indeed refresh my knowledge and teach me new stuff.

(2) Write a tutorial about netcat. I am mediocre programmer and will forget things when I don’t use it for a long time. So I just take notes of what I think is useful. IMHO, this “tutorial” doesn’t really mean teach others something, but just a journal which I can refer when I need in the future.

(3) Submit patches to netcat. During reading code, I also found bugs and some enhancements. Though trivial contributions toOpenBSD, I am still happy and enjoy it.

(4) Implement a C++ encapsulation of libtls. OpenBSD‘s netcat supports tls/ssl connection, but it needs you take full care of resource management (memory, socket, etc), otherwise a small mistake can lead to resource leak which is fatal for long-live applications (In fact, the two bugs I reported to OpenBSD are all related resource leak). Therefore I develop a simple C++ library which wraps the libtls and hope it can free developer from this troublesome problem and put more energy in application logic part.

Long story to short, reading classical source code is a rewarding process, and you can consider to try it yourself.

Be aware of space when issuing HTTP request

Use netcat to issue HTTP request:

# nc -cv www.google.com https
Connection to www.google.com 443 port [tcp/https] succeeded!
......
GET / HTTP/1.1
Host: www.google.com
 Connection: close

HTTP/1.1 400 Bad Request
Content-Length: 54
Content-Type: text/html; charset=UTF-8
Date: Mon, 27 Aug 2018 09:37:22 GMT
Connection: close

<html><title>Error 400 (Bad Request)!!1</title></html>

A superfluous space before “Connection: close” will cause error response. Be aware of it!

 

Learn socket programming tips from netcat

Since netcat is honored as “TCP/IP swiss army knife”, I read its source code in OpenBSD to summarize some socket programming tips:

(1) Client connects in non-blocking mode:

......
s = socket(res->ai_family, res->ai_socktype |
            SOCK_NONBLOCK, res->ai_protocol);


......  
if ((ret = connect(s, name, namelen)) != 0 && errno == EINPROGRESS) {
        pfd.fd = s;
        pfd.events = POLLOUT;
        ret = poll(&pfd, 1, timeout));
}
......

Creating socket and set SOCK_NONBLOCK mode for it. Then calling connect() function, if ret is 0, it means connection is established successfully; if errno is EINPROGRESS, we can use timeout to control how long to wait; otherwise the connection can’t be built.

(2) The usage of poll():

......
/* stdin */
pfd[POLL_STDIN].fd = stdin_fd;
pfd[POLL_STDIN].events = POLLIN;

/* network out */
pfd[POLL_NETOUT].fd = net_fd;
pfd[POLL_NETOUT].events = 0;

/* network in */
pfd[POLL_NETIN].fd = net_fd;
pfd[POLL_NETIN].events = POLLIN;

/* stdout */
pfd[POLL_STDOUT].fd = stdout_fd;
pfd[POLL_STDOUT].events = 0;

......
/* poll */
num_fds = poll(pfd, 4, timeout);

/* treat poll errors */
if (num_fds == -1)
    err(1, "polling error");

/* timeout happened */
if (num_fds == 0)
    return;

/* treat socket error conditions */
for (n = 0; n < 4; n++) {
    if (pfd[n].revents & (POLLERR|POLLNVAL)) {
        pfd[n].fd = -1;
    }
}
/* reading is possible after HUP */
if (pfd[POLL_STDIN].events & POLLIN &&
    pfd[POLL_STDIN].revents & POLLHUP &&
    !(pfd[POLL_STDIN].revents & POLLIN))
    pfd[POLL_STDIN].fd = -1;

Usually, we just need to care about file descriptors for reading:

pfd[POLL_STDIN].fd = stdin_fd;
pfd[POLL_STDIN].events = POLLIN;

no need to monitor file descriptors for writing:

/* network out */
pfd[POLL_NETOUT].fd = net_fd;
pfd[POLL_NETOUT].events = 0;

According to poll() manual from OpenBSD, if no need for “high-priority” (maybe out-of-band) data, POLLIN is enough, otherwise the monitor events should be POLLIN|POLLPRI. And this is similar for POLLOUT and POLLWRBAND.

There are 3 values(POLLERR, POLLNVAL and POLLHUP) which are only used in struct pollfd‘s revents. If POLLERR or POLLNVAL is detected, it’s not necessary to poll this file descriptor furthermore:

if (pfd[n].revents & (POLLERR|POLLNVAL)) {
    pfd[n].fd = -1;
}

We should pay more attention to POLLHUP:
(a)

POLLHUP

The device or socket has been disconnected. This event and POLLOUT are mutually-exclusive; a descriptor can never be writable if a hangup has occurred. However, this event and POLLIN, POLLRDNORM, POLLRDBAND, or POLLPRI are not mutually-exclusive. This flag is only valid in the revents bitmask; it is ignored in the events member.

(b)

The second difference is that on EOF there is no guarantee that POLLIN will be set in revents, the caller must also check for POLLHUP.

So it means if POLLHUP and POLLIN are both set in revents, there must be data to read (maybe EOF?), otherwise if only POLLHUP is checked, there is no data to read from.

 

Use network analyzer to learn SSH session establishment

The establishment of SSH session consists of 2 parts: build up the encryption channel and authenticate user. To understand the whole flow better, I usetcpdump/Wireshark to capture and analyze the packets. Server is OpenBSD 6.1 and client is ArchLinux. The tcpdump command is like this:

sudo tcpdump -A -s 0 'net 192.168.38.176' -i enp7s0f0 -w capture.pcap

(1) Connect server first time:

1

The captured packets:

C1

We can see the client/server negotiated SSH version first (In fact, client and server sent SSH version simultaneously, so please don’t misunderstand client sent first, then server responded. Use “nc 192.168.38.176 22” command to check.)

, then exchanged public key to generate secret key. The server issued “New Keys” message, and waited for client to answer.

(2) Accept server’s public key but not input password:

2

The captured packets:

C2

The first packet should be client acknowledged server’s “New Keys” message, then there are some interactions. Now the encryption channel is set up.

(3) Enter password and authenticate user:

3

The captured packets:

C3

These packets are all encrypted data. If user’s password is correct, the whole SSH session will be ready, and you can administrator server now.

Reference:
Understanding the SSH Encryption and Connection Process.