Nan Xiao's Blog

A core dump related to jemalloc

Recently I came across a core dump related to jemalloc:

#0  extent_arena_ind_get (extent=0x0) at include/jemalloc/internal/extent_inlines.h:40
#1  je_tcache_bin_flush_small (tsd=tsd@entry=0x7ffff7f717f8, tcache=tcache@entry=0x7ffff7f719e8,
    tbin=tbin@entry=0x7ffff7f71a70, binind=binind@entry=5, rem=<optimized out>) at src/tcache.c:159
#2  0x00007ffff356b97b in je_tcache_event_hard (tsd=tsd@entry=0x7ffff7f717f8,
    tcache=tcache@entry=0x7ffff7f719e8) at src/tcache.c:55
#3  0x00007ffff3512d49 in tcache_event (tcache=<optimized out>, tsd=<optimized out>)
    at include/jemalloc/internal/tcache_inlines.h:37
#4  tcache_dalloc_large (slow_path=<optimized out>, binind=<optimized out>, ptr=<optimized out>,
    tcache=<optimized out>, tsd=<optimized out>) at include/jemalloc/internal/tcache_inlines.h:212
#5  arena_dalloc_large (slow_path=<optimized out>, szind=<optimized out>, tcache=<optimized out>,
    ptr=<optimized out>, tsdn=<optimized out>) at include/jemalloc/internal/arena_inlines_b.h:276
#6  arena_dalloc (slow_path=<optimized out>, alloc_ctx=<optimized out>, tcache=<optimized out>,
    ptr=<optimized out>, tsdn=<optimized out>) at include/jemalloc/internal/arena_inlines_b.h:323
#7  idalloctm (slow_path=<optimized out>, is_internal=<optimized out>, alloc_ctx=<optimized out>,
    tcache=<optimized out>, ptr=<optimized out>, tsdn=<optimized out>)
    at include/jemalloc/internal/jemalloc_internal_inlines_c.h:118
#8  ifree (slow_path=<optimized out>, tcache=<optimized out>, ptr=<optimized out>,
    tsd=<optimized out>) at src/jemalloc.c:2589
#9  je_free_default (ptr=0x7fff2ccf53c0) at src/jemalloc.c:2799

The Sanitizers helped me to find the root cause: a classical “double-free” memory issue. One thing should be noticed is the Sanitizers and jemalloc can’t be used simultaneously because they both intercept memory allocation/free functions. Check following code:

# cat memory-leak.c
#include <stdlib.h>
void *p;
int main() {
  p = malloc(7);
  p = 0; // The memory is leaked here.
  return 0;
}

Build with both Sanitizers and jemalloc:

# gcc -fsanitize=address -g memory-leak.c -L`jemalloc-config --libdir` -Wl,-rpath,`jemalloc-config --libdir` -ljemalloc `jemalloc-config --libs`
# ldd a.out
    linux-vdso.so.1 (0x00007ffdb85a6000)
    libasan.so.6 => /usr/lib/libasan.so.6 (0x00007ffb04605000)
    libjemalloc.so.2 => /usr/lib/libjemalloc.so.2 (0x00007ffb04362000)
    libm.so.6 => /usr/lib/libm.so.6 (0x00007ffb0421d000)
    libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007ffb03fb4000)
    libdl.so.2 => /usr/lib/libdl.so.2 (0x00007ffb03fae000)
    libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007ffb03f8d000)
    libc.so.6 => /usr/lib/libc.so.6 (0x00007ffb03dc5000)
    librt.so.1 => /usr/lib64/../lib64/librt.so.1 (0x00007ffb03dba000)
    libgcc_s.so.1 => /usr/lib64/../lib64/libgcc_s.so.1 (0x00007ffb03da0000)
    /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007ffb04fdc000)

Use gdb to debug the program:

......
Breakpoint 2, 0x00007ffff76a90a4 in malloc () from /usr/lib/libasan.so.6
(gdb) bt
#0  0x00007ffff76a90a4 in malloc () from /usr/lib/libasan.so.6
#1  0x0000555555555183 in main () at memory-leak.c:4
......

The program will use functions from Sanitizers instead of jemalloc.

An AES encryption/decryption program

I write a simple AES encryption/decryption program, not recommend using it, but show some basic concepts:

(1) Refer my previous post: initialise EVP_CIPHER_CTX only once, which can improve code efficiency:

......
    EVP_CIPHER_CTX *enc_ctx = EVP_CIPHER_CTX_new();

    if (EVP_EncryptInit_ex(enc_ctx, EVP_aes_128_ecb(), NULL, key, NULL) == 0) {
        goto END;
    }
......

(2) Because the key length is 128 bits, the cipher text length should be multiples of 16 bytes. The plain text length is 98; EVP_EncryptUpdate() will encrypt first 96 bytes, and EVP_EncryptFinal_ex() will encrypt the remaining 2 bytes. The total length of encrypted text is 112.

......
        if (EVP_EncryptUpdate(enc_ctx, ct, &ct_len, pt, sizeof(pt)) == 0) {
            goto END;
        }

        if (EVP_EncryptFinal_ex(enc_ctx, ct + ct_len, &len) == 0) {
            goto END;
        }

        ct_len += len;
......

Correspondingly, EVP_DecryptUpdate() will decrypt first 96 bytes, and EVP_DecryptFinal_ex() will decrypt the trailing 2 bytes:

......
        if (EVP_DecryptUpdate(dec_ctx, decrypted, &decrypted_len, ct, ct_len) == 0) {
            goto END;
        }

        if (EVP_DecryptFinal_ex(dec_ctx, decrypted + decrypted_len, &len) == 0) {
            goto END;
        }

        decrypted_len += len;
......

Why does SSL client report google’s certificate “self-signed”?

In previous post, I implemented a simple HTTPS client, but the program has a small flaw, i.e., when connecting to “www.google.com:443“, it will report following error in verifying certificate:

error code is 18:self signed certificate

error code is from SSL_get_verify_result:

long SSL_get_verify_result(const SSL *ssl)
{
    return ssl->verify_result;
}

and 18 is mapping to X509_V_ERR_DEPTH_ZERO_SELF_SIGNED_CERT, which means “self-signed certificate”. But for other websites, e.g., facebook.com, no error is outputted.

Use OpenSSL‘s client-arg program to test:

# LD_LIBRARY_PATH=/root/openssl/build gdb --args ./client-arg -connect "www.google.com:443"
......
Thread 2 hit Breakpoint 1, main (argc=3, argv=0xfffffc7fffdf4c38) at client-arg.c:99
99      BIO_puts(sbio, "GET / HTTP/1.0\n\n");
(gdb) p ssl->verify_result
$1 = 18
(gdb)

The same error code: 18. But openssl-s_client can guarantee the certificate is not “self-signed”:

# LD_LIBRARY_PATH=/root/openssl/build openssl s_client -connect google.com:443
CONNECTED(00000004)
depth=2 OU = GlobalSign Root CA - R2, O = GlobalSign, CN = GlobalSign
verify return:1
depth=1 C = US, O = Google Trust Services, CN = GTS CA 1O1
verify return:1
depth=0 C = US, ST = California, L = Mountain View, O = Google LLC, CN = *.google.com
verify return:1
---
Certificate chain
 0 s:C = US, ST = California, L = Mountain View, O = Google LLC, CN = *.google.com
   i:C = US, O = Google Trust Services, CN = GTS CA 1O1
 1 s:C = US, O = Google Trust Services, CN = GTS CA 1O1
   i:OU = GlobalSign Root CA - R2, O = GlobalSign, CN = GlobalSign
---
......

Hmm, I need to find the root cause.

First of all, I searched the code to see when X509_V_ERR_DEPTH_ZERO_SELF_SIGNED_CERT is set, and found only one spot:

if (self_signed)
            return verify_cb_cert(ctx, NULL, num - 1,
                                  sk_X509_num(ctx->chain) == 1
                                  ? X509_V_ERR_DEPTH_ZERO_SELF_SIGNED_CERT
                                  : X509_V_ERR_SELF_SIGNED_CERT_IN_CHAIN);

The interesting thing is the amount of certificates in the chain is only 1, but from above openssl-s_client‘s output, there are 2 certificates in the chain. OK, let’s see the content of this “self-signed” certificate.

After some debugging, I finally found tls_process_server_certificate, which is used to process the server’s certificate. With the help of gdb, I can dump the content of certificate:

# gdb --args ./client www.google.com:443
.......
(gdb) b tls_process_server_certificate
......
Thread 2 hit Breakpoint 1, tls_process_server_certificate (s=0xf09e90, pkt=0xfffffc7fffdefe30)
    at ../ssl/statem/statem_clnt.c:1768
1768        X509 *x = NULL;
......
1806            if (certbytes != (certstart + cert_len)) {
(gdb)
1811            if (SSL_IS_TLS13(s)) {
(gdb) dump binary memory cert certstart certstart + cert_len
......

Try to check the cert file:

# cat cert
......
�0� *�H��       (No SNI provided; please fix your client.10Uinvalid2.invalid0�"0
��bO����
.....

The reason is obvious: “No SNI provided; please fix your client.”. Ah, I need to set SNI explicitly. After invoking SSL_set_tlsext_host_name, the certificate chain becomes correct (The new code can be downloaded here).

Summary: I am not an SSL/TLS expert, and OpenSSL project is complex and daunting. But with some basic SSL/TLS knowledge and the help of debugger, I can find the root cause of issues independently. Don’t give up, digest code bit by bit, finally you will win!

Write a simple server using libev

I came across libev about two months ago, and gave it a try: implemented a simple dead-loop server which just prints the client’s address and port, like this:

# ./server
Connection from ::ffff:192.168.1.7:59362
......

P.S., the code can be downloaded here.

A simple HTTPS client demo

This post has a simple demo program which shows how to program with OpenSSL, unfortunately, I think it is a little dated and not correct. So I refactored it and hope it can be a reference for learning OpenSSL. The code can be downloaded here.