Fix “Address 0xxxxxxxxx out of bounds” issue

Yesterday, I came across a crash issue when program accessed “out of bound” memory:

......
func = 0x7ffff7eb1dc8 <Address 0x7ffff7eb1dc8 out of bounds>
......

After some debugging, I found the reason is program uses dlopen to load some dynamic library, records the memory address of the library, but still read the memory after dlclose it.

__COUNTER__ macro in gcc/clang

Both gcc and clang define __COUNTER__ marco:

Defined to an integer value that starts at zero and is incremented each time the COUNTER macro is expanded.

Check following code:

# cat foo.c
#include <stdio.h>

void foo1(void)
{
    printf("%s:%d\n", __func__, __COUNTER__);
}

void foo2(void)
{
    printf("%s:%d\n", __func__, __COUNTER__);
}
# cat bar.c
#include <stdio.h>

void bar1(void)
{
    printf("%s:%d\n", __func__, __COUNTER__);
}

void bar2(void)
{
    printf("%s:%d\n", __func__, __COUNTER__);
}
# cat main.c
#include "foo.h"
#include "bar.h"

int main(void)
{
    foo1();
    foo2();
    bar1();
    bar2();
    return 0;
}

Run the program:

# ./main
foo_1:0
foo_2:1
bar_1:0
bar_2:1

You can see for every translate unit (.c) file, the __COUNTER__ begins at 0.

P.S., the code can be referenced here.

Introduction of Unix pseudo-random number functions

I can’t find detailed introduction of Unix pseudo-random number functions, so I decide to write one myself. Please notice many platforms provide reentrant versions, like Linux. Now that non-reentrant versions should just invoke reentrant versions with global data (refer glibc), I will use non-reentrant versions to demonstrate in this post.

(1) Call random() only:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    for (int i = 0; i < 5; i++)
    {
        printf("%ld\n", random());
    }
    return 0;
}

Compile and run for 2 times:

# gcc random.c
# ./a.out
1804289383
846930886
1681692777
1714636915
1957747793
# ./a.out
1804289383
846930886
1681692777
1714636915
1957747793

Same outputs as expected because they are “pseudo”.

(2) Call srandom(1). According to spec:

Like rand(), random() shall produce by default a sequence of numbers that can be duplicated by calling srandom() with 1 as the seed.

Verify it:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    srandom(1);
    for (int i = 0; i < 5; i++)
    {
        printf("%ld\n", random());
    }
    return 0;
}

Compile and run:

# gcc random.c
# ./a.out
1804289383
846930886
1681692777
1714636915
1957747793

Yes, the output is same as the first test (call random() only).

(3) Call srandom(2):

#include <stdio.h>
#include <stdlib.h>

int main()
{
    srandom(2);
    for (int i = 0; i < 5; i++)
    {
        printf("%ld\n", random());
    }
    return 0;
}

Build and run it:

# gcc random.c
# ./a.out
1505335290
1738766719
190686788
260874575
747983061

Hmm, this time different output is generated.

(4) Let’s see an example of using initstate():

#include <stdio.h>
#include <stdlib.h>

int main()
{
    unsigned int seed = 1;
    char state[128];

    if (initstate(seed, state, sizeof(state)) == NULL)
    {
        printf("initstate error\n");
        return 1;
    }

    for (int i = 0; i < 5; i++)
    {
        printf("%ld\n", random());
    }
    return 0;
}

From the spec:

The initstate() function allows a state array, pointed to by the state argument, to be initialized for future use.
So how initstate() will initialize the state array? Let’s see the implementation of glibc:

  ......
  int32_t *state = &((int32_t *) arg_state)[1]; /* First location.  */
  /* Must set END_PTR before srandom.  */
  buf->end_ptr = &state[degree];

  buf->state = state;

  __srandom_r (seed, buf);
  ......

initstate() actually calls srandom() to initialize the state array. Build and run program:

# gcc random.c
# ./a.out
1804289383
846930886
1681692777
1714636915
1957747793

The same output as the first test (call random() only), and this complies to another quote extracted from spec:

If initstate() has not been called, then random() shall behave as though initstate() had been called with seed=1 and size=128.

Change seed from 1 to 2:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    unsigned int seed = 2;
    char state[128];

    if (initstate(seed, state, sizeof(state)) == NULL)
    {
        printf("initstate error\n");
        return 1;
    }

    for (int i = 0; i < 5; i++)
    {
        printf("%ld\n", random());
    }
    return 0;
}

Compile and run again:

# gcc random.c
# ./a.out
1505335290
1738766719
190686788
260874575
747983061

This time, the output is same as the third test (call srandom(2) only). Definitely, you can change the size of state array and modify seed during running, like this:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    unsigned int seed = 2;
    char state[64];

    if (initstate(seed, state, sizeof(state)) == NULL)
    {
        printf("initstate error\n");
        return 1;
    }

    for (int i = 0; i < 5; i++)
    {
        srandom(seed + i);
        printf("%ld\n", random());
    }
    return 0;
}

(5) Finally, let’s see setstate(). Check following example:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    unsigned seed = 1;
    char state_1[128], state_2[128];

    if (initstate(seed, state_1, sizeof(state_1)) == NULL)
    {
        printf("initstate error\n");
        return 1;
    }

    seed = 2;
    if (initstate(seed, state_2, sizeof(state_2)) == NULL)
    {
        printf("initstate error\n");
        return 1;
    }

    for (int i = 0; i < 5; i++)
    {
        printf("%ld\n", random());
    }

    if (setstate(state_1) == NULL)
    {
        printf("setstate error\n");
        return 1;
    }

    for (int i = 0; i < 5; i++)
    {
        printf("%ld\n", random());
    }

    return 0;
}

Compile and run the program:

# gcc random.c
# ./a.out
1505335290
1738766719
190686788
260874575
747983061
1804289383
846930886
1681692777
1714636915
1957747793

You can see the first 5 numbers are same as invoking srandom(2), whilst the last 5 numbers are same as invoking srandom(1).

Last but not least, please keep state memory always valid during usage of these pseudo-random number functions.

The tips of optimising OpenSSL applications

This post introduces some tips of optimising applications which use OpenSSL.

(1) Per-thread memory pool. Because OpenSSL heavily allocate/free memories in its internals, implementing bespoke memory management functions which use per-thread memory pool (register them with CRYPTO_set_mem_functions) can improve performance.

(2) Reuse contexts. E.g., for EVP_PKEY_CTX, since every time EVP_PKEY_derive_init() will initialise its content, every thread can pre-allocate one EVP_PKEY_CTX and avoid allocating & freeing EVP_PKEY_CTX frequently. Further more, from the document:

The function EVP_PKEY_derive() can be called more than once on the same context if several operations are performed using the same parameters.

Another example is about EVP_CIPHER_CTX; a typical program is like this:

    EVP_CIPHER_CTX *ctx = EVP_CIPHER_CTX_new();
    EVP_EncryptInit_ex(ctx, EVP_aes_128_gcm(), NULL, key, nonce);
    EVP_EncryptUpdate(ctx, NULL, &len, aad, sizeof(aad));
    EVP_EncryptUpdate(ctx, ct, &ct_len, pt, sizeof(pt));
    EVP_EncryptFinal_ex(ctx, ct + ct_len, &len);
    EVP_CIPHER_CTX_ctrl(ctx, EVP_CTRL_GCM_GET_TAG, 16, ct + ct_len);
    EVP_CIPHER_CTX_free(ctx);

Actually, we can create a dedicated EVP_CIPHER_CTX for one EVP_CIPHER, i.e.:

    EVP_CIPHER_CTX *ctx = EVP_CIPHER_CTX_new();
    EVP_EncryptInit_ex(ctx, EVP_aes_128_gcm(), NULL, NULL, NULL);

Then in program, we can reuse this EVP_CIPHER_CTX, and just modify other parameters:

    EVP_EncryptInit_ex(ctx, NULL, NULL, key, nonce);
    EVP_EncryptUpdate(ctx, NULL, &len, aad, sizeof(aad));
    EVP_EncryptUpdate(ctx, ct, &ct_len, pt, sizeof(pt));
    EVP_EncryptFinal_ex(ctx, ct + ct_len, &len);
    EVP_CIPHER_CTX_ctrl(ctx, EVP_CTRL_GCM_GET_TAG, 16, ct + ct_len);

This also applies to EVP_Decrypt* functions.