Today my colleague fixed one bug related to out-of-boundary access of array: a hash function returns the selected index of the array, but the hash function’s return value is int
, so in corner case, when the hash value is overflow, it can become negative, and this will cause access an invalid element of the array. The lessons I learnt from this bug:
(1) Review the return value of hash function;
(2) Pay attention to the index when accessing array, is it possible to cause out-of-boundary access?
Tag: C
The caveat of thread name length in glibc
Recently, our team met an interesting bug: the process is configured to spawn 16
threads, but only spawns 10
threads in reality. The thread code is like this:
static void *
stat_consumer_thread_run(void *data)
{
stat_consumer_thread_t *thread = data;
char thread_name[64];
snprintf(thread_name, sizeof(thread_name), "stat.consumer.%d",
thread->id);
int rc = pthread_setname_np(pthread_self(), thread_name);
if (rc != 0) {
return NULL;
}
......
return NULL;
}
After checking pthread_setname_np manual, we found:
The thread name is a meaningful C language string, whose length is restricted to 16 characters, including the terminating null byte (’\0’).
So thread name is restricted to 16
characters, “stat.consumer.0
” ~ “stat.consumer.9
” are set successfully, but “stat.consumer.10
” ~ “stat.consumer.15
” are not, and the corresponding threads are failed to run.
Bisection assert is a good debug methodology
Recently, I fixed an issue which is related to uninitialised bit-field in C
programming language. Because the bit-filed can be either 0
or 1
, so the bug will occur randomly. But the good news is the reproduced rate is very high, nearly 50%
. Though I am not familiar with the code, I used bisection assert
to help:
{
......
assert(bit-field == 0);
......
assert(bit-field == 0);
......
}
If the first assert
is not triggered, but the second one is, I can know which code block has the bug, then bisect code and add assert
again, until the root cause is found.
An issue related to uninitialised memory
Today I met an interesting bug: A C program behaved differently between debug (gcc -O0
) and release (gcc -O3
) modes.
First of all, I compared the logs between two modes, and pinned down in which function, the logs began to diverge.
Secondly, I used gdb
to debug two programs simultaneously, and checked the variables’ values, then found a variable which had disparate values that would cause two programs enter different branches in a if-else
statement. Hmm, this was the root cause.
My gut feeling was the release mode program may fetch the staled value, but after reviewing code carefully, I found the reason is one block memory (the variable belonged to) allocated from heap was not initialised, so this will introduce notorious “undefined behaviour”.
As far as I know, the reasons for uninitialising variables:
(1) The programmer forgets;
(2) The programmer reckons the variable will be assigned correct value before use, and there may be performance penalty for initialising a block of memory.
Anyway, the lesson I learnt today is unless you are 100% sure it will be OK to uninitialise the specified variable, otherwise please initialise it, and this can save you several hours in the future.
Beware of using GNU libc basename() function
From the manual page, we know there are two versions of basename()
implementation. One for POSIX
-compliant:
#include <libgen.h>
char *basename(char *path);
Another for GNU
version:
#define _GNU_SOURCE /* See feature_test_macros(7) */
#include <string.h>
But the manual
doesn’t mention that the prototype type of GNU
version is different from POSIX
-compliant one (The parameter type is const char*
, not char*
):
char *basename (const char *__filename)
And the implementation is also simple, just invokes strrchr()
:
char *
__basename (const char *filename)
{
char *p = strrchr (filename, '/');
return p ? p + 1 : (char *) filename;
}