The Dilemma brought by “outdated technology”

This week, I came across an interesting post: Do You Know Cobol? If So, There Might Be a Job for You. The general idea is many big finance companies still use the systems that are developed in Cobol, which is an ancient programming language. Currently, there are few people who master Cobol, and even worse, most of them will or already retire. IMHO, this article gives an good example about companies and personal engineers’ dilemma brought by “outdated technology”.

Let me tell two stories first:
(1) I once worked in HPE for nearly two years, and HPE has its own Unix Operating System: HP-UX. Actually, my team did HP-UX related work before I joined in, but at that time, all team’s work was already switched to Linux. Why? Because Linux had dominated the servers market then. To earn money, more and more resource should be invested in Linux area, and for HP-UX, basic maintenance and development is enough. As fat as I know, HP-UX is still the backbone of many critical services, such as banks, telecommunication Operators, etc. But even for HPE itself, HP-UX‘s priority becomes very lower now.

(2) There is a service which was launched in mid-1990s. It is a 32-bit program, written in C programming language, and stable enough to serve the people all over the world every day. About ~20 years later, one engineer noticed Year_2038_problem because the program definitely use time_t which is 32-bit long. He began to discuss with leader to transform the program to 64-bit, but both of them knew it was not as simple as adding only -m64 compile option. After ~20 years, the code had become “mature”, what I mean is the code repository was very large; about ~40 engineers had ever committed code, and some modules had changed into a total “black box”, what I mean is no one knew the logic behind it, but it really worked as a charm! To transform it into 64-bit program, maybe the compilation can pass, but no one know whether it indeed work! It needs careful code review and sufficient testing, but seems not worth cause it is a problem which will occur ~20 years later. Therefore, this task lies in “Todo” list year by year. Every one pray the service should be shut down by 2038.

These examples all narrate one fact, most “outdated technology” are not too “bad”, such as Cobol, HP-UX or 32-bit program, but for some reasons, they are not main-steam now. For most companies, the overhaul of services which are constructed by these “outdated technology” is not accepted: besides the notable time & person cost, one glitch can bring catastrophic result, even can let company close. But at the other side, the amount of engineers who master these “outdated technology” also become smaller and smaller, so the companies can only explore the potentials from their internal staffs mostly.

For engineers, there is also a dilemma. You can pick “outdated technology” as a hobby, but there is a huge risk to adopt it as a full-time job. Maybe one day, you need to find job again, and many biased companies will reject you just because they deem you don’t know some “hyped technology”, and can only tame dinosaurs. Ridiculous! Isn’t it? But it is the reality we must adapt to.

Forgetting “-pthread” option may give you a big surprise!

Today, I wrote a small pthread program to do some testing:

#include <pthread.h>

int main(void)
{
        pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
        pthread_cond_t cv = PTHREAD_COND_INITIALIZER;

        pthread_mutex_lock(&mutex);
        pthread_cond_wait(&cv, &mutex);
        return 0;
} 

Build and test it on OpenBSD-current (version is 6.4):

# cc cv_test.c -o cv_test
# ./cv_test

The program will block there and it is my expected result. Switch to Arch Linux (kernel version is 4.18.9):

# cc cv_test.c -o cv_test
# ./cv_test
#

The program will exit immediately. I doubt it is “spurious awake” firstly, but can’t get a convincing explanation. Using ldd to check program. On OpenBSD:

# ldd cv_test
cv_test:
        Start            End              Type  Open Ref GrpRef Name
        000000d4c3a00000 000000d4c3c02000 exe   1    0   0      cv_test
        000000d6e6007000 000000d6e62f6000 rlib  0    1   0      /usr/lib/libc.so.92.5
        000000d6db100000 000000d6db100000 ld.so 0    1   0      /usr/libexec/ld.so

On Arch Linux:

# ldd cv_test
        linux-vdso.so.1 (0x00007ffde91c6000)
        libc.so.6 => /usr/lib/libc.so.6 (0x00007f3e3169b000)
        /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f3e3187a000)

Nothing special. After seeking help on stackoverflow, the answer is I need adding -pthread option:

# cc -pthread cv_test.c -o cv_test
# ./cv_test

This time it worked perfectly. Checking linked library:

# ldd cv_test
        linux-vdso.so.1 (0x00007fff48be8000)
        libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007fa46f84c000)
        libc.so.6 => /usr/lib/libc.so.6 (0x00007fa46f688000)
        /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007fa46f888000)

Why doesn’t Linux give me a link error which prompts I need link libpthread? It seems not make sense.

First taste of MPI

Different with OpenMP which focuses on multiple threads in one process, MPI defines how multiple processes can collaborate with each other. In this post, I use Open MPI on Arch Linux to do a simple test.

The “Hello World” program is from here, build and run it on one node, not a cluster containing many nodes:

$ mpirun mpi_hello_world
Hello world from processor tesla-p100, rank 16 out of 52 processors
Hello world from processor tesla-p100, rank 34 out of 52 processors
Hello world from processor tesla-p100, rank 35 out of 52 processors
......

Check the CPU information:

$ lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              104
On-line CPU(s) list: 0-103
Thread(s) per core:  2
Core(s) per socket:  26
Socket(s):           2
NUMA node(s):        2
......

Although there are 2 physical CPUs in the system, the mpirun only utilizes 1 CPU. Modify the program to output process ID:

......
#include <unistd.h>
......
printf("Hello world from process %d, processor %s, rank %d out of %d processors\n",
         getpid(), processor_name, world_rank, world_size);
......

This time you can make sure different processes are spawned:

$ mpirun mpi_hello_world
Hello world from process 52528, processor tesla-p100, rank 21 out of 52 processors
Hello world from process 52557, processor tesla-p100, rank 31 out of 52 processors
Hello world from process 52597, processor tesla-p100, rank 43 out of 52 processors
......

P.S., if you run mpirun as root, please add --allow-run-as-root option:

# mpirun --allow-run-as-root mpi_hello_world

 

Build SPDZ-2 on Arch Linux

To build SPDZ-2 on Arch Linux, besides installing necessary packages (mpir, libsodium, etc.), You also need to do following steps:

(1) Add following line in CONFIG.mine:

MY_CFLAGS = -DINSECURE

Otherwise, you will meet errors when executing Scripts/setup-online.sh:

# Scripts/setup-online.sh
terminate called after throwing an instance of 'std::runtime_error'
  what():  You are trying to use insecure benchmarking functionality for preprocessing.
You can activate this at compile time by adding -DINSECURE to the compiler options.
Make sure to run make clean as well.
Scripts/setup-online.sh: line 33: 10355 Aborted                 (core dumped) $SPDZROOT/Fake-Offline.x ${players} -lgp ${bits} -lg2 ${g} --default ${default}
dd: failed to open 'Player-Data/Private-Input-0': No such file or directory
dd: failed to open 'Player-Data/Private-Input-1': No such file or directory

(2) Execute make command;

(3) Run Scripts/setup-online.sh;

(4) SPDZ-2 requires python2, but the default python is python3 on Arch Linux. So you need to install python2 manually:

# pacman -S python2

Then modify compile.py:

#!/usr/bin/env python2

Otherwise, you will encounter following errors when running ./compile.py tutorial:

# ./compile.py tutorial
Traceback (most recent call last):
  File "./compile.py", line 19, in <module>
    import Compiler
  File "/root/SPDZ-2/Compiler/__init__.py", line 3, in <module>
    import compilerLib, program, instructions, types, library, floatingpoint
ModuleNotFoundError: No module named 'compilerLib'

Execute ./compile.py tutorial.

Now you can play the example:

# ./Server.x 2 5000 &
# Scripts/run-online.sh tutorial

 

A brief introduction of OpenBSD nohup command

When you execute command in terminal (not background mode), if the connection disconnects unexpectedly, the running process will be terminated by SIGHUP signal. nohup command can let process still keep running when this situation occurs.

OpenBSD‘s nohup implementation is neat. It actually only does 4 things:

(1) If stdout is terminal, redirect it to nohup.out file (created in current directory or specified by HOME environment variable):

......
if (isatty(STDOUT_FILENO))
    dofile();
......

In dofile option:

......
if (dup2(fd, STDOUT_FILENO) == -1)
    err(EXIT_MISC, NULL);
......

(2) If stderr is terminal, redirect it to stdout. In this case, stderr and stdout will point to same file:

if (isatty(STDERR_FILENO) && dup2(STDOUT_FILENO, STDERR_FILENO) == -1) {
    ......
}

(3) Ignore SIGHUP signal:

......
(void)signal(SIGHUP, SIG_IGN);
......

(4) Execute the intended command:

execvp(argv[1], &argv[1]);

That’s all!