September, 2017 | Nan Xiao's Blog

Be careful of using grpc::ClientStreamingInterface::Finish() function

No matter server-side streaming RPC or bidirectional streaming RPC, if there is still message for Client to read, but Client callsgrpc::ClientStreamingInterface::Finish(), it will block here forever. The call stack is like this:

#0  0x00007ffff5b24076 in epoll_pwait () from /usr/lib/libc.so.6
#1  0x00007ffff71c6ae5 in ?? () from /usr/lib/libgrpc.so.4
#2  0x00007ffff71e3a26 in ?? () from /usr/lib/libgrpc.so.4
#3  0x000055555558300f in grpc::CompletionQueue::Pluck (this=0x555555fd5340, tag=0x7fffffffd800)
    at /usr/include/grpc++/impl/codegen/completion_queue.h:224
#4  0x0000555555585e46 in grpc::ClientReaderWriter<..., ...>::Finish (
    this=0x555555fd5320) at /usr/include/grpc++/impl/codegen/sync_stream.h:494

So please pay attention to it. Before calling grpc::ClientStreamingInterface::Finish(), make sure the grpc::ReaderInterface::Read() return false:

while (stream->Read(&server_note)) {
  ...
}
Status status = stream->Finish();

First taste of gRPC

In the past month, I made my hand dirty on gRPC, and tried to use it in my current project. The effect seems good until now, I want to share some feeling here:

(1) gRPC lets you concentrate on message definition, and it will generate all necessary code for you. This is really very neat! The APIs are also simple and handy: they help process all kinds of exceptional cases which will save users a lot of energy.

(2) Now gRPC‘s maximum message length is INT_MAX (please refer this post) , so on general 64-bit platform, its value is 2^31 - 1, nearly 2GiB. Personally, I think maybe LONG_MAX, 2^63 - 1, is better, since it is nearly equal to say there is no limitation for message length. BTW, gRPC useProtocol Buffers under the hood. From Protocol Buffers document:

Protocol Buffers are not designed to handle large messages. As a general rule of thumb, if you are dealing in messages larger than a megabyte each, it may be time to consider an alternate strategy.

I haven’t dived into Protocol Buffers code, so not sure what is the negative effect of transmitting large message. But anyway, whether support and whether support well are 2 different things.

The basics of Client/Server socket programming

While Client/Server communication model is ubiquitous nowadays, most of them involve socket programming knowledge. In this post, I will introduce some rudimentary aspects of it:

(1) Short/Long-lived TCP connection.
Short-lived TCP connection refers to following pattern: Client creates a connection to server; send message, then close the connection. If Client wants to transmit information again, repeat the above steps. Because establishing and destroying TCP sessions have overhead, if Client needs to have transactions with Server frequently, long-lived connection may be a better choice: connect Server; deliver message, deliver message, …, disconnect Server. A caveat about long-lived connection is Client may need to send heartbeat message to Server to keep the TCP session active.

(2) Synchronous/Asynchronous communication.
After Client sends the request, it can block here to wait for Server’s response, this is called synchronous mode. Certainly the Client should set a timer in case the response never come. The Client can also choose not to block, and continue to do other things. On the contrary, this is called asynchronous mode. If the Client can send multiple requests before receiving responses, it needs to add ID for every message, so it can distinguish the corresponding response for every request.

(3) Error handling.
A big pain point of socket programming is you need to consider so many exceptional cases. For example, the Server suddenly crashes; the network cable is plugged out, or the response message is half-received, etc. So your code should process as many abnormalities as possible. It is no exaggeration to say that error-handling code quality is the cornerstone of program’s robustness.

(4) Portability.
Different *NIX flavors may have small divergences on socket programming, so the program works well on Linux may not guarantee it also run as you expect on FreeBSD. BTW, I summarized a post about tips of Solaris/illumos socket programming before, and you can read it if you happen to work on these platforms.

(5) Leverage sniffer tools.
Tcpdump/Wireshark/snoop are amazing tools for debugging network programming, and they can tell you what really happens under the hood. Try to be sophisticated at these tools, and they will save you at one day, trust me!

The header file and library search path of cc on OpenBSD

On my OpenBSD 6.1/amd64 system, the default header file search path of cc compiler is only /usr/include:

# echo | cc -E -Wp,-v -
ignoring duplicate directory "/usr/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/include
End of search list.
# 1 "<stdin>"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "<stdin>"

The default library search path is only /usr/lib:

# cc -Xlinker --verbose  2>/dev/null | grep SEARCH | sed 's/SEARCH_DIR("=\?\([^"]\+\)"); */\1\n/g'  | grep -vE '^$'
SEARCH_DIR("/usr/lib");

So if you want to include other header file or link libraries in other directory, you need to specify it explicitly. For example:

# cc -I/usr/local/include -L/usr/local/lib ...

Reference:
Finding out what the GCC include path is;
How to print the ld(linker) search path.

lscpu for OpenBSD/FreeBSD

There is a neat command, lscpu, which is very handy to display CPU information on GNU/Linux OS:

$ lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              32
On-line CPU(s) list: 0-31
Thread(s) per core:  2
Core(s) per socket:  8
Socket(s):           2
......

But unfortunately, the BSD OSs lack this command, maybe one reason is lscpu relies heavily on /proc file system which BSD don’t provide, :-). TakeOpenBSD as an example, if I want to know CPU information, dmesg should be one choice:

# dmesg | grep -i cpu
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM)2 Duo CPU P8700 @ 2.53GHz, 2527.35 MHz
cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF,SENSOR
cpu0: 3MB 64b/line 8-way L2 cache
cpu0: apic clock running at 266MHz
cpu0: mwait min=64, max=64, C-substates=0.2.2.2.2.1.3, IBE
......

But the output makes me feeling messy, not very clear. As for dmidecode, it used to be another option, but now can’t work out-of-box because it will access /dev/mem which for security reason, OpenBSD doesn’t allow by default (You can refer this discussion):

# ./dmidecode
# dmidecode 3.1
Scanning /dev/mem for entry point.
/dev/mem: Operation not permitted

Based on above situation, I want a specified command for showing CPU information for my BSD box. So in the past 2 weeks, I developed a lscpu program for OpenBSD/FreeBSD, or more accurately, OpenBSD/FreeBSD on x86 architecture since I only have some Intel processors at hand. The application getsCPU metrics from 2 sources:

(1) sysctl functions.
The BSD OSs provide sysctl interface which I can use to get general CPU particulars, such as how many CPUs the system contains, the byte-order of CPU, etc.

(2) CPUID instruction. For x86 architecture, CPUID instruction can obtain very detail information of CPU. This coding work is a little tedious and error-prone, not only because I need to reference both Intel and AMD specifications since these 2 vendors have minor distinctions, but also I need to parse the bits of register values.

The code is here, and if you run OpenBSD/FreeBSD on x86 processors, please try it. It will be better you can give some feedback or report the issues, and I appreciate it very much. In the future if I have other CPUs resource, such as ARM or SPARC64, maybe I will enrich this small program.