Postmortem of NTL::Vec type

I am playing with NTL, and come across a core dump issue which is related with NTL::ZZX variable:

(gdb) p msg
$2 = (NTL::ZZX &) @0x7fff680601f0: {rep = {_vec(long double,...)( *) = {rep = 0xab629d0}}}

The NTL::ZZX actually contains one member, rep:

class ZZX {

public:

vec_ZZ rep;

......

ZZ& operator[](long i) { return rep[i]; }
const ZZ& operator[](long i) const { return rep[i]; }

......
}

The vec_ZZ is a vector (not std::vector, NTL::Vec instead) in fact:

typedef Vec<ZZ> vec_ZZ;

The error occurs when getting the 8191-st element. Unfortunately, I can’t use gdb to access the element in vector directly:

(gdb) p i
$3 = 8191
(gdb) p msg[i]
You can't do that without a process to debug.

After referring this doc, it gives me the idea that seems be the gdb‘s limitation of accessing container. So I try to access the member straightaway.NTL::Vec is just a template class containing one public member:

template<class T>
class Vec {  
public:  
    ......
    WrappedPtr<T, _vec_deleter> _vec__rep;
    ......
};

While WrappedPtr is nothing but another template class:

template<class T, class Deleter>
class WrappedPtr {
   ......
public:
   typedef T * raw_ptr;

   raw_ptr rep;
   ......
}

We can see the rep member in WrappedPtr points to the start address of the content in vector. Read the 8191-st element’s value:

(gdb) p sizeof(*msg.rep._vec__rep.rep)
$23 = 8
(gdb) x/16xb msg.rep._vec__rep.rep+8191
0xab729c8:      0x77    0x01    0x00    0x00    0x00    0x00    0x00    0x00
0xab729d0:      0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00

The valid value should be a memory address, 0x177 is definitely not. So the next thing is to find out why this isn’t correct …

Use network analyzer to learn SSH session establishment

The establishment of SSH session consists of 2 parts: build up the encryption channel and authenticate user. To understand the whole flow better, I usetcpdump/Wireshark to capture and analyze the packets. Server is OpenBSD 6.1 and client is ArchLinux. The tcpdump command is like this:

sudo tcpdump -A -s 0 'net 192.168.38.176' -i enp7s0f0 -w capture.pcap

(1) Connect server first time:

1

The captured packets:

C1

We can see the client/server negotiated SSH version firstĀ (In fact, client and server sentĀ SSH version simultaneously, so please don’t misunderstand client sent first, then server responded. Use “nc 192.168.38.176 22” command to check.)

, then exchanged public key to generate secret key. The server issued “New Keys” message, and waited for client to answer.

(2) Accept server’s public key but not input password:

2

The captured packets:

C2

The first packet should be client acknowledged server’s “New Keys” message, then there are some interactions. Now the encryption channel is set up.

(3) Enter password and authenticate user:

3

The captured packets:

C3

These packets are all encrypted data. If user’s password is correct, the whole SSH session will be ready, and you can administrator server now.

Reference:
Understanding the SSH Encryption and Connection Process.

Be careful of clear/release methods in gRPC

My project uses gRPC and generates code like this:

......
inline void AddDBResponse::clear_last_record() {
  if (has_last_record()) {
    delete msg_.last_record_;
    clear_has_msg();
  }
}
......
inline ::privdb::DB* AddDBResponse::release_last_record() {
  // @@protoc_insertion_point(field_release:privdb.AddDBResponse.last_record)
  if (has_last_record()) {
    clear_has_msg();
    ::privdb::DB* temp = msg_.last_record_;
    msg_.last_record_ = NULL;
    return temp;
  } else {
    return NULL;
  }
}
inline void AddDBResponse::set_allocated_last_record(::privdb::DB* last_record) {
  clear_msg();
  if (last_record) {
    set_has_last_record();
    msg_.last_record_ = last_record;
  }
  // @@protoc_insertion_point(field_set_allocated:privdb.AddDBResponse.last_record)
}
......

In clear_last_record() method, it will assume the msg_.last_record_ is allocated in heap, so free it; while release_last_record() not. So if you callset_allocated_last_record() function and last_record points the data which resides on stack literally, please use release_last_record(). Otherwise, the scary memory corruption will occur!:-)

Message length setting in gRPC

The default send/receive message length of gRPC is defined here:

/** Default send/receive message size limits in bytes. -1 for unlimited. */
/** TODO(roth) Make this match the default receive limit after next release */
#define GRPC_DEFAULT_MAX_SEND_MESSAGE_LENGTH -1
#define GRPC_DEFAULT_MAX_RECV_MESSAGE_LENGTH (4 * 1024 * 1024)

We can know that send message length is no limited (-1), while Server/Client can only receive 4 Mi bytes by default.

You can change receive message length to unlimited in Client:

grpc::ChannelArguments ch_args;
ch_args.SetMaxReceiveMessageSize(-1);
std::shared_ptr<grpc::Channel> ch = 
        grpc::CreateCustomChannel("localhost:50051", grpc::InsecureChannelCredentials(), ch_args);

But this doesn’t work in Server:

ServerBuilder builder;
builder.SetMaxReceiveMessageSize(-1);

Because in server_builder.cc, the parameter only takes effect when it is positive:

std::unique_ptr<Server> ServerBuilder::BuildAndStart() {
    ......
    if (max_receive_message_size_ >= 0) {
      args.SetInt(GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH, max_receive_message_size_);
    }
    ......
}

So we can use builder.SetMaxReceiveMessageSize(INT_MAX); as a work-around.

BTW, check message length limit is in get_message_size_limits function.

Use clang to build OpenBSD on amd64/i386

I install the newest OpenBSD 6.1, and try to build -curr kernel. But unfortunately the make reports following errors:

# make
cat /usr/src/sys/arch/amd64/amd64/genassym.cf /usr/src/sys/arch/amd64/amd64/genassym.cf |  sh /usr/src/sys/kern/genassym.sh cc -no-integrated-as -g -Werror -Wall -Wimplicit-function-declaration  -Wno-uninitialized -Wno-pointer-sign  -Wno-address-of-packed-member -Wno-constant-conversion  -Wframe-larger-than=2047 -mcmodel=kernel -mno-red-zone -mno-sse2 -mno-sse -mno-3dnow  -mno-mmx -msoft-float -fno-omit-frame-pointer -ffreestanding -fno-pie -O2 -pipe -nostdinc -I/usr/src/sys -I/usr/src/sys/arch/amd64/compile/GENERIC.MP/obj -I/usr/src/sys/arch -DDDB -DDIAGNOSTIC -DKTRACE -DACCOUNTING -DKMEMSTATS -DPTRACE -DPOOL_DEBUG -DCRYPTO -DSYSVMSG -DSYSVSEM -DSYSVSHM -DUVM_SWAP_ENCRYPT -DFFS -DFFS2 -DFFS_SOFTUPDATES -DUFS_DIRHASH -DQUOTA -DEXT2FS -DMFS -DNFSCLIENT -DNFSSERVER -DCD9660 -DUDF -DMSDOSFS -DFIFO -DFUSE -DSOCKET_SPLICE -DTCP_SACK -DTCP_ECN -DTCP_SIGNATURE -DINET6 -DIPSEC -DPPP_BSDCOMP -DPPP_DEFLATE -DPIPEX -DMROUTING -DMPLS -DBOOT_CONFIG -DUSER_PCICONF -DAPERTURE -DMTRR -DNTFS -DHIBERNATE -DPCIVERBOSE -DUSBVERBOSE -DWSDISPLAY_COMPAT_USL -DWSDISPLAY_COMPAT_RAWKBD -DWSDISPLAY_DEFAULTSCREENS="6" -DX86EMU -DONEWIREVERBOSE -DMULTIPROCESSOR -DMAXUSERS=80 -D_KERNEL -MD -MP -MF assym.P > assym.h.tmp
cc: unrecognized option '-no-integrated-as'
cc1: error: unrecognized command line option "-Wno-address-of-packed-member"
cc1: error: unrecognized command line option "-Wno-constant-conversion"
*** Error 1 in /usr/src/sys/arch/amd64/compile/GENERIC.MP (Makefile:938 'assym.h')

From this mail, I learn that clang has been the default compiler on amd64/i386 platforms for OpenBSD, so I switch to use clang to build kernel:

# CC=clang make
.....

Now it can compile!