int $0x80

Your daily work interruption

Static Initialization of Globals in C++

I just started working on my C++ kernel, and the first problem I faced is the initialization of global (non-POD) variables. Here is what I found and how I solved the troubles I faced.

What’s wrong?

The simple case

Let’s start with a simple class:

1
2
3
4
5
6
7
8
9
#include <iostream>

struct Test
{
  Test()
  {
    std::cout << "hello" << std::endl;
  }
};

Every time the Test class is instanciated, the constructor must be called to do the required initialization tasks (here, it just prints “hello” to standard output).

In a simple case like this:

1
2
3
4
5
6
void some_function()
{
  Test test_obj;

  /* do something intelligent here */
}

the machinery going on behind the scene is pretty easy to understand. When some_function is called, memory is allocated on the stack to store test_obj and the constructor of the Test class is called to initialize it. When the constructor returns (remember, a constructor is just a regular method), execution continues until the end of the function, where destructors are called if necessary, stack is free’d and we return.

The actual problem

Now, if we have something like this:

1
2
3
4
5
6
Test test_obj;

int some_function()
{
  /* do something intelligent with test_obj */
}

what happens? How and when is test_obj initialized and brought to a usable state for other functions to use it?

Again, it’s pretty simple:

  • the compiler sees that you have a global object that requires initialization;
  • it generates a small function that is responsible for calling the constructor of that object;
  • it then adds a pointer to that function in the .ctors section of the output object file;
  • at link time, all the .ctors sections of all object files are merged into a section called .init_array;
  • finally, at run time, before your main function gets called, the C++ runtime calls every function pointer in the .init_array section, subsequently initializing every object that requires it.

Now, in kernel land, we do not have runtime support. So these functions cannot be called “automagically”. We have to do it ourselves.

How can I fix it?

We want to collect all the function pointers present in object files’ .ctors sections, and call them one by one.

I use a custom linker script for my kernel, which is useful to control alignment of sections, drop some of them and keep the others. First of all, I added this to my linker script:

1
2
3
4
5
.ctors ALIGN(0x1000) : {
  PROVIDE(_ctors_start = .);
  *(.ctors)
  PROVIDE(_ctors_end = .);
}

This tells the linker to find every .ctors function in the input files, and merge them in a .ctors section in the output. Moreover, it generates two symbols _ctors_start and _ctors_end marking the begining and the end of the array. My runtime code will use these to know the boundaries of the array.

Next, in the kxx_entry function (the kernel’s first function, the one called by GRUB), I call every function pointer present in the .ctors section. This is done with a simple loop in assembly:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
kxx_entry:
  # initialize stack and push GRUB's arguments
  mov $_stack_end, %esp
  xor %ebp, %ebp
  push %ebx
  push %eax

  # call ctors list
  mov $_ctors_start, %esi
  1:
    cmp $_ctors_end, %esi
    je 1f
    mov (%esi), %edi
    call *%edi
    add $4, %esi
    jmp 1b
  1:

  # jump into the kernel's main function
  call kxx_main

And that’s it! Everything runs as expected and I have a working static initialization of my variables before calling the kxx_main function (which is the kernel’s main function).

Conclusion

Here I gave a simple solution for the static variable initialization problem. I ignored some trouble that can be caused if the object has a destructor: if a destructor is present, we will have some link errors, with the linker complaining about missing __cxa_atexit and __dso_handle symbols. I may explain these in another article. But for now, as long as I do not define an explicit destructor for my globals, I am OK and I can continue with my project.

A Kernel in C++

I am interrested in kernel development, and I also enjoy playing with the intricacies of the C++ language, be it syntax details or compiler and runtime implementation. Today, I am starting a new project: a toy kernel in C++.

My goal is to use the C++ language to its full extent in kernel environment. So I don’t want to have only classes and inheritance, but I also want exceptions, RTTI, a decent standard library support, etc…

At the end of the day, I expect to have a better understanding of the runtime support required to make C++ programs run; and to see if there is a true benefit in kernel development with “high level” languages.

For now, I don’t have a clear list of the features I want in my kernel, so I will start with the assignments from the k course (hence the name for my kernel: k++) and I’ll see where it goes.

I will use the GCC toolchain (compiler, linker, etc…) and QEMU for testing. The code is available here.

RFC 6106: DNS Configuration for IPv6 SLAAC

Bringing IPv6 internet access in your home network when your ISP just routes your IPv4 packets is fairly simple today, thanks the various tunnel brokers like Hurricane Electric or SixXS. Once you have set up your tunnel, the next step is to configure your hosts to get an IPv6 address and a recursive DNS server and you’re done.

Manual configuration can be a viable option for some specific use cases, but most networks will use either DHCPv6, stateless address autoconfiguration (SLAAC), or both.

  • DHCPv6 is basically an adaptation of DHCP for IPv6. For instance, it does not use broadcast but instead takes advantage of the ff02::1:2 multicast address to find available DHCP servers on the network.
  • Another configuration method for IPv6 hosts is the Router Advertisement (RA) / Router Discovery (RD) couple, commonly referred as SLAAC. Both have the ability to give a full 128 bit IPv6 address to each host on the network, but only DHCPv6 provides information on available recursive DNS servers, so it is impossible to rely only on RA to configure your network today.

RFC 5006, published in September 2007 was a first attempt to overcome this issue. The proposed idea was to add an option to RA packets, containing a list of recursive DNS servers so hosts, after getting their IPv6 address configured, could updated their local DNS server database (/etc/resolv.conf on most Unix systems) accordingly. RFC 6106, published in November 2010 is an extension to the original RFC 5006 and thus, obsoletes it.

Options introduced by RFC 6106

RDNSS: Recursive DNS Server option

This options is basically the same that was presented in RFC 5006. As said before, it provides each host with a list of recursive DNS servers with their associated lifetime.

The format of the option is as follows:

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|     Type      |     Length    |           Reserved            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           Lifetime                            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
:            Addresses of IPv6 Recursive DNS Servers            :
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

DNSSL: DNS Search List option

This RFC also adds an option that was not present in the original RFC 5006, the ability to give hosts a search list. Search lists are the thing that allow you to specify only the host name of a machine and skip the domain name. For example, if I am in EPITA’s network and I want to ssh on a machine called maya, a correctly configured search list allows me to type ssh maya instead of ssh maya.epita.fr. You can configure this in your /etc/resolv.conf with the domain and search keywords.

The format of this option is similar to RDNSS options:

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|     Type      |     Length    |           Reserved            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           Lifetime                            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
:                Domain Names of DNS Search List                :
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Problems

This RFC comes with its set of problems, mostly related to implementation in general purpose operating systems. If we take the case of DHCP/DHCPv6, everything is done by a userland daemon that communicates with the server, gets the packets, parses them, and updates addresses of network interfaces and /etc/resolv.conf accordingly. SLAAC does not work exactly the same way. Data received in RA packets is processed in the kernel and addresses are updated from there. This works pretty well with the original specifications of router advertisement, but can cause some problems with options that need to modify files belonging to the userland. As pointed out by section 6 of this RFC, a communication channel must be established between the userland and the kernel to export information about RA packets and allow further processing.

Another problem that can arise, is concurrent access to the local DNS servers database that can cause inconsistency. If we take the example of a dual stacked machine, getting IPv4 information from DHCP and IPv6 information from RA packets, both trying to update /etc/resolv.conf, we have a non-negligible chance to end up with only IPv4 or IPv6 DNS servers. The proposed solution is to use an external tool that can serialize accesses to the file and make both daemons use it.

Another thing we might consider as a problem, is the lack of security measures in these options, but obviously, these are inherited from the assumptions made by IPv6 regarding to local networks. So I’d say this is “nothing new”.

Implementation status

Every major operating system (be it Linux, Windows or any BSD) has IPv6 support today and is able to configure its interfaces with information gathered in RA packets. On a Linux system for example, one just needs to set the net.ipv6.conf.all.accept_ra sysctl button to 1 to allow RA configuration.

When it comes to RFC 6106 support, all systems are not equal:

  • Linux has full server-side support with radvd, so it can emit RA packets with both RDNSS and DNSSL options. For client-side support, there was rdnssd (now merged in the ndisc6 suite) which could handle RFC 5006 options, and radns which is said to work for Linux and supports full RFC 6106.

  • FreeBSD will have full RFC 6106 support in the upcoming 9.0 release, with rtsold and rtadvd.

  • OpenBSD has no support for RFC 6106 at all, but I’m currently working on a server-side patch for rtadvd. The code can be found here, in patches/rtadvd-rfc6106.patch.

    Edit: The patch has been submitted to the OpenBSD tree.

  • Windows has no support at all either.