int $0x80 Your daily work interruption

Static Initialization Of Globals In C++

02 Jan 2012

I just started working on my C++ kernel, and the first problem I faced is the initialization of global (non-POD) variables. Here is what I found and how I solved the troubles I faced.

What's wrong?

The simple case

Let's start with a simple class:

#include <iostream>

struct Test
{
  Test()
  {
    std::cout << "hello" << std::endl;
  }
};

Every time the Test class is instanciated, the constructor must be called to do the required initialization tasks (here, it just prints "hello" to standard output).

In a simple case like this:

void some_function()
{
  Test test_obj;

  /* do something intelligent here */
}

the machinery going on behind the scene is pretty easy to understand. When some_function is called, memory is allocated on the stack to store test_obj and the constructor of the Test class is called to initialize it. When the constructor returns (remember, a constructor is just a regular method), execution continues until the end of the function, where destructors are called if necessary, stack is free'd and we return.

The actual problem

Now, if we have something like this:

Test test_obj;

int some_function()
{
  /* do something intelligent with test_obj */
}

what happens? How and when is test_obj initialized and brought to a usable state for other functions to use it?

Again, it's pretty simple:

Now, in kernel land, we do not have runtime support. So these functions cannot be called "automagically". We have to do it ourselves.

How can I fix it?

We want to collect all the function pointers present in object files' .ctors sections, and call them one by one.

I use a custom linker script for my kernel, which is useful to control alignment of sections, drop some of them and keep the others. First of all, I added this to my linker script:

.ctors ALIGN(0x1000) : {
  PROVIDE(_ctors_start = .);
  *(.ctors)
  PROVIDE(_ctors_end = .);
}

This tells the linker to find every .ctors function in the input files, and merge them in a .ctors section in the output. Moreover, it generates two symbols _ctors_start and _ctors_end marking the begining and the end of the array. My runtime code will use these to know the boundaries of the array.

Next, in the kxx_entry function (the kernel's first function, the one called by GRUB), I call every function pointer present in the .ctors section. This is done with a simple loop in assembly:

kxx_entry:
  # initialize stack and push GRUB's arguments
  mov $_stack_end, %esp
  xor %ebp, %ebp
  push %ebx
  push %eax

  # call ctors list
  mov $_ctors_start, %esi
  1:
    cmp $_ctors_end, %esi
    je 1f
    mov (%esi), %edi
    call *%edi
    add $4, %esi
    jmp 1b
  1:

  # jump into the kernel's main function
  call kxx_main

And that's it! Everything runs as expected and I have a working static initialization of my variables before calling the kxx_main function (which is the kernel's main function).

Conclusion

Here I gave a simple solution for the static variable initialization problem. I ignored some trouble that can be caused if the object has a destructor: if a destructor is present, we will have some link errors, with the linker complaining about missing __cxa_atexit and __dso_handle symbols. I may explain these in another article. But for now, as long as I do not define an explicit destructor for my globals, I am OK and I can continue with my project.