I just started working on my C++ kernel, and the first problem I faced is the initialization of global (non-POD) variables. Here is what I found and how I solved the troubles I faced.
What’s wrong?
The simple case
Let’s start with a simple class:
1 2 3 4 5 6 7 8 9 | |
Every time the Test class is instanciated, the constructor must be called to
do the required initialization tasks (here, it just prints “hello” to standard
output).
In a simple case like this:
1 2 3 4 5 6 | |
the machinery going on behind the scene is pretty easy to understand. When
some_function is called, memory is allocated on the stack to store test_obj
and the constructor of the Test class is called to initialize it. When the
constructor returns (remember, a constructor is just a regular method),
execution continues until the end of the function, where destructors are called
if necessary, stack is free’d and we return.
The actual problem
Now, if we have something like this:
1 2 3 4 5 6 | |
what happens? How and when is test_obj initialized and brought to a usable
state for other functions to use it?
Again, it’s pretty simple:
- the compiler sees that you have a global object that requires initialization;
- it generates a small function that is responsible for calling the constructor of that object;
- it then adds a pointer to that function in the
.ctorssection of the output object file; - at link time, all the
.ctorssections of all object files are merged into a section called.init_array; - finally, at run time, before your
mainfunction gets called, the C++ runtime calls every function pointer in the.init_arraysection, subsequently initializing every object that requires it.
Now, in kernel land, we do not have runtime support. So these functions cannot be called “automagically”. We have to do it ourselves.
How can I fix it?
We want to collect all the function pointers present in object files’ .ctors
sections, and call them one by one.
I use a custom linker script for my kernel, which is useful to control alignment of sections, drop some of them and keep the others. First of all, I added this to my linker script:
1 2 3 4 5 | |
This tells the linker to find every .ctors function in the input files, and
merge them in a .ctors section in the output. Moreover, it generates two
symbols _ctors_start and _ctors_end marking the begining and the end of the
array. My runtime code will use these to know the boundaries of the array.
Next, in the kxx_entry function (the kernel’s first function, the one called
by GRUB), I call every function pointer present in the .ctors section. This is
done with a simple loop in assembly:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | |
And that’s it! Everything runs as expected and I have a working static
initialization of my variables before calling the kxx_main function (which is
the kernel’s main function).
Conclusion
Here I gave a simple solution for the static variable initialization problem. I
ignored some trouble that can be caused if the object has a destructor: if a
destructor is present, we will have some link errors, with the linker
complaining about missing __cxa_atexit and __dso_handle symbols. I may
explain these in another article. But for now, as long as I do not define an
explicit destructor for my globals, I am OK and I can continue with my project.