Writing a kernel and testing it (with libraries)

For the last weeks we worked on libraries loading, and now it's working (even if we still have to fix some small issues). You may not notice how awesome this is, so let me explain how we work and how we were doing some things until now.

All (most) of our development is test driven. We think a feature, we write a test and we run it. When you are running a test for something as basic as bootstrap, message lookup, or a primitive, things get a bit different. Such kinds of things are part of the kernel, and in order to test them we write an executable kernel file (bee.exe). This file has an entry point that performs basic initialization and then looks at command line arguments to know what to do. From our host image, our test generates the file, executes it, and looks for the return value. A 1 return value is considered success, and a 0 is an error. Of course, debugging at this stage is done with native debuggers. This worked so fine for the first baby steps of the kernel, that we wrote lots of tests to cover lookup, allocation, primitives and more. But at some point this process became too slow.

Until now, as libraries didn't work, to make a test run (after writing it), we had to add all of its code and objects inside the kernel and output a new kernel file. This worked fine, but generating a full kernel can demand more than a minute in my old machine, so it slowed tests execution a bit and development a lot. To solve this slow-test problem, we instead manually generated a minimal kernel. That is, a reduced version of the (already small) Smalltalk kernel, that included only the parts that were actually used in tests. This reduced execution time, but in turn was a painful, slow and error prone development task. As kernel rarely changes, the correct solution is to build it once, and then just build small libraries from projects. This is not only very fast, but also doesn't need a closure to be manually generated, so common errors like forgetting to add some method disappear. We also plan to have modularity in Bee via loadable libraries, so libraries are in the roadmap, the sooner the better!

But for this to work, there are some subtleties that need to be resolved. Libraries were made to support loading objects, not code. That doesn't mean you couldn't save a compiled method, but it meant that before saving you had to throw away its native code. This is because native code was jitted when needed, wasn't actually visible by the image. In our new scheme, native code gets reified as an object that has code (a byteArray) and an array of references to other objects (selector, compiledMethod, lookup, etc), among other things. This native code object is written within the library as all other objects are.

When the library is loaded, all references from the native code byteArray are broken, since the byteArray is just that, a byte array where the loader doesn't see slots to fix. So before writing the library we mark all the nativeCodes as "not fresh" (they have an instance variable for that). Then, at lookup time, we check that the native code status is fresh before executing, and refresh it if necessary. To refresh, we use the references array, which is made of normal objects so the references get updated correctly. We think this check will have little negative impact on performance, as the inline cache avoids lookup completely, and on the other hand, in the future we'll be able to benefit of not having to re-jit after each GC.

All this is going to let us build a very small Bee Kernel Tests project, which will get written to disk super fast, and then get loaded at startup for tests execution. That makes the development process a lot faster and is now possible because we got libraries support working.


Entradas populares de este blog

Connecting the dots

Plugging Bee JIT and Compiler

Pre-releasing Bee Smalltalk