BayLibre has continued our contribution to the Linux community as seen with this new version of Linux Kernel 5.3, released on Sunday, September 15th 2019. An excellent summary of this release can be found at KernelNewbies.

If you check out LWN.net’s 5.3 Kernel development cycle statistics article, you’ll see that BayLibre made it onto the top 20 contributors list, showing that we were one of the most active companies (when measured by changes) this development cycle.

Here is a summary of our contributions, organized by SoC family and a summary graph of contributions by developer.

AmLogic SoCs

This release we contributed heavily to Amlogic Meson SoCs mainline support with lots of work from Neil Armstrong, Jerome Brunet, and Maxime Jourdan for the G12A (S905X2/S905D2) and G12B SoCs including:

Initial support for new G12B Soc family
Enabling the IR controller on the SEI510, u200, and X96 Max boards
Fixups for the AXG TDM formatter driver
Adding support for dynamic OTG switching with the ID change interrupt
Enabling the sound card on the Hardkernel Odroid-N2
Increasing the Bluetooth bus speed to 2Mbaud/s
Enabling the WiFi SDIO module
Adding gigabit ethernet support for the X96 Max board
Enabling the hwrng module for the next generation SM1 SoC family
Enabling SD and eMMC on the g12a u200 board
A V4L2 m2m video decoder driver
Support for the XBGR8888 and ABGR8888 formats to the graphics controller

GPIO subsystem

Bartosz Golaszewski contributed changes for the GPIO subsystem including:

Fixes for warnings when the gpiolib is disabled
A fix for a use-after-free bug
Various other cleanups and fixes

RISC-V

With the continued interest and adoption of RISC-V, we wanted an easy way to build an upstream kernel for the SiFive Unleashed board. Loys Ollivier submitted a patch that enables support in the default RISC-V kernel config.

DaVinci SoC

We’ve talked about updating the existing TI DaVinci SoC timer driver in the past, and this release Bartosz Golaszewski implemented a much simpler — and more modern — version with clockevents and clocksource support. He also enabled cpufreq support.

Misc

Beyond the above, we’ve also contributed patches to various subsystems and drivers.

Jerome Brunet contributed miscellaneous fixes and cleanups for the ASoC subsystem
Fabien Parent added AUDSYS clock support for MediaTek’s MT8516 SoC
Neil Armstorng enabled the Lima driver (ARM Mali 400/450 GPU) for arm64 and ARMv7 boards because it will be useful for KernelCI boot and runtime testing.
Bartosz Golaszewski contributed cleanup patches for the at24 EEPROM driver and a new selector stepping option for voltage regulators.

Conventional wisdom says you should normally apply small microcontrollers to dedicated applications with constrained resources. 8-bit microcontrollers with a few kilobytes of memory are still plentiful today. 32-bit microcontrollers with a couple of dozen kilobytes of memory are also very popular. In the latter case, it is typical to rely on a small RTOS to provide basic software interfaces and services.

The Zephyr Project provides such an RTOS. Many ARM-based microcontrollers are supported, but other architectures including ARC, XTENSA, RISC-V (32-bit) and X86 (32-bit) are also supported.

Yet some people are designing products with computing needs that are simple enough to be fulfilled by a small RTOS like Zephyr, but with memory addressing needs that cannot be described by kilobytes or megabytes, but that actually require gigabytes! So it was quite a surprise when BayLibre was asked to port Zephyr to the 64-bit RISC-V architecture.

Where to start

The 64-bit port required a lot of cleanups. Initially, we were far from concerned by the actual RISCV64 support. Zephyr supports a virtual “board” configuration faking basic hardware on one side and interfacing with a POSIX environment on the other side which allows for compiling a Zephyr application into a standard Linux process. This has enormous benefits such as the ability to use native Linux development tools. For example, it allows you to use gdb to look at core dumps without fiddling with a remote debugging setup or emulators such as QEMU.

Until this point, this “POSIX” architecture only created 32-bit executables. We started by only testing the generic Zephyr code in 64-bit mode. It was only a matter of flipping some compiler arguments to attempt a 64-bit build. But unsurprisingly, it failed.

The 32-bit legacy

Since its inception, the Zephyr RTOS targeted 32-bit architectures. The assumption that everything can be represented by an int32_t variable was everywhere. Code patterns like the following were ubiquitous:

static inline void mbox_async_free(struct k_mbox_async *async)
{
        k_stack_push(&async_msg_free, (u32_t)async);
}

Here the async pointer gets truncated on a 64-bit build. Fortunately, the compiler does flag those occurrences:

In function ‘mbox_async_free’:
warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
k_stack_push(&async_msg_free, (u32_t)async);
^

Therefore the actual work started with a simple task: converting all u32_t variables and parameters that may carry pointers into uintptr_t. After several days of work, the Hello_world demo application could finally be built successfully. Yay!

But attempting to execute it resulted in a segmentation fault. The investigation phase began.

Chasing bugs

While the compiler could identify bad u32_t usage when a cast to or from a pointer was involved, some other cases could be found only by manual code inspection. Still, Zephyr is a significant body of code to review and catching all issues, especially the subtle ones, couldn’t happen without some code execution tracing in gdb.

A much more complicated issue involved linked list pointers that ended up referring to non-existent list nodes for no obvious reason, and the bug only occurred after another item was removed from the list. This issue was only noticeable with a subsequent list search that followed the rogue pointer into Lalaland. And it didn’t trigger every time.

The header file for list operations starts with this:

#ifdef __LP64__
typedef u64_t unative_t;
#else
typedef u32_t unative_t;
#endif

So one would quickly presume that the code is already 64-bit ready. From a quick glance, it does use unative_t everywhere. What is easily missed is this:

#define SYS_SFLIST_FLAGS_MASK 0x3U

static inline sys_sfnode_t *z_sfnode_next_peek(sys_sfnode_t *node)
{
        return (sys_sfnode_t *)(node->next_and_flags & ~SYS_SFLIST_FLAGS_MASK);
}

Here we return the next pointer after masking out the bottom 2 flag bits. But 0x3U is interpreted by the compiler as an unsigned int and therefore a 32-bit value, meaning that ~0x3U is equal to 0xFFFFFFFC. Because node->next_and_flags is an u64_t, our (unsigned) 0xFFFFFFFC is promoted to 0x00000000FFFFFFFC, effectively truncating the returned pointer to its 32 bottom bits. So everything worked when the next node in the list was allocated in heap memory which is typically below the 4GB mark, but not for nodes allocated on the stack which is typically located towards the top of the address space on Linux.

The fix? Turning 0x3U into 0x3UL. The addition of that single character required many hours of debugging, and this is only one example. Other equally difficult bugs were also found.

The unsuspecting C library

One major change with most 64-bit targets is the width of pointers, but another issue is the change in width of long integer variables. This means that the printf() family of functions have to behave differently when the “l” conversion modifier is provided, as in “%ld”. On a 32-bit only target, all the printf() modifiers can be ignored as they all refer to a 32-bit integer (except for “%lld” but that isn’t supported by Zephyr). For 64-bit, this shortcut can no longer be used.

Alignment considerations are different too. For example, memory allocators must return pointers that are naturally aligned to 64-bit boundaries on 64-bit targets which has implications for the actual allocator design. The memcpy() implementation can exploit larger memory words to optimize data transfer but a larger align is necessary. Structure unions may need adjustments to remain space efficient in the presence of wider pointers and longs.

Test, test and test

One great thing about Zephyr is its extensive test suite. Once all the above was dealt with, it was time to find out if the test suite was happy. And of course it wasn’t. In fact, the majority of the tests failed. At least the Hello_world demo application worked at that point.

Writing good tests is difficult. The goal is to exercise code paths that ought to work, but it is even better when tests try to simulate normal failure conditions to make sure the core code returns with proper error codes. That often requires some programming trickery (read: type casting) within test code that is less portable than regular application code. This means that many tests had to be fixed to be compatible with a 64-bit build. And when core code bugs only affecting 64-bit builds were found, fixing them typically improved results in large portions of the tests all at once.

OK, but where does RV64 fit in this story?

We wrote the RV64 support at the very end of this project. In fact, it represented less than 10% of the whole development effort. Once Zephyr reached 64-bit maturity, it was quite easy to abstract register save/restore and pointer accesses in the assembly code to support RV64 within the existing RV32 code thanks to RISC-V’s highly symmetric architecture. Testing was also easy with QEMU since it can be instructed to use either an RV32 or an RV64 core with the same machine model.

Taking full advantage of 64-bit RISC-V cores on Zephyr may require additional work depending on the actual target where it would be deployed. For example, Zephyr doesn’t support hardware floating point context switching or SMP with either 32-bit or 64-bit RISC-V flavors yet.

But the groundwork is now done and merged into the mainline Zephyr project repository. Our RV64 port makes Zephyr RTOS 2.0.0 a milestone release — it’s the first Zephyr version to support both 32-bit and 64-bit architectures.

Linux Kernel v5.3 released, our contributions