BayLibre’s contributions to GCC 14

GCC 14 was released on May 7, 2024, extending the support for C23 (including _BitInt), C++23 and C++26, adding first Fortran 2023 features and the experimental gccrs RUST compiler, but also extending Ada, Go and Modula-2. GCC 14 also improves the static analyser, support for AArch64, x86-64, LoongArch, RISC-V, … and includes a plethora of other improvements.

GCC 14 also extends the support for OpenMP and OpenACC, which permit thread-parallel execution and offloading of program parts to GPUs via directive-based language extensions and the run-time library.  Both OpenMP and OpenACC are widely used in scientific programs, running from laptops to large supercomputers such as the pictured Frontier system (photo: OLCF at ORNL, which also runs that system). GCC supports offloading to Nvidia and AMD GPUs – and is also one of the main compilers on Frontier, where each node has compute GPUs.

BayLibre’s newly formed Compiler & Toolchain Services team contributed to GCC 14 with 72 commits by 7 developers – since the team was formed mid/end of January. See below for GCC 14’s OpenMP, OpenACC and GPU offloading highlights, as BayLibre’s GCC contributions were mostly in that area.

AMD GPU Support

In GCC 14, the low-latency memory allocator for OpenMP was added, performance improvements especially for the high-end Instinct MI100 (gfx908) and MI200 (gfx90a) series were made,  and offloading is now also supported on the following consumer cards: AMD Radeon gfx90c (GCN5), gfx1030, gfx1036 (RDNA2), gfx1100 and gfx1103 (RDNA3).

The BayLibre developers (in particular, Andrew Stubbs and Thomas Schwinge) solved several issues to properly support the gfx10 and gfx11 cards and enabled the gfx1103 which can be found as APU in laptops. Andrew Stubbs, as one of GCC’s GCN maintainers, additionally reviewed several patches, including the ones adding the gfx90c and gfx1036 support.

OpenMP

GCC 14 extended the coverage of the OpenMP 5.0, 5.1, and 5.2 specifications.

In particular, the memory management was extended on all ends, such as implementing the allocate and allocators directives, supporting the libnuma library or adding low-latency allocators on GPUs. Imperfectly nested loops are now permitted, as is the present modifier in map clauses or invoking function pointers on the GPU that point to a host function – if the latter is declared target with the indirect clause. On the OpenMP 5.2 side, OMP_TARGET_OFFLOAD=mandatory follows now the new semantic, Fortran’s pure procedures are now more widely permitted, semantic changes for firstprivate and syntax changes for the destroy clause are now also honored. The OpenMP 6.0 preview TR12 added the ‘decl’ attribute to C++ and, as C23 now also supports attribute, also attribute support to C – and GCC 14 already supports this!

On the performance side, omp_target_memcpy_rect got a boost when copying noncontiguous memory to or from GPU devices. Also the documentation was largely improved, especially the coverage of library routines, memory management and environment variables.

BayLibre developers (in particular Kwok Cheung Yeung and Tobias Burnus) added the indirect clause, contributed to the documentation, fixed a corner-case issue with firstprivate of C++ member variables and fixed a couple of bugs. Tobias Burnus, as one of GCC’s OpenMP maintainers, also reviewed some patches.

OpenACC

The OpenACC 2.7 support was extended.   The self clause was added to be used on compute constructs and the default clause for data constructs. The readonly modifier is now handled in the copyin clause and cache directive. Additionally, OpenACC 3.2 made the memory-handling API routines available to Fortran (header and module file), which were previously only supported in C and C++.

BayLibre developers contributed to this work, in particular Chung-Lin Tang added support for the readonly modifier and adjusted the reference-counter handing for acc_map_data acc_unmap_data to match OpenACC 2.7 – while Tobias Burnus added the Fortran interface for the memory-handling routines. Thomas Schwinge did some housekeeping work and reviewed – as GCC’s main OpenACC maintainer – several patches.