AI & ML

AI & ML on Embedded Hardware: NPU Firmware, Edge Inference & Framework Optimization

From custom AI co-processor firmware to on-device TensorFlow Lite and PyTorch inference — we handle the full software stack for AI silicon.

Developed Zephyr support for Meta’s MTIA line of AI co-processors

Optimize RISC-V ATen operations in PyTorch on behalf of the RISE Project

Contracted by ARM to train a vision model for Vending Machines on the OpenMV Cam H7 platform with TensorFlow Lite

BayLibre operates at the intersection of AI and the hardware it runs on. We bring up operating systems and firmware for custom AI accelerators designed to serve billions. We optimize deep learning frameworks at the operator level for emerging instruction set architectures. And we deploy trained vision models onto resource-constrained edge devices where every byte and cycle counts. From custom silicon enablement to on-device inference, our engineers understand the full vertical — the math, the compilers, the kernels, and the boards. When AI needs to leave the data center and land on real hardware, BayLibre makes it work.

Develop model deployment strategy
Train small models for deployment on Edge Compute hardware
Optimize the execution of the full software pipeline for emerging hardware accelerators and GPU offload

Our Case Studies

Real-world examples of our expertise in action

Strengthening GCC for GPU Offloading in HPC Systems

Enabling Python Data & AI Ecosystem on RISC-V (RISE Initiative)

Scaling Embedded Linux & Android Platform Development for Point-of-Sale Devices on Qualcomm QCS5430 / QCS6490

Contact us!

Our expert team in the US or Europe will contact you within one business day to schedule a technical discovery and kick things off.

Thanks for your message!
We received your message and will get back to you shortly.