In March 2026 we attended Embedded World in Nuremberg — the world's leading trade fair for embedded systems, with over 36,000 visitors from nearly 90 countries. This year the dominant theme was unmistakable: embedded artificial intelligence. Not as a theoretical concept, but as production-ready technology integrated directly into the microcontrollers we use every day in our electronic design projects.
What we saw confirms a direction Nexilica has been following for some time: local inference is replacing the cloud in a growing number of industrial IoT applications. In this article we share what we observed at the fair and how these technologies translate into real opportunities for SMEs, manufacturers and system integrators.
Edge AI brings model inference directly onto the device: millisecond latency, offline operation, data that never leaves the device. With new NPU-equipped microcontrollers, embedded AI is no longer a compromise — it's a superior design choice.
What is Edge AI and why it changes the rules
Until a few years ago, integrating artificial intelligence into a product meant depending on the cloud: the device collected data, sent it to a remote server, and waited for the response. This approach works for some applications, but has structural limitations that make it unsuitable for many industrial contexts:
- Latency: the cloud round-trip introduces delays incompatible with real-time control (motors, production lines, robotics)
- Connectivity: many industrial environments cannot guarantee a stable, low-latency connection
- Privacy and security: sensitive production data or personal data should never leave the company perimeter
- Operating costs: every cloud call has a recurring cost that scales with the number of devices
Edge AI solves these problems by moving inference — the execution of the already-trained model — directly onto the embedded hardware. The device analyzes data locally, makes decisions autonomously, and communicates only relevant results to the cloud when needed.
TinyML: neural networks on microcontrollers
TinyML is the AI subset focused on running machine learning models on extremely resource-constrained devices: microcontrollers with just a few KB of RAM, battery-powered, costing a few euros. The principle is simple: you don't need a server to tell whether a vibration is anomalous or whether a gesture has been performed.
The typical TinyML project workflow includes:
- Data collection: acquisition of representative datasets from the real operating environment
- Training: model training on workstation or cloud using frameworks like TensorFlow or PyTorch
- Optimization: quantization (float32 to int8), pruning and compression to reduce size and computational complexity
- On-device deployment: model conversion to optimized C code for the target MCU, integration into embedded firmware
- Validation: testing accuracy, latency and power consumption on the actual device using dedicated test benches
STM32N6: the game-changer we saw at Embedded World
Among the most impressive demonstrations we saw in Nuremberg, those from STMicroelectronics with the STM32N6 deserve their own chapter. This is the first STM32 microcontroller with an integrated neural processing unit (NPU): the Neural-ART Accelerator.
The numbers speak for themselves:
- 600 GOPS (Giga Operations Per Second) inference throughput — 600x more than an STM32H7
- Cortex-M55 core at 800 MHz, the fastest in the STM32 family
- Up to 4.2 MB of on-chip SRAM — sufficient for vision, audio and time-series models
- Native support for camera interface up to 16 Mpixel, designed for computer vision applications
Real-world industrial use cases
Edge AI is not a technology looking for an application — the applications already exist and are entering production:
Predictive maintenance
Vibration, temperature and current sensors feed a local model that detects anomalies in the behavior of motors, pumps and compressors before a failure occurs. The advantage over traditional monitoring: the model learns the specific "healthy" pattern of that machine and flags subtle deviations that fixed thresholds miss.
Visual quality control
Cameras integrated into the production line, connected to an MCU with NPU, classify defects in real time — scratches, missing components, cold solder joints — without sending images to external servers. Latency under 50 ms, production data privacy guaranteed.
Human-machine interfaces
Gesture recognition and voice commands executed locally on wearable devices or industrial control panels. Full offline operation, instant response, no cloud infrastructure needed.
Smart environmental sensors
IoT nodes that classify sounds, vibrations or environmental patterns directly in the field, transmitting only significant events. Data traffic reduction up to 99%, multiplied battery life.
Cyber Resilience Act: embedded AI in the 2026 regulatory context
A cross-cutting theme at Embedded World 2026 was the European Cyber Resilience Act (CRA), whose first operational deadline — mandatory 24-hour vulnerability reporting — hits in September 2026. For those designing devices with embedded AI, this means:
- Secure boot and signed firmware updates are requirements, not options
- The AI model must be securely updatable (OTA with cryptographic validation)
- A Software Bill of Materials (SBOM) that includes ML framework dependencies is required
The STM32N6 natively integrates secure boot, secure firmware update and SBOM generation mechanisms — a concrete advantage for designing CRA-compliant products. Our technology consulting service includes regulatory compliance support from the earliest project phases.
The Nexilica approach to embedded machine learning
At Nexilica, AI integration in embedded systems is not a standalone service — it's part of our end-to-end design approach. When a project requires local intelligence, we work on three levels simultaneously:
- Hardware: microcontroller selection (with or without NPU) based on performance, power and cost requirements; PCB design optimized for sensing (sensor placement, shielding, clean power supply)
- Firmware: inference runtime integration in embedded firmware, model lifecycle management, secure OTA update
- Model: collaboration with the client for data collection, model optimization for target hardware (quantization, pruning), performance validation on the actual device
The advantage of having a single partner managing hardware, firmware and model is that design decisions are coherent from the start. You don't design a PCB only to discover the microcontroller doesn't have enough SRAM for the model, or that the layout introduces noise on the sensor signal.
On-edge AI is no longer a promise — it's a mature technology, ready for industrial production. With microcontrollers like the STM32N6, the cost/performance ratio is finally accessible even for medium-low volumes. If you're considering integrating artificial intelligence into your next embedded product, contact us: we can help you from platform selection through to series production.