What is the difference between Edge AI and cloud AI in embedded systems?

Edge AI runs model inference directly on the device, while cloud AI sends data to a remote server. The choice depends on four factors: latency (Edge AI responds in milliseconds, essential for real-time control), connectivity (Edge AI works offline, cloud requires a stable connection), privacy (sensitive data never leaves the device perimeter), and cost (cloud has a recurring cost per inference, Edge AI does not). For industrial, automotive and IoT applications, Edge AI is now the dominant design choice.

Which microcontrollers are suitable for embedded AI applications?

The new generation of MCUs with integrated NPU radically changes the possibilities: STM32N6 from STMicroelectronics integrates a 600 GOPS Neural-ART Accelerator, an 800 MHz Cortex-M55 core and up to 4.2 MB of SRAM, enough for vision, audio and time-series models. For simpler applications (anomaly detection, lightweight audio classification) STM32H7, ESP32-S3 or Microchip PolarFire SoC platforms are also valid. The choice depends on the model to be executed, power constraints and target unit cost.

How do you make an embedded AI device compliant with the Cyber Resilience Act?

The European Cyber Resilience Act enters into operational force in September 2026 and imposes specific requirements on AI-embedded devices: secure boot and cryptographically signed firmware, the ability to update the AI model securely via OTA with validation , and generation of a Software Bill of Materials (SBOM) that also includes machine learning framework dependencies. MCUs like STM32N6 natively integrate these mechanisms: planning compliance from the earliest project phases avoids costly redesigns later.

AI on Edge: Embedded Artificial Intelligence on STM32 Microcontroller

In March 2026 we attended Embedded World in Nuremberg — the world's leading trade fair for embedded systems, with over 36,000 visitors from nearly 90 countries. This year the dominant theme was unmistakable: embedded artificial intelligence. Not as a theoretical concept, but as production-ready technology integrated directly into the microcontrollers we use every day in our electronic design projects.

What we saw confirms a direction Nexilica has been following for some time: local inference is replacing the cloud in a growing number of industrial IoT applications. In this article we share what we observed at the fair and how these technologies translate into real opportunities for SMEs, manufacturers and system integrators.

Edge AI brings model inference directly onto the device: millisecond latency, offline operation, data that never leaves the device. With new NPU-equipped microcontrollers, embedded AI is no longer a compromise — it's a superior design choice.

What is Edge AI and why it changes the rules

Until a few years ago, integrating artificial intelligence into a product meant depending on the cloud: the device collected data, sent it to a remote server, and waited for the response. This approach works for some applications, but in many industrial contexts it runs into four structural limitations that make the model unsustainable.

The first is latency: the cloud round-trip introduces delays incompatible with real-time control of motors, production lines and robotics, where every millisecond counts. The second is connectivity — many industrial environments cannot guarantee a stable or low-latency connection, and cloud-dependent systems become fragile or unusable in exactly the conditions where they should perform best. Then there is the matter of privacy and security: sensitive production data or personal data should never leave the company perimeter, and every cloud inference opens an additional attack surface. Finally operating costs accumulate over time — every cloud call has a recurring cost that scales with the number of devices, turning what could be a one-time investment into a perpetual OpEx line.

Edge AI solves these problems by moving inference — the execution of the already-trained model — directly onto the embedded hardware. The device analyzes data locally, makes decisions autonomously, and communicates only relevant results to the cloud when needed.

TinyML: neural networks on microcontrollers

TinyML is the AI subset focused on running machine learning models on extremely resource-constrained devices: microcontrollers with just a few KB of RAM, battery-powered, costing a few euros. The principle is simple: you don't need a server to tell whether a vibration is anomalous or whether a gesture has been performed.

The typical TinyML project workflow includes:

Data collection: acquisition of representative datasets from the real operating environment
Training: model training on workstation or cloud using frameworks like TensorFlow or PyTorch
Optimization: quantization (float32 to int8), pruning and compression to reduce size and computational complexity
On-device deployment: model conversion to optimized C code for the target MCU, integration into embedded firmware
Validation: testing accuracy, latency and power consumption on the actual device using dedicated test benches

STM32N6: the game-changer we saw at Embedded World

Among the most impressive demonstrations we saw in Nuremberg, those from STMicroelectronics with the STM32N6 deserve their own chapter. This is the first STM32 microcontroller with an integrated neural processing unit (NPU): the Neural-ART Accelerator.

The numbers speak for themselves:

600 GOPS (Giga Operations Per Second) inference throughput — 600x more than an STM32H7
Cortex-M55 core at 800 MHz, the fastest in the STM32 family
Up to 4.2 MB of on-chip SRAM — sufficient for vision, audio and time-series models
Native support for camera interface up to 16 Mpixel, designed for computer vision applications

The Nexilica team at Embedded World 2026 in Nuremberg

At Embedded World 2026, ST showcased over 50 live demos with the STM32N6: anomaly detection on vibration signals, object recognition with camera, environmental audio classification — all running locally on the microcontroller, without any cloud connection. From STMicroelectronics' live demos to Texas Instruments' solutions with the new TinyEngine NPU, the fair's message was clear: local inference is no longer an experiment — it's the design standard for the next generation of industrial, IoT and automotive devices.

For Nexilica, being present in Nuremberg means staying at the center of the European embedded ecosystem and bringing these technologies directly into our clients' projects.

Real-world industrial use cases

Edge AI is not a technology looking for an application — the applications already exist and are entering production:

Predictive maintenance

Vibration, temperature and current sensors feed a local model that detects anomalies in the behavior of motors, pumps and compressors before a failure occurs. The advantage over traditional monitoring: the model learns the specific "healthy" pattern of that machine and flags subtle deviations that fixed thresholds miss.

Visual quality control

Cameras integrated into the production line, connected to an MCU with NPU, classify defects in real time — scratches, missing components, cold solder joints — without sending images to external servers. Latency under 50 ms, production data privacy guaranteed.

Human-machine interfaces

Gesture recognition and voice commands executed locally on wearable devices or industrial control panels. Full offline operation, instant response, no cloud infrastructure needed.

Smart environmental sensors

IoT nodes that classify sounds, vibrations or environmental patterns directly in the field, transmitting only significant events. Data traffic reduction up to 99%, multiplied battery life.

Cyber Resilience Act: embedded AI in the 2026 regulatory context

A cross-cutting theme at Embedded World 2026 was the European Cyber Resilience Act (CRA), whose first operational deadline — mandatory 24-hour vulnerability reporting — hits in September 2026. For anyone designing devices with embedded AI, the CRA reshapes the rules of the game from the earliest design phases.

The first non-negotiable point is that secure boot and signed firmware updates are no longer a high-end feature but a baseline requirement: every device placed on the European market must guarantee code integrity and cryptographic verification of updates. On top of this comes a consideration specific to on-edge AI: the model itself — not just the firmware running it — must be securely updatable over the air, with cryptographic validation of the distribution pipeline. An outdated or compromised model is an attack vector, exactly like a firmware vulnerability. Finally, the regulation requires maintaining an up-to-date Software Bill of Materials (SBOM) which, for AI products, must also include the dependencies of the machine learning framework, runtime libraries and model versions: every component of the system must be traceable and patchable when a vulnerability is reported.

The STM32N6 natively integrates secure boot, secure firmware update and SBOM generation mechanisms — a concrete advantage for designing CRA-compliant products. Our technology consulting service includes regulatory compliance support from the earliest project phases.

The Nexilica approach to embedded machine learning

At Nexilica, AI integration in embedded systems is not a standalone service — it's part of our end-to-end design approach. When a project requires local intelligence, we work on three levels simultaneously.

On the hardware side we select the microcontroller — with or without NPU — based on performance, power and cost requirements, and we deliver a PCB design optimized for sensing: sensor placement, shielding, clean power supply. On the firmware side we integrate the inference runtime into the embedded firmware, manage the model lifecycle and ship secure OTA updates. On the model side we collaborate with the client on data collection, optimize the neural network for the target hardware (quantization, pruning) and validate performance on the actual device, not just in the lab.

The advantage of having a single partner managing hardware, firmware and model is that design decisions are coherent from the start. You don't design a PCB only to discover the microcontroller doesn't have enough SRAM for the model, or that the layout introduces noise on the sensor signal.

On-edge AI is no longer a promise — it's a mature technology, ready for industrial production. With microcontrollers like the STM32N6, the cost/performance ratio is finally accessible even for medium-low volumes. If you're considering integrating artificial intelligence into your next embedded product, contact us: we can help you from platform selection through to series production.

AI on Edge: How to Integrate Artificial Intelligence into Embedded Systems

What is Edge AI and why it changes the rules

TinyML: neural networks on microcontrollers

STM32N6: the game-changer we saw at Embedded World

Real-world industrial use cases

Predictive maintenance

Visual quality control

Human-machine interfaces

Smart environmental sensors

Cyber Resilience Act: embedded AI in the 2026 regulatory context

The Nexilica approach to embedded machine learning

Frequently asked questions

Explore more

Have a project?

What is Edge AI and why it changes the rules

TinyML: neural networks on microcontrollers

STM32N6: the game-changer we saw at Embedded World

Real-world industrial use cases

Predictive maintenance

Visual quality control

Human-machine interfaces

Smart environmental sensors

Cyber Resilience Act: embedded AI in the 2026 regulatory context

The Nexilica approach to embedded machine learning

Frequently asked questions

Explore more

Firmware Development

Electronic Design

Have a project?