Posted by Satoru Kumashiro, Nov 24, 2025 10:00:01 AM

What kind of device is suitable for your on-device AI inference?

Edge AI device deployment scenarios

The term, "Edge AI device", can be used for different type of products. The best system architecture will depend on many factors, including required accuracy, AI inference latency, network availability, power consumption of each device, device cost, and so on.

AI inference on a edge device
- Remote camera, sensors, individual robot, handheld device , etc.
AI inference on a controller managing multiple devices
- Centralized controller unit for automation system, multi-input video recorder, etc.
AI inference on an on-premise edge computer
- e.g. on-premise servers

1. Edge devices

Performing AI inference on the edge device provides fast response time and no dependency on connectivity status and reliability. As it limits the amount of data sent over the network (if devices are connected), their impact on overall network performance is minimal. Hence, they are less likely to affect other processes that require data exchange over the network.

On the other hand, AI-capable SoCs or MCUs consume more power than those that are not capable of AI inference. In addition, the cost of such controllers and the entire system can be higher than that of non-AI-capable devices.

Therefore, this solution is suitable for devices that:

must always operate as intended,
must not affect the operation of other networked devices,
require real-time response, and
can afford additional cost for improved performance.

A SoC with an integrated NPU or accelerator is the most likely option to enable such a device. To manage AI model deployment or to trigger and alert events detected by AI, these devices tend to include networking capabilities. Reliable, industrial-grade connectivity is essential to ensure robust system operation.

Example use cases include:

Drones
Material handling robots
Surveillance cameras
Defect detection cameras
Handheld device with camera, endoscopy
AGV
AMR
Patient monitoring

2. Centralized controllers

An example of such a device is a controller used as part of an automation system or a predictive maintenance system. End devices such as tiny cameras, sensors, motor controllers, and similar components, constantly interact with the AI controller via a wired communication protocol or a wireless network. These end devices are typically small, low-power, and cost-effective, which limits their ability to support AI capabilities. This is especially true when there are many end devices, each performing only lightweight tasks. In such cases, data from the end devices, or instructions sent to them are better managed by a centralized controller. This controller may resemble an industrial computer or even a desktop computer.

Selection of the SoC highly depends on the system requirements. Therefore, it is hard to say what is the best architecture for a device.

Amount of data it has to deal with,
The number of devices to monitor or control,
Frequently of data exchange with each connected device?
Latency to respond to inference from each device,

The controller often exchanges data with each connected device over either a wired or wireless network. This connectivity must be reliable and robust. Redundancy should also be considered to prevent system malfunctions caused by network disconnections. In addition to connection reliability, performance particularly in terms of latency and throughput, is crucial for supporting real-time automation systems.

Example use cases:

Industrial computer for process automation
AI-enabled PLC
AOI (Automatic Optical Inspection) system
IoT gateway
Surgeon assisting system, guided surgery system

3. On-premise edge computers

When the data used for AI inference is large, the AI model has many parameters, high accuracy is required, data privacy is a concern, and latency is not a priority, an on-premise edge computer, such as a server, is the best choice. In general, server architecture is scalable and capable of supporting discrete GPUs, allowing the server to run heavy AI inference tasks. If not a server, a desktop computer can be used for SMEs. Even in this case, a combination of a high-end CPU and a graphics card is a viable option.

Example use cases:

AI assisted medical imaging diagnostics
Genome analysis
Early drug discovery
Generative AI, Agentic AI on on-premise server for enterprise workflow assistance