Platform and SoM Knowledge Pool

What AI models can the EP-200Q support?

Written by Satoru Kumashiro | Nov 24, 2025 5:59:59 PM

AI inference on the EP-200Q

The EP-200Q can handle AI inference on either CPU, GPU or NPU. Here is a brief summary of the pros and cons of the process with each hardware. In general, we recommend using NPU as it can be dedicated to the AI inference process whereas CPU and GPU share their resources with other tasks.

  CPU GPU NPU
Model format FP32, FP16, Int16, Int8 FP32, FP16, Int16, Int8 Int16, Int8
Runtime SNPE, QNN, TensorFlow lite SNPE, QNN, TensorFlow lite SNPE, QNN, TensorFlow lite
Accuracy High with FP32 High with FP32 Less than FP32
Latency Slow Slow Fast
Memory usage The number of AI model parameters times 4 (FP32), 2 (FP16/Int16) or 1 (Int8) The number of AI model parameters times 4 (FP32), 2 (FP16/Int16) or 1 (Int8) The number of AI model parameters times 4 (FP32), 2 (FP16/Int16) or 1 (Int8)

 

What if available AI models are in a different framework?

We provide a Docker file to build development environment to migrate available AI models in ONNX, TensorFlow and PyTorch into one of usable runtime.

This is one of workflow to convert ONNX, TensorFlow and PyTorch models and develop AI applications. There is another way to use Qualcomm AI Hub, which requires the end-user to acquire a Qualcomm ID.

Please note that there are some constraints to convert ONNX models.


 

If you're interested in learning more about the development environment for the EP-200Q, please contact us.

Contact us